Difference between revisions of "Spring 2023: Data Mining"

From MKWiki
Jump to navigation Jump to search
 
(5 intermediate revisions by the same user not shown)
Line 16: Line 16:
 
!Readings
 
!Readings
 
|-
 
|-
| style="width: 10%; " | Unit 1 / Chapter 1
+
| style="width: 12%; " | Unit 1 / Chapter 1
 
| style="width: 60%" |  '''''Introduction''''': 1.1 - What Is Data Mining? 1.2 Challenges 1.3 Data Mining Origins 1.4 Data Mining Tasks
 
| style="width: 60%" |  '''''Introduction''''': 1.1 - What Is Data Mining? 1.2 Challenges 1.3 Data Mining Origins 1.4 Data Mining Tasks
 
| style="width: 15%" | [http://mkbhandari.com/mkwiki/data/spring2023/DM/1Intro.pdf '''1Intro.pdf''']   
 
| style="width: 15%" | [http://mkbhandari.com/mkwiki/data/spring2023/DM/1Intro.pdf '''1Intro.pdf''']   
Line 24: Line 24:
 
|  '''''Data mining techniques''''': 2.1- Types of data, 2.2 – Data Quality, 2.3.1 Aggregation, 2.3.2 Sampling, 2.3.3 Dimensionality reduction – upto pg 51, 2.3.4 Feature subset selection upto pg 52, 2.4.5 Feature creation upto pg 55, 2.3.6 Discretization upto pg 59, 2.3.7 variable transformations 2.4.3 Dissimilarity among data objects 2.4.4 similarity among data objects
 
|  '''''Data mining techniques''''': 2.1- Types of data, 2.2 – Data Quality, 2.3.1 Aggregation, 2.3.2 Sampling, 2.3.3 Dimensionality reduction – upto pg 51, 2.3.4 Feature subset selection upto pg 52, 2.4.5 Feature creation upto pg 55, 2.3.6 Discretization upto pg 59, 2.3.7 variable transformations 2.4.3 Dissimilarity among data objects 2.4.4 similarity among data objects
 
|  [http://mkbhandari.com/mkwiki/data/spring2023/DM/2DMT.pdf '''2DMT.pdf''']   
 
|  [http://mkbhandari.com/mkwiki/data/spring2023/DM/2DMT.pdf '''2DMT.pdf''']   
| Chapter 2 (CB2)
+
| Chapter 2 (CB1)
 
|-
 
|-
 
| Unit 5 / Chapter 6
 
| Unit 5 / Chapter 6
 
|  '''''Association Rules''''': 6.1-Problem definition, 6.2-Frequent itemset generation, 6.3-Rule generation till Pg 351
 
|  '''''Association Rules''''': 6.1-Problem definition, 6.2-Frequent itemset generation, 6.3-Rule generation till Pg 351
 
|  [http://mkbhandari.com/mkwiki/data/spring2023/DM/3AR.pdf '''3AR.pdf''']   
 
|  [http://mkbhandari.com/mkwiki/data/spring2023/DM/3AR.pdf '''3AR.pdf''']   
| Chapter 6 (CB2)
+
| Chapter 6 (CB1)
 +
|-
 +
| Unit 3 / Chapter 4
 +
|  '''''Classification: Basic Concepts and Techniques''''': 4.1 – Preliminaries, 4.2 – General Approach to Solving a Classification Problem, 4.3 Decision Tree Induction (Till Pg. 165), 4.5 –Evaluating the Performance of a Classifier
 +
|  [http://mkbhandari.com/mkwiki/data/spring2023/DM/4CL.pdf '''4Classification.pdf'''] 
 +
| Chapter 4 (CB1)
 +
|-
 +
| Unit 4 / Chapter 5
 +
|  '''''Classification: Alternative Techniques''''': 5.1 – Rule Based Classifier (upto page 212),5.2 – Nearest Neighbor Classifiers, 5.3–Bayesian Classifiers (Complete for discrete data and only introduction of Bayes classifier for continuous attributes) till pg. 233, 5.7.1 –Alternative Metrics
 +
|  [https://www-users.cse.umn.edu/~kumar001/dmbook/index.php#item4 Read from Authors' web page] 
 +
| Chapter 5 (CB1)
 +
|-
 +
| Unit 6 / Chapter 8
 +
|  '''''Clustering''''': 8.1 Basic concepts of clustering analysis, 8.2 K-Means (8.2.1-8.2.5 except 8.2.3), 8.3 Agglomerative Hierarchical Clustering (except pg 522-524), 8.4 DBSCAN
 +
|  [http://mkbhandari.com/mkwiki/data/spring2023/DM/5CA.pdf '''5CA.pdf'''] 
 +
| Chapter 8 (CB1)
 
|}
 
|}
  
Line 40: Line 55:
 
* '''Test 1''' :
 
* '''Test 1''' :
 
* '''Test 2''' :
 
* '''Test 2''' :
 
===Projects===
 
* '''Project 1''' :
 
* '''Project 2''' :
 
* '''Project 3''' :
 
* '''Project 4''' :
 
* '''Project 5''' :
 
* '''Project 6''' :
 
* '''Project 7''' :
 
* '''Project 8''' :
 
* '''Project 9''' :
 
* '''Project 10''' :
 
  
 
== Resources ==
 
== Resources ==

Latest revision as of 16:03, 13 August 2023

Logistics

  • Class Timings: Wednesdays 1:00 pm - 3:00 pm (5th and 6th slot)and Thursdays 10:45 am - 12:45 pm (3rd and 4th slot)
  • Classroom: R33
  • Lab Timings: Mondays 8:45 am - 12:45 pm (1st - 4thslots)
  • Labs: CS Lab 5

Course Overview

Lectures

Lecture Topic Lecture Slides Readings
Unit 1 / Chapter 1 Introduction: 1.1 - What Is Data Mining? 1.2 Challenges 1.3 Data Mining Origins 1.4 Data Mining Tasks 1Intro.pdf Chapter 1 (CB1)
Unit 2 / Chapter 2 Data mining techniques: 2.1- Types of data, 2.2 – Data Quality, 2.3.1 Aggregation, 2.3.2 Sampling, 2.3.3 Dimensionality reduction – upto pg 51, 2.3.4 Feature subset selection upto pg 52, 2.4.5 Feature creation upto pg 55, 2.3.6 Discretization upto pg 59, 2.3.7 variable transformations 2.4.3 Dissimilarity among data objects 2.4.4 similarity among data objects 2DMT.pdf Chapter 2 (CB1)
Unit 5 / Chapter 6 Association Rules: 6.1-Problem definition, 6.2-Frequent itemset generation, 6.3-Rule generation till Pg 351 3AR.pdf Chapter 6 (CB1)
Unit 3 / Chapter 4 Classification: Basic Concepts and Techniques: 4.1 – Preliminaries, 4.2 – General Approach to Solving a Classification Problem, 4.3 Decision Tree Induction (Till Pg. 165), 4.5 –Evaluating the Performance of a Classifier 4Classification.pdf Chapter 4 (CB1)
Unit 4 / Chapter 5 Classification: Alternative Techniques: 5.1 – Rule Based Classifier (upto page 212),5.2 – Nearest Neighbor Classifiers, 5.3–Bayesian Classifiers (Complete for discrete data and only introduction of Bayes classifier for continuous attributes) till pg. 233, 5.7.1 –Alternative Metrics Read from Authors' web page Chapter 5 (CB1)
Unit 6 / Chapter 8 Clustering: 8.1 Basic concepts of clustering analysis, 8.2 K-Means (8.2.1-8.2.5 except 8.2.3), 8.3 Agglomerative Hierarchical Clustering (except pg 522-524), 8.4 DBSCAN 5CA.pdf Chapter 8 (CB1)

Assignments and Tests

Class Assignments

  • Assignment No. 1,
  • Assignment No. 2,

Tests and Quizzes

  • Test 1 :
  • Test 2 :

Resources

Course Books:

  • CB1: Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Pearson Education.

References:

  • R2: Data Mining: Concepts and Techniques, 3nd edition,Jiawei Han and Micheline Kamber.
  • R3: Data Mining: A Tutorial Based Primer, Richard Roiger, Michael Geatz, Pearson Education 2003.
  • R4: Introduction to Data Mining with Case Studies, G.K. Gupta, PHI 2006.
  • R5: Insight into Data mining: Theory and Practice, Soman K. P., DiwakarShyam, Ajay V., PHI 2006