Difference between revisions of "Fall 2023: Open Source Software (BA Prog)"

From MKWiki
Jump to navigation Jump to search
Line 16: Line 16:
 
!Readings
 
!Readings
 
|-
 
|-
| style="width: 12%; " | Unit 1 / Chapter 1
+
| style="width: 12%; " | Unit 1
| style="width: 60%" |  '''''Introduction''''': 1.1 - What Is Data Mining? 1.2 Challenges 1.3 Data Mining Origins 1.4 Data Mining Tasks
+
| style="width: 60%" |  '''''Introduction to Open Source Softwares'''''
| style="width: 15%" | [http://mkbhandari.com/mkwiki/data/spring2023/DM/1Intro.pdf '''1Intro.pdf'''] 
+
| style="width: 15%" |  
 
| Chapter 1 (CB1)
 
| Chapter 1 (CB1)
 
|-
 
|-
| Unit 2 / Chapter 2
+
| Unit 2
 
|  '''''Data mining techniques''''': 2.1- Types of data, 2.2 – Data Quality, 2.3.1 Aggregation, 2.3.2 Sampling, 2.3.3 Dimensionality reduction – upto pg 51, 2.3.4 Feature subset selection upto pg 52, 2.4.5 Feature creation upto pg 55, 2.3.6 Discretization upto pg 59, 2.3.7 variable transformations 2.4.3 Dissimilarity among data objects 2.4.4 similarity among data objects
 
|  '''''Data mining techniques''''': 2.1- Types of data, 2.2 – Data Quality, 2.3.1 Aggregation, 2.3.2 Sampling, 2.3.3 Dimensionality reduction – upto pg 51, 2.3.4 Feature subset selection upto pg 52, 2.4.5 Feature creation upto pg 55, 2.3.6 Discretization upto pg 59, 2.3.7 variable transformations 2.4.3 Dissimilarity among data objects 2.4.4 similarity among data objects
 
|  [http://mkbhandari.com/mkwiki/data/spring2023/DM/2DMT.pdf '''2DMT.pdf''']   
 
|  [http://mkbhandari.com/mkwiki/data/spring2023/DM/2DMT.pdf '''2DMT.pdf''']   

Revision as of 19:42, 27 August 2023

Logistics

  • Class Timings: Mondays and Tuesdays 8:30 am - 9:30 am (1st slot)
  • Classroom: R45
  • Lab Timings: Fridays 8:30 am - 12:30 pm (1st - 4thslots)
  • Labs: CL-3

Course Overview

Lectures

Lecture Topic Lecture Slides Readings
Unit 1 Introduction to Open Source Softwares Chapter 1 (CB1)
Unit 2 Data mining techniques: 2.1- Types of data, 2.2 – Data Quality, 2.3.1 Aggregation, 2.3.2 Sampling, 2.3.3 Dimensionality reduction – upto pg 51, 2.3.4 Feature subset selection upto pg 52, 2.4.5 Feature creation upto pg 55, 2.3.6 Discretization upto pg 59, 2.3.7 variable transformations 2.4.3 Dissimilarity among data objects 2.4.4 similarity among data objects 2DMT.pdf Chapter 2 (CB1)
Unit 5 / Chapter 6 Association Rules: 6.1-Problem definition, 6.2-Frequent itemset generation, 6.3-Rule generation till Pg 351 3AR.pdf Chapter 6 (CB1)
Unit 3 / Chapter 4 Classification: Basic Concepts and Techniques: 4.1 – Preliminaries, 4.2 – General Approach to Solving a Classification Problem, 4.3 Decision Tree Induction (Till Pg. 165), 4.5 –Evaluating the Performance of a Classifier 4Classification.pdf Chapter 4 (CB1)
Unit 4 / Chapter 5 Classification: Alternative Techniques: 5.1 – Rule Based Classifier (upto page 212),5.2 – Nearest Neighbor Classifiers, 5.3–Bayesian Classifiers (Complete for discrete data and only introduction of Bayes classifier for continuous attributes) till pg. 233, 5.7.1 –Alternative Metrics Read from Authors' web page Chapter 5 (CB1)
Unit 6 / Chapter 8 Clustering: 8.1 Basic concepts of clustering analysis, 8.2 K-Means (8.2.1-8.2.5 except 8.2.3), 8.3 Agglomerative Hierarchical Clustering (except pg 522-524), 8.4 DBSCAN 5CA.pdf Chapter 8 (CB1)

Assignments and Tests

Class Assignments

  • Assignment No. 1,
  • Assignment No. 2,

Tests and Quizzes

  • Test 1 :
  • Test 2 :

Resources

Course Books:

  • CB1: Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Pearson Education.

References:

  • R2: Data Mining: Concepts and Techniques, 3nd edition,Jiawei Han and Micheline Kamber.
  • R3: Data Mining: A Tutorial Based Primer, Richard Roiger, Michael Geatz, Pearson Education 2003.
  • R4: Introduction to Data Mining with Case Studies, G.K. Gupta, PHI 2006.
  • R5: Insight into Data mining: Theory and Practice, Soman K. P., DiwakarShyam, Ajay V., PHI 2006