Difference between revisions of "Spring 2023: Data Mining"

From MKWiki
Jump to navigation Jump to search
 
(25 intermediate revisions by the same user not shown)
Line 6: Line 6:
  
 
== Course Overview ==
 
== Course Overview ==
* As per the Delhi University [http://cs.du.ac.in/uploads/ug_guidelines/BSc-H-CS/V/BHCS11-Internet%20Technologies.pdf Course Guidelines]
+
* As per the Delhi University [https://cs.du.ac.in/uploads/ug_guidelines/BSc-H-CS/VI/DM%20Guidelines%2014Jan2022.pdf Course Guidelines]
  
 
== Lectures ==
 
== Lectures ==
{| class="wikitable" style="text-align: justify; width: 100%";  
+
{| class="wikitable" style="text-align: left; width: 100%";  
 
|-
 
|-
 
!Lecture
 
!Lecture
Line 16: Line 16:
 
!Readings
 
!Readings
 
|-
 
|-
| style="width: 10%; " | Unit/Chapter 1.3 (26/07/22)
+
| style="width: 12%; " | Unit 1 / Chapter 1
| style="width: 60%" |  '''''Introduction to Internet''''': What is Internet? Evolution of the Internet. Working of the Internet. Difference between Intranet and Internet.
+
| style="width: 60%" |  '''''Introduction''''': 1.1 - What Is Data Mining? 1.2 Challenges 1.3 Data Mining Origins 1.4 Data Mining Tasks
| style="width: 15%" | [http://mkbhandari.com/mkwiki/data/fall2022/1.3Internet.pdf '''1.3.pdf'''] <br> [http://mkbhandari.com/mkwiki/data/fall2022/Milestones.pdf '''Key Milestones.pdf''']  
+
| style="width: 15%" | [http://mkbhandari.com/mkwiki/data/spring2023/DM/1Intro.pdf '''1Intro.pdf''']
| Chapter 1 (R1)
+
| Chapter 1 (CB1)
 
|-
 
|-
| Unit/Chapter 3.1 (01/08/22, 02/08/22)
+
| Unit 2 / Chapter 2
| '''''Web Servers''''': Introduction to Web Servers. Working, Configuring, Hosting, and Managing Web Servers(class assignment). Client-side Technologies, Server-side Technologies, Hybrid Technologies.
+
|   '''''Data mining techniques''''': 2.1- Types of data, 2.2 – Data Quality, 2.3.1 Aggregation, 2.3.2 Sampling, 2.3.3 Dimensionality reduction – upto pg 51, 2.3.4 Feature subset selection upto pg 52, 2.4.5 Feature creation upto pg 55, 2.3.6 Discretization upto pg 59, 2.3.7 variable transformations 2.4.3 Dissimilarity among data objects 2.4.4 similarity among data objects
| [http://mkbhandari.com/mkwiki/data/fall2022/3.1WebServers.pdf '''3.1.pdf''']
+
| [http://mkbhandari.com/mkwiki/data/spring2023/DM/2DMT.pdf '''2DMT.pdf''']
| Chapter 1-2 (R1)
+
| Chapter 2 (CB1)
 
|-
 
|-
| Unit/Chapter 3.2 (22/08/22)
+
| Unit 5 / Chapter 6
| '''''Proxy Servers''''': Introduction, Working, Types of Proxies, setting up and managing a proxy server.
+
|   '''''Association Rules''''': 6.1-Problem definition, 6.2-Frequent itemset generation, 6.3-Rule generation till Pg 351
| [https://www.varonis.com/blog/what-is-a-proxy-server/   Online ]
+
| [http://mkbhandari.com/mkwiki/data/spring2023/DM/3AR.pdf '''3AR.pdf''']
| Up to Proxy server risks
+
| Chapter 6 (CB1)
 
|-
 
|-
| Unit/Chapter 4.2(a) (22/08/22)
+
| Unit 3 / Chapter 4
| Introduction to '''''forums, blogging''''', portfolio, developing a responsive website.
+
|   '''''Classification: Basic Concepts and Techniques''''': 4.1 – Preliminaries, 4.2 – General Approach to Solving a Classification Problem, 4.3 Decision Tree Induction (Till Pg. 165), 4.5 –Evaluating the Performance of a Classifier
| [https://imtips.co/blog-or-forum.html Online]
+
| [http://mkbhandari.com/mkwiki/data/spring2023/DM/4CL.pdf '''4Classification.pdf''']
| Except Common FAQ’s on Forums and Blogs)
+
| Chapter 4 (CB1)
 
|-
 
|-
| Unit/Chapter 4.2(b) (23/08/22, 29/08/22, 05/09/22, 06/09/22, 12/09/22, 13/09/22, 19/09/22, 26/09/22, 27/09/22, 10/10/22, 11/10/22)
+
| Unit 4 / Chapter 5
|  JavaScript, jQuery, AJAX and JSON <br>
+
|   '''''Classification: Alternative Techniques''''': 5.1 – Rule Based Classifier (upto page 212),5.2 – Nearest Neighbor Classifiers, 5.3–Bayesian Classifiers (Complete for discrete data and only introduction of Bayes classifier for continuous attributes) till pg. 233, 5.7.1 –Alternative Metrics
(1) Basic JavaScript Instruction <br>
+
[https://www-users.cse.umn.edu/~kumar001/dmbook/index.php#item4 Read from Authors' web page]
(2) Functions, Methods & Objects <br>
+
| Chapter 5 (CB1)
(3) Decisions and Loops <br>
 
(4) Document Object Model  <br>
 
(5) Events <br>
 
(6) jQuery <br>
 
(7) AJAX, JSON
 
| [http://mkbhandari.com/mkwiki/data/fall2022/4.2JS.pdf '''4.2JS.pdf'''] <br>
 
[http://mkbhandari.com/mkwiki/data/fall2022/4.2DOM.pdf '''4.2DOM.pdf'''] <br>
 
[http://mkbhandari.com/mkwiki/data/fall2022/4.2jQuery.pdf '''4.2jQuery.pdf'''] <br>
 
[http://mkbhandari.com/mkwiki/data/fall2022/4.2AJAXandJSON.pdf '''4.2AJAXnJSON.pdf''']  
 
| Chapter 2-8 (R2) <hr> Partial PDFs have been uploaded, and contents were covered from the Reference Book in the classroom this semester. You are required to Read from the Reference Books as per the DUCS IT guidelines.
 
 
|-
 
|-
| Student Presentations ()
+
| Unit 6 / Chapter 8
| ABCD<br>
+
|   '''''Clustering''''': 8.1 Basic concepts of clustering analysis, 8.2 K-Means (8.2.1-8.2.5 except 8.2.3), 8.3 Agglomerative Hierarchical Clustering (except pg 522-524), 8.4 DBSCAN
(1) '''NodeJS''' - ''Aditi Kumari, Shreya Singh, Tanisha Sharma, Yashi choudhary, Yash lohia''<br>
+
[http://mkbhandari.com/mkwiki/data/spring2023/DM/5CA.pdf '''5CA.pdf''']
(2) '''Bootstrap''' - ''Raj Khatri, Pratham Sharma, Prakash Kr. Singh, Purbak Sengupta''<br>
+
| Chapter 8 (CB1)
(3) '''Search Engines-Components, Working and Optimisation''' - Rajat Sharma, Rishabh Sharma, Ramit Yadav, Shashank Kestwal <br>
 
 
| [http://mkbhandari.com/mkwiki/data/fall2022/4.2JS.pdf '''4.2JS.pdf'''] <br>
 
[http://mkbhandari.com/mkwiki/data/fall2022/4.2DOM.pdf '''4.2DOM.pdf'''] <br>
 
[http://mkbhandari.com/mkwiki/data/fall2022/4.2jQuery.pdf '''4.2jQuery.pdf'''] <br>
 
[http://mkbhandari.com/mkwiki/data/fall2022/4.2AJAXandJSON.pdf '''4.2AJAXnJSON.pdf''']  
 
| Chapter 2-8 (R2) <hr> Partial PDFs have been uploaded, and contents were covered from the Reference Book in the classroom this semester. You are required to Read from the Reference Books as per the DUCS IT guidelines.
 
 
|}
 
|}
  
 
== Assignments and Tests==
 
== Assignments and Tests==
 
===Class Assignments===
 
===Class Assignments===
* '''''Assignment No. 1''''', Uploaded on Google Classroom, '''Submission Deadline''': 01/11/2022.
+
* '''''Assignment No. 1''''',  
* '''''Group Assignment/ Presentations'''''
+
* '''''Assignment No. 2''''',
  
 
===Tests and Quizzes===
 
===Tests and Quizzes===
* '''Test 1''' : 18/10/2022
+
* '''Test 1''' :
 +
* '''Test 2''' :
  
 
== Resources ==
 
== Resources ==
* '''R1''': Learning PHP, MySQL, JavaScript, CSS & HTML5, (Robin Nixon), 3rd Edition, O’Reilly Media. <br>
+
===Course Books:===
* '''R2''': JavaScript and JQuery – Interactive Front-end Web Development, (Jon Duckett), John Wiley and Sons, Inc. <br>
+
* '''CB1''': Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Pearson Education. <br>
* '''R3:''' Web Design with HTML and CSS Digital Classroom, (Jeremy Osborn, Jemmifer Smith and AGI Creative Team), Wiley Publishing, Inc.
+
===References:===
 +
* '''R2''': Data Mining: Concepts and Techniques, 3nd edition,Jiawei Han and Micheline Kamber. <br>
 +
* '''R3''': Data Mining: A Tutorial Based Primer, Richard Roiger, Michael Geatz, Pearson Education 2003. <br>
 +
* '''R4''': Introduction to Data Mining with Case Studies, G.K. Gupta, PHI 2006. <br>
 +
* '''R5''': Insight into Data mining: Theory and Practice, Soman K. P., DiwakarShyam, Ajay V., PHI 2006

Latest revision as of 16:03, 13 August 2023

Logistics

  • Class Timings: Wednesdays 1:00 pm - 3:00 pm (5th and 6th slot)and Thursdays 10:45 am - 12:45 pm (3rd and 4th slot)
  • Classroom: R33
  • Lab Timings: Mondays 8:45 am - 12:45 pm (1st - 4thslots)
  • Labs: CS Lab 5

Course Overview

Lectures

Lecture Topic Lecture Slides Readings
Unit 1 / Chapter 1 Introduction: 1.1 - What Is Data Mining? 1.2 Challenges 1.3 Data Mining Origins 1.4 Data Mining Tasks 1Intro.pdf Chapter 1 (CB1)
Unit 2 / Chapter 2 Data mining techniques: 2.1- Types of data, 2.2 – Data Quality, 2.3.1 Aggregation, 2.3.2 Sampling, 2.3.3 Dimensionality reduction – upto pg 51, 2.3.4 Feature subset selection upto pg 52, 2.4.5 Feature creation upto pg 55, 2.3.6 Discretization upto pg 59, 2.3.7 variable transformations 2.4.3 Dissimilarity among data objects 2.4.4 similarity among data objects 2DMT.pdf Chapter 2 (CB1)
Unit 5 / Chapter 6 Association Rules: 6.1-Problem definition, 6.2-Frequent itemset generation, 6.3-Rule generation till Pg 351 3AR.pdf Chapter 6 (CB1)
Unit 3 / Chapter 4 Classification: Basic Concepts and Techniques: 4.1 – Preliminaries, 4.2 – General Approach to Solving a Classification Problem, 4.3 Decision Tree Induction (Till Pg. 165), 4.5 –Evaluating the Performance of a Classifier 4Classification.pdf Chapter 4 (CB1)
Unit 4 / Chapter 5 Classification: Alternative Techniques: 5.1 – Rule Based Classifier (upto page 212),5.2 – Nearest Neighbor Classifiers, 5.3–Bayesian Classifiers (Complete for discrete data and only introduction of Bayes classifier for continuous attributes) till pg. 233, 5.7.1 –Alternative Metrics Read from Authors' web page Chapter 5 (CB1)
Unit 6 / Chapter 8 Clustering: 8.1 Basic concepts of clustering analysis, 8.2 K-Means (8.2.1-8.2.5 except 8.2.3), 8.3 Agglomerative Hierarchical Clustering (except pg 522-524), 8.4 DBSCAN 5CA.pdf Chapter 8 (CB1)

Assignments and Tests

Class Assignments

  • Assignment No. 1,
  • Assignment No. 2,

Tests and Quizzes

  • Test 1 :
  • Test 2 :

Resources

Course Books:

  • CB1: Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Pearson Education.

References:

  • R2: Data Mining: Concepts and Techniques, 3nd edition,Jiawei Han and Micheline Kamber.
  • R3: Data Mining: A Tutorial Based Primer, Richard Roiger, Michael Geatz, Pearson Education 2003.
  • R4: Introduction to Data Mining with Case Studies, G.K. Gupta, PHI 2006.
  • R5: Insight into Data mining: Theory and Practice, Soman K. P., DiwakarShyam, Ajay V., PHI 2006