Difference between revisions of "Fall 2025: Data Mining-1"

From MKWiki
Jump to navigation Jump to search
(Created page with "DM-1")
 
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
DM-1
+
== '''Lab 0: Getting Started''' ( week of 04<sup>th</sup>, 11<sup>th</sup > & 18<sup>th</sup >  August 2025 ) ==
 +
{| class="wikitable" style="text-align: justify; 
 +
|-
 +
! Task No. 
 +
! Task 
 +
! Assessment Period. 
 +
! Submission Deadline
 +
|-
 +
| style="width: 8%"  | 1
 +
| style="width: 60%" | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html
 +
| style="width: 15%" |  --
 +
| --
 +
|-
 +
| 2 || https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html  || -- || --
 +
|-
 +
| 3 || https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html || -- || --
 +
|}
 +
 
 +
== '''Lab 1: ''' ( week of 25<sup>th</sup> August & 01<sup>st</sup> September 2025  ) ==
 +
{| class="wikitable" style="text-align: justify; 
 +
|-
 +
! Task No. 
 +
! Task 
 +
! Assessment Period. 
 +
! Submission Deadline
 +
|-
 +
| style="width: 8%"  style="text-align: center;  | 1
 +
| style="width: 60%" | Apply data cleaning techniques on any dataset (e.g., Paper Reviews dataset in UCI repository). Techniques may include handling missing values, outliers and inconsistent values. A set of validation rules can be prepared based on the dataset and validations can be performed.
 +
| style="width: 15%" |  25/08/2025 - 01/09/2025
 +
| 02/09/2025
 +
|}
 +
 
 +
== '''Lab 2: ''' ( week of 08<sup>th</sup> & 15<sup>th</sup> September 2025  ) ==
 +
{| class="wikitable" style="text-align: justify; 
 +
|-
 +
! Task No. 
 +
! Task 
 +
! Assessment Period. 
 +
! Submission Deadline
 +
|-
 +
| style="width: 8%" style="text-align: center;  | 2
 +
| style="width: 60%" | Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset
 +
| style="width: 15%" |  08/09/2025 - 15/09/2025
 +
| 22/09/2025
 +
|}
 +
 
 +
== '''Lab 3: ''' ( week of 22<sup>nd</sup> September 2025  ) ==
 +
{| class="wikitable" style="text-align: justify; 
 +
|-
 +
! Task No. 
 +
! Task 
 +
! Assessment Period. 
 +
! Submission Deadline
 +
|-
 +
| style="width: 8%" style="text-align: center;  | 5
 +
| style="width: 60%" | Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration.
 +
| style="width: 15%" |  22/09/2025 - 06/10/2025
 +
| 06/10/2025
 +
|}

Latest revision as of 23:16, 21 September 2025

Lab 0: Getting Started ( week of 04th, 11th & 18th August 2025 )

Task No. Task Assessment Period. Submission Deadline
1 https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html -- --
2 https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html -- --
3 https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html -- --

Lab 1: ( week of 25th August & 01st September 2025 )

Task No. Task Assessment Period. Submission Deadline
1 Apply data cleaning techniques on any dataset (e.g., Paper Reviews dataset in UCI repository). Techniques may include handling missing values, outliers and inconsistent values. A set of validation rules can be prepared based on the dataset and validations can be performed. 25/08/2025 - 01/09/2025 02/09/2025

Lab 2: ( week of 08th & 15th September 2025 )

Task No. Task Assessment Period. Submission Deadline
2 Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset 08/09/2025 - 15/09/2025 22/09/2025

Lab 3: ( week of 22nd September 2025 )

Task No. Task Assessment Period. Submission Deadline
5 Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration. 22/09/2025 - 06/10/2025 06/10/2025