Difference between revisions of "Fall 2025: Data Mining-1"
Jump to navigation
Jump to search
(Created page with "DM-1") |
|||
(14 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | + | == '''Lab 0: Getting Started''' ( week of 04<sup>th</sup>, 11<sup>th</sup > & 18<sup>th</sup > August 2025 ) == | |
+ | {| class="wikitable" style="text-align: justify; | ||
+ | |- | ||
+ | ! Task No. | ||
+ | ! Task | ||
+ | ! Assessment Period. | ||
+ | ! Submission Deadline | ||
+ | |- | ||
+ | | style="width: 8%" | 1 | ||
+ | | style="width: 60%" | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html | ||
+ | | style="width: 15%" | -- | ||
+ | | -- | ||
+ | |- | ||
+ | | 2 || https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html || -- || -- | ||
+ | |- | ||
+ | | 3 || https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html || -- || -- | ||
+ | |} | ||
+ | |||
+ | == '''Lab 1: ''' ( week of 25<sup>th</sup> August & 01<sup>st</sup> September 2025 ) == | ||
+ | {| class="wikitable" style="text-align: justify; | ||
+ | |- | ||
+ | ! Task No. | ||
+ | ! Task | ||
+ | ! Assessment Period. | ||
+ | ! Submission Deadline | ||
+ | |- | ||
+ | | style="width: 8%" style="text-align: center; | 1 | ||
+ | | style="width: 60%" | Apply data cleaning techniques on any dataset (e.g., Paper Reviews dataset in UCI repository). Techniques may include handling missing values, outliers and inconsistent values. A set of validation rules can be prepared based on the dataset and validations can be performed. | ||
+ | | style="width: 15%" | 25/08/2025 - 01/09/2025 | ||
+ | | 02/09/2025 | ||
+ | |} | ||
+ | |||
+ | == '''Lab 2: ''' ( week of 08<sup>th</sup> & 15<sup>th</sup> September 2025 ) == | ||
+ | {| class="wikitable" style="text-align: justify; | ||
+ | |- | ||
+ | ! Task No. | ||
+ | ! Task | ||
+ | ! Assessment Period. | ||
+ | ! Submission Deadline | ||
+ | |- | ||
+ | | style="width: 8%" style="text-align: center; | 2 | ||
+ | | style="width: 60%" | Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset | ||
+ | | style="width: 15%" | 08/09/2025 - 15/09/2025 | ||
+ | | 22/09/2025 | ||
+ | |} | ||
+ | |||
+ | == '''Lab 3: ''' ( week of 22<sup>nd</sup> September 2025 ) == | ||
+ | {| class="wikitable" style="text-align: justify; | ||
+ | |- | ||
+ | ! Task No. | ||
+ | ! Task | ||
+ | ! Assessment Period. | ||
+ | ! Submission Deadline | ||
+ | |- | ||
+ | | style="width: 8%" style="text-align: center; | 5 | ||
+ | | style="width: 60%" | Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration. | ||
+ | | style="width: 15%" | 22/09/2025 - 06/10/2025 | ||
+ | | 06/10/2025 | ||
+ | |} |
Latest revision as of 23:16, 21 September 2025
Contents
Lab 0: Getting Started ( week of 04th, 11th & 18th August 2025 )
Task No. | Task | Assessment Period. | Submission Deadline |
---|---|---|---|
1 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html | -- | -- |
2 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html | -- | -- |
3 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html | -- | -- |
Lab 1: ( week of 25th August & 01st September 2025 )
Task No. | Task | Assessment Period. | Submission Deadline |
---|---|---|---|
1 | Apply data cleaning techniques on any dataset (e.g., Paper Reviews dataset in UCI repository). Techniques may include handling missing values, outliers and inconsistent values. A set of validation rules can be prepared based on the dataset and validations can be performed. | 25/08/2025 - 01/09/2025 | 02/09/2025 |
Lab 2: ( week of 08th & 15th September 2025 )
Task No. | Task | Assessment Period. | Submission Deadline |
---|---|---|---|
2 | Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset | 08/09/2025 - 15/09/2025 | 22/09/2025 |
Lab 3: ( week of 22nd September 2025 )
Task No. | Task | Assessment Period. | Submission Deadline |
---|---|---|---|
5 | Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration. | 22/09/2025 - 06/10/2025 | 06/10/2025 |