Fall 2024: Data Mining Lab
Jump to navigation
Jump to search
Contents
Instructions
- Please be on time to avoid the Attendance Penalty.
- Please sign on the Attendance Register before your take a seat.
- Please put your mobile phone in the Silent Mode.
- Each lab assignment needs to be submitted in the Google Classroom for evaluation(will be notified in the GC lab-wise, submit before the deadline).
- Turn off(shut down) your assigned computer and arrange the chair before you leave the lab.
Guidelines
- As per DUCS guidelines DSE: Data Mining
Lab 0: Getting Started ( week of 05th & 12th August 2024 )
Q. NO. | Program | Practical No. | Remarks |
---|---|---|---|
1 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html | Practice Set No. 1 | Introduction to Python |
2 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html | Practice Set No. 2 | Introduction to Numpy and Pandas |
3 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html | Practice Set No. 3 | Data Exploration |
Lab 1: ( week of 19th & 26th August 2024 )
Q. NO. | Program | Practical No. | Remarks |
---|---|---|---|
1 | Apply data cleaning techniques on any dataset (e.g. Chronic Kidney Disease dataset from UCI repository). Techniques may include handling missing values, outliers and inconsistent values. Also, a set of validation rules may be specified for the particular dataset and validation checks performed. | Practical No. 1 | Dataset: kidneyDisease.csv Download from Kaggle: Chronic KIdney Disease dataset |
Lab 2: ( week of 2nd & 9th September 2024 )
Q. NO. | Program | Practical No. | Remarks |
---|---|---|---|
1 | Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset | Practical No. 2 | Dataset: rain.csv Download from data.gov.in: Rainfall in India |
Lab 3: ( week of 16th, 23rd & 30thSeptember 2024 )
Q. NO. | Program | Practical No. | Remarks |
---|---|---|---|
1 | Writing/Review of Chapter 1, Chapter 3, and Chapter 4 of Project Report | Project Work |
Lab 4: ( week of 7th October 2024 )
Q. NO. | Program | Practical No. | Remarks |
---|---|---|---|
1 | Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration. | Practical No. 3 | Dataset: Mall_Customers.csv Download from data from kaggle: Mall Customer Segmentation Data |
Projects
Team No. | Project Title | Team Members | Outcomes/Remarks |
---|---|---|---|
1 | Understanding the Monsoon Pattern in Eastern Gangatic Plain |
|
|
2 | NIRF Ranking Prediction |
|
|
3 | Student Performance Prediction |
|
|
4 | FIFA Prediction |
|
|
5 | Breast Cancer Prediction |
|
|
6 | YouTube spam comments classification |
|
|
7 | Olympic Data Analysis and Prediction |
|
|
8 | Credit Card Fraud Detection |
|
|
9 | CreditMap: Exploring Credit Score Patterns through Data Mining |
|
|
10 | Movie Recommendation System |
|
|
11 | Wine Quality Prediction |
|
|