Difference between revisions of "Fall 2024: Data Mining Lab"
Jump to navigation
Jump to search
| (12 intermediate revisions by the same user not shown) | |||
| Line 58: | Line 58: | ||
|} | |} | ||
| − | == '''Lab 3:''' ( week of 16<sup>th</sup> | + | == '''Lab 3:''' ( week of 16<sup>th</sup>, 23<sup>rd</sup> & 30<sup>th</sup>September 2024 ) == |
{| class="wikitable" style="text-align: justify; width: 100%"; | {| class="wikitable" style="text-align: justify; width: 100%"; | ||
|- | |- | ||
| Line 67: | Line 67: | ||
|- | |- | ||
| style="width: 8%" | 1 | | style="width: 8%" | 1 | ||
| − | | style="width: 60%" | Writing/Review of Chapter 1 and Chapter | + | | style="width: 60%" | Writing/Review of Chapter 1, Chapter 3, and Chapter 4 of Project Report |
| style="width: 15%" | Project Work | | style="width: 15%" | Project Work | ||
| | | | ||
| + | |} | ||
| + | |||
| + | == '''Lab 4:''' ( week of 7<sup>th</sup> October 2024 ) == | ||
| + | {| class="wikitable" style="text-align: justify; width: 100%"; | ||
| + | |- | ||
| + | ! Q. NO. | ||
| + | ! Program | ||
| + | ! Practical No. | ||
| + | ! Remarks | ||
| + | |- | ||
| + | | style="width: 8%" | 1 | ||
| + | | style="width: 60%" | Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration. | ||
| + | | style="width: 15%" | Practical No. 3 | ||
| + | | '''Dataset:''' [http://mkbhandari.com/mkwiki/data/fall2024/dm/datasets/Mall_Customers.csv '''Mall_Customers.csv'''] <br> | ||
| + | '''Download from data from kaggle:''' [https://www.kaggle.com/datasets/vjchoudhary7/customer-segmentation-tutorial-in-python Mall Customer Segmentation Data] | ||
|} | |} | ||
| Line 83: | Line 98: | ||
| style="width: 45%" | Understanding the Monsoon Pattern in Eastern Gangatic Plain | | style="width: 45%" | Understanding the Monsoon Pattern in Eastern Gangatic Plain | ||
| style="width: 25%" | | | style="width: 25%" | | ||
| − | # Akshary Sharma (25019) | + | # '''Akshary Sharma (25019)''' |
# Abhay Yadav (25040) | # Abhay Yadav (25040) | ||
| + | # Anuj Gupta (25042) | ||
# Amar Kumar (25065) | # Amar Kumar (25065) | ||
# Kunal Verma (25073) | # Kunal Verma (25073) | ||
| Line 93: | Line 109: | ||
|- | |- | ||
|2|| NIRF Ranking Prediction|| | |2|| NIRF Ranking Prediction|| | ||
| − | # Abhishek Prasad (25007) | + | # '''Abhishek Prasad (25007)''' |
# Vishal Kumar (25014) | # Vishal Kumar (25014) | ||
# Nitish Kumar (25023) | # Nitish Kumar (25023) | ||
| + | # Anshu Kumar Dubey (25036) | ||
# Sunny Chauhan (25050) | # Sunny Chauhan (25050) | ||
| | | | ||
| Line 102: | Line 119: | ||
* Project Presentation: | * Project Presentation: | ||
|- | |- | ||
| − | |3|| Student Performance Prediction|| | + | |3|| Student Performance Prediction || |
| − | # Himanshu Kumar (25016) | + | # '''Himanshu Kumar (25016)''' |
# Kanan Pal (25072) | # Kanan Pal (25072) | ||
# Khushboo Yadav (25082) | # Khushboo Yadav (25082) | ||
| Line 114: | Line 131: | ||
|4|| FIFA Prediction || | |4|| FIFA Prediction || | ||
# Arihant (25003) | # Arihant (25003) | ||
| − | # Ayush Pundir (25027) | + | # '''Ayush Pundir (25027)''' |
# Pratyush (25060) | # Pratyush (25060) | ||
# Ashish (25066) | # Ashish (25066) | ||
| Line 124: | Line 141: | ||
|5|| Breast Cancer Prediction || | |5|| Breast Cancer Prediction || | ||
# Vidhan (25044) | # Vidhan (25044) | ||
| − | # Sandeep Kumar Sharma (25047) | + | # '''Sandeep Kumar Sharma (25047)''' |
# Ayushman Pandey (25094) | # Ayushman Pandey (25094) | ||
# Tanishk Panchal (25095) | # Tanishk Panchal (25095) | ||
| Line 136: | Line 153: | ||
# Shatrughan (25084) | # Shatrughan (25084) | ||
# Om Ranjan (25085) | # Om Ranjan (25085) | ||
| − | # Aman Sagar (25086) | + | # '''Aman Sagar (25086)''' |
| | | | ||
* Dataset: | * Dataset: | ||
| Line 144: | Line 161: | ||
|7|| Olympic Data Analysis and Prediction || | |7|| Olympic Data Analysis and Prediction || | ||
# Kusum (25002) | # Kusum (25002) | ||
| − | # Aditya Kumar (25012) | + | # '''Aditya Kumar (25012)''' |
# Divyanshi (25021) | # Divyanshi (25021) | ||
# Tushar Rana (25064) | # Tushar Rana (25064) | ||
| Line 153: | Line 170: | ||
|- | |- | ||
|8|| Credit Card Fraud Detection || | |8|| Credit Card Fraud Detection || | ||
| − | # Ansh Raj ( | + | # Ritesh Dhawan (25037) |
| − | # Uday Raj Verma (250xx) | + | # Bitthal Varshney (25041) |
| − | # | + | # Ansh Raj (25081) |
| − | # | + | # '''Uday Raj Verma (25083)''' |
| + | # Astitwa Rawat (25088) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |9|| CreditMap: Exploring Credit Score Patterns through Data Mining || | ||
| + | # Himanshu Singh (25017) | ||
| + | # '''Garvit Kumar (25018)''' | ||
| + | # Mayank (25022) | ||
| + | # Abhishek Kumar Singh(25032) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |10|| Movie Recommendation System || | ||
| + | # Tanya Agrahari (25030) | ||
| + | # Prakash Mishra (25035) | ||
| + | # '''Adarsh Singh (25074)''' | ||
| + | # Shivam Verma (25078) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |11|| Wine Quality Prediction || | ||
| + | # '''Shivam Soni (250xx)''' | ||
| + | # Kashif (250xx) | ||
| + | # Akash Pathak (250xx) | ||
| + | # Priyanshu Sachan (250xx) | ||
| | | | ||
* Dataset: | * Dataset: | ||
Latest revision as of 21:54, 26 November 2024
Contents
Instructions
- Please be on time to avoid the Attendance Penalty.
- Please sign on the Attendance Register before your take a seat.
- Please put your mobile phone in the Silent Mode.
- Each lab assignment needs to be submitted in the Google Classroom for evaluation(will be notified in the GC lab-wise, submit before the deadline).
- Turn off(shut down) your assigned computer and arrange the chair before you leave the lab.
Guidelines
- As per DUCS guidelines DSE: Data Mining
Lab 0: Getting Started ( week of 05th & 12th August 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html | Practice Set No. 1 | Introduction to Python |
| 2 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html | Practice Set No. 2 | Introduction to Numpy and Pandas |
| 3 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html | Practice Set No. 3 | Data Exploration |
Lab 1: ( week of 19th & 26th August 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | Apply data cleaning techniques on any dataset (e.g. Chronic Kidney Disease dataset from UCI repository). Techniques may include handling missing values, outliers and inconsistent values. Also, a set of validation rules may be specified for the particular dataset and validation checks performed. | Practical No. 1 | Dataset: kidneyDisease.csv Download from Kaggle: Chronic KIdney Disease dataset |
Lab 2: ( week of 2nd & 9th September 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset | Practical No. 2 | Dataset: rain.csv Download from data.gov.in: Rainfall in India |
Lab 3: ( week of 16th, 23rd & 30thSeptember 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | Writing/Review of Chapter 1, Chapter 3, and Chapter 4 of Project Report | Project Work |
Lab 4: ( week of 7th October 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration. | Practical No. 3 | Dataset: Mall_Customers.csv Download from data from kaggle: Mall Customer Segmentation Data |
Projects
| Team No. | Project Title | Team Members | Outcomes/Remarks |
|---|---|---|---|
| 1 | Understanding the Monsoon Pattern in Eastern Gangatic Plain |
|
|
| 2 | NIRF Ranking Prediction |
|
|
| 3 | Student Performance Prediction |
|
|
| 4 | FIFA Prediction |
|
|
| 5 | Breast Cancer Prediction |
|
|
| 6 | YouTube spam comments classification |
|
|
| 7 | Olympic Data Analysis and Prediction |
|
|
| 8 | Credit Card Fraud Detection |
|
|
| 9 | CreditMap: Exploring Credit Score Patterns through Data Mining |
|
|
| 10 | Movie Recommendation System |
|
|
| 11 | Wine Quality Prediction |
|
|