Difference between revisions of "Spring 2025: Software Engineering Lab"
Jump to navigation
Jump to search
(Created page with "SE Lab") |
|||
| Line 1: | Line 1: | ||
| − | + | =='''Instructions'''== | |
| + | * Please be on time to avoid the '''Attendance Penalty'''. | ||
| + | * Please put your mobile phone in the '''Silent Mode'''. | ||
| + | * Each lab assignment needs to be submitted in the '''Google Classroom''' for evaluation(will be notified in the GC lab-wise, submit before the deadline). | ||
| + | * Turn off'''(shut down) your assigned computer and arrange the chair''' before you leave the lab. | ||
| + | |||
| + | ==''' Guidelines'''== | ||
| + | * As per DUCS guidelines [http://mkbhandari.com/mkwiki/data/fall2024/dm/DMGuideline.pdf '''DSE: Data Mining'''] | ||
| + | |||
| + | == '''Lab 0: Getting Started''' ( week of 05<sup>th</sup> & 12<sup>th</sup > August 2024 ) == | ||
| + | {| class="wikitable" style="text-align: justify; width: 100%"; | ||
| + | |- | ||
| + | ! Q. NO. | ||
| + | ! Program | ||
| + | ! Practical No. | ||
| + | ! Remarks | ||
| + | |- | ||
| + | | style="width: 8%" | 1 | ||
| + | | style="width: 60%" | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html | ||
| + | | style="width: 15%" | Practice Set No. 1 | ||
| + | | Introduction to Python | ||
| + | |- | ||
| + | | 2 || https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html || Practice Set No. 2 || Introduction to Numpy and Pandas | ||
| + | |- | ||
| + | | 3 || https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html || Practice Set No. 3 || Data Exploration | ||
| + | |} | ||
| + | |||
| + | == '''Lab 1:''' ( week of 19<sup>th</sup> & 26<sup>th</sup> August 2024 ) == | ||
| + | {| class="wikitable" style="text-align: justify; width: 100%"; | ||
| + | |- | ||
| + | ! Q. NO. | ||
| + | ! Program | ||
| + | ! Practical No. | ||
| + | ! Remarks | ||
| + | |- | ||
| + | | style="width: 8%" | 1 | ||
| + | | style="width: 60%" | Apply data cleaning techniques on any dataset (e.g. Chronic Kidney Disease dataset from UCI repository). Techniques may include handling missing values, outliers and inconsistent values. Also, a set of validation rules may be specified for the particular dataset and validation checks performed. | ||
| + | | style="width: 15%" | Practical No. 1 | ||
| + | | '''Dataset:''' [http://mkbhandari.com/mkwiki/data/fall2024/dm/datasets/kidneyDisease.csv '''kidneyDisease.csv'''] <br> | ||
| + | '''Download from Kaggle:''' [https://www.kaggle.com/datasets/mansoordaku/ckdisease Chronic KIdney Disease dataset] <br> | ||
| + | '''Tutorial:''' [https://www.kaggle.com/code/alexisbcook/handling-missing-values#How-many-missing-data-points-do-we-have? Tutorial on Handling Missing values] | ||
| + | |} | ||
| + | |||
| + | == '''Lab 2:''' ( week of 2<sup>nd</sup> & 9<sup>th</sup> September 2024 ) == | ||
| + | {| class="wikitable" style="text-align: justify; width: 100%"; | ||
| + | |- | ||
| + | ! Q. NO. | ||
| + | ! Program | ||
| + | ! Practical No. | ||
| + | ! Remarks | ||
| + | |- | ||
| + | | style="width: 8%" | 1 | ||
| + | | style="width: 60%" | Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset | ||
| + | | style="width: 15%" | Practical No. 2 | ||
| + | | '''Dataset:''' [http://mkbhandari.com/mkwiki/data/fall2024/dm/datasets/rain.csv '''rain.csv'''] <br> | ||
| + | '''Download from data.gov.in:''' [https://www.data.gov.in/catalog/rainfall-india Rainfall in India] | ||
| + | |} | ||
| + | |||
| + | == '''Lab 3:''' ( week of 16<sup>th</sup>, 23<sup>rd</sup> & 30<sup>th</sup>September 2024 ) == | ||
| + | {| class="wikitable" style="text-align: justify; width: 100%"; | ||
| + | |- | ||
| + | ! Q. NO. | ||
| + | ! Program | ||
| + | ! Practical No. | ||
| + | ! Remarks | ||
| + | |- | ||
| + | | style="width: 8%" | 1 | ||
| + | | style="width: 60%" | Writing/Review of Chapter 1, Chapter 3, and Chapter 4 of Project Report | ||
| + | | style="width: 15%" | Project Work | ||
| + | | | ||
| + | |} | ||
| + | |||
| + | == '''Lab 4:''' ( week of 7<sup>th</sup> October 2024 ) == | ||
| + | {| class="wikitable" style="text-align: justify; width: 100%"; | ||
| + | |- | ||
| + | ! Q. NO. | ||
| + | ! Program | ||
| + | ! Practical No. | ||
| + | ! Remarks | ||
| + | |- | ||
| + | | style="width: 8%" | 1 | ||
| + | | style="width: 60%" | Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration. | ||
| + | | style="width: 15%" | Practical No. 3 | ||
| + | | '''Dataset:''' [http://mkbhandari.com/mkwiki/data/fall2024/dm/datasets/Mall_Customers.csv '''Mall_Customers.csv'''] <br> | ||
| + | '''Download from data from kaggle:''' [https://www.kaggle.com/datasets/vjchoudhary7/customer-segmentation-tutorial-in-python Mall Customer Segmentation Data] | ||
| + | |} | ||
| + | |||
| + | == '''Projects''' == | ||
| + | {| class="wikitable" style="text-align: justify; width: 100%"; | ||
| + | |- | ||
| + | ! Team No. | ||
| + | ! Project Title | ||
| + | ! Team Members | ||
| + | ! Outcomes/Remarks | ||
| + | |- | ||
| + | | style="width: 8%" | 1 | ||
| + | | style="width: 45%" | Understanding the Monsoon Pattern in Eastern Gangatic Plain | ||
| + | | style="width: 25%" | | ||
| + | # '''Akshary Sharma (25019)''' | ||
| + | # Abhay Yadav (25040) | ||
| + | # Anuj Gupta (25042) | ||
| + | # Amar Kumar (25065) | ||
| + | # Kunal Verma (25073) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |2|| NIRF Ranking Prediction|| | ||
| + | # '''Abhishek Prasad (25007)''' | ||
| + | # Vishal Kumar (25014) | ||
| + | # Nitish Kumar (25023) | ||
| + | # Anshu Kumar Dubey (25036) | ||
| + | # Sunny Chauhan (25050) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |3|| Student Performance Prediction || | ||
| + | # '''Himanshu Kumar (25016)''' | ||
| + | # Kanan Pal (25072) | ||
| + | # Khushboo Yadav (25082) | ||
| + | # Diksha Joshi (25091) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |4|| FIFA Prediction || | ||
| + | # Arihant (25003) | ||
| + | # '''Ayush Pundir (25027)''' | ||
| + | # Pratyush (25060) | ||
| + | # Ashish (25066) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |5|| Breast Cancer Prediction || | ||
| + | # Vidhan (25044) | ||
| + | # '''Sandeep Kumar Sharma (25047)''' | ||
| + | # Ayushman Pandey (25094) | ||
| + | # Tanishk Panchal (25095) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |6|| YouTube spam comments classification || | ||
| + | # Devesh Chauhan (25011) | ||
| + | # Shatrughan (25084) | ||
| + | # Om Ranjan (25085) | ||
| + | # '''Aman Sagar (25086)''' | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |7|| Olympic Data Analysis and Prediction || | ||
| + | # Kusum (25002) | ||
| + | # '''Aditya Kumar (25012)''' | ||
| + | # Divyanshi (25021) | ||
| + | # Tushar Rana (25064) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |8|| Credit Card Fraud Detection || | ||
| + | # Ritesh Dhawan (25037) | ||
| + | # Bitthal Varshney (25041) | ||
| + | # Ansh Raj (25081) | ||
| + | # '''Uday Raj Verma (25083)''' | ||
| + | # Astitwa Rawat (25088) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |9|| CreditMap: Exploring Credit Score Patterns through Data Mining || | ||
| + | # Himanshu Singh (25017) | ||
| + | # '''Garvit Kumar (25018)''' | ||
| + | # Mayank (25022) | ||
| + | # Abhishek Kumar Singh(25032) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |10|| Movie Recommendation System || | ||
| + | # Tanya Agrahari (25030) | ||
| + | # Prakash Mishra (25035) | ||
| + | # '''Adarsh Singh (25074)''' | ||
| + | # Shivam Verma (25078) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |- | ||
| + | |11|| Wine Quality Prediction || | ||
| + | # '''Shivam Soni (250xx)''' | ||
| + | # Kashif (250xx) | ||
| + | # Akash Pathak (250xx) | ||
| + | # Priyanshu Sachan (250xx) | ||
| + | | | ||
| + | * Dataset: | ||
| + | * Report: | ||
| + | * Project Presentation: | ||
| + | |} | ||
Revision as of 13:06, 15 December 2024
Contents
Instructions
- Please be on time to avoid the Attendance Penalty.
- Please put your mobile phone in the Silent Mode.
- Each lab assignment needs to be submitted in the Google Classroom for evaluation(will be notified in the GC lab-wise, submit before the deadline).
- Turn off(shut down) your assigned computer and arrange the chair before you leave the lab.
Guidelines
- As per DUCS guidelines DSE: Data Mining
Lab 0: Getting Started ( week of 05th & 12th August 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html | Practice Set No. 1 | Introduction to Python |
| 2 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html | Practice Set No. 2 | Introduction to Numpy and Pandas |
| 3 | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html | Practice Set No. 3 | Data Exploration |
Lab 1: ( week of 19th & 26th August 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | Apply data cleaning techniques on any dataset (e.g. Chronic Kidney Disease dataset from UCI repository). Techniques may include handling missing values, outliers and inconsistent values. Also, a set of validation rules may be specified for the particular dataset and validation checks performed. | Practical No. 1 | Dataset: kidneyDisease.csv Download from Kaggle: Chronic KIdney Disease dataset |
Lab 2: ( week of 2nd & 9th September 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset | Practical No. 2 | Dataset: rain.csv Download from data.gov.in: Rainfall in India |
Lab 3: ( week of 16th, 23rd & 30thSeptember 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | Writing/Review of Chapter 1, Chapter 3, and Chapter 4 of Project Report | Project Work |
Lab 4: ( week of 7th October 2024 )
| Q. NO. | Program | Practical No. | Remarks |
|---|---|---|---|
| 1 | Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration. | Practical No. 3 | Dataset: Mall_Customers.csv Download from data from kaggle: Mall Customer Segmentation Data |
Projects
| Team No. | Project Title | Team Members | Outcomes/Remarks |
|---|---|---|---|
| 1 | Understanding the Monsoon Pattern in Eastern Gangatic Plain |
|
|
| 2 | NIRF Ranking Prediction |
|
|
| 3 | Student Performance Prediction |
|
|
| 4 | FIFA Prediction |
|
|
| 5 | Breast Cancer Prediction |
|
|
| 6 | YouTube spam comments classification |
|
|
| 7 | Olympic Data Analysis and Prediction |
|
|
| 8 | Credit Card Fraud Detection |
|
|
| 9 | CreditMap: Exploring Credit Score Patterns through Data Mining |
|
|
| 10 | Movie Recommendation System |
|
|
| 11 | Wine Quality Prediction |
|
|