Difference between revisions of "Spring 2025: Software Engineering Lab"

From MKWiki
Jump to navigation Jump to search
(Created page with "SE Lab")
 
Line 1: Line 1:
SE Lab
+
=='''Instructions'''==
 +
* Please be on time to avoid the '''Attendance Penalty'''.
 +
* Please put your mobile phone in the '''Silent Mode'''.
 +
* Each lab assignment needs to be submitted in the '''Google Classroom''' for evaluation(will be notified in the GC lab-wise, submit before the deadline).
 +
* Turn off'''(shut down) your assigned computer and arrange the chair''' before you leave the lab.
 +
 
 +
==''' Guidelines'''==
 +
* As per DUCS guidelines [http://mkbhandari.com/mkwiki/data/fall2024/dm/DMGuideline.pdf  '''DSE: Data Mining''']
 +
 
 +
== '''Lab 0: Getting Started''' ( week of 05<sup>th</sup> & 12<sup>th</sup >  August 2024 ) ==
 +
{| class="wikitable" style="text-align: justify; width: 100%";
 +
|-
 +
! Q. NO. 
 +
! Program 
 +
! Practical No. 
 +
! Remarks
 +
|-
 +
| style="width: 8%"  | 1
 +
| style="width: 60%" | https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html
 +
| style="width: 15%" |  Practice Set No. 1
 +
| Introduction to Python
 +
|-
 +
| 2 || https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html  || Practice Set No. 2 || Introduction to Numpy and Pandas
 +
|-
 +
| 3 || https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html || Practice Set No. 3 || Data Exploration
 +
|}
 +
 
 +
== '''Lab 1:''' ( week of 19<sup>th</sup>  &  26<sup>th</sup>  August 2024 ) ==
 +
{| class="wikitable" style="text-align: justify; width: 100%";
 +
|-
 +
! Q. NO. 
 +
! Program 
 +
! Practical No. 
 +
! Remarks
 +
|-
 +
| style="width: 8%"  | 1
 +
| style="width: 60%" | Apply data cleaning techniques on any dataset (e.g. Chronic Kidney Disease dataset from UCI repository). Techniques may include handling missing values, outliers and inconsistent values. Also, a set of validation rules may be specified for the particular dataset and validation checks performed.
 +
| style="width: 15%" |  Practical No. 1
 +
| '''Dataset:''' [http://mkbhandari.com/mkwiki/data/fall2024/dm/datasets/kidneyDisease.csv '''kidneyDisease.csv'''] <br>
 +
'''Download from Kaggle:''' [https://www.kaggle.com/datasets/mansoordaku/ckdisease Chronic KIdney Disease dataset] <br>
 +
'''Tutorial:''' [https://www.kaggle.com/code/alexisbcook/handling-missing-values#How-many-missing-data-points-do-we-have? Tutorial on Handling Missing values]
 +
|}
 +
 
 +
== '''Lab 2:''' ( week of 2<sup>nd</sup> &  9<sup>th</sup>  September 2024 ) ==
 +
{| class="wikitable" style="text-align: justify; width: 100%";
 +
|-
 +
! Q. NO. 
 +
! Program 
 +
! Practical No. 
 +
! Remarks
 +
|-
 +
| style="width: 8%"  | 1
 +
| style="width: 60%" | Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset
 +
| style="width: 15%" |  Practical No. 2
 +
| '''Dataset:''' [http://mkbhandari.com/mkwiki/data/fall2024/dm/datasets/rain.csv '''rain.csv'''] <br>
 +
'''Download from data.gov.in:''' [https://www.data.gov.in/catalog/rainfall-india Rainfall in India]
 +
|}
 +
 
 +
== '''Lab 3:''' ( week of 16<sup>th</sup>, 23<sup>rd</sup> &  30<sup>th</sup>September 2024 ) ==
 +
{| class="wikitable" style="text-align: justify; width: 100%";
 +
|-
 +
! Q. NO. 
 +
! Program 
 +
! Practical No. 
 +
! Remarks
 +
|-
 +
| style="width: 8%"  | 1
 +
| style="width: 60%" | Writing/Review of Chapter 1, Chapter 3, and Chapter 4 of Project Report
 +
| style="width: 15%" |  Project Work
 +
|
 +
|}
 +
 
 +
== '''Lab 4:''' ( week of 7<sup>th</sup> October 2024 ) ==
 +
{| class="wikitable" style="text-align: justify; width: 100%";
 +
|-
 +
! Q. NO. 
 +
! Program 
 +
! Practical No. 
 +
! Remarks
 +
|-
 +
| style="width: 8%"  | 1
 +
| style="width: 60%" | Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration.
 +
| style="width: 15%" |  Practical No. 3
 +
| '''Dataset:''' [http://mkbhandari.com/mkwiki/data/fall2024/dm/datasets/Mall_Customers.csv '''Mall_Customers.csv'''] <br>
 +
'''Download from data from kaggle:''' [https://www.kaggle.com/datasets/vjchoudhary7/customer-segmentation-tutorial-in-python Mall Customer Segmentation Data]
 +
|}
 +
 
 +
== '''Projects''' ==
 +
{| class="wikitable" style="text-align: justify; width: 100%";
 +
|-
 +
! Team No. 
 +
! Project Title 
 +
! Team Members
 +
! Outcomes/Remarks
 +
|-
 +
| style="width: 8%"  | 1
 +
| style="width: 45%" | Understanding the Monsoon Pattern in Eastern Gangatic Plain
 +
| style="width: 25%" | 
 +
# '''Akshary Sharma (25019)'''
 +
# Abhay Yadav (25040)
 +
# Anuj Gupta (25042)
 +
# Amar Kumar (25065)
 +
# Kunal Verma (25073)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|2|| NIRF Ranking Prediction||
 +
# '''Abhishek Prasad (25007)'''
 +
# Vishal Kumar (25014)
 +
# Nitish Kumar (25023)
 +
# Anshu Kumar Dubey (25036)
 +
# Sunny Chauhan (25050)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|3|| Student Performance Prediction ||
 +
# '''Himanshu Kumar (25016)'''
 +
# Kanan Pal (25072)
 +
# Khushboo Yadav (25082)
 +
# Diksha Joshi (25091)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|4|| FIFA Prediction ||
 +
# Arihant (25003)
 +
# '''Ayush Pundir (25027)'''
 +
# Pratyush (25060)
 +
# Ashish (25066)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|5|| Breast Cancer Prediction ||
 +
# Vidhan (25044)
 +
# '''Sandeep Kumar Sharma (25047)'''
 +
# Ayushman Pandey (25094)
 +
# Tanishk Panchal (25095)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|6|| YouTube spam comments classification ||
 +
# Devesh Chauhan (25011)
 +
# Shatrughan  (25084)
 +
# Om Ranjan (25085)
 +
# '''Aman Sagar (25086)'''
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|7|| Olympic Data Analysis and Prediction ||
 +
# Kusum (25002)
 +
# '''Aditya Kumar (25012)'''
 +
# Divyanshi (25021)
 +
# Tushar Rana (25064)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|8|| Credit Card Fraud Detection ||
 +
# Ritesh Dhawan (25037)
 +
# Bitthal Varshney (25041)
 +
# Ansh Raj (25081)
 +
# '''Uday Raj Verma (25083)'''
 +
# Astitwa Rawat (25088)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|9|| CreditMap: Exploring Credit Score Patterns through Data Mining ||
 +
# Himanshu Singh (25017)
 +
# '''Garvit Kumar (25018)'''
 +
# Mayank  (25022)
 +
# Abhishek Kumar Singh(25032)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|10|| Movie Recommendation System ||
 +
# Tanya Agrahari (25030)
 +
# Prakash Mishra (25035)
 +
# '''Adarsh Singh (25074)'''
 +
# Shivam Verma (25078)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|-
 +
|11|| Wine Quality Prediction ||
 +
# '''Shivam Soni (250xx)'''
 +
# ⁠Kashif (250xx)
 +
# Akash Pathak (250xx)
 +
# ⁠Priyanshu Sachan (250xx)
 +
|
 +
* Dataset:
 +
* Report:
 +
* Project Presentation:
 +
|}

Revision as of 13:06, 15 December 2024

Instructions

  • Please be on time to avoid the Attendance Penalty.
  • Please put your mobile phone in the Silent Mode.
  • Each lab assignment needs to be submitted in the Google Classroom for evaluation(will be notified in the GC lab-wise, submit before the deadline).
  • Turn off(shut down) your assigned computer and arrange the chair before you leave the lab.

Guidelines

Lab 0: Getting Started ( week of 05th & 12th August 2024 )

Q. NO. Program Practical No. Remarks
1 https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial1/tutorial1.html Practice Set No. 1 Introduction to Python
2 https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial2/tutorial2.html Practice Set No. 2 Introduction to Numpy and Pandas
3 https://www.cse.msu.edu/~ptan/dmbook/tutorials/tutorial3/tutorial3.html Practice Set No. 3 Data Exploration

Lab 1: ( week of 19th & 26th August 2024 )

Q. NO. Program Practical No. Remarks
1 Apply data cleaning techniques on any dataset (e.g. Chronic Kidney Disease dataset from UCI repository). Techniques may include handling missing values, outliers and inconsistent values. Also, a set of validation rules may be specified for the particular dataset and validation checks performed. Practical No. 1 Dataset: kidneyDisease.csv

Download from Kaggle: Chronic KIdney Disease dataset
Tutorial: Tutorial on Handling Missing values

Lab 2: ( week of 2nd & 9th September 2024 )

Q. NO. Program Practical No. Remarks
1 Apply data pre-processing techniques such as standardization/normalization, transformation, aggregation, discretization/binarization, sampling etc. on any dataset Practical No. 2 Dataset: rain.csv

Download from data.gov.in: Rainfall in India

Lab 3: ( week of 16th, 23rd & 30thSeptember 2024 )

Q. NO. Program Practical No. Remarks
1 Writing/Review of Chapter 1, Chapter 3, and Chapter 4 of Project Report Project Work

Lab 4: ( week of 7th October 2024 )

Q. NO. Program Practical No. Remarks
1 Apply simple K-means algorithm for clustering any dataset. Compare the performance of clusters by varying the algorithm parameters. For a given set of parameters, plot a line graph depicting MSE obtained after each iteration. Practical No. 3 Dataset: Mall_Customers.csv

Download from data from kaggle: Mall Customer Segmentation Data

Projects

Team No. Project Title Team Members Outcomes/Remarks
1 Understanding the Monsoon Pattern in Eastern Gangatic Plain
  1. Akshary Sharma (25019)
  2. Abhay Yadav (25040)
  3. Anuj Gupta (25042)
  4. Amar Kumar (25065)
  5. Kunal Verma (25073)
  • Dataset:
  • Report:
  • Project Presentation:
2 NIRF Ranking Prediction
  1. Abhishek Prasad (25007)
  2. Vishal Kumar (25014)
  3. Nitish Kumar (25023)
  4. Anshu Kumar Dubey (25036)
  5. Sunny Chauhan (25050)
  • Dataset:
  • Report:
  • Project Presentation:
3 Student Performance Prediction
  1. Himanshu Kumar (25016)
  2. Kanan Pal (25072)
  3. Khushboo Yadav (25082)
  4. Diksha Joshi (25091)
  • Dataset:
  • Report:
  • Project Presentation:
4 FIFA Prediction
  1. Arihant (25003)
  2. Ayush Pundir (25027)
  3. Pratyush (25060)
  4. Ashish (25066)
  • Dataset:
  • Report:
  • Project Presentation:
5 Breast Cancer Prediction
  1. Vidhan (25044)
  2. Sandeep Kumar Sharma (25047)
  3. Ayushman Pandey (25094)
  4. Tanishk Panchal (25095)
  • Dataset:
  • Report:
  • Project Presentation:
6 YouTube spam comments classification
  1. Devesh Chauhan (25011)
  2. Shatrughan (25084)
  3. Om Ranjan (25085)
  4. Aman Sagar (25086)
  • Dataset:
  • Report:
  • Project Presentation:
7 Olympic Data Analysis and Prediction
  1. Kusum (25002)
  2. Aditya Kumar (25012)
  3. Divyanshi (25021)
  4. Tushar Rana (25064)
  • Dataset:
  • Report:
  • Project Presentation:
8 Credit Card Fraud Detection
  1. Ritesh Dhawan (25037)
  2. Bitthal Varshney (25041)
  3. Ansh Raj (25081)
  4. Uday Raj Verma (25083)
  5. Astitwa Rawat (25088)
  • Dataset:
  • Report:
  • Project Presentation:
9 CreditMap: Exploring Credit Score Patterns through Data Mining
  1. Himanshu Singh (25017)
  2. Garvit Kumar (25018)
  3. Mayank (25022)
  4. Abhishek Kumar Singh(25032)
  • Dataset:
  • Report:
  • Project Presentation:
10 Movie Recommendation System
  1. Tanya Agrahari (25030)
  2. Prakash Mishra (25035)
  3. Adarsh Singh (25074)
  4. Shivam Verma (25078)
  • Dataset:
  • Report:
  • Project Presentation:
11 Wine Quality Prediction
  1. Shivam Soni (250xx)
  2. ⁠Kashif (250xx)
  3. Akash Pathak (250xx)
  4. ⁠Priyanshu Sachan (250xx)
  • Dataset:
  • Report:
  • Project Presentation: