Topic outline

  • Welcome to Data Mining and Machine Learning Lab

    Dear Students,

    Welcome to the course Data Mining and Machine Learning Lab course!! Myself Al Amin Biswas. I am looking forward to the opportunity so we will have to learn and grow together in this course.  I always believe that your success is most important to me as a teacher.  Please don’t hesitate to contact with me in case of having difficulties with the course materials and any others academic problems. To smooth this operation, general questions related to the courses should be posted on the Discussion Forum.  If you have any question for which you are not interested to post in Forum, then you can send your query via email for a quick response. My contact information is mentioned below. Success in an online class requires just as much work and effort as success in a traditional classroom. Lastly, I am expecting much cooperation from all of you. Hope we will enjoy this journey joyfully.

    Stay safe and healthy. Thanks all of you for joining with this course.

    Best Regards, 

    Al Amin Biswas

    Instructor Information 

    Name 
    : Al Amin Biswas 
    Designation: Lecturer
    Office Address: Room: AB4-505, Dept. of CSE, Ashulia Campus, DIU.
    Email: alamin.cse@diu.edu.bd
    Teacher Initial: AAB
    Phone             : +8801740071456                                                           

    Google Classroom Code: 5sqs7d3

    Google Meet Link: https://meet.google.com/tsu-cwrc-pmr


    Counselling Hour


    Course Rationale
    An introduction to data mining; Data preparation, model building, and data mining techniques such as clustering, decisions trees and neural networks; Induction of predictive models from data: classification, regression, and probability estimation; Application case studies; Data-mining software tools review and comparison.

    Course Objectives

    • To apply the concept of data mining in solving problems
    • To demonstrate applications of data mining using tools
    • To apply knowledge of data mining in project work

    Course Outcomes (CO’s)

    • CO1 Able to possess the basic knowledge of Weka and Python concerning data mining and machine learning
    • CO2 Able to implement different data mining and machine learning algorithms like classification, prediction, clustering and association rule mining to solve real-world problems using Weka and Python
    • CO3 Able to compare and evaluate different data mining and machine learning algorithms like classification, prediction, clustering and association rule mining using Weka and/or Python
    • CO4 Able to apply implementation knowledge of data mining and machine learning in developing research ideas
    Grading Scheme
    Attendance: 10%
    Lab Performance: 25%

    Project / Lab Report: 25%
    Final Exam: 40%

    • Recommended Books
    1. Introduction to Data Mining and Applications
    2. Data Mining Concepts and Techniques
    3. Data Mining Techniques
    4. Data Mining Using Weka
    5. Weka Manual
    6. Data Mining Using Python
    • Global Data Repository for Data Mining and/or Machine Learning
    1. WISDM
    2. UCI ML Repository
    3. KDD Cup
    4. Kaggle
    5. KDnuggets

    • Standard Templates
    1. IEEE Template
    2. ACM Template

    • Restricted Not available unless: You belong to PC-C
    • Restricted Not available unless: You belong to PC-D
    • Restricted Not available unless: You belong to PC-E
  • Week 1: Introduction to Weka

    Topics of Discussion

    • Introduction to Weka
    • Relationship to data mining
    • Overview of data mining with Weka
    • Data visualization in Weka


    Expected Learning Outcome

    • Appreciation of the needs of data mining with Weka
    • Visualization of the relationship of Weka to data mining
    • Visualization of different data mining tasks with Weka
    • Visualization of the data in Weka to data mining

  • Week 2: Classification Problem analysis and Performance Evaluation Procedure of Classifier

    Topics of Discussion

    • Review of data mining task and related application examples
    • Discussion about classification problem with real life example
    • Discussion about binary confusion matrix generated by the classifier
    • Discussion about Performance evaluation metrics  
    • Course Project Team and discussion


    Expected Learning Outcome

    • On-hand acquaintance and practice of performance evaluation metrics calculation
    • Team formation for the course project

  • Week 3: Multiclass & Multilabel Problem, Regression Problem, and Evaluation Procedure

    Topics of Discussion

    • Discussion on multiclass and multilabel problem
    • Discussion about multiclass confusion matrix
    • Discussion about performance evaluation metrics for multiclass confusion matrix
    • Discussion about ROC curve, micro and macro average precision, recall and F1-Score.
    • Discussion about Regression Problem
    • Evaluation procedure of Regression model 

    Expected Learning Outcome

    • On-hand acquaintance and practice of discussed topics
    • Hands-on calculation of performance evaluation metrics
    • Critical analysis of generated result of the classifier and regression model

  • Week 4: Classification Using Weka

    Topics of Discussion

    • Discussion about imbalanced dataset
    • Oversampling and undersampling in data analysis
    • SMOTE: Synthetic Minority Over-sampling Technique
    • Data Randomization in Weka after SMOTE
    • K-fold Cross Validation

    Expected Learning Outcome

    • Handing of imbalanced dataset
    • Necessity of Randomization in Weka after SMOTE and Impact of K-fold cross validation in result
    • Problem solving skill in classification and prediction
    • Skill in using Weka as a data mining tool for classification and prediction

  • Week 5: Data Preprocessing and Classification Using Weka

    Topics of Discussion

    • Data discretization [Data Preprocessing in Weka]
    • Numeric Transform [Data Preprocessing in Weka]
    • Real life classification and regression problem analysis

    Expected Learning Outcome

    • Problem solving skill in classification and prediction
    • Skill in using Weka as a data mining tool for data preprocessing.
    • Skill in using Weka as a data mining tool for classification and prediction.

  • Week 6: Feature Selection and Feature Ranking

    Topics of Discussion

    • Feature Selection and Feature Ranking
    • Type of Feature Selection and Its Significant
    • Wrapper, Filter, and Intrinsic Feature Selection 

    Expected Learning Outcome

    • Understand the importance of Feature Selection and Feature Ranking in ML.
    • Impact of Feature Selection and Feature Ranking in Model's Performance.

  • Week 7: Midterm Week


    Topics of Practice

    • Week 01 to Week 06

    Expected Learning Outcome

    • Cumulative outcome of week 01 to week 06

    • Week 8: Data Preprocessing using Python

      Topics of Discussion

      • IDE installation for python programming
      • Missing values handling
      • Label Encoder, OneHotEncoder
      • Dataset Splitting
      • Feature Scaling

      Expected Learning Outcome

      • Appreciation of the needs of machine learning with Python
      • Visualization of the relationship of Python to machine learning
      • Visualization of different machine learning tasks with Python
      • Data preprocessing using Python

    • Week 09: Regression using Python


      Topics of Discussion

      • Linear Regression
      • Multiple Linear Regression
      • Predictive model's output analysis

      Expected Learning Outcome

      • Implementation of Linear Regression and Linear Problem Solving
      • Implementation of Multiple Linear Regression and Output analysis of predictive models

    • Week 10: Classification using Python

      Topics of Discussion

      • Real life classification problems
      • Classifier: Logistic Regression
      • Model's Performance: Accuracy, Precision, Recall, F1-Score
      • Receiver Operating Characteristics

      Expected Learning Outcome

      • Problem solving skill in classification and prediction
      • Performance analysis of the classifier

    • Week 11: Presentation of Project (Using Python or Weka)

      Project Presentation


      • Live demonstration of project
      • Question regarding the project

    • Week 12: Final Examination

      Semester Final Examination Week


      Topics to be included in final exam:

      • Final Viva
      • Lab Final (Written)

      • Lab Final Question and Answer Assignment
        Restricted Not available unless: You belong to any group