Indian Institute of Technology, Kharagpur
Department of Industrial and Systems Engineering
Syllabus for Statistical Learning with Applications (IM31202)
Course Title: Statistical Learning with Applications (IM31202)
Credits: 3-1-0 (4 Credits)
Instructor: Sayak Roychowdhury, PhD; Assistant Professor; Department of Industrial &
Systems Engineering
Contact: Tel: 84754; email: sroychowdhury@iem.iitkgp.ac.in
TA: Bhosale Akshay Tanaji, Research Scholar (ISE), bhosaleakshay78@iitkgp.ac.in
Course Objective: With the explosion of “Big Data” problems, statistical learning has
become relevant in many scientific areas as well as marketing, finance, and other business
disciplines. This is true for system engineering and operations management problems as well.
This course is created for ISE undergraduate students as a core subject, to introduce the
fundamental techniques of statistical learning and applications in the industrial engineering
domain. It will cover feature engineering, data driven analytical, predictive, classification and
tree models. The course will help the students to develop necessary skills to derive key
insights from data using the concepts of statistical learning.
Text Book: James, G., Witten, D., Hastie, T., & Tibshirani, R. An Introduction to
Statistical Learning , 2nd Edition, New York: springer.
Reference Books:
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data
mining, inference, and prediction. Springer Science & Business Media.
Leskovec, J., Rajaraman, A., & Ullman, J. D. (2020). Mining of massive data sets.
Cambridge university press.
Haykin, S. S. (2009). Neural networks and learning machines/Simon Haykin.
Bishop, C. M. (2006). Pattern Recognition and Machine learning. Springer.
Course Content:
Introduction Statistical Learning (Week 1)
What Is Statistical Learning, Why Estimate f? , How Do We Estimate f? Maximum
Likelihood Estimation,
Model Interpretability, Supervised Versus Unsupervised Learning, Regression Versus
Classification Problems, Assessing Model Accuracy, Measuring the Quality of Fit ,
Bias-Variance Trade-Off , The Classification, Case Examples in industrial and systems
engineering.
Linear Regression (Week 2)
Simple Linear Regression, Multiple Linear Regression , Estimating the Coefficients,
Assessing the Accuracy of the Model , Qualitative Predictors, Extensions of the Linear
Model, Potential Problems, Case Examples
Classification (Week 3-4)
Overview, Logistic Regression, Multiple Logistic Regression, Logistic Regression for >2
Response Classes,
Linear Discriminant Analysis, Bayes’ Theorem for Classification, Linear Discriminant
Analysis for p=1, p>1,
Quadratic Discriminant Analysis, Comparison of Classification Methods, Case Examples
Resampling Methods (Week 5)
Cross-Validation , k-Fold Cross-Validation , Bootstrap
Case Examples
Model Selection and Regularization (Week 6)
Subset Selection, Choosing the Optimal Model (AIC, BIC), Ridge Regression, The
Lasso , Dimension Reduction Methods, Principal Components Regression, High-
Dimensional Data, Case Examples
Non-linear Regression (Week 7)
Polynomial Regression, Step Functions , Basis Functions , Regression Splines ,
Piecewise Polynomials , Constraints and Splines, Smoothing Splines, Local Regression,
Generalized Additive Models (GAM), Case Examples
Tree-Based Methods (Week 8-9)
The Basics of Decision Trees, Regression Trees, Classification Trees, Trees Versus
Linear Models,
Advantages and Disadvantages of Trees, Bagging, Random Forests, Boosting, Bagging,
Random Forests, Case Examples
Statistical Learning using ANN (Week 10)
Feed-forward Network Functions, Network Training, Error Backpropagation, The
Hessian Matrix, Regularization, Mixture Density Networks, Examples
Support Vector Machines (Week 11)
Maximal Margin Classifier, What Is a Hyperplane? Separating Hyperplane, Support
Vector Classifiers, Support Vector Machines, Classification with Non-linear Decision
Boundaries, SVMs with More than Two Classes, One-Versus-One/All Classification,
Case Examples
Unsupervised Learning (Week 12)
The Challenge of Unsupervised Learning, Principal Components Analysis , K-Means
Clustering,
Hierarchical Clustering, Model based Clustering, Case Examples