WIP - ML-22-DEC Weekend
WIP - ML-22-DEC Weekend
Course overview
Organization interest in Data science/ML/Python
Core spaces of DS, AI, ML, DL
BIG DATA definition, Evolution, drivers
V’s of BIG DATA
BIG DATA - Frameworks
Data Science Roles, skills
Overview
Intro on Data Science with Python
Basic Stats
•Observations and Variables
•Types of Variables
•Central Tendency
•Distribution of the Data
•Confidence Intervals
•Confidence Intervals
•Confidence Intervals
•Multi-collinearity
demo of code samples
demo of code samples
demo of code samples
demo of code samples
Hypothesis Tests
STATS
overview, code demo
STATS
Pearson’s Correlation Coefficient - theory
code - demo
Spearman’s Rank Correlation
code - demo
Student’s t-test - theory
code - demo
Paired Student’s t-test
code - demo
Chi2 - theory
code - demo
code - demo
Kendall-Tau - theory
code - demo
Analysis of Variance Test (ANOVA) - theory
code - demo
code - demo
Normality Tests
using visual plots
UNDERSTANDING RELATIONSHIPS
Summary tables
Specific calculations
Visualization tools
GENERATING GROUPS **
Theory on clustering
code demo -
basic operators
Python: Environment
Day 1 Setup & Basic
Python Data structures - lists, tuples, sets, Strings and dicts
Functions & lambda functions
List comprehensions
Exception handling
Dataframes
Day 2 PANDAS aggegation
merge
quick-tips
Exercises
Day 2 PANDAS
Numpy - Basics
Numpy - Basics
Day 2 Numpy
Numpy - Basics
Plotting - basics
Plotting - basics
Day 2 plotting
Plotting from pandas
Parallel plots
Intro to ML
Overview of KNN
Day 2 KNN, Overview
KNN - optimum K
KNN
KNN, Data
Day 3 preprocessing
KNN - advanced tuning (algorithm, recommdation)
KNN, Data
Day 3
preprocessing
KNN - questions
NB vs KNN
NB - on adult income data, EDA, Preproc, Normality
polynomial regression
Polynomial
Regression
Overview
Example code -
Stepwise Regression
Regularization - overview
Vectors - Norms
Ridge Regression
Alpha selection (using yellow bricks)
Lasso Regression
ElasticNet Regression
Logit - Multi-nomial
code (using iris data)
code
Simple DT explaination
Overview
code
Grid Search
code
SVM - overview
Types of SVMs
Matrix Operations
Significance to DS/ML
Gradient Descent
(foundation for Adv
machine learning
algo and Deep
learning)
Divisive clustering
Basics
code
2- KMEANS - Overview
KMEANS - understanding (using excel)
ML-K-MEANS-00-basics.ipynb
ML-K-MEANS-01-basics.ipynb
3 - DBSCAN - overview
code
UNSUP - metrics
code - demo on metric for clusterings
code
code (KNN vs DBSCAN)
Comparision algorithms
4 - Association Rules
OPTICS algorithm
Anomaly detection
Local Outlier Factor
Neural Networks (overview)
Autoencoders (overview
Deep Belief Nets
Hebbian Learning
Generative Adversarial Networks
Self-organizing map
Machine Learning
- Feature selection
Overview
Filter method
Filter - Variance threshold 00
Filter - Variance threshold 01
Filter - correlation threshold
Wrapper method
SFS - wine quality
SFS - Iris
SBS - Iris
SFFS - Iris
SBFS - Iris
SFS - using parameters for all types
SFS - with regression problems
SFS - with grid search
SFS - select k-best features
Dimensionality reduction
- Feature Extraction
PCA - overview
PCA - Maths
PCA - Maths - Eigen values, vectors
PCA - basics
PCA - basics
PCA - basics
PCA - basics
Day 5
SVD - Examples
Day 5
Linear Discriminant Analysis
LDA - examples
LDA - examples
•Blending
•Bagging
ensemble.GradientBoostingRegressor
AdaBoost
XGBoost
Vectorization
Vectorization
Vectorization
Vectorization
Clustering - basics
Clustering - basics
Clustering - basics
Next Week
EDA/Preproc steps
Naïve Bayes
SVM
UNSUP - Kmeans
UNSUP - Agglomerative
UNSUP - DBSCAN
Ensemble learning
Pre-processing/EDA
Project review
Feature Engg (PCA, )
Text analytics
code - sampling-distribution-2-diff-SAMPLE-MEANS
code - sampling-distribution-3-SAMPLE-PROPORTION
overview
code - estimation-CI-0.ipynb
code - estimation-CI-1-t-DISTRIBUTION
code - EDA-Relationship-10-1-CORR-iris.ipynb
code - EDA-Relationship-10-3-t-test-iris.ipynb
code - EDA-Relationship-10-4-chi2-census.ipynb
code - EDA-Relationship-10-5-chi2-titanic.ipynb
code - EDA-Relationship-10-2-kendall-tau-iris.ipynb
code - EDA-Relationship-10-6-ANOVA-1-way.ipynb
code - EDA-Relationship-10-7-ANOVA-1-way.ipynb
code - ML-NB-12-adult-income-EDA-normality-tests
code - ML-NB-12-adult-income-EDA-normality-tests
code - ML-NB-12-adult-income-EDA-normality-tests
EDA-cleaning-the-data-outlier-detection-00.ipynb
EDA-cleaning-the-data-outlier-detection-01-Isolation Forest.ipynb
code - EDA-Transformation-01-categorical-to-numeric.ipynb
code - EDA-Transformation-01-categorical-to-numeric.ipynb
code - EDA-Transformation-02-scaling.ipynb
code - EDA-Transformation-02-scaling.ipynb
code - EDA-Transformation-02-scaling.ipynb
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 1 - ML - Metrics.pptx
slides - 10 - 1 - ML - Metrics.pptx
slides - 10 - 1 - ML - Metrics.pptx
slides - 10 - 1 - ML - Metrics.pptx
slides - 10 - 1 - ML - Metrics.pptx
slides - 10 - 0 - ML.pptx
slides - 10 - 0 - ML.pptx
python-core-01-basics.ipynb
python-data-structures-01-DICT
python-data-structures-02-LIST
python-data-structures-03-SETS
python-data-structures-04-STRINGS
python-data-structures-05-TUPLES
python-core-02-functions.ipynb
python-core-03-generators
python-core-04-Iterators
python-core-05-exception-handling
python-core-25-programs
python-core-26-easy-programs
05 - 05 Python - PANDAS
python-PANDAS-00-data types
python-PANDAS-01-series
python-PANDAS-02-dataframes1
python-PANDAS-03-dataframes2
python-PANDAS-04-aggregation1
python-PANDAS-06-merge
python-PANDAS-05-quick-tips
python-PANDAS-20-exercises
python-NUMPY-00
python-NUMPY-01
python-NUMPY-02
python-PLOTS-basic
python-PLOTS-map
python-PLOTS-pandas
python-PLOTS-parallel
slides - 10 - 0 DS with Python - ML
ebook - (410)
slides - 10 - 0 - ML.pptx
slides - 10 - 1 DS with Python - ML - SUP - KNN Understand KD tree and ball tree,
compare with brute force algo
excel - kd-tree.xlsx
excel - Ball-tree.xlsx
code - ML-KNN-24-KD-tree-basics-00
code - ML-KNN-24-KD-tree-basics-01
Brute-force, KD tree, Ball-tree - how
they work
slides - 10 - 1 DS with Python - ML - SUP - KNN
code - ML-KNN-20-regression.ipynb
code - ML-KNN-21-regression-boston-housing-00.ipynb KNN reg vs lin reg
code - ML-KNN-21-regression-boston-housing-01.ipynb
ML-kNN-EDA-iris
ML-kNN-EDA-tips
ML-NB-10-iris
ML-NB-11-titanic
ML-LINREG-00-basics
regression, feature importance
regression metrics
ML-LINREG-10-advertising
ML-LINREG-12-auto-mpg regression
ML-LINREG-13-glass regression
code - ML-LINREG-16-head-size-1-sklearn.ipynb
code- ML-LINREG-16-head-size-2-custom-OLS-code.ipynb
code - ML-LINREG-02-STATSmodels-1-OLS.ipynb
ML-LINREG-25-optimize-advertising
slides - 10 - 3 DS with Python - ML - SUP - Linear Regression Dummy and Effect Coding
slides - 10 - 3 DS with Python - ML - SUP - Linear Regression
ML-POLY-REG-basics-00.ipynb
ML-POLY-REG-basics-01.ipynb
ML-POLY-REG-basics-02.ipynb
ML-POLY-REG-basics-03.ipynb
ML-POLY-REG-basics-04-basis-fn-regression.ipynb
ML-STEPWISE-REG-00-basics.ipynb
ML-STEPWISE-REG-using-lin-reg-10-house-prices.ipynb
ML-LINREG-06-SSE-using-matrix-formula.ipynb
ML-LINREG-07-using-matrix-formula
python-PLOTS-meshgrid-contour-00
00 - 12 DS - Maths.pptx
ML-LINREG-30-grad-des-contour-lines.ipynb
(below sections)
(below sections)
(below sections)
ML-SUP-REG-10-community-Ridge-Lasso-Logistic.ipynb
ML-SUP-REG-11-boston-housing-Ridge-Lasso-ElasticNet.ipynb
VIS-REG-02-Alpha-Selection
ML-LASSO-00-basics.ipynb
ML-LASSO-01-basics.ipynb
ML-LASSO-feature-selection-10-boston-housing.ipynb
Ridge to show shrinking, Lasso for
feature selection
ML-LASSO-feature-selection-11-hitters-RIDGE-compare.ipynb
ML-LASSO-12-wine-taste.ipynb
ML-LASSO-13-wine-taste-detailed.ipynb
ML-ElasticNet-00-basics.ipynb
ML-ElasticNet-10-house-prices.ipynb
ML-ElasticNet-11-house-prices-detailed.ipynb
sigmoid-function.ipynb
ML-LOGIT-00-intro-glass-categorical
ML-LOGIT-11-titanatic
ML-LOGIT-12-bank-deposit-plan
ML-DECTREE-00-basics
ML-DECTREE-01-basics-clf-reg.ipynb
ML-DECTREE-02-basics-explain-tree.ipynb
ML-DECTREE-03-balance-scale-data-implement-tree.ipynb
ML-DECTREE-10-iris
ML-DECTREE-11-iris-graph
code - ML-DECTREE-20-Regression-boston-housing.ipynb
code - ML-DECTREE-21-Regression-dummy-data.ipynb
ML-GRID-search-00.ipynb
ML-GRID-search-10-iris.ipynb
00 - 12 DS - Maths.pptx
GRADIENT-DESCENT-00-Intro.ipynb Maths
GRADIENT-DESCENT-01-learning_rate.ipynb learning rate, high low value
GRADIENT-DESCENT-03-basics.ipynb
GRADIENT-DESCENT-04-basics.ipynb
GRADIENT-DESCENT-10-city-pop-food-truck.ipynb
GRADIENT-DESCENT-11-housing-all-variants.ipynb
slides - 10 - 0 - ML.pptx Theoretical understanding
ML-UNSUP-agg-00.ipynb
ML-UNSUP-agg-01-basics.ipynb
code - ML-K-MEANS-15-bad-init.ipynb
code - ML-K-MEANS-14-titanic-some-tuning.ipynb Few tuning tips, accuracy
custom k-means, sklearn version,
weakness of k-means - varying
number of clusters
code - ML-K-MEANS-13-xclara-custom-sklearn
slides - 20 - 1 - ML - UNSUP - DBSCAN
code - ML-UNSUP-DBSCAN-00
slides - 10 - 1 - ML - Metrics.pptx
code - ML-UNSUP-metrics
ML-UNSUP-DBSCAN-03-clustering-compare
slides -
ML-ENSEMBLE-models-00-pima
code - ML-FS-filter-VarianceThreshold-00.ipynb
code - ML-FS-filter-VarianceThreshold-01.ipynb
code - ML-FS-filter-corr-Threshold-00.ipynb
chi-ex1.xlsx
ML-FS-filter-selectKbest-chi2-10-iris.ipynb
ML-FS-filter-selectKbest-chi2-11-pima.ipynb
ML-FS-wrapper-SFS-10-wine-quality-RF
ML-FS-wrapper-SFS-11-iris-KNN
ML-FS-wrapper-SBS-11-iris-KNN
ML-FS-wrapper-SFFS-11-iris-KNN
ML-FS-wrapper-SBFS-11-iris-KNN
ML-FS-wrapper-SFS-12-iris-KNN-all-types
ML-FS-wrapper-SFS-13-regression-boston-data
ML-FS-wrapper-SFS-14-Grid-search-iris
ML-FS-wrapper-SFS-16-best-k-feature-wine
ML-FS-wrapper-RFE-10-pima.ipynb
ML-FS-wrapper-RFE-11-iris.ipynb
ML-FS-99-11-breast-cancer-using-RF.ipynb
code - ML-FE-PCA-00-MATHS-detailed
code - ML-FE-PCA-00-eigen vectors values.ipynb
ML-FE-PCA-00-Basics
ML-FE-PCA-01-PCA
ML-FE-PCA-02-PCA
iris, explained variance, varying PCs
ML-FE-PCA-03-PCA
ML-FE-PCA-04-PCA-inner-working-wip
code - ML-FE-PCA-10-PCA-kidney-disease
code - ML-FE-PCA-12-workings-of-pca-iris.ipynb
maths using numpy/scipy
slides - overview
slides - 00 - 02 DS - Feature Engg
ML-FE-LDA-00-basics.ipynb
ML-FE-LDA-11-iris.ipynb
slides - overview
ML-ENSEMBLE-00-scoring-methods.ipynb
ML-ENSEMBLE-00-scoring-methods.ipynb
ML-ENSEMBLE-00-scoring-methods.ipynb
slides - overview
ML-ENSEMBLE-02-stacking-custom.ipynb
ML-ENSEMBLE-03-stacking.ipynb
ML-ENSEMBLE-04-stacking-proba-as-meta-features.ipynb
ML-ENSEMBLE-05-stacking-GridSearch.ipynb
ML-ENSEMBLE-06-stackingCV-classification.ipynb
ML-ENSEMBLE-07-stackingCV-classification-proba-as-meta-features.ipynb
ML-ENSEMBLE-08-stackingCV-classification-grid-search.ipynb
ML-ENSEMBLE-10-stacking-Reg.ipynb
ML-ENSEMBLE-01-blending.ipynb
ML-ENSEMBLE-30-bagging-00.ipynb
slides - overview
code - ML-ENSEMBLE-60-GB-10-titanic-tuning.ipynb
code - ML-ENSEMBLE-60-GB-11-ca-housing-wip.ipynb
code - ML-ENSEMBLE-60-GB-12-boston-housing-regression.ipynb
slides - overview
ML-ENSEMBLE-models-00-pima.ipynb
TM-Sentiment-00-basics
TM-Sentiment-10-IMDb-movie-reviews
TM-lib-WordCloud-00-basics
TM-lib-WordCloud-01-basics
TM-Sentiment-11-US-GOP-Debate
http://sentiment.christopherpotts.net/
TM-CLUSTERING-00-basics
TM-CLUSTERING-01-basics
TM-CLUSTERING-03-detailed
TM-lib-GENSIM-00-basics.ipynb
TM-lib-GENSIM-01-basics.ipynb
TM-lib-GENSIM-02-basics.ipynb
TM-TOPIC-modelling-00.ipynb
TM-TOPIC-modelling-01.ipynb
TM-TOPIC-modelling-02.ipynb
TM-TOPIC-modelling-10-Simple
TM-TOPIC-modelling-11-20-newsgroups
TM-TOPIC-modelling-12-brown
0.5
0.25
0.5
1
0.5
0.25
0.25
3.25
Status
2019, revisit HDBSCAN
Linear Algebra
Vectors,Matrices, and Systems of Linear Equations
Linear Transformations
Determinants and Eigenvalues
Inner Product Spaces, Orthogonal Projection, Least Squares
Singular Value Decomposition
Matrices
Calculus
Gradient Descent & Derivatives
Multivariate Calculus
Khan Academy Calculus
Khan Academy – Basic Matrix operations
Khan Academy – Linear Algebra