21-Day Data Science & AI Roadmap
Day 1: Introduction to Data Science & AI
Understand the difference between Data Science, Machine Learning, Artificial Intelligence,
and Deep Learning with real-world use cases.
Day 2: Python Basics
Learn Python syntax, data types, loops, functions, and conditional statements.
Day 3: Python for Data Science - NumPy
Work with arrays, broadcasting, reshaping, indexing using the NumPy library.
Day 4: Data Handling with Pandas
Load and explore datasets using DataFrames, filtering, grouping, and aggregating data.
Day 5: Data Cleaning & Preprocessing
Handle missing values, duplicates, and perform encoding of categorical variables.
Day 6: Data Visualization
Use Matplotlib and Seaborn to create visualizations like bar plots, histograms, and
heatmaps.
Day 7: Descriptive Statistics
Understand and calculate mean, median, standard deviation, and correlations.
Day 8: Introduction to Machine Learning
Understand supervised learning, ML workflow, and how to split datasets.
Day 9: Logistic Regression for Classification
Train and evaluate a binary classification model using logistic regression.
Day 10: Model Evaluation Metrics
Use metrics like accuracy, precision, recall, F1-score, and confusion matrix.
Day 11: Decision Trees & Random Forests
Learn tree-based models and ensemble methods for improved accuracy.
Day 12: Feature Engineering & Selection
Create new features, scale values, and use feature importance to select key variables.
Day 13: Unsupervised Learning - K-Means Clustering
Group similar data points using clustering algorithms and the elbow method.
Day 14: Mini Project - Titanic End-to-End
Apply all skills to predict Titanic survival: data cleaning, modeling, and evaluation.
Day 15: GitHub + Portfolio
Document and publish your notebook and results to GitHub with a proper README.
Day 16: Cross-Validation (CV)
Improve model reliability using K-Fold or Stratified K-Fold cross-validation techniques.
Day 17: Hyperparameter Tuning
Use GridSearchCV and RandomizedSearchCV to find optimal model parameters.
Day 18: Ensemble Learning
Combine multiple models using VotingClassifier, Bagging, and Boosting methods.
Day 19: PCA (Dimensionality Reduction)
Reduce the number of features while retaining the most important information.
Day 20: Model Explainability (SHAP/LIME)
Interpret and visualize why your model made specific predictions using SHAP or LIME.
Day 21: Streamlit App or Advanced Project
Deploy your model using Streamlit or start a new dataset project to apply what you've
learned.