Data Science & AI Curriculum
1. Introduction to Python
● What is Python?
● Features of Python
● Python 2 vs Python 3
● Installing Python and IDEs (VS Code, PyCharm, Jupyter)
● Writing and running your first Python script
● Python interactive mode vs script mode
2. Python Basics
● Variables and data types
● int, float, str, bool, NoneType
● Comments (single-line, multi-line)
● Type casting
● Input/output functions: input(), print()
● f-strings and formatting strings
3. Operators
● Arithmetic operators
● Assignment operators
● Comparison (Relational) operators
● Logical operators
● Bitwise operators
● Identity operators (is, is not)
● Membership operators (in, not in)
● Operator precedence
4. Conditional Statements
● if, if-else, if-elif-else
● Nested if
● Short-hand if statements
● pass statement
5. Loops and Iteration
● while loop
● for loop
● range() function
● break, continue, pass
● Nested loops
● Loop else block
6. Data Structures in Python
6.1 Strings
● Creating and accessing strings
● String slicing and indexing
● String methods (e.g., upper(), split(), find())
● in keyword
● String formatting (format(), f-strings)
6.2 Lists
● Creating lists
● Indexing, slicing
● List methods (append(), insert(), remove(), sort())
● List comprehensions
6.3 Tuples
● Creating tuples
● Tuple unpacking
● Immutability
● Tuple methods
6.4 Sets
● Creating sets
● Set operations (union, intersection, difference)
● Set methods (add(), remove(), discard())
6.5 Dictionaries
● Creating dictionaries
● Accessing, updating, deleting items
● Dictionary methods (get(), items(), keys(), values())
● Dictionary comprehensions
7. Functions
● Defining functions with def
● Arguments and return values
Types of arguments:
● Default
● Keyword
● Arbitrary (*args, **kwargs)
● Recursion
● Lambda functions
● map(), filter(), reduce()
● zip(), enumerate()
8. Modules and Packages
● Importing built-in modules (math, random, datetime, etc.)
● Creating and using user-defined modules
● The __name__ == "__main__" statement
● from module import ... syntax
● Installing external packages using pip
9. File Handling
● Opening files (open())
● Reading and writing text files
● Reading and writing CSV files
● Working with file modes: 'r', 'w', 'a', 'rb', 'wb'
● Context manager (with statement)
● File methods: read(), readline(), readlines()
10. Exception Handling
● Syntax: try-except
● finally block
● else block
● Raising exceptions (raise)
● Built-in exceptions (e.g., ValueError, TypeError)
● Creating custom exceptions
11. Object-Oriented Programming (OOP)
● Classes and objects
● __init__ method (constructor)
● self keyword
● Instance vs class variables
● Methods (instance, class, static)
● Inheritance
● Method overriding
● super() function
● Encapsulation, Abstraction, Polymorphism
● Dunder/Magic Methods (__str__, __repr__, etc.)
12. Advanced Python Topics
12.1 Iterators and Generators
● iter(), next()
● Creating custom iterators
● Generator functions using yield
● Generator expressions
12.2 Decorators
● Function decorators
● Chaining decorators
● @property decorator
12.3 Context Managers
● with statement
● Custom context managers using classes
● contextlib module
12.4 Regular Expressions
● re module
● match(), search(), findall(), sub()
● Meta-characters and special sequences
12.5 Comprehensions
● List, dict, set, and generator comprehensions
13. Working with External Libraries
● requests (HTTP requests)
● json (parsing JSON data)
● os, sys (OS-level operations)
● time, datetime
● shutil, glob
● argparse (CLI arguments)
14. Working with Data (Pandas + NumPy)
Introduction to NumPy arrays and operations
Pandas:
● DataFrame and Series
● Reading/writing data (read_csv, to_csv)
● Indexing, filtering, sorting
● Handling missing data
● Grouping and aggregations
15. Multithreading and Multiprocessing
● threading module
● multiprocessing module
● Use cases and differences
16. Virtual Environments & Packaging
● Creating virtual environments using venv, virtualenv
● requirements.txt
● Creating and installing Python packages
● setup.py, __init__.py
17. Libraries for Data Science
● NumPy: Arrays and Mathematical Operations
● Pandas: DataFrames, Series, Data Manipulation
● Matplotlib & Seaborn & Scipy: Data Visualization
18.Web Scrapping
● Scrapping the data from API
● Scarpping Using Beautiful soup
2.Introduction to Statistics and Math for Data Science
3. Descriptive Statistics
3.1 Central Tendency
● Mean (Arithmetic, Geometric, Harmonic)
● Median
● Mode
3.2 Dispersion (Variability)
● Range
● Variance
● Standard deviation
● Interquartile Range (IQR)
● Coefficient of Variation
3.3 Shape of Data
● Skewness (positive, negative, symmetric)
● Kurtosis (leptokurtic, platykurtic, mesokurtic)
4. Data Types and Scales of Measurement
● Qualitative vs Quantitative data
● Discrete vs Continuous data
Levels of measurement:
● Nominal
● Ordinal
● Interval
● Ratio
5. Data Collection and Sampling
● Population vs Sample
Types of Sampling:
● Random sampling
● Stratified sampling
● Cluster sampling
● Systematic sampling
● Convenience sampling
● Bias and variability
6. Data Visualization
● Histogram
● Box plot
● Bar chart
● Pie chart
● Scatter plot
● Heatmap
● Pair plots (with seaborn/pandas)
7. Probability Theory
● Basic terminology (experiment, sample space, event)
● Types of events (independent, mutually exclusive, exhaustive)
● Classical vs Empirical vs Subjective probability
● Addition and multiplication rules
● Conditional probability
● Bayes’ Theorem
● Complementary rule
8. Probability Distributions
8.1 Discrete Distributions
● Bernoulli distribution
● Binomial distribution
● Poisson distribution
8.2 Continuous Distributions
● Uniform distribution
● Normal distribution (Gaussian)
● Exponential distribution
● Log-normal distribution
8.3 Key Properties
● Probability Density Function (PDF)
● Cumulative Distribution Function (CDF)
● Z-scores and standardization
● Central Limit Theorem (CLT)
9. Inferential Statistics
● Population vs sample
● Sampling distribution
● Estimation (point estimate vs interval estimate)
● Confidence intervals (CI)
10. Hypothesis Testing
● Null and Alternative Hypothesis (H0 vs H1)
● Type I and Type II errors
● p-value and significance level (α)
● One-tailed vs two-tailed tests
● Steps in hypothesis testing
10.1 Common Tests
● Z-test (one-sample, two-sample)
● T-test (independent, paired)
● ANOVA (One-way and Two-way)
● Chi-Square test (goodness of fit, independence)
● F-test
11. Correlation and Covariance
● Covariance (positive, negative, zero)
● Pearson correlation coefficient
● Spearman rank correlation
● Causation vs correlation
12. Linear Algebra (For ML & Data Science)
● Scalars, vectors, matrices, tensors
● Matrix operations:
● Addition, subtraction, multiplication
● Transpose, inverse, determinant
● Identity and diagonal matrices
● Rank of a matrix
● Linear transformations
● Eigenvalues and eigenvectors
● Dot product and cross product
● Norms (L1, L2)
● Applications in ML (PCA, SVD)
13. Calculus (For ML & Deep Learning)
13.1 Differential Calculus
● Limits and continuity
● Derivatives and rules (product, quotient, chain rule)
● Partial derivatives
● Gradient, slope, tangent
● Optimization (minima, maxima)
● Cost function & gradient descent
13.2 Integral Calculus
● Indefinite and definite integrals
● Area under the curve
● Applications in ML (e.g., loss functions)
14. Discrete Mathematics
● Set theory
● Logic and truth tables
● Functions and relations
● Graph theory basics (nodes, edges, trees)
Combinatorics:
● Permutations and combinations
● Factorials and counting problems
SQL
1. Introduction to Databases & SQL
● What is a database?
● Types of databases: Relational vs Non-relational
● What is SQL?
● RDBMS vs DBMS
● SQL dialects (MySQL, PostgreSQL, SQLite, SQL Server, Oracle)
● Setting up environment (MySQL Workbench / PgAdmin / SQLite)
2. Database & Table Basics
● Creating a database: CREATE DATABASE
● Using a database: USE
● Dropping a database: DROP DATABASE
● Creating tables with CREATE TABLE
● Data types (INT, VARCHAR, TEXT, DATE, BOOLEAN, etc.)
Constraints:
● PRIMARY KEY
● FOREIGN KEY
● NOT NULL
● UNIQUE
● DEFAULT
● CHECK
● Dropping tables: DROP TABLE
● Modifying tables: ALTER TABLE (ADD, DROP, MODIFY column)
3. Basic Data Operations (CRUD)
● INSERT data into tables
● SELECT data from tables
● UPDATE existing records
● DELETE records from tables
4. Filtering and Sorting
● WHERE clause
● Logical operators: AND, OR, NOT
● Comparison operators: =, !=, <, >, <=, >=
● Pattern matching: LIKE, NOT LIKE, %, _
● BETWEEN, IN, IS NULL, IS NOT NULL
● ORDER BY clause (ASC/DESC)
● LIMIT and OFFSET
5. Working with Functions
Aggregate functions:
● COUNT(), SUM(), AVG(), MAX(), MIN()
String functions:
● LENGTH(), UPPER(), LOWER(), CONCAT(), SUBSTRING(), REPLACE(), TRIM()
Date functions:
● NOW(), CURDATE(), DATEDIFF(), DATE_ADD(), EXTRACT()
Mathematical functions:
● ROUND(), CEIL(), FLOOR(), MOD()
6. Grouping and Aggregating
● GROUP BY clause
● HAVING clause (vs WHERE)
● Grouping multiple columns
● Nested aggregations
7. SQL Joins (Combining Tables)
● INNER JOIN
● LEFT JOIN (LEFT OUTER JOIN)
● RIGHT JOIN (RIGHT OUTER JOIN)
● FULL OUTER JOIN
● CROSS JOIN
● Joining more than 2 tables
● Aliases with joins
● Self joins
8. Subqueries & Nested Queries
● Subquery in SELECT
● Subquery in WHERE and FROM
● Correlated subqueries
● EXISTS, NOT EXISTS
● IN vs EXISTS
9. Views
● Creating views: CREATE VIEW
● Updating views
● Dropping views
● Advantages and limitations
10. Indexes & Performance
● Creating indexes: CREATE INDEX
● Unique index
● Composite index
● Dropping indexes
● Performance benefits and trade-offs
11. Transactions and ACID
● What is a transaction?
● BEGIN, COMMIT, ROLLBACK
● Savepoints
● ACID properties (Atomicity, Consistency, Isolation, Durability)
● Isolation levels
12. Stored Procedures and Functions
● CREATE PROCEDURE
● CALL, IN, OUT parameters
● CREATE FUNCTION
● Differences between procedures and functions
● Dropping procedures and functions
13. Triggers and Events
● What are triggers?
● BEFORE and AFTER triggers
● Row-level vs statement-level triggers
● Creating, updating, and deleting triggers
● Scheduled events
14. Advanced SQL
● CASE statement
● Common Table Expressions (CTEs): WITH clause
Window functions:
● ROW_NUMBER(), RANK(), DENSE_RANK()
● LEAD(), LAG(), NTILE()
● Pivot tables using SQL
● Recursive queries
15. Database Normalization
● What is normalization?
● 1NF, 2NF, 3NF, BCNF
● Decomposition and dependency preservation
● Denormalization
4. Data Visualization and BI Tools
4.1. Power BI
● Data Loading and Transformation
● Power Query Editor
● DAX Formulas and Measures
● Creating Interactive Dashboards
● Publishing and Sharing Reports
4.2. Tableau
● Data Connection and Preparation
● Creating Basic and Advanced Charts
● Filters, Parameters, Calculated Fields
● Dashboards and Storytelling
● Tableau Prep for Data Cleaning
5. Data Wrangling & Exploratory Data Analysis (EDA)
● Handling Missing Values
● Data Type Conversion
● Outlier Detection
● Feature Engineering & Feature Selection
● Correlation and Trend Analysis
● Data Imbalance Handling technique
Machine Learning
1. Introduction to Machine Learning
What is Machine Learning?
Types of ML:
● Supervised Learning
● Unsupervised Learning
● Semi-supervised Learning
● Reinforcement Learning
● ML vs AI vs Deep Learning
2. Data Preprocessing
● Importing datasets
● Handling missing data
● Mean/Median/Mode imputation
● Forward/Backward fill
● Handling categorical data
● Label Encoding
● One-Hot Encoding
● Feature Scaling
● MinMaxScaler
● StandardScaler
● Train-Test Split
● Pipeline creation using Pipeline and ColumnTransformer
3. Exploratory Data Analysis (EDA)
● Summary statistics
● Univariate, bivariate, multivariate analysis
● Correlation matrix & heatmap
● Outlier detection and treatment (Z-score, IQR)
● Distribution plots (histogram, boxplot, KDE)
● Feature importance analysis
4. Supervised Learning – Regression
4.1 Linear Regression
● Simple and Multiple Linear Regression
● Assumptions of linear regression
● Evaluation metrics: MAE, MSE, RMSE, R²
Regularization:
● Lasso Regression
● Ridge Regression
● ElasticNet
4.2 Polynomial Regression
● Polynomial features
● Overfitting and underfitting
5. Supervised Learning – Classification
5.1 Logistic Regression
● Binary and Multiclass classification
● Sigmoid function
● Confusion matrix
● Accuracy, Precision, Recall, F1-Score, ROC-AUC
5.2 K-Nearest Neighbors (KNN)
5.3 Support Vector Machines (SVM)
● Linear and non-linear SVM
● Kernel trick (RBF, polynomial, sigmoid)
5.4 Decision Trees
● Gini index vs Entropy
● Overfitting and pruning
5.5 Random Forest
● Ensemble concept
● Feature importance
● Hyperparameter tuning
5.6 Gradient Boosting
● AdaBoost
● XGBoost
● LightGBM
● CatBoost
6. Unsupervised Learning
6.1 Clustering
● K-Means clustering
● Elbow method & silhouette score
● Hierarchical clustering
● DBSCAN
6.2 Dimensionality Reduction
● PCA (Principal Component Analysis)
● t-SNE
● LDA (Linear Discriminant Analysis)
6.3 Association Rule Learning
● Apriori algorithm
7. Model Evaluation and Selection
● cross-validation (K-Fold, Stratified K-Fold)
● GridSearchCV vs RandomizedSearchCV
● Bias-Variance tradeoff
● Learning curves
● Precision-Recall tradeoff
● ROC-AUC Curve
8. Feature Engineering
● Feature creation and extraction
● Handling date and time features
9.Time Series Forecasting (Basic)
● Time series components
● Lag features, rolling statistics
● AR, MA, ARMA, ARIMA
● Stationarity and differencing
● ACF & PACF plots
● Seasonal decomposition
● Prophet model
Deep Learning
1. Introduction to Deep Learning
● What is Deep Learning?
● Deep Learning vs Machine Learning
● Why Deep Learning? Use-cases and advantages
● History and evolution of neural networks
● AI ML DL ANN/CNN/RNN Transformers
2. Neural Networks Basics
● Biological Neuron vs Artificial Neuron
● Perceptron (single-layer)
Activation Functions:
● Sigmoid
● Tanh
● ReLU
● Leaky ReLU
● Softmax
● Feedforward and Backpropagation
Loss functions:
● MSE, MAE
● Cross-entropy loss
Optimizers:
● SGD
● Adam
● RMSProp
● Momentum
3. Building Neural Networks from Scratch
● NumPy implementation of ANN
● Forward pass and backward pass
● Weight initialization
● Hyperparameters: learning rate, batch size, epochs
● Underfitting vs Overfitting
● Train/Validation/Test split
● Regularization:
● L1/L2
● Dropout
● Early stopping
4. Deep Neural Networks (DNN)
● Multi-layer Perceptrons (MLPs)
● Depth vs Width of networks
● Vanishing/exploding gradients
● Batch Normalization
● Model tuning: grid search, random search
● Saving/loading models
5. Computer Vision with Convolutional Neural Networks
(CNN)
● Image basics: pixels, channels, filters
● Convolution operation
● Pooling layers: MaxPool, AvgPool
● CNN architecture:
● LeNet
● AlexNet
● VGG16/VGG19
● ResNet (skip connections)
● Inception, MobileNet, EfficientNet
● Data augmentation
● Transfer Learning & Fine-tuning
● Image classification, object detection basics
NATURAL LANGUAGE
PROCESSING
1. Introduction to NLP
● What is NLP?
● NLP vs NLU vs NLG
Applications of NLP:
● Chatbots
● Sentiment Analysis
● Search Engines
● Spam Detection
● Machine Translation
● Challenges in NLP
● Structured vs Unstructured Text
2. Text Preprocessing
● Text cleaning basics
● Tokenization
● Word tokenization
● Sentence tokenization
● Normalization
● Lowercasing
● Removing punctuation, special characters
● Removing stopwords
● Stemming vs Lemmatization
● POS (Part-of-Speech) tagging
● Named Entity Recognition (NER)
● Spell correction
● Regex for text patterns
3. Feature Extraction from Text
● Bag of Words (BoW)
● Term Frequency (TF)
● Inverse Document Frequency (IDF)
● TF-IDF
● N-grams (unigram, bigram, trigram)
● Vocabulary and document matrix
Text vectorization with:
● CountVectorizer
● TfidfVectorizer
● Document frequency analysis
4. Word Embeddings (Vector Representations)
● One-hot encoding
● Word2Vec (CBOW & Skip-gram)
● GloVe (Global Vectors for Word Representation)
● FastText
● Comparison: BoW vs Word2Vec vs GloVe
● Visualizing embeddings using t-SNE or PCA
9. MLOps (Machine Learning Operations)
● CI/CD for ML
● Model Monitoring and Logging
● MLflow, DVC
● Git, GitHub Actions
● Deployment using Cloud Services (AWS, GCP, Azure)
Generative AI
1. Introduction to Generative AI
● What is Generative AI?
● Discriminative vs Generative models
● Why is Generative AI important?
● Evolution of GenAI
Real-world applications:
● ChatGPT, Bard, Claude
● DALL·E, MidJourney, Stable Diffusion
● Music & voice generation
● Code generation
2.Foundations of Generative Models
● What is a generative model?
● Data distribution learning
● Types of generative models:
● Explicit vs implicit models
● Directed vs undirected models
● Key architectures: VAEs, GANs, Transformers
3.Transformers & Foundation Models
● Introduction to Transformer architecture
● Self-attention
● Positional encoding
● Multi-head attention
● Encoder vs Decoder vs Encoder-Decoder models
● Pretraining & fine-tuning
4.Large Language Models (LLMs)
● What are LLMs?
● BERT vs GPT
Key models:
● GPT-2, GPT-3, GPT-4
● T5
● PaLM
● LLaMA, Mistral, Falcon, Claude, Gemini
● Tokenization: BPE, SentencePiece
● Prompt engineering
● Chain-of-Thought reasoning
● In-context learning & few-shot learning
● Instruction-tuning
● Retrieval-Augmented Generation (RAG)
5.Diffusion Models
● What are diffusion models?
● Forward & reverse processes
● Denoising Diffusion Probabilistic Models (DDPM)
● Training and sampling process
Popular models:
● Stable Diffusion
● Imagen
● Glide
6.Multimodal Generative AI
● What is multimodal AI?
● Combining text, images, audio, and video
● CLIP (Contrastive Language–Image Pretraining)
● Flamingo (text + image)
● Visual Question Answering (VQA)
● Audio + Text (Whisper + GPT)
● Video generation models
7.Fine-Tuning & Customization
● Fine-tuning vs Prompt-tuning vs LoRA
● Dataset preparation for fine-tuning
● Parameter-efficient tuning (PEFT)
Tools:
● PEFT by Hugging Face
● DeepSpeed, bitsandbytes, QLoRA
● Training custom LLMs or image generators
8.Hugging Face Transformers, Datasets, Diffusers
● LangChain for LLM workflows
● OpenAI API (ChatGPT, DALL·E, Whisper)
● Vertex AI, Azure OpenAI, Amazon Bedrock
● Replicate.com for hosted models
● Gradio / Streamlit for GenAI apps
● Weights & Biases (WandB) for experiment tracking
11. BASICS OF AI AGENTS
11.1. Introduction to AI Agents
● What is an agent in AI?
● Agent = Perceives + Acts
● Types of agents:
o Simple Reflex Agents
o Model-based Reflex Agents
o Goal-based Agents
o Utility-based Agents
o Learning Agents
11.2. Agent Architecture
● Perception Decision Action
● PEAS Framework (Performance measure, Environment, Actuators, Sensors)
● Environment types:
o Fully vs Partially Observable
o Deterministic vs Stochastic
o Episodic vs Sequential
o Static vs Dynamic
o Discrete vs Continuous
11.3. Simple Agent Programs
● Rule-based agents
● IF-THEN rules
● Condition-action rules
11.4. Problem Solving Agents
● Search problem formulation
● Uninformed search (DFS, BFS, UCS)
● Informed search (A*, Greedy)
12. Career Readiness
● Resume Building
● LinkedIn Optimization
● Mock Interviews
● Project Presentation Skills
● Freelancing & Portfolio Websites