0% found this document useful (0 votes)

16 views6 pages

Regression Logistic Unit3 Notes

The document provides detailed notes on regression and logistic regression, covering definitions, objectives, types, and applications of regression techniques. It explains simple linear regression, multiple linear regression, polynomial regression, and logistic regression, along with their mathematical models and evaluation metrics. Additionally, it discusses model building, variable rationalization, and business applications across various domains.

Uploaded by

Avyuktha Raju

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views6 pages

Regression Logistic Unit3 Notes

Uploaded by

Avyuktha Raju

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

UNIT-3: Regression and Logistic Regression (Detailed Notes)

📘 1. Regression – Concepts

📌 Definition:

Regression is a fundamental statistical technique used in data science and machine learning to explore the
relationship between a dependent variable (target) and one or more independent variables (predictors or
features). The objective of regression is not just to understand the correlation but to predict the outcome of
the dependent variable when the values of independent variables are known. It falls under supervised
learning, where the algorithm is trained on a labeled dataset.

📌 Objectives of Regression:

• To predict numerical (continuous) values.

• To identify and quantify relationships between variables.
• To support business decision-making with data-driven insights.

📌 Types of Regression:

1. Linear Regression (Simple and Multiple)

2. Polynomial Regression
3. Logistic Regression (used for classification)
4. Regularized Regression (Lasso, Ridge, Elastic Net)

📌 Real-life Applications:

• Predicting house prices

• Forecasting sales or stock prices
• Estimating the effect of study hours on exam scores

📘 2. Simple Linear Regression

📌 Definition:

Simple linear regression is the most basic form of regression analysis where the relationship between one
independent variable (X) and one dependent variable (Y) is modeled using a straight line. It assumes that
this relationship is linear and continuous.

📌 Mathematical Equation:

Y = β0 + β1 X + ϵ Where:

• Y : Dependent variable (output)

1
• X : Independent variable (input)
• β0 : Intercept
• β1 : Slope of the line
• ϵ : Error term (residual)

📌 Example:

Suppose we want to predict the marks of a student based on the number of hours studied. We collect data
from multiple students, fit a line through the data points, and use that line for prediction.

📌 Graphical Representation:

Imagine a 2D scatter plot where:

• X-axis = Hours studied

• Y-axis = Marks obtained
• A straight line passes through the data points (best fit line)

📘 3. Multiple Linear Regression

📌 Definition:

Multiple linear regression is an extension of simple linear regression. It models the relationship between a
dependent variable and two or more independent variables. It is used when the dependent variable is
influenced by multiple factors.

📌 Mathematical Model:

Y = β0 + β1 X1 + β2 X2 + ... + βn Xn + ϵ

📌 Example:

Predicting a house price based on:

• Area (in sq. ft.)

• Number of rooms
• Distance from city center
• Age of the property

All of these become independent variables contributing to the price (Y).

📌 Graph:

Visualizing in 3D space or higher dimensions where each axis represents one feature and the fitted plane
(not a line) represents the predicted outcome.

2
📘 4. Polynomial Regression

📌 Definition:

Polynomial regression models the relationship between the dependent and independent variables as an
nth degree polynomial. It is useful when the data shows a curvilinear trend that linear models cannot
capture.

📌 Mathematical Model:

Y = β0 + β1 X + β2 X 2 + β3 X 3 + ... + βn X n + ϵ

📌 Example:

Predicting plant growth over time where the growth rate increases with time, slows down, and then stops –
a non-linear pattern best modeled with a quadratic or cubic polynomial.

📌 Graph:

Shows a U-shaped or S-shaped curve fitted through the data points, depending on the polynomial degree.

📘 5. BLUE Assumptions (Best Linear Unbiased Estimator)

BLUE comes from the Gauss-Markov Theorem and outlines the assumptions needed for linear regression to
give the best unbiased estimates of coefficients.

📌 Assumptions:

1. Linearity – The relationship between X and Y is linear.

2. Independence – Residuals (errors) are independent.
3. Homoscedasticity – Residuals have constant variance.
4. No Multicollinearity – Independent variables are not highly correlated.
5. Normality – Errors are normally distributed (especially important for hypothesis testing).

📌 Violations and Solutions:

• Multicollinearity ➝ Remove redundant variables, use PCA

• Heteroscedasticity ➝ Apply log or square root transformations
• Autocorrelation ➝ Use time series models

3
📘 6. Least Squares Estimation (LSE)

📌 Definition:

Least Squares Estimation is a mathematical method used to determine the best-fit line in regression by
minimizing the sum of the squares of the differences between observed and predicted values.

📌 Mathematical Objective:
n
SSE = ∑i=1 (yi − y^i )2 Minimize this to find optimal β values.

📌 Steps:

1. Assume a linear model: Y = β0 + β1 X

2. Compute residuals (difference between actual and predicted)
3. Square and sum the residuals (SSE)
4. Find values of β0 and β1 that minimize SSE

📌 Applications:

• Forecasting sales
• Estimating trends in financial data

📘 7. Variable Rationalization

📌 Definition:

Variable rationalization is the process of selecting the most relevant variables (features), transforming them
appropriately, and engineering new features to enhance model performance.

📌 Steps:

1. Feature Selection – Identify and keep the most predictive variables

2. Feature Transformation – Normalize or log-transform variables
3. Feature Engineering – Create new features (e.g., BMI from height and weight)
4. Dimensionality Reduction – Use PCA to reduce the number of variables

📌 Example:

In a student performance dataset, instead of using raw attendance and study hours, we create a composite
score like: Effort Index = Attendance × Study Hours

4
📘 8. Model Building & Evaluation

📌 Steps in Building a Regression Model:

1. Data Collection – Gather relevant, clean data

2. Data Preprocessing – Handle missing values, outliers
3. Feature Selection & Engineering
4. Splitting Data – Train-test split (e.g., 80/20)
5. Model Training – Apply regression algorithm
6. Evaluation – Assess with performance metrics

📌 Evaluation Metrics:

• R² (Coefficient of Determination): Proportion of variance explained

• Adjusted R²: Adjusted for number of predictors
• MSE/MAE/RMSE: Error-based metrics (lower is better)

📘 9. Logistic Regression

📌 Definition:

Logistic Regression is a classification algorithm used when the dependent variable is binary (Yes/No, 1/0). It
estimates the probability that a given input belongs to a certain category using the logistic (sigmoid)
function.

📌 Logistic Function:
1
P (Y = 1) = 1+e−(β0 +β1 X)

📌 Example:

Predicting whether a customer will buy a product (1) or not (0) based on age and income.

📌 Output:

Probabilities between 0 and 1. Threshold (e.g., 0.5) is used to assign class labels.

📘 10. Logistic Model Evaluation Metrics

📌 Key Metrics:

• Confusion Matrix – TP, TN, FP, FN

• Accuracy – Correct predictions / Total predictions
• Precision – TP / (TP + FP)

5
• Recall – TP / (TP + FN)
• F1 Score – Harmonic mean of precision and recall
• ROC Curve & AUC – Model’s ability to distinguish between classes
• Pseudo R² – McFadden’s R² for logistic models
• AIC/BIC – Lower values indicate better fit (penalizes complexity)

📘 11. Business Applications of Regression Models

📌 Domain-wise Use Cases:

• Finance – Credit scoring, fraud detection, stock price forecasting

• Marketing – Customer segmentation, churn prediction, campaign effectiveness
• Healthcare – Disease diagnosis (e.g., diabetes prediction)
• Retail – Product recommendation, demand forecasting
• HR – Employee attrition modeling, recruitment analytics
• Manufacturing – Predictive maintenance, defect detection
• Transportation – Route optimization, delivery forecasting
• Education – Student dropout prediction, adaptive learning systems

📌 Tools Used:

Python, R, SQL, Excel, Power BI, Tableau

Unit 5
No ratings yet
Unit 5
18 pages
MLT Unit 2 Linear Regression
No ratings yet
MLT Unit 2 Linear Regression
26 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
Module 2 Modified
No ratings yet
Module 2 Modified
67 pages
Assignment Group C
No ratings yet
Assignment Group C
8 pages
Chatgpt Unit - 2
No ratings yet
Chatgpt Unit - 2
3 pages
ML 01 (Shubham)
No ratings yet
ML 01 (Shubham)
14 pages
Unit 3
No ratings yet
Unit 3
25 pages
Unit-3 DA
No ratings yet
Unit-3 DA
50 pages
Regression Models Overview
No ratings yet
Regression Models Overview
170 pages
Linear Regression
No ratings yet
Linear Regression
5 pages
ML 3
No ratings yet
ML 3
50 pages
ML QB
No ratings yet
ML QB
13 pages
Da Sem Unit 3-1
No ratings yet
Da Sem Unit 3-1
13 pages
Unit 6
No ratings yet
Unit 6
107 pages
Rohit Unit 2 ML Notes
No ratings yet
Rohit Unit 2 ML Notes
7 pages
Unit 3 Da
No ratings yet
Unit 3 Da
20 pages
ML 2 ND Unit
No ratings yet
ML 2 ND Unit
50 pages
Da Unit-3
No ratings yet
Da Unit-3
27 pages
Modern Pridictive Modelling (Regression)
No ratings yet
Modern Pridictive Modelling (Regression)
12 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
ML 01 (Pranavv)
No ratings yet
ML 01 (Pranavv)
14 pages
Regression in M.L
No ratings yet
Regression in M.L
13 pages
Unit 2 ML
No ratings yet
Unit 2 ML
26 pages
Information Retrieval Important Questions
No ratings yet
Information Retrieval Important Questions
20 pages
3 Unit - Dspu
No ratings yet
3 Unit - Dspu
23 pages
Linear Regression Mastry
No ratings yet
Linear Regression Mastry
6 pages
Regression Algorithms Guide
No ratings yet
Regression Algorithms Guide
22 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Teit ML2
No ratings yet
Teit ML2
11 pages
Complete
No ratings yet
Complete
12 pages
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
AI Lab7
No ratings yet
AI Lab7
13 pages
AIML Question Ans Part1
No ratings yet
AIML Question Ans Part1
9 pages
MLDAP Module2
No ratings yet
MLDAP Module2
32 pages
Linear Regression Model 1
No ratings yet
Linear Regression Model 1
23 pages
Unit - Iv
No ratings yet
Unit - Iv
11 pages
Regression Bayesian SVM Notes
No ratings yet
Regression Bayesian SVM Notes
6 pages
ML Final
No ratings yet
ML Final
92 pages
Predective Analytics
No ratings yet
Predective Analytics
11 pages
Unit 2 Notes - Final
No ratings yet
Unit 2 Notes - Final
32 pages
DSR Notes 3 To 5
No ratings yet
DSR Notes 3 To 5
70 pages
S&ML Unit 5 - Q & A
No ratings yet
S&ML Unit 5 - Q & A
15 pages
2 Modele Lineare
No ratings yet
2 Modele Lineare
43 pages
Unit 2
No ratings yet
Unit 2
19 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
228w1f0065 ML
No ratings yet
228w1f0065 ML
15 pages
Unit 2
No ratings yet
Unit 2
26 pages
Data Analytics Unit 3 Notes
100% (3)
Data Analytics Unit 3 Notes
28 pages
SumitBurnwal ML
No ratings yet
SumitBurnwal ML
13 pages
To Understand Regression Models Using First Principles Thinking
No ratings yet
To Understand Regression Models Using First Principles Thinking
3 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
ML Manual
No ratings yet
ML Manual
37 pages
Classification & Regression Models
No ratings yet
Classification & Regression Models
32 pages
Unit 2
No ratings yet
Unit 2
67 pages
Unit-III Dielectrics, Magnetic & Energymaterials (Only Dielectric Materials)
No ratings yet
Unit-III Dielectrics, Magnetic & Energymaterials (Only Dielectric Materials)
9 pages
Ids Unit 5 By1
No ratings yet
Ids Unit 5 By1
24 pages
Ids Unit 4 by
No ratings yet
Ids Unit 4 by
41 pages
DATA COMMUNICATION AND COMPUTER NETWORKS-Assignment-1
No ratings yet
DATA COMMUNICATION AND COMPUTER NETWORKS-Assignment-1
1 page
Ids Unit 3 by
No ratings yet
Ids Unit 3 by
109 pages
DS-Question Bank
No ratings yet
DS-Question Bank
9 pages
Budget 23 24 Fees and Charges
No ratings yet
Budget 23 24 Fees and Charges
14 pages
Com3004 Salesforcemanagement
No ratings yet
Com3004 Salesforcemanagement
13 pages
State Space to Transfer Function
No ratings yet
State Space to Transfer Function
2 pages
Douglas Rachford Optimization
No ratings yet
Douglas Rachford Optimization
4 pages
DAF Sample
100% (3)
DAF Sample
9 pages
In 2006, Montana State University
No ratings yet
In 2006, Montana State University
2 pages
Catalogue
No ratings yet
Catalogue
15 pages
Bussiness Accounting Cifa Notes
No ratings yet
Bussiness Accounting Cifa Notes
31 pages
Understanding Center of Gravity
No ratings yet
Understanding Center of Gravity
10 pages
Social Responsibility Project
No ratings yet
Social Responsibility Project
22 pages
Sertifikat Kalibrasi Elcometer
100% (1)
Sertifikat Kalibrasi Elcometer
2 pages
Unit 9 - Introduction To Business Management
No ratings yet
Unit 9 - Introduction To Business Management
34 pages
SDM-1 Mini Manual: Page 1 of 54 426006-2101-013-A0
No ratings yet
SDM-1 Mini Manual: Page 1 of 54 426006-2101-013-A0
54 pages
2 Precepts Transcript
No ratings yet
2 Precepts Transcript
2 pages
Reproductive Health in One Shot
No ratings yet
Reproductive Health in One Shot
53 pages
Oil Making Line Quote From Anne
No ratings yet
Oil Making Line Quote From Anne
3 pages
eGovernanceRoadmap Rajasthan
No ratings yet
eGovernanceRoadmap Rajasthan
102 pages
Windsheild Survey
No ratings yet
Windsheild Survey
4 pages
1A Time Clauses
No ratings yet
1A Time Clauses
3 pages
Unit 8 - HS Part 1
No ratings yet
Unit 8 - HS Part 1
7 pages
The Stony Brook Press - Volume 26, Issue 10
No ratings yet
The Stony Brook Press - Volume 26, Issue 10
48 pages
217 Energy Management
No ratings yet
217 Energy Management
1 page
CEGP013091: 49.248.216.238 21/07/2022 08:51:56 Static-238
No ratings yet
CEGP013091: 49.248.216.238 21/07/2022 08:51:56 Static-238
9 pages
History of Mexico City
No ratings yet
History of Mexico City
2 pages
Branch List As On 30.11.2023
No ratings yet
Branch List As On 30.11.2023
540 pages
Explosionado - Clas
No ratings yet
Explosionado - Clas
14 pages
Transport From Bayswater
No ratings yet
Transport From Bayswater
7 pages
Ingold, Tim. The Conical Lodge at The Centre of The Earth.
No ratings yet
Ingold, Tim. The Conical Lodge at The Centre of The Earth.
28 pages
L13c Searching
No ratings yet
L13c Searching
57 pages
Service Manual: Diva Avr200 Surround Sound Receiver
No ratings yet
Service Manual: Diva Avr200 Surround Sound Receiver
61 pages

Regression Logistic Unit3 Notes

Uploaded by

Regression Logistic Unit3 Notes

Uploaded by

UNIT-3: Regression and Logistic Regression (Detailed Notes)

• To predict numerical (continuous) values.

1. Linear Regression (Simple and Multiple)

• Predicting house prices

📘 2. Simple Linear Regression

• Y : Dependent variable (output)

Imagine a 2D scatter plot where:

• X-axis = Hours studied

📘 3. Multiple Linear Regression

Predicting a house price based on:

• Area (in sq. ft.)

All of these become independent variables contributing to the price (Y).

📘 5. BLUE Assumptions (Best Linear Unbiased Estimator)

1. Linearity – The relationship between X and Y is linear.

📌 Violations and Solutions:

• Multicollinearity ➝ Remove redundant variables, use PCA

1. Assume a linear model: Y = β0 + β1 X

1. Feature Selection – Identify and keep the most predictive variables

📌 Steps in Building a Regression Model:

1. Data Collection – Gather relevant, clean data

• R² (Coefficient of Determination): Proportion of variance explained

📘 10. Logistic Model Evaluation Metrics

• Confusion Matrix – TP, TN, FP, FN

📘 11. Business Applications of Regression Models

📌 Domain-wise Use Cases:

• Finance – Credit scoring, fraud detection, stock price forecasting

Python, R, SQL, Excel, Power BI, Tableau

You might also like