0% found this document useful (0 votes)

10 views27 pages

Slide 3 Linear Regression

The document provides an overview of linear regression, explaining its purpose in predicting a dependent variable based on independent variables. It includes examples of simple and multiple regression, discusses the importance of minimizing error through cost functions, and highlights challenges such as outliers and normalization. Additionally, it emphasizes the significance of the R-squared score in evaluating model performance.

Uploaded by

JOBIN Wilson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views27 pages

Slide 3 Linear Regression

Uploaded by

JOBIN Wilson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

24CSA524: Machine Learning

Remya Rajesh
LINEAR REGRESSION

Area Cost (Lakh) • Plot a graph between the cost and area of the house
(sq.feet) X Y
• The area of the house is represented in the X-axis while
1000 30
cost is represented in Y-axis
1200 40
• What will Regression Do?
1300 50
• Fit the line through these points
1450 70
1495 70
1600 80

Cost
Example 1

Area
LINEAR REGRESSION

Predict for House area=1100?

How is this line represented mathematically?

Introduction to Linear Regression

Example 2
Regression
• We assume that we have 𝑘 feature variables:
• or independent variables
• The target variable is also known as dependent variable.
• We are given a dataset of the form (𝒙1 , 𝑦1 ) , … , (𝒙𝑛 , 𝑦𝑛 ) where, 𝒙𝒊 is a 𝑘-
dimensional feature vector (real), and 𝑦𝑖 a real value
• We want to learn a function ℎ which given a feature vector 𝒙𝒊 predicts a
value 𝑦𝑖′ = ℎ 𝒙𝒊 that is as close as possible to the value 𝑦𝑖 or 𝑒𝑞𝑢𝑎𝑙 𝑡𝑜 𝑦𝑖
• Minimize sum of squares:
2
෍ 𝑦𝑖 − ℎ 𝒙𝒊
𝑖
The Simple Regression Model
• Definition A simple regression of y on x explains variable y in terms of a
single variable x

Intercept Slope parameter

Error term,
Dependent variable, disturbance,
Independent variable, unobservables,…
LHS variable, RHS variable,
explained variable, explanatory variable,
response variable,… Control variable,…
The Simple Regression Model
• Example: Soybean yield and fertilizer quantity

Rainfall,
land quality,
Measures the effect of amount of fertilizer on presence of parasites, …
yield

• Example: A simple wage equation

Total experience,
current experience,
Measures wage work ethic, work interest,workshops
given number of years attended.. …
of education
Linear regression
• Univariate Linear regression

Training Set

Learning Algorithm

Estimated Model Representation

price
Size of h
(predicted
h ( x) =  0 + 1 x
house
(x) y)
Basic Idea: Method 1
• Using a linear equation h ( x) =  0 + 1 x

compute:
Linear Regression: Prediction Model Example 3

X (years of Y (salary, Rs
• Given one variable X experience) 1,000)
• Goal: Predict value of Y 3 30

• Example: 8 57

• Given Years of Experience 9 64

• Predict Salary 13 72
3 36
• Questions:
6 43
• When X=10, what is Y?
11 59
• When X=25, what is Y?
21 90
• This is known as regression
1 20
16 83
Example
Salary Dataset
X Y
Years of Salary in
Experience 1000s
3 30
8 57
9 64
13 72
3 36
6 43
11 59
21 90
1 20
16 83
From the given dataset we could get: 𝑥ҧ = 9.1; 𝑦ത = 55.4

3−9.1 ∗ 30−55.4 + 8−9.1 ∗ 57−55.4 +⋯+ 16−9.1 ∗(83−55.4)

θ1 = = 3.5
(3−9.1)2 +(8−9.1)2 + ⋯+(16−9.1)2

θ0 = 55.4 − 3.5 ∗ 9.1 = 23.2

Thus, 𝑦 = 23. 2 + 3.5 ∗ 𝑥

Linear Regression Example
Linear Regression: Y=3.5*X+23.2

120

100

80
Salary

0
0 5 10 15 20 25
Years
For the example data

Thus, when x=10 years, prediction of y (salary)

is: 23.2+35=58.2 K Rs/year.
Multivariate models

simple regression model

(Education) x y (Income)

Multivariate or multiple regression model

(Education) x1
(Gender) x2 y (Income)
(Experience) x3
(Age) x4
More than one prediction attribute
• Consider two independent attributes X1, X2 and a dependent
variable Y
• For example,
• X1=‘years of experience’
• X2=‘age’
• Y=‘salary’
• Equation:
Outliers
Image result for regression with outliers

• Regression is sensitive to outliers:

• The line will “tilt” to accommodate very extreme values
• Solution: remove the outliers
• But make sure that they do not capture useful information
Normalization

• In the regression problem some times our features may have very
different scales:
• For example: predict the GDP of a country using the count of home owners and
the income as features

• Solution: Normalize the features by replacing the values with the z-

scores
Predictive Model challenges with linearity
Hypothesis

• is are the parameters

• Lets visualize this hypothesis
• Consider 0 = 1.5 and 1 = 0
• h(x) = 1.5 + 0x
Hypothesis
Consider 0 = 0 and 1 = 1 Consider 0 = 0 and 1 = 0.5
h(x) = 0 + 1x h(x) = 0 + 0.5x
Optimize Cost Function
1 m 2
min imize  (h( x (i) )− y (i) )
2m i =1

 ,
0 1

1 m 2
min imize  ( 0 + 1x (i) − y (i) )
2m i =1

 ,
0 1

1 m 2
J ( 0 , 1) = 
2m i =1 (h( x ) − y )
(i ) (i )

Goal:min imizeJ ( 0, 1)

 ,
0 1

Where J(0,1) is the cost function or the squared error function

Cost Function
• How to best fit our Data?
• Choose the value of theta such that the 𝐽(𝜃) = ℎ(𝑥) − 𝑦
difference between h(x) [which returns the
predicted value] and y (which is the actual 𝐽(𝜃) = (ℎ(𝑥) − 𝑦)2
value) is minimum 𝑚

• To calculate this - define an error function 𝐽(𝜃) = ෍(ℎ(𝑥 𝑖 ) − 𝑦 𝑖 )2

also called the cost function 𝑖=1
• Absolute error - square of the error 𝑚
because some points are above and below 1
the line 𝐽(𝜃) = ෍(ℎ(𝑥 𝑖 ) − 𝑦 𝑖 )2
𝑚
• Error of ALL points – Summation 𝑖=1
𝑚
• Averaged and then divided by 2 to make 1
the calculation easier 𝐽(𝜃) = ෍(ℎ(𝑥 𝑖 ) − 𝑦 𝑖 )2
2𝑚
𝑖=1
Cost Function
1 m 2
min imize  (h( x (i) )− y (i) )
2m i =1

 ,
0 1

1 m 2
min imize  ( 0 + 1x (i) − y (i) )
2m i =1

 ,
0 1

1 m 2
J ( 0 , 1) = 
2m i =1 (h( x ) − y )
(i ) (i )

Goal:min imizeJ ( 0, 1)

 ,
0 1

J(0,1) is the cost function or the squared error function

Evaluation

the higher the R-squared score, the better the model fits your data

Linear Regression
No ratings yet
Linear Regression
20 pages
LR 1751142062
No ratings yet
LR 1751142062
10 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
2 Simple Linear Regression
No ratings yet
2 Simple Linear Regression
22 pages
Regression
No ratings yet
Regression
16 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
Linear Regression
No ratings yet
Linear Regression
54 pages
ML 02 Regression 2
No ratings yet
ML 02 Regression 2
30 pages
Lecture 3 - Linear Regression Imran 20022025 092939am
No ratings yet
Lecture 3 - Linear Regression Imran 20022025 092939am
46 pages
Linear Regression
No ratings yet
Linear Regression
130 pages
AI Lec 3
No ratings yet
AI Lec 3
36 pages
Linear Regression
No ratings yet
Linear Regression
34 pages
ML Unit
No ratings yet
ML Unit
23 pages
AI Lab7
No ratings yet
AI Lab7
13 pages
W2 Ecs7020p
No ratings yet
W2 Ecs7020p
54 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
10 pages
Chap 2 Linear Regression - Part1
No ratings yet
Chap 2 Linear Regression - Part1
29 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
6 - Classification and Regression Tasks
No ratings yet
6 - Classification and Regression Tasks
115 pages
MachineLearning Unit II
No ratings yet
MachineLearning Unit II
45 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Linear Regression Lecture Notes
No ratings yet
Linear Regression Lecture Notes
34 pages
Linear Regression Presentation
No ratings yet
Linear Regression Presentation
10 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Lecture - 2 Linear Regression (Applied ML)
No ratings yet
Lecture - 2 Linear Regression (Applied ML)
45 pages
AAI Lecture 10 SP 25
No ratings yet
AAI Lecture 10 SP 25
37 pages
He Images Outline The Steps To Solve A Supervised Learning Problem
No ratings yet
He Images Outline The Steps To Solve A Supervised Learning Problem
24 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Linear Regression - Mathematical Concepts
No ratings yet
Linear Regression - Mathematical Concepts
22 pages
Regression Questionnaire
No ratings yet
Regression Questionnaire
10 pages
Linear Regression Lecture Notes
No ratings yet
Linear Regression Lecture Notes
28 pages
Linear Regression Solution
No ratings yet
Linear Regression Solution
1 page
Linear Regression A Foundational ML Algorithm
No ratings yet
Linear Regression A Foundational ML Algorithm
10 pages
Regression
No ratings yet
Regression
16 pages
Linear Regression Guide & Examples
No ratings yet
Linear Regression Guide & Examples
36 pages
Linear & Polynomial Regression Guide
No ratings yet
Linear & Polynomial Regression Guide
56 pages
Python Data Analysis Guide
No ratings yet
Python Data Analysis Guide
171 pages
Lecture6 Regression
No ratings yet
Lecture6 Regression
42 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Linear Regression
No ratings yet
Linear Regression
49 pages
Regression
No ratings yet
Regression
6 pages
ML Section2
No ratings yet
ML Section2
36 pages
CSE 412 Lab Manual 3 Linear Regression
No ratings yet
CSE 412 Lab Manual 3 Linear Regression
10 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
44 pages
Practical 5
No ratings yet
Practical 5
8 pages
Linear Regression - Everything You Need To Know About Linear Regression
No ratings yet
Linear Regression - Everything You Need To Know About Linear Regression
17 pages
Week 04
No ratings yet
Week 04
101 pages
Revised-L3-Linear Regression
No ratings yet
Revised-L3-Linear Regression
41 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
Notes Chapter 2
No ratings yet
Notes Chapter 2
19 pages
Lec 6
No ratings yet
Lec 6
19 pages
MLDAP Module2
No ratings yet
MLDAP Module2
32 pages
Linear Regression Assumptions Impact
No ratings yet
Linear Regression Assumptions Impact
21 pages
11 Regression
No ratings yet
11 Regression
34 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Linear Regression Model 1
No ratings yet
Linear Regression Model 1
23 pages
Chapter4 Regression
No ratings yet
Chapter4 Regression
15 pages
3 Neural
No ratings yet
3 Neural
27 pages
2 Neural
No ratings yet
2 Neural
14 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
Logarithms 230602 164707
No ratings yet
Logarithms 230602 164707
4 pages
Blend Astm Final Dosage Units Calculations Revised 04-22-18
No ratings yet
Blend Astm Final Dosage Units Calculations Revised 04-22-18
27 pages
F-Test of A Linear Restriction: U L K y
No ratings yet
F-Test of A Linear Restriction: U L K y
2 pages
2nd Class - Practical Statistics For Data Scientist - Shift Academy
No ratings yet
2nd Class - Practical Statistics For Data Scientist - Shift Academy
142 pages
Assignment-Based Subjective Questions/Answers
No ratings yet
Assignment-Based Subjective Questions/Answers
3 pages
BMAT Scores and Outcomes
No ratings yet
BMAT Scores and Outcomes
9 pages
Quantitative Finance Problems and Solutions
No ratings yet
Quantitative Finance Problems and Solutions
2 pages
PRMQ: Normative Data & Structure
No ratings yet
PRMQ: Normative Data & Structure
17 pages
W02 Case Study-Math124 - Doc - W02CaseStudyOneSampleZ - Osazee
No ratings yet
W02 Case Study-Math124 - Doc - W02CaseStudyOneSampleZ - Osazee
8 pages
BU255 Final Exam Cheat Sheet
No ratings yet
BU255 Final Exam Cheat Sheet
2 pages
Design of Experiments
No ratings yet
Design of Experiments
65 pages
R Programming Syllabus
100% (1)
R Programming Syllabus
3 pages
Lecture 6-7 Significance Testing 30-08-2024
No ratings yet
Lecture 6-7 Significance Testing 30-08-2024
16 pages
Breast Cancer Classification With Machine Learning
No ratings yet
Breast Cancer Classification With Machine Learning
17 pages
Stock Data Analysis Tool
No ratings yet
Stock Data Analysis Tool
166 pages
Notes On Cluster Analysis
No ratings yet
Notes On Cluster Analysis
15 pages
Single Mean Hypothesis Testing Guide
No ratings yet
Single Mean Hypothesis Testing Guide
16 pages
Statistical Tools for Test Analysis
No ratings yet
Statistical Tools for Test Analysis
110 pages
13 Zscores PDF
No ratings yet
13 Zscores PDF
13 pages
Ex No.: Date: Problem Statement
No ratings yet
Ex No.: Date: Problem Statement
3 pages
How To Compute Mean, Median, Mode, Range, and Standard Deviation
100% (1)
How To Compute Mean, Median, Mode, Range, and Standard Deviation
3 pages
Multiple Imputation Presentation
No ratings yet
Multiple Imputation Presentation
23 pages
Logistic Regression in Biostatistics
No ratings yet
Logistic Regression in Biostatistics
9 pages
2010 VJC H1 Math Prelim
No ratings yet
2010 VJC H1 Math Prelim
4 pages
Practice Problems
No ratings yet
Practice Problems
2 pages
24 - Pande Komang Adinda Febriasi - TUGAS Praktikum STATISTIKA LATIHAN 2
No ratings yet
24 - Pande Komang Adinda Febriasi - TUGAS Praktikum STATISTIKA LATIHAN 2
9 pages
Data Mining Unit-1 Complete
No ratings yet
Data Mining Unit-1 Complete
45 pages
Apsy 366
No ratings yet
Apsy 366
2 pages
Trip Generation: CIVL3420
No ratings yet
Trip Generation: CIVL3420
15 pages
Chapter 3 Multiple Regression Analysis Estimation
No ratings yet
Chapter 3 Multiple Regression Analysis Estimation
9 pages
Pengaruh Intellectual Capital Terhadap Kinerja Keuangan Pada Perusahaan Manufaktur Dengan Competitive
No ratings yet
Pengaruh Intellectual Capital Terhadap Kinerja Keuangan Pada Perusahaan Manufaktur Dengan Competitive
15 pages

Slide 3 Linear Regression

Uploaded by

Slide 3 Linear Regression

Uploaded by

24CSA524: Machine Learning

Predict for House area=1100?

How is this line represented mathematically?

Intercept Slope parameter

• Example: A simple wage equation

Estimated Model Representation

• Given Years of Experience 9 64

3−9.1 ∗ 30−55.4 + 8−9.1 ∗ 57−55.4 +⋯+ 16−9.1 ∗(83−55.4)

θ0 = 55.4 − 3.5 ∗ 9.1 = 23.2

Thus, 𝑦 = 23. 2 + 3.5 ∗ 𝑥

Thus, when x=10 years, prediction of y (salary)

simple regression model

Multivariate or multiple regression model

• Regression is sensitive to outliers:

• Solution: Normalize the features by replacing the values with the z-

• is are the parameters

Goal:min imizeJ ( 0, 1)

Where J(0,1) is the cost function or the squared error function

• To calculate this - define an error function 𝐽(𝜃) = ෍(ℎ(𝑥 𝑖 ) − 𝑦 𝑖 )2

Goal:min imizeJ ( 0, 1)

J(0,1) is the cost function or the squared error function

You might also like