0% found this document useful (0 votes)

8 views56 pages

Lecture W2ab

Uploaded by

ayeshatanveer.cs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views56 pages

Lecture W2ab

Uploaded by

ayeshatanveer.cs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

CS-871: Machine Learning

Week 2 – Linear Regression

Instructor: Dr. Daud Abdullah
Email: daud.abdullah@seecs.edu.pk
General Conduct
• Be respectful of others

• Only speak at your turn and preferably raise your hand if you want
to say something

• Do not cut off others when they are talking or asking questions

• Join the class on time and always close the door while entering the
class causing minimum hindrance

2
Linear regression
with one variable

Supervised Learning
Regression Problem: Predict real-valued output

Andrew Ng
Supervised Learning
The computer is presented with example inputs and
their desired outputs, and the goal is to learn a general
rule that maps inputs to outputs.

Andrew Ng
Supervised Learning: Regression
• Goal: Determine the function, which maps x to y
• Function: Approximated using the dataset
• The machine learns, for what value of x, what value
of y is usually obtained
• Formulated as a function
• Any unseen x as input provides an expected y
Andrew Ng
Training set of housing Size in feet2 (x) Price ($) in 1000's (y)
prices
(Portland, OR) 2104 460
Linear regression with one variable.
1416 232
Univariate linear regression. 1534 315 m
One variable 852 178
… …
Notation:
m = Number of training examples
x’s = “input” variable / features
x(1) = 2104
y’s = “output” variable / “target” variable
x(2) = 1416
(x, y) – one training example y(1) = 460
(x(i), y(i)) – ith trainingg example
Andrew Ng
Andrew Ng
Andrew Ng
Andrew Ng
How do we represent h ?
Training Set

Learning Algorithm

Size of h Estimated
house price
Degree 7 polynomial
x hypothesis y Linear hypothesis
hypothesis
h maps from x’s to y’s

Restriction bias: Consider only

linear functions

Andrew Ng
Size in feet2 (x) Price ($) in 1000's (y)
Training Set
2104 460
1416 232
1534 315
852 178
… …
Hypothesis:

‘s: Parameters
How to choose ‘s ?
Andrew Ng
3 3 3
h(x) = 1.5 + 0·x h(x) = 0.5·x
2 2 2

1 1 1
h(x) = 1 + 0.5·x
0 0 0
0 1 2 3 0 1 2 3 0 1 2 3

Andrew Ng
(x(i), y(i))
1 𝑚 (i)) −y(i) )2
y minimize 𝑖=1 (hΘ(x
Θ0, Θ1 Θ0 Θ1 2𝑚

h(x(i)) = Θ0+ Θ1x(i)

x
1 𝑚 (i)) −y(i) )2
J(Θ0, Θ1) = 𝑖=1 (hΘ(x
2𝑚

Idea: Choose so that

Minimize J(Θ0, Θ1) : Cost Function
is close to for our Θ0 Θ1

training examples Squared error function

Andrew Ng
Linear regression
with one variable
Cost function
intuition I
Machine Learning

Andrew Ng
Simplified
Hypothesis:

Parameters:
h(x) Θ0 = 0
h(x)

Cost Function:

Goal:

Andrew Ng
(for fixed , this is a function of x) (function of the parameter )

3 3

2 2
y
1 1

0 0
0 1 x 2 3 -0.5 0 0.5 1 1.5 2 2.5
1 𝑚 (i)) − y(i))2
J(Θ1) = 𝑖=1 (hΘ(x
2𝑚
1 𝑚 (i)) − y(i)) 2
= 𝑖=1 (Θ1x 𝐽 1 =0
2𝑚
1
= 02 + 02 + 02
2𝑚 Andrew Ng
(for fixed , this is a function of x) (function of the parameter )

3 3

2 2
y y(i)
1 1
hΘ(x(i))
0 0
0 1 2 3 -0.5 0 0.5 1 1.5 2 2.5
x
1 3
J(0.5) = 𝑖=1 [(0.5−1)2 +(1−2)2+(1.5−3)2]
2∙3
1
= ∙(3.5) = 0.58
6
Andrew Ng
(for fixed , this is a function of x) (function of the parameter )

3 3

2 2
y
1 1

0 0
0 1 2 3 -0.5 0 0.5 1 1.5 2 2.5
x
1 3 2
J(0) = 𝑖=1 [1 +22+32]
2∙3
1
= 6
∙ 14 = 2.3
Andrew Ng
Linear regression
with one variable
Cost function
intuition II
Machine Learning

Andrew Ng
Hypothesis:

Parameters:

Cost Function:

Goal:

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

500

400
Price ($) 300
in 1000’s
200
Θ0 = 50
100
Θ1 = 0.06
0
0 1000 2000 3000
Size in feet2 (x)

Andrew Ng
Contour plots

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Is the slope positive or negative What is the value of

Do you agree?
Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

h(x) = 360 + 0·x

Θ0 = 360
Θ1 = 0
Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Is the slope positive or negative What is the value of

Do you agree?
Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Is the slope positive or negative What is the value of

Do you agree?
Andrew Ng
Linear regression
with one variable

Gradient
Machine Learning
descent
Andrew Ng
Have some function
Want

Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum

Andrew Ng
J(0,1)

1
0

Andrew Ng
J(0,1)

1
0

Andrew Ng
Linear regression
with one variable
Gradient descent for
linear regression
Machine Learning

Andrew Ng
J(0,1)

1
0

Andrew Ng
Convex function

Bowl-shaped

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
Training, Validation and Testing

• Dataset is usually split in to training set, validation set and testing

set
• Training set is used to train your model and estimate its
parameters
• Validation set is used to validate the performance of your model
and tune the hyper-parameters
• Testing set is used to check the accuracy of your final model
• We need our model to perform well on unseen data

5
Choosing Step Size (Learning Rate)

𝒘𝑛𝑒𝑤 = 𝒘𝑜𝑙𝑑 − 𝛼∇𝒘 𝑇𝑟𝑎𝑖𝑛𝐿𝑜𝑠𝑠 𝒘

• Could be constant

• Could be decreasing (1/𝑠𝑞𝑟𝑡(𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑢𝑝𝑑𝑎𝑡𝑒𝑠 𝑚𝑎𝑑𝑒 𝑠𝑜 𝑓𝑎𝑟)

Initial 1

6
Regression Model Performance Evaluation

• Mean Squared Error (MSE): Average squared difference between

predicted and actual values.
• Root Mean Squared Error (RMSE): Square Root of MSE.
• Mean Absolute Error (MAE): Average absolute difference between
predicted and actual values.
• R-Squared (R2): Proportion of variance in target variable explained
by model. Ranges between 0 and 1. Higher means better
performance.

7
Multiple features (variables)
Size in feet2 (𝑥) Price ($) in 1000’s (𝑦)

2104 400
1416 232
1534 315
852 178
… …

𝑓𝑤,𝑏 𝑥 = 𝑤𝑥 + 𝑏

Andrew Ng
Multiple features (variables)
Size in Number of Number of Age of home Price ($) in
feet2 bedrooms floors in years $1000’s

2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …
x𝑗 = 𝑗 𝑡ℎ feature
𝑛 = number of features
x 𝑖 = features of 𝑖 𝑡ℎ training example
𝑖
x𝑗 = value of feature 𝑗 in 𝑖 𝑡ℎ training example

Andrew Ng
Model:
Previously: 𝑓𝑤,𝑏 𝑥 = 𝑤𝑥 + 𝑏

𝑓𝑤,𝑏 x = 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯ + 𝑤𝑛 𝑥𝑛 + 𝑏

Andrew Ng
𝑓𝑤,𝑏 𝑥 = 𝑤1 𝑥1 + 𝑤2𝑥2 + ⋯ + 𝑤𝑛 𝑥𝑛 + 𝑏

𝑓w,𝑏 x = w ∙ x + 𝑏 =

multiple linear regression

Andrew Ng
Previous notation Vector notation
Parameters 𝑤1 , ⋯ , 𝑤𝑛
w = 𝑤1 ⋯ 𝑤𝑛
𝑏 𝑏
Model 𝑓w,𝑏 x = 𝑤1 𝑥1 + ⋯ + 𝑤𝑛 𝑥𝑛 + 𝑏 𝑓w,𝑏 x = w ∙ x + 𝑏

Cost function 𝐽 𝑤1 , ⋯ , 𝑤𝑛 , 𝑏 𝐽 w, 𝑏

Gradient descent
repeat { repeat {
𝜕 𝜕
𝑤𝑗 = 𝑤𝑗 − 𝛼𝜕𝑤 𝐽 𝑤1 , ⋯ , 𝑤𝑛 , 𝑏 𝑤𝑗 = 𝑤𝑗 − 𝛼𝜕𝑤 𝐽 w, 𝑏
𝑗 𝑗
𝜕 𝜕
𝑏=𝑏 − 𝛼𝜕𝑏 𝐽 𝑤1 , ⋯ , 𝑤𝑛 , 𝑏 𝑏 = 𝑏 − 𝛼𝜕𝑏 𝐽 w, 𝑏
} }
Andrew Ng
Gradient descent
One feature 𝑛 features 𝑛 ≥ 2
repeat {
𝑚 repeat { 𝑚
1 1 𝑖
𝑤 = 𝑤 − 𝛼 ෍ 𝑓𝑤,𝑏 𝑥 𝑖 −𝑦 𝑖 𝑥 𝑖 𝑤1 = 𝑤1 − 𝛼 ෍ 𝑓w,𝑏 x 𝑖 − 𝑦 𝑖
𝑥1
𝑚 𝑚
𝑖=1 𝑖=1
⋮ 𝜕
𝐽 w, 𝑏
𝜕 𝜕𝑤1
𝜕𝑤
𝐽 𝑤, 𝑏 𝑚
1 𝑖 𝑖 𝑖
𝑤𝑛 = 𝑤𝑛 − 𝛼 ෍ 𝑓w,𝑏 x −𝑦 𝑥𝑛
𝑚
𝑚 𝑖=1
𝑚 1
1 𝑖 𝑖 𝑏 = 𝑏 − 𝛼 ෍ 𝑓w,𝑏 x 𝑖 −𝑦 𝑖
𝑏 = 𝑏 − 𝛼 ෍ 𝑓𝑤,𝑏 𝑥 −𝑦 𝑚
𝑚 𝑖=1
𝑖=1 simultaneously update
simultaneously update 𝑤, 𝑏 𝑤𝑗 (for 𝑗 = 1, ⋯ , 𝑛) and 𝑏
} }

Andrew Ng
Linear Regression Example
• Training data given for linear regression is:

𝒙 𝑦
[1,0] 2
[1,0] 4
[0,1] 1

• Initialize weights as 0. Calculate the updated weights for this

problem using 2 iterations.

8
QUESTIONS???
AC K N OW L E D G E M E N T !
• Various contents in this presentation have been taken from different books,
lecture notes, and the web. These solely belong to their owners, and are here used
only for clarifying various educational concepts. Any copyright infringement is
not intended.

Lecture W2ab
No ratings yet
Lecture W2ab
44 pages
Lecture W2c
No ratings yet
Lecture W2c
40 pages
Lecture 6,7-Linear Regression
No ratings yet
Lecture 6,7-Linear Regression
47 pages
Week 04
No ratings yet
Week 04
101 pages
Week 4
No ratings yet
Week 4
101 pages
LinearRegression) Byimran
No ratings yet
LinearRegression) Byimran
47 pages
Unit 4 - Linear Regression
No ratings yet
Unit 4 - Linear Regression
52 pages
Linear Regression
No ratings yet
Linear Regression
75 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
48 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Intro to Linear Regression
100% (1)
Intro to Linear Regression
47 pages
ML 02 Linear Regression
No ratings yet
ML 02 Linear Regression
51 pages
Lecture 2
No ratings yet
Lecture 2
62 pages
Linear Regression
No ratings yet
Linear Regression
64 pages
Linear Regression Lecture Notes
No ratings yet
Linear Regression Lecture Notes
28 pages
2-LR Optim
No ratings yet
2-LR Optim
60 pages
Linear Regression
No ratings yet
Linear Regression
54 pages
Abstract: y F X X X, X, X
No ratings yet
Abstract: y F X X X, X, X
10 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
49 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
Lecture 2-Linear-Regression-Part1
No ratings yet
Lecture 2-Linear-Regression-Part1
80 pages
Linear Regression
No ratings yet
Linear Regression
38 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
Machine Learning (ML) RIME-832: Dr. Hasan Sajid
100% (1)
Machine Learning (ML) RIME-832: Dr. Hasan Sajid
57 pages
Lecture 4 - Cost Function
No ratings yet
Lecture 4 - Cost Function
18 pages
04 LinearRegression PDF
No ratings yet
04 LinearRegression PDF
61 pages
Linear Regression
No ratings yet
Linear Regression
91 pages
Unit Ii
No ratings yet
Unit Ii
48 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
Machine Learning - 5
No ratings yet
Machine Learning - 5
50 pages
Revised-L3-Linear Regression
No ratings yet
Revised-L3-Linear Regression
41 pages
Lecture 2
No ratings yet
Lecture 2
87 pages
Week 6
No ratings yet
Week 6
72 pages
Linear Regression Lecture Notes
No ratings yet
Linear Regression Lecture Notes
34 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Lecture W2c
No ratings yet
Lecture W2c
16 pages
Lecture2-Linear Regression With One Variable
No ratings yet
Lecture2-Linear Regression With One Variable
49 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
23 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
Machine Learning Coursera
100% (1)
Machine Learning Coursera
55 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
04 LinearRegression
No ratings yet
04 LinearRegression
61 pages
Unit-2 Supervised Machine Learning
No ratings yet
Unit-2 Supervised Machine Learning
132 pages
AI & ML Lab Manual - LDCE
No ratings yet
AI & ML Lab Manual - LDCE
70 pages
Lecture 3 Ai
No ratings yet
Lecture 3 Ai
48 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
49 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
21 pages
Lecture 2
No ratings yet
Lecture 2
71 pages
SumitBurnwal ML
No ratings yet
SumitBurnwal ML
13 pages
Lecture 1, Part 1: Linear Regression: Roger Grosse
No ratings yet
Lecture 1, Part 1: Linear Regression: Roger Grosse
9 pages
Cost Function
No ratings yet
Cost Function
17 pages
Boston Housing Price Prediction
No ratings yet
Boston Housing Price Prediction
33 pages
Lecture 2. Regression
No ratings yet
Lecture 2. Regression
61 pages
HCI Final Term
No ratings yet
HCI Final Term
21 pages
AI Intro
No ratings yet
AI Intro
40 pages
Diff. Non-Diff. Func.
No ratings yet
Diff. Non-Diff. Func.
62 pages
CS-871 Machine Learning - Daud Abdullah
No ratings yet
CS-871 Machine Learning - Daud Abdullah
3 pages
The Super Singer
No ratings yet
The Super Singer
8 pages
Pid - 235
No ratings yet
Pid - 235
14 pages
AMED Project 2
No ratings yet
AMED Project 2
14 pages
RoBERTa Token Classification With Additional PLODv2 Data
No ratings yet
RoBERTa Token Classification With Additional PLODv2 Data
22 pages
Rajaram Reghuram
No ratings yet
Rajaram Reghuram
37 pages
NLP Cat 2
No ratings yet
NLP Cat 2
78 pages
Afaan Oromo News Stance Detection Using Machine and Deep Learning
No ratings yet
Afaan Oromo News Stance Detection Using Machine and Deep Learning
109 pages
Computer Vision Learning and Inference Bob Edits
No ratings yet
Computer Vision Learning and Inference Bob Edits
51 pages
Ai 10
No ratings yet
Ai 10
2 pages
Capstone Synopsis
No ratings yet
Capstone Synopsis
6 pages
Artificial Intelligence and Deep Learning in Pathology 1st Edition Stanley Cohen MD (Editor) Full Chapters Included
100% (1)
Artificial Intelligence and Deep Learning in Pathology 1st Edition Stanley Cohen MD (Editor) Full Chapters Included
123 pages
Predictive Modeling Applications in Actuarial Science Volume 2 Case Studies in Insurance 1st Edition Edward W. Frees Instant Download
100% (2)
Predictive Modeling Applications in Actuarial Science Volume 2 Case Studies in Insurance 1st Edition Edward W. Frees Instant Download
59 pages
Deep Learning Unit I II MCQ
No ratings yet
Deep Learning Unit I II MCQ
2 pages
Rock Vs Mine Prediction
No ratings yet
Rock Vs Mine Prediction
15 pages
ML Usar Manual-2
No ratings yet
ML Usar Manual-2
21 pages
Emotion Detection-Final
No ratings yet
Emotion Detection-Final
24 pages
CAPSTONE THESIS Format
No ratings yet
CAPSTONE THESIS Format
29 pages
Credit Card Fraud Detection Report
No ratings yet
Credit Card Fraud Detection Report
31 pages
Yolov 10
No ratings yet
Yolov 10
20 pages
4th Year - House Price Prediction
No ratings yet
4th Year - House Price Prediction
20 pages
Homework 1 OM690
No ratings yet
Homework 1 OM690
5 pages
Beginner's Guide to Random Forests
No ratings yet
Beginner's Guide to Random Forests
73 pages
GANBOT: A GAN Based Framework For Social Bot Detection: Shaghayegh Najari Mostafa Salehi Reza Farahbakhsh
No ratings yet
GANBOT: A GAN Based Framework For Social Bot Detection: Shaghayegh Najari Mostafa Salehi Reza Farahbakhsh
11 pages
Steps Involved in Image Classification Using CNN
No ratings yet
Steps Involved in Image Classification Using CNN
5 pages
Ijirt162213 Paper
No ratings yet
Ijirt162213 Paper
6 pages
Precision and Recall in ML Evaluation
No ratings yet
Precision and Recall in ML Evaluation
28 pages
DiffuseMix: AI Image Augmentation
No ratings yet
DiffuseMix: AI Image Augmentation
18 pages
Car Price Prediction Report
No ratings yet
Car Price Prediction Report
29 pages
Independent Research Project (IRP) - Marketing Final Report Submission (Section A and D)
No ratings yet
Independent Research Project (IRP) - Marketing Final Report Submission (Section A and D)
20 pages
ML Valuation of Illiquid Options
No ratings yet
ML Valuation of Illiquid Options
23 pages

Lecture W2ab

Uploaded by

Lecture W2ab

Uploaded by

CS-871: Machine Learning

Week 2 – Linear Regression

Restriction bias: Consider only

h(x(i)) = Θ0+ Θ1x(i)

Idea: Choose so that

training examples Squared error function

Is the slope positive or negative What is the value of

h(x) = 360 + 0·x

Is the slope positive or negative What is the value of

Is the slope positive or negative What is the value of

• Dataset is usually split in to training set, validation set and testing

𝒘𝑛𝑒𝑤 = 𝒘𝑜𝑙𝑑 − 𝛼∇𝒘 𝑇𝑟𝑎𝑖𝑛𝐿𝑜𝑠𝑠 𝒘

• Could be decreasing (1/𝑠𝑞𝑟𝑡(𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑢𝑝𝑑𝑎𝑡𝑒𝑠 𝑚𝑎𝑑𝑒 𝑠𝑜 𝑓𝑎𝑟)

• Mean Squared Error (MSE): Average squared difference between

multiple linear regression

• Initialize weights as 0. Calculate the updated weights for this

You might also like