0% found this document useful (0 votes)

24 views27 pages

Supervised Learning Part2

Uploaded by

lulucifer610

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views27 pages

Supervised Learning Part2

Uploaded by

lulucifer610

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

18/11/2021

Supervised
learning (Part Ⅱ)

Outline

- Linear regression
• Model representation
• Cost function
• Gradient Descent
- Logistic regression
• Generative vs discriminative classifiers
• Hypothesis representation
• Cost function

1
18/11/2021

Linear regression

Regression: An example
Regression
Training set
real-valued output

Learning Algorithm

Size of house Hypothesis Estimated price

2
18/11/2021

House pricing prediction

Price ($)
in 1000’s
400
300
200
100

500 1000 1500 2000 2500

Size in m^2 5

Training set
Size in m^2 (x) Price ($) in 1000’s (y)
2104 460
1416 232
1534 315
𝑚 = 47
852 178
… …
• Notation:
• 𝑚 = Number of training examples
• 𝑥 = Input variable / features Examples:
• 𝑦 = Output variable / target variable 𝑥 ( ) = 2104
• (𝑥, 𝑦) = One training example 𝑥 ( ) = 1416
• (𝑥 ( ) , 𝑦 ( ) ) = 𝑖 training example 𝑦 ( ) = 460
6

3
18/11/2021

Model representation

Training set
ℎ 𝑥

Learning Algorithm Price ($)

in 1000’s
400
300
200
100

Size of house Hypothesis Estimated price 500 1000 1500 2000 2500
Size in m^2

Univariate linear regression 7

Cost function
Size in m^2 (x) Price ($) in 1000’s (y)
2104 460
1416 232
1534 315
𝑚 = 47
852 178
… …

•Hypothesis

: parameters/weights

How to choose ’s? 8

4
18/11/2021

Cost function

𝑦 𝑦 𝑦
3 3 3
2 2 2
1 1 1

1 2 3 𝑥 1 2 3 𝑥 1 2 3 𝑥

Cost function

• Idea: 𝜃 ,𝜃
Choose 𝜃 , 𝜃 so that
ℎ 𝑥 is close to 𝑦 for our ()
training example (𝑥, 𝑦)

𝑦 1
Price ($) 𝐽 𝜃 ,𝜃 = ℎ 𝑥 −𝑦
in 1000’s 2𝑚
400
300
200
100
𝑥 Cost function
500 1000 1500 2000 2500 𝜃 ,𝜃
Size in feet^2 10

5
18/11/2021

Cost function
Simplified
• Hypothesis: • Hypothesis:
ℎ 𝑥 =𝜃 +𝜃 𝑥 ℎ 𝑥 =𝜃 𝑥 𝜃 =0
• Parameters: • Parameters:
𝜃 ,𝜃 𝜃
• Cost function: • Cost function:
1 1
𝐽 𝜃 ,𝜃 = ℎ 𝑥 −𝑦 𝐽 𝜃 = ℎ 𝑥 −𝑦
2𝑚 2𝑚

• Goal: • Goal:
minimize 𝐽 𝜃 , 𝜃 minimize 𝐽 𝜃
𝜃 ,𝜃 𝜃 ,𝜃
11

Cost function

, function of , function of
𝑦 𝐽 𝜃
3 3

2 2

1 1

1 2 3 𝑥 0 1 2 3 𝜃
12

6
18/11/2021

Cost function
, function of , function of
𝑦 𝐽 𝜃
3 3

2 2

1 1

1 2 3 𝑥 0 1 2 3 𝜃
13

Cost function
, function of , function of
𝑦 𝐽 𝜃
3 3

2 2

1 1

1 2 3 𝑥 0 1 2 3 𝜃
14

7
18/11/2021

Cost function

, function of , function of
𝑦 𝐽 𝜃
3 3

2 2

1 1

1 2 3 𝑥 0 1 2 3 𝜃
15

Cost function

, function of , function of
𝑦 𝐽 𝜃
3 3

2 2

1 1

1 2 3 𝑥 0 1 2 3 𝜃
16

8
18/11/2021

Cost function
•Hypothesis:

•Parameters:

•Cost function:

•Goal:
𝜃 ,𝜃
17

Cost function

9
18/11/2021

How do we find good that minimize ?

Gradient descent

Have some function

Want
𝜃 ,𝜃

Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at minimum
20

10
18/11/2021

Gradient descent

Repeat until convergence{

,
(for and )
}

: Learning rate (step size)

,
: derivative (rate of change)
22

11
18/11/2021

Gradient descent

Correct: simultaneous update Incorrect:

𝜕𝐽 𝜃 , 𝜃 𝜕𝐽 𝜃 , 𝜃
temp0 ≔ 𝜃 −𝛼 temp0 ≔ 𝜃 −𝛼
𝜕𝜃 𝜕𝜃
𝜕𝐽 𝜃 , 𝜃 𝜃 ≔ temp0
temp1 ≔ 𝜃 −𝛼
𝜕𝜃 𝜕𝐽 𝜃 , 𝜃
temp1 ≔ 𝜃 −𝛼
𝜃 ≔ temp0 𝜕𝜃
𝜃 ≔ temp1 𝜃 ≔ temp1

𝐽 𝜃

𝜕
3 𝐽 𝜃 <0
𝜕𝜃
𝜕
2 𝐽 𝜃 >0
𝜕𝜃

0 1 2 3 𝜃 24

12
18/11/2021

Learning rate

Gradient descent for linear regression

Repeat until convergence{

𝝏𝑱 𝜽𝟎 ,𝜽𝟏
𝜽𝒋 ≔ 𝜽𝒋 − 𝜶 (for 𝒋 = 𝟎 and 𝒋 = 𝟏)
𝝏𝜽𝒋
}

• Linear regression model

ℎ 𝑥 =𝜃 +𝜃 𝑥
1
𝐽 𝜃 ,𝜃 = ℎ 𝑥 −𝑦
2𝑚
26

13
18/11/2021

Computing partial derivative

,
• =

= ()

,
• :
,
• :
27

Gradient descent for linear regression

Repeat until convergence{

1
𝜃 ≔𝜃 −𝛼 ℎ 𝑥 −𝑦
𝑚

1
𝜃 ≔𝜃 −𝛼 ℎ 𝑥 −𝑦 𝑥
𝑚
}

Update 𝜃 and 𝜃 simultaneously

14
18/11/2021

Batch gradient descent

• “Batch”: Each step of gradient descent uses all the training examples
Repeat until convergence{
𝑚: Number of training examples
1
𝜃 ≔𝜃 −𝛼 ℎ 𝑥 −𝑦
𝑚

1
𝜃 ≔𝜃 −𝛼 ℎ 𝑥 −𝑦 𝑥
𝑚
}
29

Training dataset with one feature

Size in m^2 (x) Price ($) in 1000’s (y)

2104 460
1416 232
1534 315
852 178
… …

ℎ 𝑥 =𝜃 +𝜃 𝑥

15
18/11/2021

Multiple features (input variables)

Number of Number of Age of home Price ($) in
Size in m^2 (𝑥 )
bedrooms (𝑥 ) floors (𝑥 ) (years) (𝑥 ) 1000’s (y)
2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… …
Notation:
𝑛 = Number of features ( )
𝑥 =?
𝑥 ( ) = Input features of 𝑖 training example ( )
() 𝑥 =?
𝑥 = Value of feature 𝑗 in 𝑖 training example
31

Multiple features (input variables)

Hypothesis
Previously:

Now:

16
18/11/2021

Matrix representation
ℎ 𝑥 = 𝜃 + 𝜃 𝑥 + 𝜃 𝑥 +⋯+ 𝜃 𝑥
• For convenience of notation, define 𝑥 = 1
()
(𝑥 = 1 for all examples)
𝑥 𝜃
𝑥 𝜃
•𝒙= 𝑥 ∈𝑅 𝜽= 𝜃 ∈𝑅
⋮ ⋮
𝑥 𝜃

• ℎ 𝑥 = 𝜃 +𝜃 𝑥 +𝜃 𝑥 +⋯+𝜃 𝑥
=𝜽 𝒙 33

Gradient descent

• Previously (𝑛 = 1) • New algorithm (𝑛 ≥ 1)

Repeat until convergence{ Repeat until convergence{

1 1
𝜃 ≔𝜃 −𝛼 ℎ 𝑥 −𝑦 𝜃 ≔𝜃 −𝛼 ℎ 𝑥 −𝑦 𝑥
𝑚 𝑚
}
1
𝜃 ≔𝜃 −𝛼 ℎ 𝑥 −𝑦 𝑥
𝑚
Simultaneously update
} 𝜃 , for 𝑗 = 0, 1, ⋯ , 𝑛

17
18/11/2021

Gradient descent in practice: Feature scaling

• Idea: Make sure features are on a similar scale (e.g,. −1 ≤ 𝑥 ≤ 1)
• E.g. 𝑥 = size (0-2000 m^2)
𝑥 = number of bedrooms (1-5)

𝜃 𝜃
3 3
2 2
1 1

0 1 2 3 𝜃 0 1 2 3 𝜃 35

Gradient descent in practice: Learning rate

• 𝛼 too small: slow convergence

• 𝛼 too large: may not converge

• To choose 𝛼, try

0.001, … 0.01, …, 0.1, … , 1

18
18/11/2021

Logistic regression

Logistic regression: An example

1 (Yes)
Malignant?

0 (No)
Tumor Size
A classification problem

19
18/11/2021

If we use linear regression?

1 (Yes)
Malignant?

0 (No)
Tumor Size
ℎ 𝑥 =𝜃 𝑥

• Threshold classifier output at 0.5

• If ℎ 𝑥 ≥ 0.5, predict “𝑦 = 1”
• If ℎ 𝑥 < 0.5, predict “𝑦 = 0” 39

If we use linear regression?

Classification: 𝑦 = 1 or 𝑦 = 0

ℎ 𝑥 = 𝜃 𝑥 (from linear regression)

can be > 1 or < 0

Logistic regression: 0 ≤ ℎ 𝑥 ≤ 1

Logistic regression is actually for classification

20
18/11/2021

Classification

• Learn: h:X->Y
– X – features
– Y – target classes
• Suppose you know P(Y|X) exactly, how should you
classify?
– Bayes classifier:
y *  hbayes ( x )  arg max P (Y  y | X  x )
y

Generative vs. Discriminative

Classifiers - Intuition
• Generative classifier, e.g., Naïve Bayes:
• Assume some functional form for P(X|Y), P(Y)
• Estimate parameters of P(X|Y), P(Y) directly from training data
• Use Bayes rule to calculate P(Y|X=x)
• This is ‘generative’ model
• Indirect computation of P(Y|X) through Bayes rule
• Probabilistic model of each class

• Discriminative classifier, e.g., Logistic Regression:

• Assume some functional form for P(Y|X)
• Estimate parameters of P(Y|X) directly from training data
• This is the ‘discriminative’ model
• Directly learn P(Y|X)
• Focus on the decision boundary

21
18/11/2021

Logistic Regression: Hypothesis

• In logistic regression, we learn the conditional distribution P(y|x)

• Let py(x; 𝜃) be our estimate of P(y|x), where 𝜃 is a vector of adjustable
parameters.
• Assume there are two classes, y = 0 and y = 1 and
1 1
𝑝1(𝑥; 𝜃) = 𝑝0(𝑥; 𝜃) = 1 −
1+𝑒 1+𝑒
• This is equivalent to
( ; )
Log =𝜃 𝑥
( ; )

• That is, the log odds of class 1 is a linear function of x

Hypothesis representation

• Want

where
𝑔(𝑧)

• Sigmoid function
• Logistic function 𝑧 44

22
18/11/2021

Interpretation of hypothesis output

• ℎ 𝑥 = estimated probability that 𝑦 = 1 on input 𝑥

𝑥 1
• Example: If 𝑥 = x =
tumorSize
• ℎ 𝑥 = 0.7

• Tell patient that 70% chance of tumor being malignant

Logistic regression

ℎ 𝑥 =𝑔 𝜃 𝑥 𝑔(𝑧)
1
𝑔 𝑧 =
1+𝑒
𝑧=𝜃 𝑥
Suppose predict “y = 1” if ℎ 𝑥 ≥ 0.5
𝑧= 𝜃 𝑥≥0
predict “y = 0” if ℎ 𝑥 < 0.5
𝑧= 𝜃 𝑥<0
46

23
18/11/2021

Decision boundary

Age
E.g.,

Tumor Size

• Predict “ ” if

Cost Function
Training set with 𝑚 examples
{ 𝑥 ,𝑦 , 𝑥 ,𝑦 ,⋯, 𝑥 ,𝑦
𝑥
𝑥
𝑥∈ ⋮ 𝑥 = 1, 𝑦 ∈ {0, 1}
𝑥

1
ℎ 𝑥 =
1+𝑒
How to choose parameters 𝜃?
48

24
18/11/2021

Reminder: Cost function for Linear Regression

1 1
𝐽 𝜃 = ℎ 𝑥 −𝑦 = Cost(ℎ (𝑥 ), 𝑦))
2𝑚 𝑚

Cost function for Logistic Regression

−log ℎ 𝑥 if 𝑦 = 1
Cost(ℎ 𝑥 , 𝑦) =
−log 1 − ℎ 𝑥 if 𝑦 = 0

0 ℎ 𝑥 1 0 ℎ 𝑥 1 50

25
18/11/2021

Logistic regression cost function

−log ℎ 𝑥 if 𝑦 = 1
• Cost(ℎ 𝑥 , 𝑦) =
−log 1 − ℎ 𝑥 if 𝑦 = 0

• Cost ℎ 𝑥 , 𝑦 = −𝑦 log h x − (1 − y) log 1 − ℎ 𝑥

• If 𝑦 = 1: Cost ℎ 𝑥 , 𝑦 = −log ℎ 𝑥
• If 𝑦 = 0: Cost ℎ 𝑥 , 𝑦 = −log 1 − ℎ 𝑥 51

Logistic regression

1
𝐽 𝜃 = Cost(ℎ (𝑥 ), 𝑦 ( ) ))
𝑚
=
− ∑ 𝑦 ( ) log ℎ 𝑥 ( ) + (1 − 𝑦 ( ) ) log 1 − ℎ 𝑥 ( )

Learning: fit parameter 𝜃 Prediction: given new 𝑥

min 𝐽(𝜃) Output ℎ 𝑥 =

26
18/11/2021

Reference

• Andrew Y.Ng. Machine learning. Stanford University.

Logistic Regression
No ratings yet
Logistic Regression
74 pages
Lec2 Regression
No ratings yet
Lec2 Regression
96 pages
Machine Learning Shortnote
No ratings yet
Machine Learning Shortnote
14 pages
Ch2Regression and Regularization1
No ratings yet
Ch2Regression and Regularization1
45 pages
Notes 05
No ratings yet
Notes 05
51 pages
A Layman's Guide To The Project
No ratings yet
A Layman's Guide To The Project
34 pages
Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
No ratings yet
Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
20 pages
CS 229: Supervised Learning Basics
100% (1)
CS 229: Supervised Learning Basics
48 pages
Linear Regression
No ratings yet
Linear Regression
95 pages
Lecture 8: Gradient Descent and Logistic Regression
No ratings yet
Lecture 8: Gradient Descent and Logistic Regression
39 pages
3 LogisticRegression
No ratings yet
3 LogisticRegression
30 pages
03-Logistic Regression
No ratings yet
03-Logistic Regression
59 pages
Deep Learning for Beginners
No ratings yet
Deep Learning for Beginners
42 pages
Logistic Regression
No ratings yet
Logistic Regression
24 pages
W2 Ann
No ratings yet
W2 Ann
12 pages
Lecture 11 Logistic-Marked
No ratings yet
Lecture 11 Logistic-Marked
27 pages
3-LG Eval
No ratings yet
3-LG Eval
52 pages
Data Science L19 - LogisticRegression
No ratings yet
Data Science L19 - LogisticRegression
52 pages
Linear Regression For Machine Learning Course
No ratings yet
Linear Regression For Machine Learning Course
41 pages
Algorithms Notes
No ratings yet
Algorithms Notes
66 pages
Lecture 5 - Logistic Regression
No ratings yet
Lecture 5 - Logistic Regression
28 pages
All Unit
No ratings yet
All Unit
100 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
EE769 7 Introduction To Neural Networks
No ratings yet
EE769 7 Introduction To Neural Networks
52 pages
Logistic Regression
No ratings yet
Logistic Regression
51 pages
ML02
No ratings yet
ML02
25 pages
383 Fall11 Lec19
No ratings yet
383 Fall11 Lec19
30 pages
Linear Models
No ratings yet
Linear Models
30 pages
ML Supervised Learning SNU
No ratings yet
ML Supervised Learning SNU
34 pages
DSCTP 2022 1 ML Slides
No ratings yet
DSCTP 2022 1 ML Slides
351 pages
2.1.3 Logistic Regression
No ratings yet
2.1.3 Logistic Regression
17 pages
Lecture 220927 02
No ratings yet
Lecture 220927 02
29 pages
Week 4
No ratings yet
Week 4
61 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
Machine Learning Guide 2017
No ratings yet
Machine Learning Guide 2017
15 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
01B DL2023 LinearModels
No ratings yet
01B DL2023 LinearModels
47 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
CH3 Logistic Regression 2020
No ratings yet
CH3 Logistic Regression 2020
28 pages
Lec1 PDF
No ratings yet
Lec1 PDF
56 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Linear - Regression - SGD
No ratings yet
Linear - Regression - SGD
71 pages
Logistic Regression
No ratings yet
Logistic Regression
34 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
M146 Lec3 Sidenotes S25
No ratings yet
M146 Lec3 Sidenotes S25
29 pages
Machine Learning01
No ratings yet
Machine Learning01
4 pages
Linear Regression
No ratings yet
Linear Regression
75 pages
CS414-Lesson 02. General Supervised Models and Shallow Models
No ratings yet
CS414-Lesson 02. General Supervised Models and Shallow Models
42 pages
Binary Logistic Regression 2
No ratings yet
Binary Logistic Regression 2
43 pages
Logistic Regression
No ratings yet
Logistic Regression
19 pages
Lecture 3.3 - Logistic Regression
No ratings yet
Lecture 3.3 - Logistic Regression
5 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
Cours 1
No ratings yet
Cours 1
42 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
ML - Mca
No ratings yet
ML - Mca
48 pages
CM20315 06 Fitting
No ratings yet
CM20315 06 Fitting
67 pages
Grove Mobile Crane Rt650e 234215 Shop Manual
100% (4)
Grove Mobile Crane Rt650e 234215 Shop Manual
22 pages
Mastertop 100: Dry-Shake Surface Hardener For Concrete Industrial Floors
No ratings yet
Mastertop 100: Dry-Shake Surface Hardener For Concrete Industrial Floors
2 pages
Tushar Bhawsar Resume-1
No ratings yet
Tushar Bhawsar Resume-1
1 page
2023 Rethink Retail-Trends-Report
No ratings yet
2023 Rethink Retail-Trends-Report
46 pages
G10 Ch02 Test 2025 01s1 Vector Ans
No ratings yet
G10 Ch02 Test 2025 01s1 Vector Ans
3 pages
Bianca Psychology Exam
No ratings yet
Bianca Psychology Exam
3 pages
Op-Amp Testing with Multimeter Guide
No ratings yet
Op-Amp Testing with Multimeter Guide
5 pages
Early Music For Chidlren
No ratings yet
Early Music For Chidlren
14 pages
Probability Concepts and Bayesian Analysis
No ratings yet
Probability Concepts and Bayesian Analysis
57 pages
Bank Reconciliation Shortcuts
No ratings yet
Bank Reconciliation Shortcuts
3 pages
SCM - CH5 Network Design in SC
No ratings yet
SCM - CH5 Network Design in SC
45 pages
Android Codes for Free Internet
100% (2)
Android Codes for Free Internet
8 pages
04.EBF-55 Dual Shaft High Speed Disperser With 2000L Tank-20190513
No ratings yet
04.EBF-55 Dual Shaft High Speed Disperser With 2000L Tank-20190513
2 pages
Ethiopian Institutes of Architecture, Building Construction and City Development (Eiabc)
No ratings yet
Ethiopian Institutes of Architecture, Building Construction and City Development (Eiabc)
37 pages
9EC03 Specimen Paper
100% (1)
9EC03 Specimen Paper
50 pages
NR 10 Marius Pullover Men
No ratings yet
NR 10 Marius Pullover Men
2 pages
MKT 412 Fall 2021 Exam
No ratings yet
MKT 412 Fall 2021 Exam
3 pages
Letter of Approval of Industrial Experience in Visual Testing Level 2 According To en ISO 17637
No ratings yet
Letter of Approval of Industrial Experience in Visual Testing Level 2 According To en ISO 17637
1 page
Financial Accounting Full Notes
No ratings yet
Financial Accounting Full Notes
125 pages
International Relations Solved MCQs
No ratings yet
International Relations Solved MCQs
123 pages
PLASTIC - Questionnaire: Answer From 30 Student The True Answer Are Colored Red
No ratings yet
PLASTIC - Questionnaire: Answer From 30 Student The True Answer Are Colored Red
3 pages
Sampling
No ratings yet
Sampling
14 pages
Aishwarya Bahirmal - Resume
No ratings yet
Aishwarya Bahirmal - Resume
2 pages
D and F Block Past Papers
No ratings yet
D and F Block Past Papers
4 pages
Landscape Site Standards
No ratings yet
Landscape Site Standards
153 pages
(2022) Analisis Soalan Sebenar SPM Kertas 2 Biologi
No ratings yet
(2022) Analisis Soalan Sebenar SPM Kertas 2 Biologi
1 page
ALO1 - System Construction v1.3
No ratings yet
ALO1 - System Construction v1.3
45 pages
Activity - Chapter 8
No ratings yet
Activity - Chapter 8
3 pages
(Ebook) Foundation Mathematics For The Physical Sciences by K. F. Riley, M. P. Hobson ISBN 9780521192736, 0521192730 No Waiting Time
No ratings yet
(Ebook) Foundation Mathematics For The Physical Sciences by K. F. Riley, M. P. Hobson ISBN 9780521192736, 0521192730 No Waiting Time
170 pages
Poker Training Manual
83% (6)
Poker Training Manual
18 pages