0% found this document useful (0 votes)

19 views5 pages

Lecture 3.3 - Logistic Regression

This document discusses logistic regression as a method for solving classification problems, where the goal is to predict discrete outcomes based on input variables. It introduces the logistic function as a suitable hypothesis function for binary classification and explains the probabilistic interpretation of logistic regression. The document also covers the likelihood function and the gradient ascent method for maximizing the likelihood to update model parameters.

Uploaded by

narutohoang5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views5 pages

Lecture 3.3 - Logistic Regression

Uploaded by

narutohoang5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Lecture III.

3 : Logistic
Regression

III.3.1. Introduction to Classification

Let’s now talk about the classification problem. This is just like the regression
problem, except that the values ywe now want to predict is a discrete random
variable instead of continuous random variable.
We can display our classification problem like this :
Giving a training dataset :

D = {(x1 , y1 ); (x2 , y2 ); … ; (xn , yn )}

where :

xi are input variables

yi are corresponding labels which belongs to a value set T ( T can be

{1; 2; ...; n}or other discrete value set).

Our goal is to giving a prediction function f(x) in order to accurately predicts the
label y(belongs to the value set T )from an unseen datapoint x.

Let’s begin with binary classification problem, where ycan only take two values,
0 and 1. For instance, if we are trying to build a spam classifier for email, then x(i)
maybe some features of a email, then ycan be 1 if it’s a spam email, and 0
otherwise.

III.3.2 Logistic regression

Let’s begin with choosing a new hypothesis function hθ (x). In this model, our

hypothesis function will be :

Lecture III.3 : Logistic Regression 1

1
hθ (x) = g(θT x) =
1 + e−θ .x
T

where :

1
g(z) =
1 + e−x

This function is called the logistic function or the sigmoid function.

III.3.2.1. Why choosing logistic function ?

Let’s begin with the model of different activation functions.

The yellow one represents for linear regression. This line is not restricted, so
it’s not suitable for this problem (though we can fixed that through a simple
function : if y > 1, then y = 1 && if y > 0, then y = 0. However, this’s still not a
good choice as linear regression is sensitive with noise.

Here is an example :

The red one represents for the hard threshold (which seems closed to PLA).
Our PLA also doesn’t work efficiently in this problem since our data isn’t
linearly separable (which will be mentioned later).

Therefore, the blue & green line seems much suitable for our problem.

Lecture III.3 : Logistic Regression 2

III.3.2.2. Logistic Regression under Probabilistic Interpretation
In this section, we are working with a binary classification problem, so let’s
assume that:

P (y = 1∣x; θ) = hθ (x)

P (y = 0∣x; θ) = 1 − hθ (x)

Note that this can be written more compactly as

p(y∣x; θ) = (hθ (x))y (1 − hθ (x))1−y

Assuming that the mtraining examples were generated independently, our

likelihood of the parameters as:

m
L(θ) = p(y ∣ X; θ) = ∏ p(y(i) ∣ x(i) ; θ)

i=1
m
y (i) 1−y (i)
= ∏ (hθ (x ))
(i)
(1 − hθ (x ))

(i)

i=1

As in the before likelihood function, we will have :

l(θ) = log L(θ)

n
= ∑ y(i) . l og h(x(i) ) + (1 − y(i) ).log (1 − h(x(i) )

i=1

How do we maximize the likelihood ? Similar to our derivation in the case of linear
regression, we can use gradient ascent.
∂
Our update formula : θj := θj + α ∂θl(θ)

Lecture III.3 : Logistic Regression 3

Note that we are maximizing our function rather than minimizing it; therefore, we
will move in the direction of the gradient ascent (which also means moving along
the direction of the derivative).

Let’s start by working with just one training example (x, y)and take derivatives to
derive the stochastic gradient ascent rule :

∂ 1 1 ∂
l(θ) = (y. T − (1 − y). ). g(θT x)
∂θj 1 − g(θ x) ∂θj
T

g(θ x)

Here, we will use a consequence of the logistic function that is :

g′ (z) = g(z).(1 − g(z))

Therefore :

∂
g(θT x) = g(θT x).(1 − g(θT x).xj
∂θj

Then :

∂
l(θ) = y.(1 − g(θT x)).xj − (1 − y).g(θT x)xj
∂θj

Lecture III.3 : Logistic Regression 4

= (y − g(θT x)).xj = (y − hθ (x)).xj

In conclusion, our stochastic gradient ascent rule is :

(i)
θj := θj + α.(y(i) − hθ (x(i) ))xj

We might see this looks identical to the LMS update rule; however, this is not the
same algorithm as hθ (x(i) )is now defined as a non-linear function of θT x(i) .

It’s surprising that we end up with the same update rule for 2
different learning algorithms and learning problems.
Is this coincidence or is there a deeper reason behind this ?
We’ll answer this when get to GLM models.

Lecture III.3 : Logistic Regression 5

Logistic Regression Explained
No ratings yet
Logistic Regression Explained
15 pages
W8 - Logistic Regression
No ratings yet
W8 - Logistic Regression
18 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
7 Logistic-Regression
No ratings yet
7 Logistic-Regression
63 pages
CS229 Lecture 3 PDF
100% (1)
CS229 Lecture 3 PDF
35 pages
Machine Learning for Mechanics
No ratings yet
Machine Learning for Mechanics
19 pages
Lecture3 Logistic Regression Regularization
No ratings yet
Lecture3 Logistic Regression Regularization
39 pages
Intro to Logistic Regression
No ratings yet
Intro to Logistic Regression
4 pages
Lecture 07
No ratings yet
Lecture 07
26 pages
Lecture 03 Logistic Regression
No ratings yet
Lecture 03 Logistic Regression
34 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
67 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
25 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
04 - Linear-Classification-2024
No ratings yet
04 - Linear-Classification-2024
65 pages
Binary Logistic Regression 2
No ratings yet
Binary Logistic Regression 2
43 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
cs188 Fa23 Note22
No ratings yet
cs188 Fa23 Note22
3 pages
23 LogisticRegression
No ratings yet
23 LogisticRegression
67 pages
Unit 3-ML
No ratings yet
Unit 3-ML
99 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
Binary Classification and Logistic Regression
No ratings yet
Binary Classification and Logistic Regression
7 pages
Logistic - Regression Class 3
No ratings yet
Logistic - Regression Class 3
88 pages
Lecture 8 Logistic Regression
No ratings yet
Lecture 8 Logistic Regression
34 pages
Mathematics Behind Logistic Regression Model 1598272636
No ratings yet
Mathematics Behind Logistic Regression Model 1598272636
6 pages
3-LG Eval
No ratings yet
3-LG Eval
52 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
CS229 Supplemental Lecture Notes: 1 Binary Classification
No ratings yet
CS229 Supplemental Lecture Notes: 1 Binary Classification
7 pages
Output 23
No ratings yet
Output 23
6 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
Logistic Regression and SGD
No ratings yet
Logistic Regression and SGD
10 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
41 pages
CSCI-43646364 S25 - Lecture 4
No ratings yet
CSCI-43646364 S25 - Lecture 4
92 pages
ML Assignment
No ratings yet
ML Assignment
20 pages
Logistic Regression
No ratings yet
Logistic Regression
26 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Notes 05
No ratings yet
Notes 05
51 pages
Logistic Regression Loss
No ratings yet
Logistic Regression Loss
7 pages
Lec 20
No ratings yet
Lec 20
16 pages
Exp 2
No ratings yet
Exp 2
7 pages
Logistic Regression
No ratings yet
Logistic Regression
13 pages
cs188 Fa22 Note21
No ratings yet
cs188 Fa22 Note21
4 pages
DDA3020 Lecture 06 Logistic Regression
No ratings yet
DDA3020 Lecture 06 Logistic Regression
47 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
A Layman's Guide To The Project
No ratings yet
A Layman's Guide To The Project
34 pages
Logistic Regression
No ratings yet
Logistic Regression
19 pages
Lec 02 LogisticReg
No ratings yet
Lec 02 LogisticReg
33 pages
Logistic Regression
No ratings yet
Logistic Regression
34 pages
Lecture 11 Logistic
No ratings yet
Lecture 11 Logistic
19 pages
Week 7
No ratings yet
Week 7
21 pages
Week 4 Logistic
No ratings yet
Week 4 Logistic
21 pages
Lecture 5 - Logistic Regression
No ratings yet
Lecture 5 - Logistic Regression
28 pages
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
No ratings yet
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
10 pages
Regression vs Classification Algorithms
100% (1)
Regression vs Classification Algorithms
13 pages
Logistic Regression for NLP
No ratings yet
Logistic Regression for NLP
64 pages
Output 25
No ratings yet
Output 25
8 pages
Binary Classifier Training Guide
No ratings yet
Binary Classifier Training Guide
11 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
50 pages
Logistic Regressions
No ratings yet
Logistic Regressions
11 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Chapter - 5 Machine Learning
0% (1)
Chapter - 5 Machine Learning
25 pages
Pgm6 With Output
No ratings yet
Pgm6 With Output
6 pages
Text Preprocessing
No ratings yet
Text Preprocessing
39 pages
Chapter 8
No ratings yet
Chapter 8
103 pages
WEKA Intro
No ratings yet
WEKA Intro
17 pages
Supervised Learning Essentials
No ratings yet
Supervised Learning Essentials
166 pages
ESDL Lab Manual
No ratings yet
ESDL Lab Manual
7 pages
Unit-3 Classification
No ratings yet
Unit-3 Classification
28 pages
Naive Bayes in Machine Learning
No ratings yet
Naive Bayes in Machine Learning
17 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
Feature Selection and Its Use in Big Data
No ratings yet
Feature Selection and Its Use in Big Data
17 pages
Lorentzian Algorithm Trading Strategy
100% (1)
Lorentzian Algorithm Trading Strategy
20 pages
Predicting Sample Size Required For Classification Performance
No ratings yet
Predicting Sample Size Required For Classification Performance
10 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
Neural Sheet 6
No ratings yet
Neural Sheet 6
3 pages
Machine Learning Interview Q&A
100% (1)
Machine Learning Interview Q&A
83 pages
Minor Project 2024 Report
No ratings yet
Minor Project 2024 Report
27 pages
ML 01
No ratings yet
ML 01
44 pages
Final Report
No ratings yet
Final Report
20 pages
Machine Learning Exam Papers
No ratings yet
Machine Learning Exam Papers
8 pages
ICATM Paper Template
No ratings yet
ICATM Paper Template
5 pages
F2P Game Data Time Series Clustering
No ratings yet
F2P Game Data Time Series Clustering
8 pages
ML Daily Tracker 8 Weeks
No ratings yet
ML Daily Tracker 8 Weeks
2 pages
EE 6885 Statistical Pattern Recognition: Reading
No ratings yet
EE 6885 Statistical Pattern Recognition: Reading
12 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
53 pages
Book Chapter-Deep Learning-New 23-05-2019
No ratings yet
Book Chapter-Deep Learning-New 23-05-2019
24 pages
Evaluation of Liquid Loading in Gas Wells Using Machine Learning
No ratings yet
Evaluation of Liquid Loading in Gas Wells Using Machine Learning
12 pages
Machine Learning Approach For Predicting Heart and Diabetes Diseases Using Data-Driven Analysis
No ratings yet
Machine Learning Approach For Predicting Heart and Diabetes Diseases Using Data-Driven Analysis
8 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
26 pages
TSP Iasc 45402
No ratings yet
TSP Iasc 45402
17 pages

Lecture 3.3 - Logistic Regression

Uploaded by

Lecture 3.3 - Logistic Regression

Uploaded by

Lecture III.

III.3.1. Introduction to Classification

D = {(x1 , y1 ); (x2 , y2 ); … ; (xn , yn )}

xi ﻿are input variables

yi ﻿are corresponding labels which belongs to a value set T ﻿( T ﻿can be

{1; 2; ...; n}﻿or other discrete value set).

III.3.2 Logistic regression

hypothesis function will be :

Lecture III.3 : Logistic Regression 1

This function is called the logistic function or the sigmoid function.

III.3.2.1. Why choosing logistic function ?

Lecture III.3 : Logistic Regression 2

Note that this can be written more compactly as

p(y∣x; θ) = (hθ (x))y (1 − hθ (x))1−y ​ ​

Assuming that the m﻿training examples were generated independently, our

As in the before likelihood function, we will have :

l(θ) = log L(θ)

Lecture III.3 : Logistic Regression 3

Here, we will use a consequence of the logistic function that is :

g′ (z) = g(z).(1 − g(z))

Lecture III.3 : Logistic Regression 4

In conclusion, our stochastic gradient ascent rule is :

Lecture III.3 : Logistic Regression 5

You might also like

xi are input variables

yi are corresponding labels which belongs to a value set T ( T can be

{1; 2; ...; n}or other discrete value set).

p(y∣x; θ) = (hθ (x))y (1 − hθ (x))1−y

Assuming that the mtraining examples were generated independently, our