0% found this document useful (0 votes)

10 views28 pages

Rahul Narayanan - Generalizedlinearmodel

The document provides an overview of Generalized Linear Models (GLMs), detailing their definition, components, and applications in regression and classification. It explains the differences between normal and binomial distributions, as well as the use of link functions in establishing relationships between variables. Additionally, it includes examples of simple linear regression and logistic regression to illustrate the concepts discussed.

Uploaded by

morilloatilio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views28 pages

Rahul Narayanan - Generalizedlinearmodel

Uploaded by

morilloatilio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Generalized Linear Model

By
Rahul Narayanan
Agenda

 Refresher

 Definition of Generalized Linear Model

 What is a Normal Distribution

 What is a Linear Model

 Linear Modelling for Regression (Simple Linear Regression)

 Linear Modelling for Classification (Logistic Regression)

 Generalizing Linear Modelling of classification and Regression using GLM

 Some of GLMs
Refresher :

Types of ML :
GLM

Supervised Unsupervised Reinforcement

Response/output/Dependent variable Example

• Yes/No
Classification Categorical (or) discrete • Survived/Dead
• Lion/Tiger/Cheetah etc.

• 100.70
Regression Continuous • 25 -∞ to +∞
• -75.25
Quiz :

Question 1 :

Suppose you are working on a weather Prediction model, and you would like to predict whether or not it will be raining
At 5pm tomorrow

Is this a Classification or a Regression problem ?

Ans : Classification

It will rain - 1

It will not rain - 0

Quiz :

Question 2 :

The HR department of an organization wants to have a salary prediction tool by which they want to decide on the salary
of a new employee based on his/her experience

Is this a Classification or a Regression problem ?

Ans : Regression

Independent variable --> Experience in years

Dependent variable --> Salary

Quiz :

Question 3 :
Weight of car Engine capacity Mileage
(kg) (Litre) (kmpl)
890 1.2 21
1200 1.6 19
920 2.2 15
700 1.0 22

Is this a Classification or a Regression problem ?

Ans : Regression

Independent variable --> Weight of the car, Engine capacity

Dependent variable --> Mileage

Generalized Linear Model

Definition :
The Generalized Linear Model expands the General Linear Model that allows Dependent variable to have a
linear relationship with the independent variable via a specified link function. Moreover the model allows for
the dependent variable to have a non-normal distribution.

There are three components to a GLM :

1. Random Component

2. Systematic Component

3. Link Function
Normal Distribution

Definition :
A Normal Distribution is an arrangement of dataset in which most of the values cluster in the middle(around
the mean) and the rest of the values falls away from the mean.

Example
Height of human
5.2 5.8
Salaries of Employees

4.9 6.1
4.6 5.5 6.4

-3 σ -2 σ -1 σ µ 1σ 2σ 3σ

68.2%

95.4%

99.7%
Linear Model

Definition :
A Linear model is one in which a constant change in input/Independent variable results in a constant change
in output/Dependent variable.
+2 +2 +2 +2

X 1 3 5 7 9
Y 10 20 30 40 50

+10 +10 +10 +10

+2 +2 +2 +2

X 1 3 5 7 9
Y 4.8 10 15.3 20.2 25.3

+5.2 +5.3 +4.9 +5.1 ≈5

Linear Modelling
x y 12 Equation of line:
y = 2x
0 0 10 y = mx + b
8
1 2
6

Y
2 4 y-intercept
4
slope
3 6 2

4 8 0
0 1 2 3 4 5 6
X
5 10
?

(+)ve Infinite Slope

(-)ve No Slope y = 2x + 0
Simple Linear Regression (Linear Modelling technique for Regression)

Problem:
As a Hotel owner you want to predict the tip amount($) of a meal for any given bill
amount. Therefore one evening you collect data for six meals.

Meal # Tip amount ($)

1 5
Unfortunately, when you begin to look at your data,
2 17
you realize you only collected data for tip amount
3 11
and not the meal amount (total bill). So this is the
4 8
best data you have.
5 14

6 5
Simple Linear Regression (contd)
18
Meal # Tip amount ($)
16
1 5 14 +7
12 +4
2 17 +1
best fit line
10

TIP AMOUNT
3 11 8 -2
-5 -5
6
4 8
4

5 14 2

0
6 5 0 1 2 3 4 5 6 7
MEAL #

ȳ = 10
Sum of squared error (SSE) = (-5) ² + 7² + 1² + (-2) ² + 4² + (-5) ²
y = 0x + 10
= 120
Simple Linear Regression (contd)
18
Total Bill Amount ($) Tip amount ($)
16
34 5 14

12 y = 0x + 10
108 17
10

TIP AMOUNT
64 11 8

6
88 8
4

99 14 2

0
51 5 20 30 40 50 60 70 80 90 100 110 120
BILL AMOUNT

y = 0x + 10
Simple Linear Regression (contd)
18
Total Bill Amount ($) Tip amount ($)
16
34 5 14

12
108 17
10

TIP AMOUNT
64 11 8

6
88 8
4

99 14 2

0
51 5 20 30 40 50 60 70 80 90 100 110 120
BILL AMOUNT

y = 0.08x + 6.2
Simple Linear Regression (contd)
18
Total Bill Amount ($) Tip amount ($)
16
34 5 14

12
108 17
10

TIP AMOUNT
64 11 8

6
88 8
4

99 14 2

0
51 5 20 30 40 50 60 70 80 90 100 110 120
BILL AMOUNT

y = 0.11x + 1.8
Simple Linear Regression (contd)
18
Total Bill Amount ($) Tip amount ($)
16
34 5 14

12
108 17
10

TIP AMOUNT
64 11 8

6
88 8
4

99 14 2

0
51 5 20 30 40 50 60 70 80 90 100 110 120
BILL AMOUNT

y = 0.14x – 0.81
• By Tuning the slope and intercept we make a best fit of line for our data SSE = 30.075
• How do you tune ? By using Gradient Descent Algorithm

Ho do we interpret y = 0.14x – 0.81

Logistic Regression (Linear Modelling technique for Classification)
Problem:
We have collected a sample dataset of people’s age and whether they subscribed to
a magazine or not. Let’s come up with a model where given a persons’ age we have to
predict whether he will subscribe to the magazine or not.

Age in years Subscribed

Can I use the same technique of regression(fitting a
line) that we learned so far to solve this?
18 0
No
22 0
Why ?
27 1
• Data is categorical in nature
31 1 • Non-Normal Distribution [Binomial distribution]
• No linear relationship between age and subscription
24 0

42 1 But Let’s try

Subscribed -1 Not Subscribed - 0
Logistic Regression
Age in years Subscribed 1.6
1.4
18 0
1.2
22 0 1
27 1 0.8

SUBSCRIBED
31 1 0.6 y = mx + b
0.4
24 0
0.2
42 1 0
10 15 20 25 30 35 40 45 50
Age in years Probability (p) AGE

18 0.23
22 0.30
27 0.72
31 0.81
How do we solve
24 0.29
this ?
42 0.88

38 1.47 X
17 -0.20 X
Trick Intuition

1.6 1.6
1.4 1.4
1.2 1.2
1 1
0.8 0.8
SUBSCRIBED

SUBSCRIBED
0.6 0.6
0.4 0.4
0.2 0.2
0 0
10 15 20 25 30 35 40 45 50 10 15 20 25 30 35 40 45 50
AGE AGE
Trick 1
1.6 1.6
1.4 1.4
1.2 1.2
1 1
0.8 0.8
SUBSCRIBED

SUBSCRIBED
0.6 0.6
0.4 0.4
0.2 0.2
0 0
10 15 20 25 30 35 40 45 50 10 15 20 25 30 35 40 45 50
AGE AGE

y = mx + b which ranges from -∞ to +∞ y = emx + b which ranges from 0 to +∞

How do you ensure non - negativity of a number

• Absolute value of a number |-5|  +ve

• Squaring a number (-5) ²  +ve
• Exponential form of a number e⁻⁵  +ve
Trick 2
1.6 1.6
1.4 1.4
1.2 1.2
1 1
0.8 0.8
SUBSCRIBED

SUBSCRIBED
0.6 0.6 0.5
0.4 0.4
0.2 0.2
0 0
10 15 20 25 30 35 40 45 50 10 15 20 25 30 35 40 45 50
AGE AGE

y = emx + b which ranges from 0 to +∞ y = emx + b / 1 + emx + b which ranges from 0 to 1

How do you ensure any number to be <=1 This is called a Sigmoid Function
• By dividing a number that is greater than it
5/(5+1) = 0.833  <=1
E(Y) => P = emx + b / 1 + emx + b
Linear Model Constraint

Linear Modelling technique for Regression

• Normal Distribution E(Y) = 0.14x – 0.81

i.e We can explain the prediction as for
• E(Y) = mx + b every $1 the bill amount increases, we
would expect the tip amount to increase
by $0.14 or about 15-cents
This is the most
important
constraint of a
Linear Modelling technique for Classification Linear model
• Binomial Distribution

• E(Y) = emx + b / 1 + emx + b i.e We cannot explain the prediction as a

Linear combination of Independent
variables
• E(Y) ≠ mx + b
Generalized Linear Model

Framework for Generalization

Random Component
Explains the distribution of our
Dependent Variable
Link Function
Establishes Relationship
between Random &
Systematic component
Systematic Component
Explains Dependent variable as a
Linear combination of
Independent variable
Solve Linear Model Constraint using GLM

Linear Modelling technique for Regression

Link Function
• Normal Distribution E(Y) = 0.14x – 0.81
i.e We can explain the prediction as for
• E(Y) = mx + b every $1 the bill amount increases, we Identity Function
would expect the tip amount to increase ɪ(E(Y)) = mx + b
by $0.14 or about 15-cents

Linear Modelling technique for Classification

• Binomial Distribution
i.e We cannot explain the prediction as a Logit Function
Linear combination of Independent
• E(Y) ≠ mx + b variables Logit(E(Y)) = mx + b

• E(Y) = emx + b / 1 + emx + b

Generalized Linear Model

There are three components to a GLM :

1. Random Component

2. Systematic Component

3. Link Function
Some of the Generalized Linear Models

 Logistic Regression
• Logit(E(Y)) = mx + b

 Probit Regression
• Probit(E(Y)) = mx + b

 Poisson Regression
• log(E(Y)) = mx + b

 Linear Regression
• E(Y) = mx + b

• ɪ(E(Y)) = mx + b
References

 http://www.statisticshowto.com/probability-and-statistics/normal-distributions/
 https://machinelearningmastery.com/simple-linear-regression-tutorial-for-machine-learning/
 https://www.analyticsvidhya.com/blog/2015/11/beginners-guide-on-logistic-regression-in-r/
 https://www.youtube.com/watch?v=zAULhNrnuL4
 https://www.youtube.com/watch?v=W3OaWyHEPv0
Thank You

Generalised Linear Model
No ratings yet
Generalised Linear Model
4 pages
15 GLM
No ratings yet
15 GLM
32 pages
Week6 1 GLM
No ratings yet
Week6 1 GLM
28 pages
Generalized Linear Models Guide
No ratings yet
Generalized Linear Models Guide
24 pages
Logistic Regression
No ratings yet
Logistic Regression
26 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
Generalized Linear Model: Badr Missaoui
No ratings yet
Generalized Linear Model: Badr Missaoui
35 pages
ML 02 Regression 2
No ratings yet
ML 02 Regression 2
30 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
44 pages
Unit 2 ML - Ver 2
No ratings yet
Unit 2 ML - Ver 2
129 pages
ML - LAB - BE CSE (DS) Final
No ratings yet
ML - LAB - BE CSE (DS) Final
110 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Unit Iii
No ratings yet
Unit Iii
27 pages
AAI Lecture 10 SP 25
No ratings yet
AAI Lecture 10 SP 25
37 pages
Regression Analysis Guide
No ratings yet
Regression Analysis Guide
13 pages
Linear & Polynomial Regression Guide
No ratings yet
Linear & Polynomial Regression Guide
56 pages
Regression
No ratings yet
Regression
45 pages
(GAM) Application PDF
No ratings yet
(GAM) Application PDF
30 pages
ML Assignment3 Solution
No ratings yet
ML Assignment3 Solution
13 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
30 pages
Advanced Regression with GLMs
No ratings yet
Advanced Regression with GLMs
13 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
7 pages
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
No ratings yet
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
16 pages
Unit-Iii-1 1
No ratings yet
Unit-Iii-1 1
31 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Linear Regression
No ratings yet
Linear Regression
97 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
10 pages
Chapter 13 - Generalized Linear Models
No ratings yet
Chapter 13 - Generalized Linear Models
6 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Unit 2
No ratings yet
Unit 2
34 pages
2 Modele Lineare
No ratings yet
2 Modele Lineare
43 pages
Bias and Variance Tradeoff:: High Bias Underfitting Low Training & Testing
No ratings yet
Bias and Variance Tradeoff:: High Bias Underfitting Low Training & Testing
12 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
Fileml
No ratings yet
Fileml
54 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Regression Analysis (AI)
No ratings yet
Regression Analysis (AI)
9 pages
Unit-2 Machine Learning
No ratings yet
Unit-2 Machine Learning
148 pages
OE-ML Unit - 3
No ratings yet
OE-ML Unit - 3
29 pages
EC501 Lecture 04
No ratings yet
EC501 Lecture 04
30 pages
Unit 3
No ratings yet
Unit 3
30 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
Linear Regression - Everything You Need To Know About Linear Regression
No ratings yet
Linear Regression - Everything You Need To Know About Linear Regression
17 pages
Day.9 SML
No ratings yet
Day.9 SML
23 pages
ML Unit2
No ratings yet
ML Unit2
69 pages
Final ML
No ratings yet
Final ML
54 pages
Chap 5
No ratings yet
Chap 5
13 pages
Linear Regression
No ratings yet
Linear Regression
49 pages
Module 2-Supervised Learning
No ratings yet
Module 2-Supervised Learning
74 pages
Analytics Compendium
No ratings yet
Analytics Compendium
41 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Unit-2 ML
No ratings yet
Unit-2 ML
39 pages
Linear and Logistic Regression
No ratings yet
Linear and Logistic Regression
21 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
(Unit-04) Part-01 - ML Algo
No ratings yet
(Unit-04) Part-01 - ML Algo
49 pages
MachineLearning Unit II
No ratings yet
MachineLearning Unit II
45 pages
U3 U4 Regression
No ratings yet
U3 U4 Regression
22 pages
ML Unit
No ratings yet
ML Unit
23 pages
Answers To All Questions
No ratings yet
Answers To All Questions
4 pages
Abdulle A., Wanner G 200 Years of Least Squares
No ratings yet
Abdulle A., Wanner G 200 Years of Least Squares
16 pages
Iqbal - Mahmood - Ali - Riaz - On Enhanced GLM-Based Monitoring
No ratings yet
Iqbal - Mahmood - Ali - Riaz - On Enhanced GLM-Based Monitoring
28 pages
Bital - Mohsin - Aslam - Weibull-Expo Distrib and Its Apps in Monitoring Indust Process
No ratings yet
Bital - Mohsin - Aslam - Weibull-Expo Distrib and Its Apps in Monitoring Indust Process
26 pages
Mustafa - El-Desouky - AL-Garash - The Weibull Generalized Exponential Distribution
No ratings yet
Mustafa - El-Desouky - AL-Garash - The Weibull Generalized Exponential Distribution
30 pages
Barreto-Souzaa, - de Moraisa - Cordeirob - The Weibull-Geometric Distribution
No ratings yet
Barreto-Souzaa, - de Moraisa - Cordeirob - The Weibull-Geometric Distribution
15 pages
Dobbs - Miller - Advanced Level Math Statist
No ratings yet
Dobbs - Miller - Advanced Level Math Statist
190 pages
(T. Meis, U. Marcowitz) Num Sols of Part
No ratings yet
(T. Meis, U. Marcowitz) Num Sols of Part
550 pages
Federico Vegetti - GLM and Maximum Likelihood
No ratings yet
Federico Vegetti - GLM and Maximum Likelihood
32 pages
Schweder - Hjort - Confidence, Likelihood, Probab Statist Inference
100% (1)
Schweder - Hjort - Confidence, Likelihood, Probab Statist Inference
521 pages
Thomas A. Severini - Likelihood Methods in Statistics
No ratings yet
Thomas A. Severini - Likelihood Methods in Statistics
195 pages
A - Review On - Energy Management System of Solar Car
No ratings yet
A - Review On - Energy Management System of Solar Car
4 pages
Slides Module 4 Lesson 2
No ratings yet
Slides Module 4 Lesson 2
34 pages
Ens 362: Econometrics Ii Course Outline and Reading List
No ratings yet
Ens 362: Econometrics Ii Course Outline and Reading List
3 pages
On The Evaluation of Generative Models in Music
100% (1)
On The Evaluation of Generative Models in Music
12 pages
Fundamentals of Statistical Signal Processing
67% (3)
Fundamentals of Statistical Signal Processing
303 pages
2024 MI2020E Probability-and-Statistics CTTT Final 2
No ratings yet
2024 MI2020E Probability-and-Statistics CTTT Final 2
7 pages
Regression and Correlation
No ratings yet
Regression and Correlation
22 pages
Bootstrap Methods in Statistics
No ratings yet
Bootstrap Methods in Statistics
7 pages
Econometrics All R Codes Final
No ratings yet
Econometrics All R Codes Final
12 pages
Statistical Analysis Report
No ratings yet
Statistical Analysis Report
26 pages
PLUM - Ordinal Regression: Notes
No ratings yet
PLUM - Ordinal Regression: Notes
4 pages
Agec 313 Ecnometrics
No ratings yet
Agec 313 Ecnometrics
3 pages
Regression Models Overview
No ratings yet
Regression Models Overview
27 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
CEP Calculation
No ratings yet
CEP Calculation
20 pages
Finite Mixture of Skewed Distributions: Víctor Hugo Lachos Dávila Celso Rômulo Barbosa Cabral Camila Borelli Zeller
No ratings yet
Finite Mixture of Skewed Distributions: Víctor Hugo Lachos Dávila Celso Rômulo Barbosa Cabral Camila Borelli Zeller
108 pages
JNTUA Probability and Statistics Notes - R20
No ratings yet
JNTUA Probability and Statistics Notes - R20
109 pages
Sequential Forward Selection (SFS)
No ratings yet
Sequential Forward Selection (SFS)
5 pages
Stats Exam Guide for Honors Students
No ratings yet
Stats Exam Guide for Honors Students
3 pages
Basic Econometrics 4th Edition Damodar N. Gujarati PDF Download
100% (7)
Basic Econometrics 4th Edition Damodar N. Gujarati PDF Download
54 pages
A. Discuss What Is Meant by Sampling Distribution of A Population Parameter
No ratings yet
A. Discuss What Is Meant by Sampling Distribution of A Population Parameter
5 pages
28 - AI-Regression vs. Classification
No ratings yet
28 - AI-Regression vs. Classification
35 pages
ML Lab6.Ipynb - Colaboratory
100% (1)
ML Lab6.Ipynb - Colaboratory
5 pages
Shipping Service Satisfaction
No ratings yet
Shipping Service Satisfaction
9 pages
Lecture 13
No ratings yet
Lecture 13
25 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
28 pages
Thuchanh
No ratings yet
Thuchanh
1 page
Machine Learning Midterm Prep
No ratings yet
Machine Learning Midterm Prep
42 pages
GIS Kriging Techniques Explained
No ratings yet
GIS Kriging Techniques Explained
49 pages
Reading 2 Time Series Analysis
No ratings yet
Reading 2 Time Series Analysis
22 pages
Mid Term Umt
No ratings yet
Mid Term Umt
4 pages

Rahul Narayanan - Generalizedlinearmodel

Uploaded by

Rahul Narayanan - Generalizedlinearmodel

Uploaded by

Generalized Linear Model

 Definition of Generalized Linear Model

 What is a Normal Distribution

 What is a Linear Model

 Linear Modelling for Regression (Simple Linear Regression)

 Linear Modelling for Classification (Logistic Regression)

 Generalizing Linear Modelling of classification and Regression using GLM

Supervised Unsupervised Reinforcement

Response/output/Dependent variable Example

Is this a Classification or a Regression problem ?

It will not rain - 0

Is this a Classification or a Regression problem ?

Independent variable --> Experience in years

Dependent variable --> Salary

Is this a Classification or a Regression problem ?

Independent variable --> Weight of the car, Engine capacity

Dependent variable --> Mileage

There are three components to a GLM :

+10 +10 +10 +10

+5.2 +5.3 +4.9 +5.1 ≈5

(+)ve Infinite Slope

Meal # Tip amount ($)

Ho do we interpret y = 0.14x – 0.81

Age in years Subscribed

42 1 But Let’s try

y = mx + b which ranges from -∞ to +∞ y = emx + b which ranges from 0 to +∞

• Absolute value of a number |-5|  +ve

y = emx + b which ranges from 0 to +∞ y = emx + b / 1 + emx + b which ranges from 0 to 1

Linear Modelling technique for Regression

• Normal Distribution E(Y) = 0.14x – 0.81

• E(Y) = emx + b / 1 + emx + b i.e We cannot explain the prediction as a

Framework for Generalization

Linear Modelling technique for Regression

Linear Modelling technique for Classification

• E(Y) = emx + b / 1 + emx + b

There are three components to a GLM :

You might also like