0% found this document useful (0 votes)

17 views27 pages

Lecture 3

The document discusses multiple linear regression analysis. It covers basic concepts of multiple regression, formulating multiple regression models, interpreting regression coefficients, comparing regression models using measures like R-squared and adjusted R-squared, testing overall and individual variable significance, and variable selection methods like backward elimination.

Uploaded by

Arnika Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views27 pages

Lecture 3

Uploaded by

Arnika Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

LECTURE 3

REGRESSION ANALYSIS
- MULTIPLE REGRESSION

1
AGENDA

 Basic Concepts of Multiple Linear Regression

 Data Analysis Using Multiple Regression Models
 Measures of Variation and Statistical Inference

2
FORMULATION OF MULTIPLE REGRESSION
MODEL

 A multiple regression model is to relate one dependent variable with two or

more independent variables in a linear function
Population Intercept Population Slope Coefficients

𝑌 𝛽 𝛽𝑋 𝛽 𝑋 ⋯ 𝛽 𝑋 𝜀

Dependent Variable Independent Variable Random Error

 𝑏 , 𝑏 , 𝑏 … , 𝑏 are used to represent sample intercept and sample slope

coefficients

3
FORMULATION OF MULTIPLE REGRESSION
MODEL

4
FORMULATION OF MULTIPLE REGRESSION
MODEL

 Coefficients in a multiple regression net out the impact of each independent

variable in the regression equation
 The estimated slope coefficient, 𝑏 , measures the change in the average value of
𝑌 as a result of a one-unit increase in 𝑋 , holding all other independent variables
constant – “ceteris paribus effect”

5
EXAMPLE

 Recall the example in the last topic, we wish to find possible factors that
affecting taxi tips in NYC.The relationship between the taxi fare and the size of
the tip is estimated using a 2-variable regression model.
 Today we wish to include more factors that could possibly affect tips:
 Area
 number of riders
 Holiday reasons
 ……

6
MULTIPLE LINEAR REGRESSION

 Fill in the pop-up box:

7
MULTIPLE LINEAR REGRESSION
 Excel’s Output:

8
MULTIPLE LINEAR REGRESSION

 The estimated multiple regression equation:

𝑌 1.2649 0.9216𝑋 0.0382𝑋 17.2675𝑋 0.0288𝑋 0.1496𝑋
where 𝑌 = Taxi tips in NYC in $
𝑋 = Area indicator (NYC=1,JFK=0)
𝑋 = Number of riders
𝑋 = High tipper indicator (High=1, Normal = 0)
𝑋 = New year day indicator (Jan 1st =1, Others = 0)
𝑋 = Pre-tip amount in $

9
INTERPRETATION OF ESTIMATES

 The estimated slope coefficient

 𝑏 0.9216 says that the estimated average tips decrease by $0.9216 when the trip
area switching from JFK to NYC, holding all other things equal
 𝑏 0.0382 says that the estimated average tips increase by $0.0382 for each
additional rider, holding all other things equal
 𝑏 17.2675 says that the estimated average tips increase by $17.2675 if the rider is
categories as a high tipper, holding all other things equal
 𝑏 0.0288 says that the estimated average tips increase by $0.0288 if it it on New
year day, holding all other things equal
 𝑏 0.1496 says that the estimated average tips increase by $0.1496 for each $1
increase in pre-tip taxi fare, holding all other things equal

10
COMPARISON OF MODELS
 Suppose we run another linear regression model only used pre-tip taxi fare and
# of riders as independent variables

11
EVALUATE THE MODEL

 𝑟 and adjusted 𝑟
 F-test for overall model significance
 t-test for a particular 𝑋-variable significance

12
MEASURES OF VARIATION --

 Total variation of the 𝑌-variable is made up of two parts

where
𝑆𝑆𝑇 ∑ 𝑌 𝑌
𝑆𝑆𝑅 ∑ 𝑌 𝑌
𝑆𝑆𝐸 ∑ 𝑌 𝑌

13
MEASURES OF VARIATION --
The blue part: SSE, the variation
attributable to factors other than
Tips
taxifare and # of riders

Taxi-fare

# of riders

The grey, orange and purple parts: SSR, the total variation of 𝑌-variable that being
explained by the regression equation with independent variables
MEASURES OF VARIATION --

 What is the net effect of adding a new 𝑋-variable?

 𝑟 increases , even if the new 𝑋-variable is explaining an insignificant proportion of the
variation of the 𝑌-variable
 Is it fair to use 𝑟 for comparing models with different number of 𝑋-variables?

 A degree of freedom will be lost, as a slope coefficient has to be estimated for that
new 𝑋-variable
 Did the new 𝑋-variable add enough explanatory power to offset the lose of one degree of
freedom?

15
MEASURES OF VARIATION – ADJUSTED

⁄
 Adjusted 𝑟 1 1 𝑟 1
⁄

 Measures the proportion of variation of the 𝑌 values that is explained by the

regression equation with the independent variable 𝑋 , 𝑋 , … , 𝑋 , after the
adjustment of sample size (𝑛) and the number of 𝑋-variables used (𝐾)
 Smaller than 𝑟 , and can be negative
 Penalize the excessive use of 𝑋-variables
 Useful in comparing among models with different number of 𝑋-variables

16
EXAMPLE

 Compare the model only used pre-tip amount against the model used 5
independent variables, which one fits better?
 Number of Observations: 197,103 vs 197,103
 Degree of freedom (n-K-1): 197,101 vs 197,097
 𝑟 : 0.5533 vs 0.6075
 Adjusted 𝑟 : 0.5533 vs 0.6075

17
INFERENCE: OVERALL MODEL SIGNIFICANCE

 F-test for the overall model significance

𝐻 :𝛽 𝛽 ⋯ 𝛽 0
(none of the 𝑋-variables affects 𝑌)
𝐻 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝛽 0
(at least one 𝑋-variable affects 𝑌)
/
F with 𝐾, 𝑛 𝐾 1 d.f.
/

p-value 𝑃 𝐹 F
Reject 𝐻 if F > C. V. 𝐹 , , or p-value < 𝛼

18
INFERENCE:A PARTICULAR X-VARIABLE
SIGNIFICANCE

 By rejecting the 𝐻 in F-test, we still cannot distinguish which 𝑋-variable(s) has

the significant impacts on the 𝑌-variable
 t-test for a particular 𝑋-variable significance
𝐻 :𝛽 0 (𝑋 has no linear relationship with 𝑌)
𝐻 :𝛽 0 (𝑋 is linearly related to 𝑌)

t with 𝑛 𝐾 1 d.f.

p-value 𝑃 𝑡 |t|
Reject 𝐻 if |t| > C. 𝑉. 𝑡 ⁄ , or p-value < 𝛼

19
EXAMPLE

 For the model used 5 independent variables, is the overall model significant?

 F = 61015.62, p-value < 0.00001;

 At 5% significance level, 𝐻 is rejected

20
EXAMPLE

 At 5% level of significance, which X-variable(s), significantly affecting Y?

 According to the t-test results, the p-value for all five independent variables are
smaller than 5%, indicating all of them are significantly related to tips paid in
NYC.

21
VARIABLES SELECTION STRATEGIES

 In case some of the independent variables are insignificant based on t-test

results, one may consider eliminated them using the following methods
 All possible regressions
 Backward elimination
 Forward selection
 Stepwise regression

22
ALL POSSIBLE REGRESSIONS

 To develop all the possible regression models between the dependent variable
and all possible combinations of independent variables
 If there are 𝐾 𝑋-variables to consider using, there are 2 1 possible
regression models to be developed
 The criteria for selecting the best model may include
 MSE
 Adjusted 𝑟
 Disadvantages of all possible regressions
 No unique conclusion, with different criteria, different conclusions will arise
 Look at overall model performance, but not individual variable significance
23
 When there is a large number of potential 𝑋-variables, computational time can be long
BACKWARD ELIMINATION

 Evaluate individual variable significance

Step 1: Build a model by using all potential 𝑋-variables

Step 2: Identify the least significant 𝑋-variable using t-test
Step 3: Remove this 𝑋-variable if its p-value is larger than the specified level of
significance; otherwise terminate the procedure
Step 4: Develop a new regression model after removing this 𝑋-variable, repeat steps
2 and 3 until all remaining 𝑋-variables are significant

24
FORWARD SELECTION

 Evaluate individual variable significance

Step 1: Start with a model which only contain the intercept term
Step 2: Identify the most significant 𝑋-variable using t-test
Step 3: Add this 𝑋-variable if its p-value is smaller than the specified level of
significance; otherwise terminate the procedure
Step 4: Develop a new regression model after including this 𝑋-variable, repeat steps
2 and 3 until all significant 𝑋-variables are entered

25
STEPWISE REGRESSION

 Evaluate individual variable significance

 A 𝑋-variable entering can later leave; a 𝑋-variable eliminated can later go back in

Step 1: Start with a model which only contain the intercept term
Step 2: Identify the most significant 𝑋-variable, add this 𝑋-variable if its p-value is
smaller than the specified level of significance; otherwise terminate the procedure
Step 3: Identify the least significant 𝑋-variable from the model, remove this 𝑋-
variable if its p-value is larger than the specified level of significance
Step 4: Repeat steps 2 and 3 until all significant 𝑋-variables are entered and none of
them have to be removed

26
PRINCIPLE OF MODEL BUILDING

 A good model should

 Have few independent variables
 Have high predictive power
 Have low correlation between independent variables
 Be easy to interpret

Lecture 3
No ratings yet
Lecture 3
35 pages
Regression Analysis
No ratings yet
Regression Analysis
65 pages
Chapter 3 MLR
No ratings yet
Chapter 3 MLR
40 pages
03 - Simple Linear Regression
No ratings yet
03 - Simple Linear Regression
13 pages
Multiple Linear Regression & Nonlinear Regression Models
No ratings yet
Multiple Linear Regression & Nonlinear Regression Models
51 pages
Multiple Regression Insights
100% (1)
Multiple Regression Insights
21 pages
Intro To Regresion: Codergirl Data Analysis
No ratings yet
Intro To Regresion: Codergirl Data Analysis
32 pages
13 Predictive Analysis - Tests of Association - Regression
No ratings yet
13 Predictive Analysis - Tests of Association - Regression
70 pages
Multiple Regression A
No ratings yet
Multiple Regression A
32 pages
Introduction To Management Science: Post Mid Sessions 2 & 3 November 4 and 6 2019
No ratings yet
Introduction To Management Science: Post Mid Sessions 2 & 3 November 4 and 6 2019
26 pages
Linear Regression PDF
100% (1)
Linear Regression PDF
32 pages
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
No ratings yet
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
12 pages
Multiple Regression: by Dr. D. Israel
No ratings yet
Multiple Regression: by Dr. D. Israel
23 pages
Chapter 06-Regression Analysis
No ratings yet
Chapter 06-Regression Analysis
41 pages
Bio2 Module 4 - Multiple Linear Regression
No ratings yet
Bio2 Module 4 - Multiple Linear Regression
20 pages
Lecture 12
No ratings yet
Lecture 12
47 pages
Topic 7-Regression Analysis
No ratings yet
Topic 7-Regression Analysis
56 pages
REGRESSION ANALYSIS 1 and 2 Notes
No ratings yet
REGRESSION ANALYSIS 1 and 2 Notes
9 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Multiple Regression Analysis Guide
No ratings yet
Multiple Regression Analysis Guide
19 pages
Diagnostic Tests2
No ratings yet
Diagnostic Tests2
25 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
Etman MachineL4
No ratings yet
Etman MachineL4
55 pages
Regression
No ratings yet
Regression
24 pages
COMM5005 Lecture 8
No ratings yet
COMM5005 Lecture 8
54 pages
Multiple Regression (Compatibility Mode)
No ratings yet
Multiple Regression (Compatibility Mode)
24 pages
Evans Analytics2e PPT 08
No ratings yet
Evans Analytics2e PPT 08
65 pages
DA-3rd Unit
No ratings yet
DA-3rd Unit
16 pages
Regression Analysis Techniques
No ratings yet
Regression Analysis Techniques
16 pages
3 Linear Regression 3
No ratings yet
3 Linear Regression 3
10 pages
Module 3
No ratings yet
Module 3
34 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
RRB - Unit 2 Regresion
No ratings yet
RRB - Unit 2 Regresion
53 pages
BES - Lecture 10 - Simple Linear Regression
No ratings yet
BES - Lecture 10 - Simple Linear Regression
15 pages
Slides - Topic 6 - Regression Analysis
No ratings yet
Slides - Topic 6 - Regression Analysis
40 pages
Ch08 Part 2 - Multtiple Regression
No ratings yet
Ch08 Part 2 - Multtiple Regression
45 pages
Advanced Managerial Statistics Guide
No ratings yet
Advanced Managerial Statistics Guide
37 pages
04 MLR
No ratings yet
04 MLR
32 pages
Data Science: Stats & Regression
100% (1)
Data Science: Stats & Regression
21 pages
Ch08 Part 2 - Multiple Regression
No ratings yet
Ch08 Part 2 - Multiple Regression
45 pages
Fsgs
No ratings yet
Fsgs
28 pages
DISC 212 Session 13
No ratings yet
DISC 212 Session 13
29 pages
15multiple Linear Regression
No ratings yet
15multiple Linear Regression
168 pages
Chapter 9
No ratings yet
Chapter 9
10 pages
Econometrics Study Guide Gujarati Plain
No ratings yet
Econometrics Study Guide Gujarati Plain
3 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
Lecture 12 Regression
No ratings yet
Lecture 12 Regression
55 pages
Chapter 14
No ratings yet
Chapter 14
15 pages
SBE11 CH 16
No ratings yet
SBE11 CH 16
59 pages
CUHK STAT5102 Ch3
No ratings yet
CUHK STAT5102 Ch3
73 pages
14 Statistics and Probability
No ratings yet
14 Statistics and Probability
37 pages
Regression Analysis
No ratings yet
Regression Analysis
41 pages
Sec2 Regression PDF
No ratings yet
Sec2 Regression PDF
183 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
Prediction & Forecasting: Regression Analysis
No ratings yet
Prediction & Forecasting: Regression Analysis
3 pages
Slides - Simple Linear Regression
No ratings yet
Slides - Simple Linear Regression
35 pages
College of Natural and Computational Science Department of Statistics Linear Regression Biostatistics Master Program
No ratings yet
College of Natural and Computational Science Department of Statistics Linear Regression Biostatistics Master Program
3 pages
CS124 - Operating System Syllabus - 2
No ratings yet
CS124 - Operating System Syllabus - 2
13 pages
Warom Technology Company Overview
No ratings yet
Warom Technology Company Overview
35 pages
Kannur University 5th Sem Business Research Methodology Qustions 02
No ratings yet
Kannur University 5th Sem Business Research Methodology Qustions 02
10 pages
Hci 101
No ratings yet
Hci 101
11 pages
Login History
No ratings yet
Login History
55 pages
Contact List of Names & Numbers
No ratings yet
Contact List of Names & Numbers
54 pages
Agilent ICP OES 700 Series
No ratings yet
Agilent ICP OES 700 Series
58 pages
Crypto Puzzle for Tech Enthusiasts
No ratings yet
Crypto Puzzle for Tech Enthusiasts
4 pages
Aerospace Engineering
No ratings yet
Aerospace Engineering
1 page
CIR FORMAT GJ18AZ6696 (2) Amc
No ratings yet
CIR FORMAT GJ18AZ6696 (2) Amc
6 pages
Challan PDF
No ratings yet
Challan PDF
1 page
Unit 4: Communication Authentic Assessment Results
100% (1)
Unit 4: Communication Authentic Assessment Results
23 pages
Bradley Pulverizer Airswept Roller Mills Brochure-Intl
No ratings yet
Bradley Pulverizer Airswept Roller Mills Brochure-Intl
3 pages
Electrical
No ratings yet
Electrical
14 pages
XPSUAT Manuale.00
No ratings yet
XPSUAT Manuale.00
104 pages
Generator Rewind Innovations in Africa
No ratings yet
Generator Rewind Innovations in Africa
13 pages
Irn37-45k, 50-60H (CC) 10.01 List Parts
No ratings yet
Irn37-45k, 50-60H (CC) 10.01 List Parts
32 pages
The Enterprise Data Catalog Early Release Ole Olesenbagneux Download Full Chapters
100% (2)
The Enterprise Data Catalog Early Release Ole Olesenbagneux Download Full Chapters
86 pages
A Study On Consumar Satisfaction Towords Google Pa1
100% (4)
A Study On Consumar Satisfaction Towords Google Pa1
77 pages
Forensics Notes Summaries
No ratings yet
Forensics Notes Summaries
16 pages
Module 5 Lesson 1 What Why and How To Evaluate A Curriculum PDF
100% (3)
Module 5 Lesson 1 What Why and How To Evaluate A Curriculum PDF
15 pages
Heavy Machinery Maintenance Guide
No ratings yet
Heavy Machinery Maintenance Guide
2 pages
A10 Vertical Machining Center Specs
No ratings yet
A10 Vertical Machining Center Specs
6 pages
IDP New Form Blank Form
No ratings yet
IDP New Form Blank Form
2 pages
Chapter 5 Memory Organization
No ratings yet
Chapter 5 Memory Organization
75 pages
07 Operations Engr Management v3.1
No ratings yet
07 Operations Engr Management v3.1
3 pages
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
100% (1)
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
8 pages
Working On The Linux Terminal and Shell
No ratings yet
Working On The Linux Terminal and Shell
7 pages
N2 Manual
No ratings yet
N2 Manual
36 pages
Arun Krishna Prabhu: Overview of Skill and Experience (Linkedin
No ratings yet
Arun Krishna Prabhu: Overview of Skill and Experience (Linkedin
2 pages