[go: up one dir, main page]

0% found this document useful (0 votes)
24 views9 pages

Multiple Linear Regression Notes

The document discusses Multiple Linear Regression, emphasizing its advantages over Simple Linear Regression by allowing control for multiple variables, which aids in inferring causality. It outlines the assumptions of the Multiple Regression Model, the interpretation of OLS estimates, and the implications of including or excluding variables in the model. Additionally, it covers the unbiasedness of OLS estimates and the importance of the zero conditional mean assumption.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views9 pages

Multiple Linear Regression Notes

The document discusses Multiple Linear Regression, emphasizing its advantages over Simple Linear Regression by allowing control for multiple variables, which aids in inferring causality. It outlines the assumptions of the Multiple Regression Model, the interpretation of OLS estimates, and the implications of including or excluding variables in the model. Additionally, it covers the unbiasedness of OLS estimates and the importance of the zero conditional mean assumption.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Multiple Linear Regression

Intuition

• In the Simple Linear Regression, y = β0 + β1 x + u studied so far, we have had to


make a strong assumption about the model i.e. y is solely a function of x and other
unobservables belong in the error term, u. Infering causality in this model is difficult
because we have excluded many factors from the model.

• Multiple Regression Model allows us to control for several variables in the analysis
and therefore provides for a ceteris paribus interpretation of the coefficients. Since,
many variables that might be correlated with the explanatory variable can be con-
trolled for in the regression, we can hope to infer causality which may not be possible
in a simple linear regression model.

• The simple idea for multiple linear regression is that variation in y is explained by
multiple factors e.g. x1 , x2 , ... xn as opposed to just x1 in the case of a simple linear
regression model. Therefore, in principle, we are building better models to predict
the dependent variable.

• Another important explanation for preferring a multiple linear regression model is


the flexibility in functional forms. We could include both linear and non-linear terms
and therefore we could say the model is more flexible.

Multiple Regression Model - OLS

Linear Regression with 2 Explanatory Variables

Consider the three variable regression model

y = β 0 + β 1 x1 + β 2 x2 + u

1
where y is the dependent variable, x1 and x2 are the two independent variables and u is the
error term. The assumptions of this model are the same as the classical linear regression
model (CLRM). We introduce an additional assumption in the form of multicollinearity.

MLR 1: Linear in Parameters

The dependent variable, y, is related to the independent variable, x, and the error or
unobserved term, u as y = β0 + β1 x + u where β0 and β1 are the population intercept
and slope parameters, respectively.

MLR 2: Random Sampling

x1 and x2 and y are random variables of size i = 1, 2, 3, 4, ..., n. Thus the population
regression model can be re-written as yi = β0 + β1 x1i + β2 x2i + ui

MLR 3: Sampling Variation in Explanatory Variables, x1 and x2

The sample outcomes on the explanatory variables x1 and x2 are not all the same
values.

MLR 4: Zero Conditional Mean

The error u has an expected value of zero given any value of the explanatory variable.
In other words, E(u|x1 , x2 ) = 0

MLR 5: Homoskedasticity

The error u has the same variance given any value of the explanatory variable. In
other words, V ar(u|x1 , x2 ) = σ 2

MLR 6: No Perfect Collinearity

No exact collinearity exists between the two variables x1 and x2 . In other words, there
is no exact linear relationship between x1 and x2 .

2
Interpretation of the OLS Estimates in Multiple Linear Regression

Consider the multiple regression model.

y = β 0 + β 1 x1 + β 2 x2 + u

Then, conditional expectation of y on both sides gives

E[y|x1 , x2 ] = β0 + β1 x1 + β2 x2

Therefore, similar to the case in simple linear regression, multiple regression analysis gives
the average value of y for given values of x1 and x2 . The above identity also provides a
way to interpret the OLS coefficients β0 , β1 and β2 . The intercept term, β0 , as in the simple
linear regression model, it gives the average effect on y when x1 and x2 are excluded from
the model. The interpretation of β1 and β2 is best understood in terms of partial derivatives.
Consider the following identities

∂E[y|x1 , x2 ]
= β1
∂x1
∂E[y|x1 , x2 ]
= β2
∂x2

β1 , thus, measures the change in the average value of y or E[y] for a unit change in x1 ,
holding x2 as fixed. Similarly, β2 , measures the change in the average value of y or E[y] for
a unit change in x2 , holding x1 as fixed.

Average Marginal Effect

Another way to think about the interpretation of OLS coefficients in a multiple linear re-
gression model is via average marginal effects. β1 and β2 give the marginal effect of x1 and
x2 on y, respectively.

3
"Partialling Out" Interpretation

Consider the population regression function as defined by the three-variable regression


model. The interpretation of β1 and β2 is seen as a partial effect or net effect i.e. β1 is the
effect of x1 on y after we have partialled out or netted out the effect of x2 on y. The partialling
out interpretation demonstrates this.

• Regress x1 on x2 and generate the residuals, rˆi1 from this regression.


Pn
i=1 rˆ
i1 yi
• Regress y on rˆi1 . The β1 coefficient is given by Pn ˆ2
r
i=1 i1

• The intuition is that the residuals rˆi1 are part of xi1 that is not correlated with xi2 . Put
another way, rˆi1 is the xi1 after the effects of xi2 have been partialled out or netted out.

Linear Regression with k explanatory variables

Consider the multiple regression model with k-explanatory variables

y = β0 + β1 x1 + β2 x2 + β3 x3 + .... + βk xk + u

Then, the sample regression function (SRF) is

ŷ = βˆ0 + βˆ1 x1 + βˆ2 x2 + .... + βˆk xk + u

• The Gauss-Markov or Classical Linear Regression Model (CLRM) assumptions SLR1-


SLR6 hold.

• The interpretation of the estimates, βˆ1 , βˆ2 , ....., βˆk is that they are partial effects of x1 ,
x2 ,....,xk on y, respectively (see appendix for derivation).

¯
• The sample average of the residuals i.e. û is zero. This implies ȳ = ŷ.

• The sample covariance between each independent variable and the OLS residuals is
zero. Consequently, the sample covariance between the OLS fitted values and the OLS
residuals is zero.

• The point (x¯1 , x¯2 , x¯3 , ....., x¯k , ȳ) is always on the OLS regression line ȳ = βˆ0 + βˆ1 x¯1 +
βˆ2 x¯2 + .... + βˆk x¯k

4
Variance of OLS in the Multiple Linear Regression Model

If the assumptions MLR 1 through MLR 6 hold true, conditional on the sample values of
the independent variables, the variance of the OLS estimators is given by
σ2
V ar(βˆj ) = Pn 2
¯j )2 (1 − Rij
i=1 (xij − x )

2
for j=2, 3, ...., k explanatory variables where Rij is the R-squared from regressing xj on all
other independent variables (and including an intercept).

uˆ2i
Pn
In the case where σ is unknown, the unbiased estimator of σ, σˆ2 is i=1
n−k−1
where n − k −
1 is the degrees of freedom for the k-variable OLS regression with n observations and k
explanatory variables.

Model Selection

The purpose of regression analysis is twofold. One - it enables us to estimate the value of a
dependent variable (y) based on one or more independent variables (x). This is particularly
useful for forecasting and decision-making in various fields such as finance, economics, and
machine-learning. Two - it helps us analyze the relationship between variables, allowing us
to assess whether changes in an independent variable (X) cause a change in the dependent
variable (Y). However, drawing causal conclusions requires careful model specification, in
particular the treatment of the zero conditional mean error assumption. Here, we focus on
the second aspect of regression analysis

Unbiasedness of the OLS Estimates

Consider the zero conditional mean assumption from the multiple linear regression model:

E[u|x1 , x2 , x3 , ...., xk ] = 0

Why is this assumption crucial? When does this assumption get violated?

5
• One way the zero conditional mean assumption fails is when the model is mis-specified.
This can include the incorrect functional form. For instance if wages increase non-
linearly with respect to experience i.e. w = β0 + β1 exper + β2 exper2 + u, then a linear
specification such as w = β0 + β1 exper + u would present biased estimates.

• Another way the zero conditional mean assumption can fail is if there is measurement
error in either the dependent or explanatory variable.

• A further way the zero conditional mean assumption can fail is if the dependent and
independent variables are jointly determined. For instance, if you perform a regres-
sion of price of a good on quantity, then we know they are jointly determined by the
intersection of supply and demand curves.

• Finally, zero conditional mean assumption can fail when certain variables that deter-
mine y are omitted from the regression model.

Including Irrelevant Variables

A question that often pops out in practice is how many and what kind of variables to in-
clude in the regression model?

Consider the true model

y = β0 + β1 x1 + β2 x2 + β3 x3 + u

However, suppose it is known that β3 has no effect on y after x1 and x2 have been controlled
for. But we do not know the true model! So, we end up including x3 in the estimation
model. In other words, we have included an irrelevant variable into the model. What are
the implications of including an irrelevant variable in our model?

• Firstly, note that if the variable is not correlated with the explanatory variables x1 and
x2 , then we should expect the β1 and β2 coefficients to be unbiased.

• Secondly, note that the variance (and therefore standard errors) of the OLS estimates
are given by V ar(βˆj ) = Pn

σ2
2 2 and so including an irrelevant variable can
i=1 ij x¯j ) (1−Rij )
(x

6
affect the standard errors. This, in turn, can change the hypothesis tests around our
estimates.

• Thirdly, the R2 of the model is affected as we are adding more variables to the model.
Therefore, the R2 is bound to increase. To overcome this issue, we use the modified
(1−R2 )(n−1)
version of R2 or the adjusted R2 where AdjR2 = 1 − n−k−1
. This measure cor-
rects for any overfitting in the model and penalizes any irrelevant variables that are
included in the model.

Excluding Relevant Variables (Omitted Variable Bias)

Now, suppose you exclude a relevant variable from the model. That is, suppose the true
model is

y = β 0 + β 1 x1 + β 2 x2 + u

but you estimate the model

y = α 0 + α 1 x1 + e

Estimating the above model will result in biased estimates of α1 . To see why, consider the
regression of x2 on x1 :

x 2 = δ0 + δ1 x 1 + η 1

Then, the true model can be written as

y = β0 + β1 x1 + β2 (δ0 + δ1 x1 + η1 ) + u

y = (β0 + β2 δ0 ) + (β1 + β2 δ1 )x1 + (β2 η1 + u)

y = α0 + α1 x1 + e

Thus, α0 = β0 + β2 δ0 and α1 = β1 + β2 δ1 . Taking expectations, we get E(α0 ) = E(β0 ) +


E(β2 δ0 ) = β0 + β2 δ0 . Therefore, bias in α0 is given by β2 δ0 . Similarly, E(α1 ) = E(β1 ) +
E(β2 δ1 ) = β1 + β2 δ1 . Therefore, bias in α1 is given by β2 δ1 .

7
Appendix

Derivation of the OLS Estimator in Matrix Form

The Linear Regression Model is given by

Yi = β0 + β1 Xi1 + β2 Xi2 + β3 Xi3 + ..... + βk Xik + ϵi

The linear regression model expressed in matrix form is

      
Y 1 X11 X12 . . . X1k β ϵ
 1    0  1 
      
 Y2  1 X21 X22 . . . X2k 
 β1   ϵ2 
   
  
      
 Y  = 1 X31 X32  β2  +  ϵ3 
. . . X3k     
 3 
 ..   .. .. .. .. ..   ..   .. 
      
 .  . . . . .  .   . 
      
Yn 1 Xn1 Xn2 . . . Xnk βk ϵn

Y = Xβ + ϵ

where

- Y is an n × 1 vector of dependent variables.

- X is an n × k matrix of independent variables.

- β is a k × 1 vector of coefficients.

- ϵ is an n × 1 vector of errors.

The sum squared of errors is given by

SSE = ϵ′ ϵ = (Y − Xβ)′ (Y − Xβ)

Taking the derivative with respect to β and setting it to zero:


(Y − Xβ)′ (Y − Xβ) = −2X ′ Y + 2X ′ Xβ = 0
∂β

8
Solving for β:

X ′ Xβ = X ′ Y

β̂ = (X ′ X)−1 X ′ Y

Thus, the OLS estimator is β̂ = (X ′ X)−1 X ′ Y

Variance of OLS in Matrix Form

The OLS estimator is given by:

β̂ = (X ′ X)−1 X ′ Y

Substituting Y = Xβ + ϵ:

β̂ = β + (X ′ X)−1 X ′ ϵ

Taking the variance:

Var(β̂) = Var((X ′ X)−1 X ′ ϵ)

Using the property Var(Aϵ) = AVar(ϵ)A′ :

Var(β̂) = (X ′ X)−1 X ′ Var(ϵ)X(X ′ X)−1

Assuming homoskedastic errors (Var(ϵ) = σ 2 I):

Var(β̂) = σ 2 (X ′ X)−1

You might also like