Week02 RegressionWithPanelDataPart2
Week02 RegressionWithPanelDataPart2
Osman DOGAN
1 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Outline:
1 Fixed effects regression models,
2 Fixed effects regression assumptions,
3 Random effects regression models,
4 Application: Drunk driving laws and traffic deaths.
Readings:
1 Stock and Watson (2020, Chapter 10),
2 Hanck et al. (2021, Chapters 10).
2 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
3 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
4 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
5 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Assumption 1
uit has conditional mean zero: E (uit |Xi1 , Xi2 , . . . , XiT , αi ) = 0.
Assumption 2
(Xi1 , Xi2 , . . . , XiT , ui1 , ui2 , . . . , uiT ), i = 1, 2, . . . , n, are i.i.d. draws from
their joint distribution.
Assumption 3
Large outliers are unlikely: (Xit , uit ) have finite fourth moments.
Assumption 4
There is no perfect multicollinearity (multiple X’s).
6 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
7 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Note that Assumption 2 requires that the variables are independent across
entities but makes no such restriction within an entity. For example, it allows
Xit to be correlated over time within an entity.
Definition 1
If Xit is correlated with Xis for different values of s and t-that is, if Xit is
correlated over time for a given entity- then Xit is said to be autocorrelated
(correlated with itself, at different dates) or serially correlated.
8 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Recall that uit consists of time-varying factors that are determinants of Yit but
are not included as regressors, and some of these omitted factors might be
autocorrelated:
1 For example, a downturn in the local economy might produce layoffs and
diminish commuting traffic, thus reducing traffic fatalities for 2 or more years.
2 Similarly, a major road improvement project might reduce traffic accidents not
only in the year of completion but also in future years.
Such omitted factors, which persist over multiple years, produce autocorrelated
regression errors.
Finally, the third and fourth assumptions for fixed effects regression are
analogous to the third and fourth least squares assumptions for the multiple
linear regression model.
9 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Under the fixed effects regression assumption, we have the following results:
(1) The fixed effects estimator is consistent and is normally distributed when n is
large.
(2) However, the usual OLS standard errors (both homoskedasticity-only and
heteroskedasticityrobust) will in general be wrong because they assume that uit
is serially uncorrelated.
1 We need to use the clustered standard errors. Clustered standard errors estimate the
variance of β̂1 when the variables are i.i.d. across entities but are potentially
autocorrelated within an entity.
2 The term “clustered” comes from allowing correlation within a “cluster” of
observations (within an entity), but not across clusters (entities).
3 In panel data regression, clustered SEs are valid whether or not there is
heteroskedasticity and/or serial correlation in uit .
4 In R, we can use vcovHC(reg, type = "HC1") to get clustered standard errors.
10 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Recall that
1 α1 , . . . , αn are the unobserved entity specific effects,
2 λ1 , . . . , λT are the unobserved time fixed effects.
In the fixed effects regression models, we assume that these unobserved fixed
effects are arbitrary correlated with the regressors.
If we assume that these unobserved fixed effects are uncorrelated with the
regressors, then we have random effects models.
Thus, in random effects models, αi and λt are i.i.d random variables,
independent of Xit and uit .
11 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
The random effects models in addition to Assumptions 1-4, they also require
Assumption 5. Thus, the random effects model is a special case of the fixed
effects model.
12 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
13 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
How should we decide between a fixed effects regression and a random effects
regression?
If we think that the unobserved fixed effects are arbitrary correlated with the
regressors, then we should use the fixed effects regression.
However, if we think that the unobserved fixed effects are uncorrelated with the
regressors, then we should use the random effects regression.
We can also use the Hausman test to decide between fixed or random effects.
In this test, the null hypothesis is that the model is a random effects regression,
and the alternative hypothesis is that the model is a fixed effects regression.
The plm package includes the phtest function for this test.
# Hausman test
# r1 : fixed effects model and r2 : random effects model
phtest ( r1 , r2 )
14 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
There are approximately 35,000 highway traffic fatalities each year in the
United States.
Approximately one-fourth of fatal crashes involve a driver who was drinking,
and this fraction rises during peak drinking periods.
Policy makers are interested in how effective various government policies
designed to discourage drunk driving actually are in reducing traffic deaths.
The panel data set contains variables related to traffic fatalities and alcohol,
including the number of traffic fatalities in each state in each year, the type of
drunk driving laws in each state in each year, and the tax on beer in each state.
□ Traffic fatality (# of traffic deaths in state i in year t)
□ Tax on a case of beer
□ Other: legal driving age, legal drinking age, drunk driving laws, vehicle miles per
driver, state socio-economic data, etc.
15 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
variable description
state state id (FIPS) code
year year
mrall per capita vehicle fatality (number of traffic deaths per 10.000
people)
beertax the tax on a case of beer
mlda minimum legal drinking age
jaild = 1 if state requires mandatory jail sentence for an initial drunk
driving conviction
comserd = 1 if state requires mandatory community service for an initial
drunk driving conviction
vmiles total vehicle miles traveled annually
unrate unemployment rate
perinc per capita personal income
16 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
17 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
The final regression follows the “before and after” approach and uses only data
from 1982 and 1988. In all other regressions, all time periods are used.
The fourth model serves as our base model, while additional regressors are
included in the other models for sensitivity checks.
18 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Estimation
# Model 1
r1 = lm ( mrall∼beertax , data = mydata )
r1 _ vcov = vcovHC ( r1 , type = " HC1 " )
r1 _ se = sqrt ( diag ( r1 _ vcov ) )
# Model 2
r2 = plm ( mrall∼beertax , data = mydata , index = c ( " state " ," year " ) ,
model = " within " )
r2 _ vcov = vcovHC ( r2 , type = " HC1 " )
r2 _ se = sqrt ( diag ( r2 _ vcov ) )
# Model 3
r3 = plm ( mrall∼beertax , data = mydata , index = c ( " state " ," year " ) ,
model = " within " , effect = " twoways " )
r3 _ vcov = vcovHC ( r3 , type = " HC1 " )
r3 _ se = sqrt ( diag ( r3 _ vcov ) )
# Model 4
r4 = plm ( mrall∼beertax + drink18 + drink19 + drink20 + jaild +
vmiles + unrate + log ( perinc ) , data = mydata ,
index = c ( " state " ," year " ) , model = " within " ,
effect = " twoways " )
r4 _ vcov = vcovHC ( r4 , type = " HC1 " )
r4 _ se = sqrt ( diag ( r4 _ vcov ) )
# Model 5
r5 = plm ( mrall∼beertax + drink18 + drink19 + drink20 + jaild +
vmiles , data = mydata , index = c ( " state " ," year " ) ,
model = " within " , effect = " twoways " )
r5 _ vcov = vcovHC ( r5 , type = " HC1 " )
r5 _ se = sqrt ( diag ( r5 _ vcov ) )
19 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Estimation
# Model 6
r6 = plm ( mrall∼beertax + mlda + jaild + unrate + vmiles + log ( perinc ) ,
data = mydata , index = c ( " state " ," year " ) ,
model = " within " , effect = " twoways " )
r6 _ vcov = vcovHC ( r6 , type = " HC1 " )
r6 _ se = sqrt ( diag ( r6 _ vcov ) )
# Model 7
r7 = plm ( mrall∼beertax + drink18 + drink19 + drink20 + jaild + vmiles +
unrate + log ( perinc ) , data = mydata ,
index = c ( " state " ," year " ) ,
model = " within " , effect = " twoways " ,
subset =( year ==1982| year ==1988) )
r7 _ vcov = vcovHC ( r7 , type = " HC1 " )
r7 _ se = sqrt ( diag ( r7 _ vcov ) )
stargazer ( r5 , r6 , r7 , se = list ( r5 _ se , r6 _ se , r7 _ se ) ,
title = " Estimation Results " , type = " text " ,
keep . stat = c ( " n " ," rsq " ," adj . rsq " ) ,
dep . var . labels . include =F ,
column . labels = c ( " (5) " ," (6) " ," (7) " ) ,
model . numbers = F )
20 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
21 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
22 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Estimation
23 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Estimation
24 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Estimation
In Model 4, consider
Recall that we should use the F-test for the joint null hypothesis. In R, we can
use the linearHypothesis function to compute the F-statistic.
25 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Estimation
The F-statistic is 35.01 with a very small p-value. Thus, we reject the null
hypothesis.
26 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Estimation
Model 5 omits the economic factors. The result supports the notion that
economic indicators should remain in the model as the coefficient on beer tax is
sensitive to the inclusion of the latter.
Results for model (6) demonstrate that the legal drinking age has little
explanatory power and that the coefficient of interest is not sensitive to changes
in the functional form of the relation between drinking age and traffic fatalities.
Specification (7) reveals that reducing the amount of available information (we
only use 95 observations for the period 1982 to 1988 here) inflates standard
errors but does not lead to drastic changes in coefficient estimates.
27 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Estimation
28 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References
Bibliography I
29 / 29