[go: up one dir, main page]

0% found this document useful (0 votes)
3 views29 pages

Week02 RegressionWithPanelDataPart2

Uploaded by

isfendiyar542
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views29 pages

Week02 RegressionWithPanelDataPart2

Uploaded by

isfendiyar542
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Outline Fixed effects regression models Assumptions Random effects regression models Application References

Regression with Panel Data: Part II

Osman DOGAN

1 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Regression with Panel Data: Part II

Outline:
1 Fixed effects regression models,
2 Fixed effects regression assumptions,
3 Random effects regression models,
4 Application: Drunk driving laws and traffic deaths.
Readings:
1 Stock and Watson (2020, Chapter 10),
2 Hanck et al. (2021, Chapters 10).

2 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Fixed effects regression models

There are two ways to write the fixed effects models:


1 Fixed effects form:
Yit = β1 Xit + αi + uit , (1)
Yit = β1 Xit + αi + λt + uit , (2)
where α1 , . . . , αn are the entity fixed effects and λ1 , . . . , λT are the time fixed
effects.
2 Dummy variable form:
Yit = β0 + β1 Xit + γ2 D2i + γ3 D3i + . . . + γn Dni + uit , (3)

Yit = β0 + β1 Xit + γ2 D2i + γ3 D3i + . . . + γn Dni


+ δ2 B2t + δ3 B3t + . . . + δT BTt + uit , (4)
where β0 , β1 , γ2 , γ3 , . . . , γn , δ2 , δ3 , . . . , δT are the unknown coefficients, and
D2, . . . , Dn, B2, . . . , BT are dummy variables defined as
( (
1 if i = 2, 1 if t = 2,
D2i = B2t = (5)
0, otherwise. 0, otherwise.

3 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Fixed effects regression models

If there are multiple regressors X1 , X2 , . . . , Xk , then the fixed effects model


can be written in the following way.
1 Fixed effects form:

Yit = β1 X1it + β2 X2it + . . . + βk Xkit + αi + uit , (6)

Yit = β1 X1it + β2 X2it + . . . + βk Xkit + αi + λt + uit . (7)

2 Dummy variable form:

Yit = β0 + β1 X1it + β2 X2it + . . . + βk Xkit


+ γ2 D2i + γ3 D3i + . . . + γn Dni + uit , (8)

Yit = β0 + β1 X1it + β2 X2it + . . . + βk Xkit


+ γ2 D2i + γ3 D3i + . . . + γn Dni
+ δ2 B2t + δ3 B3t + . . . + δT BTt + uit . (9)

4 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Fixed effects regression models

Alternative terminology for fixed effects models:


1 One-way fixed effects model:

Yit = β1 X1it + β2 X2it + . . . + βk Xkit + αi + uit .

2 Two-way fixed effects model:

Yit = β1 X1it + β2 X2it + . . . + βk Xkit + αi + λt + uit .

Terminology for fixed effects:


1 αi : Entity fixed effect, or individual fixed effect, or time-invariant effect, or entity
heterogeneity
2 λt : Time fixed effect, or entity-invariant effect, or time heterogeneity.

5 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Fixed effects regression assumptions

Consider the following fixed effect regression model:


Yit = β1 Xit + αi + uit . (10)
We consider this model under the following assumptions.

Assumption 1
uit has conditional mean zero: E (uit |Xi1 , Xi2 , . . . , XiT , αi ) = 0.

Assumption 2
(Xi1 , Xi2 , . . . , XiT , ui1 , ui2 , . . . , uiT ), i = 1, 2, . . . , n, are i.i.d. draws from
their joint distribution.

Assumption 3
Large outliers are unlikely: (Xit , uit ) have finite fourth moments.

Assumption 4
There is no perfect multicollinearity (multiple X’s).

6 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Fixed effects regression assumptions

Consider Assumption 1: E (uit |Xi1 , Xi2 , . . . , XiT , αi ) = 0.


We require that the conditional mean of uit does not depend on any of the
values of X for that entity - past, present, or future.
This assumption is violated if current uit is correlated with past, present, or
future values of X.
The second assumption is that the variables for one entity are distributed
identically to, but independently of, the variables for another entity; that is, the
variables are i.i.d. across entities for i = 1, . . . , n.
Assumption 2 holds if entities are selected by simple random sampling from the
population.

7 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Fixed effects regression assumptions

Note that Assumption 2 requires that the variables are independent across
entities but makes no such restriction within an entity. For example, it allows
Xit to be correlated over time within an entity.

Definition 1
If Xit is correlated with Xis for different values of s and t-that is, if Xit is
correlated over time for a given entity- then Xit is said to be autocorrelated
(correlated with itself, at different dates) or serially correlated.

Autocorrelation is a pervasive feature of time series data: What happens one


year tends to be correlated with what happens the next year.

8 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Fixed effects regression assumptions

Recall that uit consists of time-varying factors that are determinants of Yit but
are not included as regressors, and some of these omitted factors might be
autocorrelated:
1 For example, a downturn in the local economy might produce layoffs and
diminish commuting traffic, thus reducing traffic fatalities for 2 or more years.
2 Similarly, a major road improvement project might reduce traffic accidents not
only in the year of completion but also in future years.

Such omitted factors, which persist over multiple years, produce autocorrelated
regression errors.
Finally, the third and fourth assumptions for fixed effects regression are
analogous to the third and fourth least squares assumptions for the multiple
linear regression model.

9 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Fixed effects regression assumptions

Under the fixed effects regression assumption, we have the following results:
(1) The fixed effects estimator is consistent and is normally distributed when n is
large.
(2) However, the usual OLS standard errors (both homoskedasticity-only and
heteroskedasticityrobust) will in general be wrong because they assume that uit
is serially uncorrelated.
1 We need to use the clustered standard errors. Clustered standard errors estimate the
variance of β̂1 when the variables are i.i.d. across entities but are potentially
autocorrelated within an entity.
2 The term “clustered” comes from allowing correlation within a “cluster” of
observations (within an entity), but not across clusters (entities).
3 In panel data regression, clustered SEs are valid whether or not there is
heteroskedasticity and/or serial correlation in uit .
4 In R, we can use vcovHC(reg, type = "HC1") to get clustered standard errors.

We should use clustered standard errors for t-statistics and F-statistics.

10 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Random effects regression models

Recall that
1 α1 , . . . , αn are the unobserved entity specific effects,
2 λ1 , . . . , λT are the unobserved time fixed effects.

In the fixed effects regression models, we assume that these unobserved fixed
effects are arbitrary correlated with the regressors.
If we assume that these unobserved fixed effects are uncorrelated with the
regressors, then we have random effects models.
Thus, in random effects models, αi and λt are i.i.d random variables,
independent of Xit and uit .

Assumption 5 (Random effects model)


The random effects αi and λt are i.i.d random variables that are independent
of Xit .

11 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Random effects regression models

Terminology for the random effects models:


1 One-way random effects model:

Yit = β1 X1it + β2 X2it + . . . + βk Xkit + αi + uit .

2 Two-way random effects model:

Yit = β1 X1it + β2 X2it + . . . + βk Xkit + αi + λt + uit .

3 αi : Entity random effect, individual random effect, time-invariant random effect,


4 λt : Time random effect, entity-invariant random effect.

The random effects models in addition to Assumptions 1-4, they also require
Assumption 5. Thus, the random effects model is a special case of the fixed
effects model.

12 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Random effects regression models

The one-way random effects model is


Yit = β1 X1it + β2 X2it + . . . + βk Xkit + αi + uit (11)
2
where αi ∼ N (0, σα ) for i = 1, 2 . . . , n.
The two-way random effects model is
Yit = β1 X1it + β2 X2it + . . . + βk Xkit + αi + λt + uit , (12)
2
where αi ∼ N (0, σα ) for i = 1, 2 . . . , n, and λt ∼ N (0, σλ2 ) for t = 1, 2, . . . , T .
2
In the random effects models, the variance terms σα and σλ2 are the unknown
terms to be estimated.
In R, we can estimate the random effects models by setting model="random"
in the plm function:

# One - way random effects model


r1 = plm ( mrall∼beertax , data = mydata , index = c ( " state " ," year " ) ,
model = " random " )
# Two - way random effects model
r2 = plm ( mrall∼beertax , data = mydata , index = c ( " state " ," year " ) ,
model = " random " , effect = " twoways " )

13 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Random effects regression models

How should we decide between a fixed effects regression and a random effects
regression?
If we think that the unobserved fixed effects are arbitrary correlated with the
regressors, then we should use the fixed effects regression.
However, if we think that the unobserved fixed effects are uncorrelated with the
regressors, then we should use the random effects regression.
We can also use the Hausman test to decide between fixed or random effects.
In this test, the null hypothesis is that the model is a random effects regression,
and the alternative hypothesis is that the model is a fixed effects regression.
The plm package includes the phtest function for this test.

# Hausman test
# r1 : fixed effects model and r2 : random effects model
phtest ( r1 , r2 )

14 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Traffic deaths and alcohol taxes

There are approximately 35,000 highway traffic fatalities each year in the
United States.
Approximately one-fourth of fatal crashes involve a driver who was drinking,
and this fraction rises during peak drinking periods.
Policy makers are interested in how effective various government policies
designed to discourage drunk driving actually are in reducing traffic deaths.
The panel data set contains variables related to traffic fatalities and alcohol,
including the number of traffic fatalities in each state in each year, the type of
drunk driving laws in each state in each year, and the tax on beer in each state.
□ Traffic fatality (# of traffic deaths in state i in year t)
□ Tax on a case of beer
□ Other: legal driving age, legal drinking age, drunk driving laws, vehicle miles per
driver, state socio-economic data, etc.

15 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Traffic deaths and alcohol taxes

We will work with traffic fatality data set (fatality.csv).


The data are for the 48 U.S. states (excluding Alaska and Hawaii), annually for
1982 through 1988.
Observational unit: n = 48 states, T = 7 years, and n × T = 336 observations.
The main variables are described in the following table:

variable description
state state id (FIPS) code
year year
mrall per capita vehicle fatality (number of traffic deaths per 10.000
people)
beertax the tax on a case of beer
mlda minimum legal drinking age
jaild = 1 if state requires mandatory jail sentence for an initial drunk
driving conviction
comserd = 1 if state requires mandatory community service for an initial
drunk driving conviction
vmiles total vehicle miles traveled annually
unrate unemployment rate
perinc per capita personal income

16 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Traffic deaths and alcohol taxes

We will extend our preceding analysis by consdering the following variables:


1 unrate: a numeric variable stating the state specific unemployment rate.
2 log(perinc): the logarithm of real per capita income (in prices of 1988).
3 vmiles: the state average miles per driver.
4 mlda: the state specific minimum legal drinking age.
5 drink18, drink19, drink20: a discretized version of mlda that classifies states
into four categories of minimal drinking age: 18, 19, 20. The base category is
when the minimal drinking age is 21.
6 jaild: a dummy variable with levels yes and no that measures if drunk driving is
severely punished by mandatory jail time or mandatory community service (first
conviction).

17 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Traffic deaths and alcohol taxes

We will extend our preceding analysis by considering the following models:


FatalityRateit = β0 + β1 BeerTaxit + uit , (1)
FatalityRateit = β0 + β1 BeerTaxit + αi + uit , (2)
FatalityRateit = β0 + β1 BeerTaxit + αi + λt + uit , (3)
FatalityRateit = β0 + β1 BeerTaxit + β2 drink18 + β3 drink19 + β4 drink20
+ β5 jaild + β6 vmiles + β7 unrate + β8 log(perinc) + αi + λt + uit , (4)
FatalityRateit = β0 + β1 BeerTaxit + β2 drink18 + β3 drink19 + β4 drink20
+ β5 jaild + β6 vmiles + αi + λt + uit , (5)
FatalityRateit = β0 + β1 BeerTaxit + β2 mlda + β3 jaild + β4 unrate
+ β5 vmiles + β6 log(perinc) + αi + λt + uit , (6)
FatalityRateit = β0 + β1 BeerTaxit + β2 drink18 + β3 drink19 + β4 drink20
+ β5 jaild + β6 vmiles + β7 unrate + β8 log(perinc) + αi + λt + uit . (7)

The final regression follows the “before and after” approach and uses only data
from 1982 and 1988. In all other regressions, all time periods are used.
The fourth model serves as our base model, while additional regressors are
included in the other models for sensitivity checks.

18 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Estimation

# Model 1
r1 = lm ( mrall∼beertax , data = mydata )
r1 _ vcov = vcovHC ( r1 , type = " HC1 " )
r1 _ se = sqrt ( diag ( r1 _ vcov ) )
# Model 2
r2 = plm ( mrall∼beertax , data = mydata , index = c ( " state " ," year " ) ,
model = " within " )
r2 _ vcov = vcovHC ( r2 , type = " HC1 " )
r2 _ se = sqrt ( diag ( r2 _ vcov ) )
# Model 3
r3 = plm ( mrall∼beertax , data = mydata , index = c ( " state " ," year " ) ,
model = " within " , effect = " twoways " )
r3 _ vcov = vcovHC ( r3 , type = " HC1 " )
r3 _ se = sqrt ( diag ( r3 _ vcov ) )
# Model 4
r4 = plm ( mrall∼beertax + drink18 + drink19 + drink20 + jaild +
vmiles + unrate + log ( perinc ) , data = mydata ,
index = c ( " state " ," year " ) , model = " within " ,
effect = " twoways " )
r4 _ vcov = vcovHC ( r4 , type = " HC1 " )
r4 _ se = sqrt ( diag ( r4 _ vcov ) )
# Model 5
r5 = plm ( mrall∼beertax + drink18 + drink19 + drink20 + jaild +
vmiles , data = mydata , index = c ( " state " ," year " ) ,
model = " within " , effect = " twoways " )
r5 _ vcov = vcovHC ( r5 , type = " HC1 " )
r5 _ se = sqrt ( diag ( r5 _ vcov ) )

19 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Estimation

# Model 6
r6 = plm ( mrall∼beertax + mlda + jaild + unrate + vmiles + log ( perinc ) ,
data = mydata , index = c ( " state " ," year " ) ,
model = " within " , effect = " twoways " )
r6 _ vcov = vcovHC ( r6 , type = " HC1 " )
r6 _ se = sqrt ( diag ( r6 _ vcov ) )
# Model 7
r7 = plm ( mrall∼beertax + drink18 + drink19 + drink20 + jaild + vmiles +
unrate + log ( perinc ) , data = mydata ,
index = c ( " state " ," year " ) ,
model = " within " , effect = " twoways " ,
subset =( year ==1982| year ==1988) )
r7 _ vcov = vcovHC ( r7 , type = " HC1 " )
r7 _ se = sqrt ( diag ( r7 _ vcov ) )

# Print estimation results


stargazer ( r1 , r2 , r3 , r4 , se = list ( r1 _ se , r2 _ se , r3 _ se , r4 _ se ) ,
title = " Estimation Results " , type = " text " ,
keep . stat = c ( " n " ," rsq " ," adj . rsq " ) ,
dep . var . labels . include =F ,
model . names = F )

stargazer ( r5 , r6 , r7 , se = list ( r5 _ se , r6 _ se , r7 _ se ) ,
title = " Estimation Results " , type = " text " ,
keep . stat = c ( " n " ," rsq " ," adj . rsq " ) ,
dep . var . labels . include =F ,
column . labels = c ( " (5) " ," (6) " ," (7) " ) ,
model . numbers = F )

20 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Table 1: Estimation Results for Models 1-4


Dependent variable:
(1) (2) (3) (4)
beertax 0.365∗∗∗ −0.656∗∗ −0.640∗ −0.415
(0.053) (0.289) (0.350) (0.288)
drink18 −0.040
(0.058)
drink19 0.001
(0.043)
drink20 0.076
(0.077)
jaild 0.038
(0.098)
vmiles 0.00001
(0.00001)
unrate −0.062∗∗∗
(0.013)
log(perinc) 1.943∗∗∗
(0.577)
Constant 1.853∗∗∗
(0.047)
Observations 336 336 336 335
R2 0.093 0.041 0.036 0.363
Adjusted R2 0.091 −0.120 −0.149 0.221

Note: p<0.1; ∗∗ p<0.05; ∗∗∗
p<0.01

21 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Table 2: Estimation Results for Models 5-7


Dependent variable:
(5) (6) (7)
beertax −0.661∗ −0.456 −0.905∗∗∗
(0.347) (0.301) (0.340)
drink18 0.017 −0.055
(0.083) (0.098)
drink19 −0.023 −0.075
(0.064) (0.087)
drink20 −0.058 −0.116
(0.069) (0.122)
mlda −0.002
(0.021)
jaild 0.085 0.039 0.093
(0.106) (0.101) (0.166)
unrate −0.063∗∗∗ −0.091∗∗∗
(0.013) (0.020)
vmiles 0.00002 0.00001 0.0001∗∗
(0.00001) (0.00001) (0.0001)
log(perinc) 1.786∗∗∗ 1.096∗
(0.631) (0.630)
Observations 335 335 95
R2 0.052 0.357 0.660
Adjusted R2 −0.151 0.219 0.158

Note: p<0.1; ∗∗ p<0.05; ∗∗∗
p<0.01

22 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Estimation

The estimation results are given in Tables 1 and 2.


Model 1 presents results for the OLS regression of the fatality rate on the real
beer tax without state and time fixed effects, the coefficient on the real beer
tax is positive (0.365).
However, Model 2 includes state fixed effects, suggests that the positive
coefficient in Model 1 is the result of omitted variable bias (the coefficient on
the real beer tax is −0.656).
Little changes when time effects are added, as reported in Model 3.
The results in columns (1) through (3) are consistent with the omitted fixed
factors – historical and cultural factors, general road conditions, population
density, attitudes toward drinking and driving, and so forth – being important
determinants of the variation in traffic fatalities across states.
The next four regressions include additional potential determinants of fatality
rates along with state and time effects.

23 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Estimation

The base specification, reported in column (4), includes variables related to


drunk driving laws plus variables that control for the amount of driving and
overall state economic conditions (unrate and log(perinc)).
Including the additional variables reduces the estimated effect of the beer tax
from −0.640 in column (3) to −0.416 in column (4). However, the estimate is
quite imprecise.
The minimum legal drinking age does not have an effect on traffic fatalities:
none of the three dummy variables (drink18-drink20) are significantly
different from zero at any common level of significance.

24 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Estimation

In Model 4, consider

H0 : β2 = β3 = β4 = 0, H1 : At least one coefficient is nonzero.

Recall that we should use the F-test for the joint null hypothesis. In R, we can
use the linearHypothesis function to compute the F-statistic.

# F test for H0 : beta2 = beta3 = beta4 =0 in Model 4


l i n e a r H y p othesis ( r4 , test = " F " ,c ( " drink18 =0 " , " drink19 =0 " , " drink20 =0 " ) ,
vcov . = vcovHC , type = " HC1 " )
# Test result
Res . Df Df F Pr ( > F )
1 276
2 273 3 0.5384 0.6564

The F-statistic is 0.5384 with a p-value of 0.6564. Thus, we fail to reject H0 .

25 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Estimation

The coefficient on the first offense punishment variable is also estimated to be


small and is not significantly different from 0 at the 10% significance level.
The economic variables significantly explain traffic fatalities. We can check
that the employment rate and per capita income are jointly significant. In
Model 4, consider

H0 : β7 = β8 = 0, H1 : At least one coefficient is nonzero.

# F test for H0 : beta7 = beta8 =0 in Model 4


l i n e a r H y p othesis ( r4 , test = " F " ,
c ( " unrate =0 " , " log ( perinc ) =0 " ) ,
vcov . = vcovHC , type = " HC1 " )
# Test result
Res . Df Df F Pr ( > F )
1 275
2 273 2 35.01 2.918 e -14 * * *

The F-statistic is 35.01 with a very small p-value. Thus, we reject the null
hypothesis.

26 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Estimation

Model 5 omits the economic factors. The result supports the notion that
economic indicators should remain in the model as the coefficient on beer tax is
sensitive to the inclusion of the latter.
Results for model (6) demonstrate that the legal drinking age has little
explanatory power and that the coefficient of interest is not sensitive to changes
in the functional form of the relation between drinking age and traffic fatalities.
Specification (7) reveals that reducing the amount of available information (we
only use 95 observations for the period 1982 to 1988 here) inflates standard
errors but does not lead to drastic changes in coefficient estimates.

27 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Estimation

Stock and Watson (2020) conclude that:


Taken together, these results present a provocative picture of measures
to control drunk driving and traffic fatalities. According to these es-
timates, neither stiff punishments nor increases in the minimum legal
drinking age have important effects on fatalities. In contrast, there is
evidence that increasing alcohol taxes, as measured by the real tax on
beer, does reduce traffic deaths, presumably through reduced alcohol
consumption. The imprecision of the estimated beer tax coefficient
means, however, that we should be cautious about drawing policy con-
clusions from this analysis and that additional research is warranted.

28 / 29
Outline Fixed effects regression models Assumptions Random effects regression models Application References

Bibliography I

Hanck, Christoph et al. (2021). Introduction to Econometrics with R. url:


https://www.econometrics-with-r.org/index.html.
Stock, James H. and Mark W. Watson (2020). Introduction to Econometrics.
Fourth. Pearson.

29 / 29

You might also like