0% found this document useful (0 votes)

91 views9 pages

Vb V ε X = σ Vb = σ Vb = X'X Σx X'X: I X'X X'

The document compares different estimation techniques for regression models: - White and Newey-West estimators allow for heteroskedasticity and autocorrelation, correcting standard errors compared to OLS. - Generalized least squares (GLS) is efficient but requires specifying the exact heteroskedasticity structure. - Tests for instrumental variables (IV) assumptions include Hausman for endogeneity, overidentification for instrument validity, and weak instruments tests. - Measurement error in regressors biases OLS, while panel data allows controlling for additional unobserved factors.

Uploaded by

Dinhthe Hoi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views9 pages

Vb V ε X = σ Vb = σ Vb = X'X Σx X'X: I X'X X'

Uploaded by

Dinhthe Hoi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Lecture Notes 11

Review

What is difference between

- robust estimators (White)
- HAC robust estimators (Newey-West heteroscedasticity-autocorrelation
consistent)
- GLS estimator (Generalized Least Squares - i.e. general
heteroskedasticity)?

- White is for heteroscedasticity with no auto-correlation

- Newey-West is for auto-correlation and heteroscedasticity
- calculates correct V ( b ) ,
- which OLS regression packages do not do
- since OLS assumes V ε X = σ 2I ( )
- making V ( b ) = σ 2 ( X ' X )
−1

- but with heteroscedasticity, V ( b ) = ( X ' X ) X ' ΣX ( X ' X )

−1 −1

- why not use GLS instead of OLS?

- after all, it is efficient
- have to specify exact structure of heteroscedasticity
- White and Newey-West robust estimators esp. for case
where you don’t think you have heteroscedasticity problem
- check to see if you have a problem
- OLS still consistent
- but White & Newey-West allows correct inference

Testing IV assumptions

1. E(ε X) ≠ 0 - Hausman test for endogeneity

test if b = β̂ 2SLS
2. E ( ε Z ) = 0 - Overidentification test (only possible if L > K)
test if β̂ is the same with or without extra L-K instruments
3. E ( Z'X ) = QZX ≠ 0 - Weak instrument test
test correlation of Z and X from first stage of 2SLS

1 of 9
Important because one is using IV in the first place because there is doubt about
endogeneity, and never obvious that instruments are both exogenous and highly
correlated with X.

- several variations on each of these tests

- also different versions if you allow robust standard errors (H or AC or HAC)

1. Hausman (and Wu, Durban) test

Is there an endogeneity problem in the first place?

i.e. E ( ε X ) = 0
Can’t test E ( ε X ) = 0 directly
OLS residuals e are constructed so that X 'e = 0
If E ( ε X ) = 0 , then OLS is consistent, and so is IV
- because it is still true that E ( ε Z ) = 0 and E ( Z'X ) = QZX
But if E ( ε X ) ≠ 0 , then OLS in inconsistent, but IV is consistent
Hausman tests whether β̂ 2SLS − b = 0

H 0 : β̂ 2SLS − b = 0
H A : β̂ 2SLS − b ≠ 0

Use Wald statistic

( ){
′
} (β̂ )
−1
H = β̂ 2SLS − b Est.Asy.Var ⎡⎣ β̂ 2SLS − b ⎤⎦ 2SLS
−b

Asy.V ⎡⎣ β̂ 2SLS − b ⎤⎦ = Asy.V ⎡⎣ β̂ 2SLS ⎤⎦ + Asy.V ⎡⎣b ⎤⎦ − 2Asy.Cov ⎡⎣ β̂ 2SLS ,b ⎤⎦

But what is Asy.Cov ⎡⎣ β̂ 2SLS ,b ⎤⎦ ?

First, Hausman noted that under H 0 , OLS is efficient and IV is not

−1 −1
σ2 ⎛ X̂ ' X̂ ⎞ σ2 ⎛ X'X⎞
Asy.V ⎡⎣ β̂ 2SLS ⎤⎦ − Asy.V ⎡⎣b ⎤⎦ = plim ⎜ − plim ⎜
n ⎝ n ⎠ ⎟ n ⎝ n ⎟⎠

since X̂ is a estimate of X

2 of 9
- it is less correlated with X than X is with itself
- (unless columns of Z perfectly predict columns of X )
−1 −1
⎛ X̂ ' X̂ ⎞ ⎛ X 'X ⎞
> plim ⎜ so Asy.Var ⎡⎣ β̂ 2SLS ⎤⎦ > Asy.Var [ b ]
⎝ n ⎟⎠
plim ⎜
⎝ n ⎟⎠
Second, he proved that
the Cov between an efficient estimator (b)
and the difference with an inefficient estimator ( β̂ 2SLS )
for the same parameter is zero.

So Cov ⎡⎣b , β̂ 2SLS − b ⎤⎦ = V ⎡⎣b ⎤⎦ − Cov ⎡⎣b , β̂ 2SLS ⎤⎦ = 0

or Cov ⎡⎣b , β̂ 2SLS ⎤⎦ = V ⎡⎣b ⎤⎦

so Asy.V ⎡⎣ β̂ 2SLS − b ⎤⎦ = Asy.V ⎡⎣ β̂ 2SLS ⎤⎦ − Asy.V ⎡⎣b ⎤⎦

( ) − s2 ( X ' X )
−1 −1
Est.Asy.V ⎡⎣ β̂ 2SLS − b ⎤⎦ = s 2 X̂ ' X̂

( )′ {Est.Asy.V ⎡⎣β̂ } (β̂ )

−1
so H = β̂ 2SLS − b 2SLS
− b ⎤⎦ 2SLS
−b

(
H = β̂ 2SLS − b )′ (V ⎡⎣β̂ 2SLS
⎤ − V ⎡b ⎤ β̂
⎦ )(
⎣ ⎦ 2SLS − b )

2. Overidentification test
- only possible if L > K
E ( z i ε i ) = 0 othogonality condition
⎛1 n ⎞
E(m) = E ⎜ ∑ z i ε i ⎟ = 0 , even though not exactly true in sample
⎝ n i=1 ⎠
1 n
So test whether ∑ z iε i = 0 when L>K
n i=1
i.e. test m = 0
n
1 1 n
- use m = ∑ i IV ,i n ∑ z i (yi − xi ' β̂ IV )
n i=1
z e =
i=1

- then m' ⎡⎣Var ( m ) ⎤⎦ m ∼ χ L−K

−1 2

- only L-K degrees of freedom because

3 of 9
β̂ IV already forces first K moment conditions to be exactly equal to
zero
1 n
2 ∑ ( i IV ,i ) ( i IV ,i )
1
- Est.Var ( m ) = ze z e ' = 2 Z'e IV e IV 'Z
n i=1 n
1 n 1
- m = ∑ z i eIV ,i = Z'e IV
n i=1 n
- so Wald stat is χ 2 = e IV 'Z [ Z'e IV e IV 'Z ] Z'e IV
−1

- can view this as a test of whether the instruments give the same answer
as each other.

3. Test for weak instruments

- testing for E ( Z'X ) ≠ 0 : whether Z are sufficiently correlated with X
- if just one endogenous variable, then first stage of 2SLS is regression is
xi = z i 'γ + υi
- how to test correlation?
- just test that all γ = 0
- how would we carry this out?
- more complicated if more than one endogenous variable
- if weak correlation of X and Z
−1
- Asy.Var ⎡⎣ β̂ IV ⎤⎦ = σ 2 ⎡⎣ X 'Z ( Z'Z ) Z'X ⎤⎦
−1

- if X 'Z → 0 , then Asy.Var ⎡⎣ β̂ IV ⎤⎦ → ∞

- Godfrey test compares variance of b and β̂ 2SLS
- for just one endogenous xk , with ratio

( X 'X )kk
Rk2 = ,
( X̂ ' X̂ ) kk

R (n − L )
2
then k
∼F
1− Rk2 ( L − 1)
- more complicated with multiple endogenous xk

4 of 9
Measurement Error

yi* = β xi* + ε i
yi = yi* + υi
xi = xi* + ui
- if only error in yi , no problem
yi = β xi* + ε i + υi = β xi* + ε i′
- if error in xi , big problem
yi = β xi + ε i − β ui = β xi + wi
Cov[ xi ,wi ] = Cov ⎡⎣ xi* + u i , ε i − β ui ⎤⎦ = − βσ u2
- violates exogeneity of x
β
plim b = attenuation bias - b too small
1+ σ u / plim(x'x)
2

- in multivariable context, don’t know direction of bias

- even if just one x has measurement error, all b is biased
- to fix, use IV

Panel Data

Have cross section data on units (the “panel”) repeatedly measured over time
- AKA cross-section time-series data (“xt” in Stata)

Nothing inherently problematic, just allows you to correct for more issues
- an opportunity to make more precise estimates
- in particular, to control for all unchanging individual characteristics
- with an individual-specific constant term

Panel data typically expensive and difficult to collect

- attrition bias
- not typically random who drops out of panel over time
- important to have dedicated surveyors who track everyone down

Have both a individual subscript i and a time subscript t

yit = x it′ β + ε it

if T is the same for all individuals, then a “balanced panel”

5 of 9
if Ti is different for each individual, “unbalanced panel”
- in general, just complicates the notation a bit
- rarely a substantive issue, unless you are programming estimators

How many observations?

nT or nT

Most important issues

1. How do we estimate individual effects?
- fixed effects or random effects models?
2. Do the coefficient estimates ( βi ) vary by individual?
- random coefficients model
3. How do we model autoregressive errors?
- Arellano-Bond GMM estimators

Fixed vs. Random effects

yit = x it′ β + α i + ε it , where x it doesn’t have a column of ones (why not?)

i.e. why can’t we estimate yit = x it′ β + β 0 + α i + ε it

y = Xβ + iβ 0 + d1α 1 + d 2α 2 +!+ d nα n + ε
1 n
∑ di = i ,
n i=1
so i is collinear with d i

X matrix (including i and d i ) will not be full rank

- this is just the usual dummy variable problem

This is the big deal of most panel data estimation

- we can estimate an individual-specific constant term
- means we can control for all unchanging individual characteristics
- another tool for reducing endogeneity
- why don’t we estimate this for cross-sectional data?
yi = x i′β + α i + ε i
because we would be estimating n+K coefficients
- with n observations
- failure of identification

6 of 9
with A1-A4, we can estimate this with OLS
consistent and efficient
known as “fixed effects”, but doesn’t mean that α i are not random variables
- misnomer

Issues:
1) α i not consistently estimated
- each α i just estimated from T observations
- imagine we just had data on 1 individual
- could still estimate that α i
- since T is typically small, too few obs for consistent estimate
- typically less than 25, almost certainly less than 100
- often said that “T is assumed fixed”
- not a good way to say it
- T just too small for accurate estimates
- and asymptotic approximations
- therefore can’t trust value of α i
but we have controlled for all unchanging individual characteristics

Aside: sample size doesn’t only matter for asymptotics

- with small sample statistics,
- still have inaccurate estimates with small samples
- just have more confidence that we know true variance

2) cannot estimate effect of any other unchanging characteristic

- e.g. effect of ability on earnings
- can control for effect of ability if it is unchanging
- can’t independently estimate effect of education
- since unchanging among adults
- lack of identification

Estimating Fixed Effects:

- if 1000s of individuals, 1000s of individual effects
- each with its own dummy variable
- regression with 1000s for indep. variables
computationally inefficient,
especially since we don’t care about value of α i

7 of 9
instead subtract off individual means:

yit = x it′ β + α i + ε it
1 T
yi ≡ ∑ yit
T t=1
yi = xi′β + α i + ε i n.b. α i = α i

yit − yi = ( x it′ − xi′ ) β + α i − α i + ε it − ε i

yit − yi = ( x it′ − xi′ ) β + ε it − ε i

Are A1-A4 still met for regressing yit − yi on x it′ − xi′ ?

- Is [ x it′ − xi′ ] full column rank?

yes - subtracting off means doesn’t change that
- Is E ( ε it − ε i X ) = 0 ?
- yes - because E ( ε it ) = 0 ∀i,t , so E ( ε i ) = 0
( )
- Is V ε it − ε i X = σ 2 ?

( ) ( ) ( )
V ε it − ε i X = V ε it X + V ε i X − 2Cov(ε it , ε i X)
⎛ ε +!+ ε ⎞ 1 σ2
V (ε X ) = V ⎜ X⎟ =
i1
Tσ iT 2
=
i
⎝ T ⎠ T 2
T
ε i1 +!+ ε iT ε
Cov(ε it , ε i X) = Cov(ε it , X) = Cov(ε it , it X)
T T
because Cov(ε it , ε is X) = 0 ∀ t ≠ s
σ2
Cov(ε it , ε i X) =
T
σ2 σ2 ⎛ 1⎞ 2
so, ( )
V ε it − ε i X = σ +
T
−2 2
= 1−
T ⎜⎝ T ⎟⎠
σ

Variance no longer equal to σ 2 , but still homoscedastic

How about autocorrelation?

Cov ( ε it − ε i , ε is − ε i X ) = Cov(ε it , ε is X) − Cov(ε i , ε is X) − Cov(ε it , ε i X) + Cov(ε i , ε i X)

Cov(ε it , ε is X) = 0
σ2
Cov(ε i , ε is X) = Cov(ε it , ε i X) =
T

8 of 9
σ2
(
Cov(ε i , ε i X) = V ε i X = ) T
, so
σ 2 σ 2 −σ 2
Cov ( ε it − ε i , ε is − ε i X ) = −2 + =
T T T

⎡ ⎛ 1 ⎞ 2 −σ 2 −σ 2 ⎤
⎢ ⎜ 1− ⎟ σ ! ⎥ ⎡ ε1 ⎤
⎢ ⎝ T⎠ T T ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ! ⎥
−σ 2 ⎢ ε1 ⎥
⎢ ! ! " ⎥
Let Σ i = ⎢ T ⎥ , and [ ε ] = ⎢ ⎥
! ⎥ , then
⎢ −σ 2 ⎥ i ⎢
⎢ " ! ! ⎥ ⎢ εn ⎥
⎢ T ⎥ ⎢ ⎥
⎢ ! ⎥
⎢ −σ 2 −σ 2 ⎛ 1⎞ 2 ⎥
⎢ ! ⎜⎝ 1− ⎟⎠ σ ⎥ ⎢⎣ εn ⎥
T T T ⎦
⎢⎣ ⎥⎦
⎡ Σ 0 ! 0 ⎤
⎢ 1 ⎥
⎢ 0 Σ2 ! 0 ⎥
V ⎡⎣ε − ⎡⎣ε i ⎤⎦ X ⎤⎦ = ⎢ ⎥
⎢ ! ! " ! ⎥
⎢ 0 0 ! Σn ⎥
⎣ ⎦
How big is this matrix?
nT x nT
Are our OLS assumptions met?
No - Autocorrelation within individual time series
Use GLS - easy to form P matrices
s2 σ2
because just need estimate of
T T

Time and individual fixed effects:

yit = x it′ β + α i + δ t + ε it
1 n
if yt = ∑ yit , then regress yit − yi − yt on xit′ − xi′ − xt′
n i=1

9 of 9

Panel Data Modelling
No ratings yet
Panel Data Modelling
24 pages
cn4 IV
No ratings yet
cn4 IV
18 pages
MIT Microeconomics 14.32 Final Review
No ratings yet
MIT Microeconomics 14.32 Final Review
5 pages
Panel Data Lecture Rome
No ratings yet
Panel Data Lecture Rome
47 pages
Hausman Test: Bias vs. Efficiency
No ratings yet
Hausman Test: Bias vs. Efficiency
3 pages
Serial Correlation:: ST ST
No ratings yet
Serial Correlation:: ST ST
7 pages
Tests
No ratings yet
Tests
10 pages
Rev Lect 3&4 J
No ratings yet
Rev Lect 3&4 J
56 pages
s10 IV Handout
No ratings yet
s10 IV Handout
48 pages
New Section 1
No ratings yet
New Section 1
39 pages
Week 3-1
No ratings yet
Week 3-1
25 pages
ECON326 Midterm
No ratings yet
ECON326 Midterm
5 pages
Panel Data Analysis for Economists
No ratings yet
Panel Data Analysis for Economists
20 pages
Panel Data Econometrics Problem Set
No ratings yet
Panel Data Econometrics Problem Set
3 pages
Panel Data Lecture Notes
No ratings yet
Panel Data Lecture Notes
38 pages
Panal Data Method ch14 PDF
No ratings yet
Panal Data Method ch14 PDF
38 pages
Fem & Rem
No ratings yet
Fem & Rem
20 pages
Er Za 2009
100% (1)
Er Za 2009
9 pages
Panel Data Analysi
No ratings yet
Panel Data Analysi
27 pages
Panel Data Regression Guide
100% (1)
Panel Data Regression Guide
30 pages
Chapter 14
No ratings yet
Chapter 14
22 pages
Panel Data I
No ratings yet
Panel Data I
40 pages
Endogeneity and Instrumental Variables
No ratings yet
Endogeneity and Instrumental Variables
22 pages
Chapter 2 - Panel Data Regression
No ratings yet
Chapter 2 - Panel Data Regression
30 pages
Panel Data IV
No ratings yet
Panel Data IV
30 pages
AE 2023 Lecture10
No ratings yet
AE 2023 Lecture10
40 pages
Applied Econometrics: William Greene Department of Economics Stern School of Business
No ratings yet
Applied Econometrics: William Greene Department of Economics Stern School of Business
68 pages
STATA Panel Regression Guide
No ratings yet
STATA Panel Regression Guide
16 pages
Econometrics - Review Sheet ' (Main Concepts)
No ratings yet
Econometrics - Review Sheet ' (Main Concepts)
5 pages
Lecture 12 Instrumental Variables
No ratings yet
Lecture 12 Instrumental Variables
5 pages
Fere
No ratings yet
Fere
46 pages
Oversikt ECN402
No ratings yet
Oversikt ECN402
40 pages
Cuarta Clase
No ratings yet
Cuarta Clase
142 pages
Full Summary of Panel Data
No ratings yet
Full Summary of Panel Data
17 pages
Topic 6 - Static Panel Data
No ratings yet
Topic 6 - Static Panel Data
21 pages
Additional Cheatsheet en
No ratings yet
Additional Cheatsheet en
3 pages
Solution Basic Econometrics
No ratings yet
Solution Basic Econometrics
10 pages
Cheat Sheet Quantitative Methods in Finance Nova Cheat Sheet Quantitative Methods in Finance Nova
0% (1)
Cheat Sheet Quantitative Methods in Finance Nova Cheat Sheet Quantitative Methods in Finance Nova
3 pages
Panel Cookbook
No ratings yet
Panel Cookbook
98 pages
Econometrics Eviews 4
No ratings yet
Econometrics Eviews 4
14 pages
Lecture 3 - LRM
No ratings yet
Lecture 3 - LRM
40 pages
Econometrics for Advanced Analysts
No ratings yet
Econometrics for Advanced Analysts
54 pages
Key Expressions & Concepts
No ratings yet
Key Expressions & Concepts
5 pages
Topic 9: Panel Data Models
No ratings yet
Topic 9: Panel Data Models
46 pages
Panel Data
No ratings yet
Panel Data
14 pages
Section10 Solutions
100% (1)
Section10 Solutions
11 pages
Panel Data Modeling and Analysis
100% (1)
Panel Data Modeling and Analysis
13 pages
72 UE Panelv3
No ratings yet
72 UE Panelv3
35 pages
Econometric S Cheat Sheet
No ratings yet
Econometric S Cheat Sheet
3 pages
Econometrica - 2009 - Bai - Panel Data Models With Interactive Fixed Effects
No ratings yet
Econometrica - 2009 - Bai - Panel Data Models With Interactive Fixed Effects
51 pages
PS5 Sol
No ratings yet
PS5 Sol
7 pages
Heteroscedasticity & Autocorrelation
No ratings yet
Heteroscedasticity & Autocorrelation
5 pages
Ch. 1 - Endogeneity
No ratings yet
Ch. 1 - Endogeneity
18 pages
Chapter 14 Advanced Panel Data Methods: T T Derrorterm Complicate X y
No ratings yet
Chapter 14 Advanced Panel Data Methods: T T Derrorterm Complicate X y
13 pages
Lecture Note 15
No ratings yet
Lecture Note 15
9 pages
Some Basics For Panel Data Analysis
No ratings yet
Some Basics For Panel Data Analysis
21 pages
Funtional Form
No ratings yet
Funtional Form
21 pages
Econometrics Cheat Sheet
No ratings yet
Econometrics Cheat Sheet
3 pages
Academic Word List (Coxhead, 2001) 210 Words: Ielp Vocabulary Target Proficiency For Gradtrack
No ratings yet
Academic Word List (Coxhead, 2001) 210 Words: Ielp Vocabulary Target Proficiency For Gradtrack
4 pages
EC590 Rodriguez Sp2017
No ratings yet
EC590 Rodriguez Sp2017
2 pages
Syntax Description Options Remarks and Examples Also See: G-4 G-4 G-4 G-4 G-4 G-4
No ratings yet
Syntax Description Options Remarks and Examples Also See: G-4 G-4 G-4 G-4 G-4 G-4
2 pages
2018 Scholarships Ad
No ratings yet
2018 Scholarships Ad
1 page
IELTS Writing Actual Tests 2008
No ratings yet
IELTS Writing Actual Tests 2008
6 pages
MBA Stats Module Overview
No ratings yet
MBA Stats Module Overview
4 pages
Lab Test 2018 Answers PDF
No ratings yet
Lab Test 2018 Answers PDF
6 pages
FDSA Unit V LECTURE NOTS
No ratings yet
FDSA Unit V LECTURE NOTS
28 pages
Probit and Logit Models: Differences in The Multivariate Realm
No ratings yet
Probit and Logit Models: Differences in The Multivariate Realm
14 pages
Hypothesis Testing ANOVA
No ratings yet
Hypothesis Testing ANOVA
61 pages
Regularization for ML Practitioners
No ratings yet
Regularization for ML Practitioners
45 pages
Chapter 3
100% (2)
Chapter 3
12 pages
Linear Regression and Correlation: Mcgraw Hill/Irwin
No ratings yet
Linear Regression and Correlation: Mcgraw Hill/Irwin
37 pages
Repeated Measures ANOVA SPSS Guide
0% (1)
Repeated Measures ANOVA SPSS Guide
36 pages
Tutorial2 Solution Jan21
No ratings yet
Tutorial2 Solution Jan21
5 pages
Forecasting Birth Trends in Germany
No ratings yet
Forecasting Birth Trends in Germany
23 pages
15 MAY - NR - Correlation and Regression
No ratings yet
15 MAY - NR - Correlation and Regression
10 pages
Wilcoxon Signed-Rank Test
No ratings yet
Wilcoxon Signed-Rank Test
39 pages
EViews 8 Users Guide II
No ratings yet
EViews 8 Users Guide II
1,005 pages
Anova 2
No ratings yet
Anova 2
14 pages
Amiri2024 Papre-Iranianjournal
No ratings yet
Amiri2024 Papre-Iranianjournal
11 pages
I. Using The Z Table (Table E), Find The Critical Value (Or Values) For Each. (3 Pts Each)
No ratings yet
I. Using The Z Table (Table E), Find The Critical Value (Or Values) For Each. (3 Pts Each)
6 pages
2278 9131 1 PB PDF
No ratings yet
2278 9131 1 PB PDF
9 pages
Ejercicos Stata Solución
No ratings yet
Ejercicos Stata Solución
6 pages
Lesson 13
No ratings yet
Lesson 13
21 pages
Limitation and Delimitation
100% (1)
Limitation and Delimitation
3 pages
Understanding Diagnostic Plots For Linear Regression Analysis
No ratings yet
Understanding Diagnostic Plots For Linear Regression Analysis
5 pages
ch8 4710
No ratings yet
ch8 4710
63 pages
Generative AI On AWS
100% (11)
Generative AI On AWS
208 pages
How To Select Appropriate Statistical Test?: Technical Notes
No ratings yet
How To Select Appropriate Statistical Test?: Technical Notes
3 pages
Notes On Estimation
No ratings yet
Notes On Estimation
76 pages
IFRS Impact on Indian Companies
No ratings yet
IFRS Impact on Indian Companies
13 pages
Stat Sos 2023
No ratings yet
Stat Sos 2023
6 pages
Statistics Exam Prep Guide
No ratings yet
Statistics Exam Prep Guide
10 pages
Y T Is The Time Index For The Month (T 1, 2,, 96) Y D D D
No ratings yet
Y T Is The Time Index For The Month (T 1, 2,, 96) Y D D D
2 pages

Vb V ε X = σ Vb = σ Vb = X'X Σx X'X: I X'X X'

Uploaded by

Vb V ε X = σ Vb = σ Vb = X'X Σx X'X: I X'X X'

Uploaded by

Lecture Notes 11

What is difference between

- White is for heteroscedasticity with no auto-correlation

- but with heteroscedasticity, V ( b ) = ( X ' X ) X ' ΣX ( X ' X )

- why not use GLS instead of OLS?

1. E(ε X) ≠ 0 - Hausman test for endogeneity

- several variations on each of these tests

1. Hausman (and Wu, Durban) test

Is there an endogeneity problem in the first place?

Use Wald statistic

Asy.V ⎡⎣ β̂ 2SLS − b ⎤⎦ = Asy.V ⎡⎣ β̂ 2SLS ⎤⎦ + Asy.V ⎡⎣b ⎤⎦ − 2Asy.Cov ⎡⎣ β̂ 2SLS ,b ⎤⎦

But what is Asy.Cov ⎡⎣ β̂ 2SLS ,b ⎤⎦ ?

First, Hausman noted that under H 0 , OLS is efficient and IV is not

So Cov ⎡⎣b , β̂ 2SLS − b ⎤⎦ = V ⎡⎣b ⎤⎦ − Cov ⎡⎣b , β̂ 2SLS ⎤⎦ = 0

so Asy.V ⎡⎣ β̂ 2SLS − b ⎤⎦ = Asy.V ⎡⎣ β̂ 2SLS ⎤⎦ − Asy.V ⎡⎣b ⎤⎦

( )′ {Est.Asy.V ⎡⎣β̂ } (β̂ )

- then m' ⎡⎣Var ( m ) ⎤⎦ m ∼ χ L−K

- only L-K degrees of freedom because

3. Test for weak instruments

- if X 'Z → 0 , then Asy.Var ⎡⎣ β̂ IV ⎤⎦ → ∞

- in multivariable context, don’t know direction of bias

Panel data typically expensive and difficult to collect

Have both a individual subscript i and a time subscript t

if T is the same for all individuals, then a “balanced panel”

How many observations?

Most important issues

Fixed vs. Random effects

yit = x it′ β + α i + ε it , where x it doesn’t have a column of ones (why not?)

i.e. why can’t we estimate yit = x it′ β + β 0 + α i + ε it

X matrix (including i and d i ) will not be full rank

- this is just the usual dummy variable problem

This is the big deal of most panel data estimation

Aside: sample size doesn’t only matter for asymptotics

2) cannot estimate effect of any other unchanging characteristic

Estimating Fixed Effects:

yit − yi = ( x it′ − xi′ ) β + α i − α i + ε it − ε i

Are A1-A4 still met for regressing yit − yi on x it′ − xi′ ?

- Is [ x it′ − xi′ ] full column rank?

Variance no longer equal to σ 2 , but still homoscedastic

How about autocorrelation?

Cov ( ε it − ε i , ε is − ε i X ) = Cov(ε it , ε is X) − Cov(ε i , ε is X) − Cov(ε it , ε i X) + Cov(ε i , ε i X)

Time and individual fixed effects:

You might also like