0% found this document useful (0 votes)

139 views16 pages

Pooled and Panel Data Analysis

This document provides an overview of pooled cross-section and panel data analysis. It defines pooled cross-section data as randomly sampled cross-sections of individuals at different points in time, while panel data observes the same cross-sections of individuals over time. The document discusses how pooled cross-section data can be used to study policy impacts through natural experiments and differences-in-differences estimations. It also introduces panel data analysis using an error components model to control for individual fixed effects and deal with endogeneity. An example analyzes the effect of unemployment rates on crime rates using a two-period panel data set on 46 cities.

Uploaded by

Daniel Romero

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

139 views16 pages

Pooled and Panel Data Analysis

Uploaded by

Daniel Romero

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Econ 582

Introduction to Pooled Cross Section and

Panel Data
Eric Zivot

May 22nd, 2012

Outline

Pooled Cross Section and Panel Data

Analysis of Pooled Cross Section Data
Two Period Panel Data
Multi-period Panel Data

Pooled Cross Section and Panel Data

Definition 1 (Pooled cross-section data) Randomly sampled cross sections of
individuals at dierent points in time
Example:

Current population survey (CPS) in 1978 and 1988

Definition 2 (Panel Data) Observe cross sections of the same individuals at

dierent points in time
Example: National Longitudinal Survey of Youth (NLSY)

Pooled Cross Section Data

Pooling makes sense if cross sections are randomly sampled (like one big
sample)

Time dummy variables can be used to capture structural change over time
Observations across dierent time periods allows for policy analysis

Example: Womens fertility over time (Wooldridge)

National Opinion Research Centers General Social Survey for even years from
1972-1984
= 0 + 174 + + 684 + 0x +

74 = 1 if year = 74 0 otherwise (year dummy)

x = ( 2 )
Q: After controlling for observable factors (educ etc), what has happened to
fertility over time?
A: Time eects of fertility are captured by dummy variables
[|x = 72] = 0 + 0x

[|x = 74] = 0 + 1 + 0x

[|x = 74] [|x = 72] = 1

Hence, 1 = change in fertility between 1972 and 1974 controlling for x
Some complications:

() may change over time. Best to use HC standard errors

Other coecients may not be constant over time

Example contd
To allow coecients on x to vary over time, add interaction terms with the
dummy variable:
= 0 + 174 + + 684 + 0x
01 (74 x) + + 6 (84 x) +
Then
[|x = 72] = 0 + 0x

[|x = 74] = 0 + 1 + ( + 1)0x

and
[|x = 74] [|x = 72] = 1 + 01x

Testing for Structural Change (Chow Test)

0 :
1 :

(no structural change) 1 = = 6 = 0 and 1 = = 6 = 0

(structural change) some 6= 0 and/or 6= 0

Use F-test or Wald test

Advisable to correct for possible heteroskedasticity

Policy Analysis with Pooled Cross Section Data

Pooled cross-sections can be useful for evaluating the impact of certain

events or policy interventions

Event or policy intervention must be a natural experiment - i.e., must

be exogenously imposed on data

Control variable must be exogenous (no endogenous regressors)

Example: Eect of Garbage Incinerator Location on House Values in North

Andover MA

2 year pooled cross section of data for 1978 and 1981

New incinerator built in 1981 and online in 1985
Knowledge of incinerator project not known in 1978
Q: Did house values near the incinerator decline in value?

Regression using 1981 data

= 0 + 1 +
= 101 307 30 688
(3093)

(5827)

= 1 if near incinerator, 0 otherwise

= 142 2 = 0665
Note
[| = 1 in 1981] [| = 0 in 1981]
= 1 = 30 688

Regression using 1978 data

= 82 517 18 824
(2653)

(5287)
2
= 142 = 0665

Note
[| = 1 in 1978] [| = 0 in 1978]
= 18 824

so that it appears that the incinerator was build in a low income/house value
area.

Dierence in Dierences (Di-in-Di) Estimate

To determine the impact of the incinerator on house values, we need to compare
the dierences between the treatment and control groups across the two time
periods (compute the dierence in the dierence)
[| = 1 in 1981] [| = 0 in 1981]

[| = 1 in 1978] [| = 0 in 1978]
= 30 688 (18 824)
= 11 863

Dummy Variable Formulation of Di-in-Di Estimation

= 0 + 081 + 1 + 1 (81 ) +
Then
[| = 1 81 = 1] = 0 + 0 + 1 + 1

[| = 0 81 = 1] = 0 + 0

81 = 1 + 1
[| = 1 81 = 0] = 0 + 1
[| = 0 81 = 0] = 0
78 = 1

81 78 = 1

Dummy variable regression results

= 82 517 + 18 790 81 18 824

(2726)

(4050)

(4875)

11 863 81
(7456)

1 = 11 863 = 81 78
11 863
1=0 =
= 159
7 456
Note: Dummy variable formulation allows the standard error on
1 to be computed.

Natural Experiment

Some exogenous event (e.g., change in government policy) changes the

environment in which individuals, families, firms, cities, etc., operate

Control group is not aected by the policy change

Treatment group is thought to be aected by the policy change
No random assignment to control and treatment groups

Group comparison
Group
Period 1 Period 2
Control
before
after
Di
Treatment before
after
Di
Di in Di

Two Period Panel Data

Observe cross section on the same individuals, cities, countries etc., in two
time periods = 1 and = 2

Panel data structure makes it possible to deal with certain types of endogeneity without the use of exogenous instruments

Extends the natural experiment framework to situations in which there may

be endogeneity

Example: Determine the eect of the unemployment rate on crime rates (Wooldridge)
Data on crime rates and unemployment for 46 cities for 1982 and 1987
Regression for 1982
d

= 12838 416 umemp

(2076)
(342)
2
= 46 = 0033

It appears that increases in unemp lowers crime rate (but not significant)
!

Bias likely due to omitted variables (unemp is endogenous)

Error Components Framework for Two Period Panel Data

= 0 + 02 + 0x + = 1 2
= 0 + 02 + 0x + ( + )
2 = 1 if = 2; 0 otherwise
= unobserved heterogeneity (fixed eect)
= idiosyncratic error

represents unobserved omitted variables that vary across individuals but

stay fixed over time (e.g., race, gender, ability)

x is endogenous if it is correlated with and pooled OLS is biased and

inconsistent

Example: Pooled OLS estimates in crime rate regression

= 9342 + 794 87 + 427

(1274)

(798)
(1188)
= 92 (46 x 2), 2 = 0012

unemp is not significant in pooled regression

It is likely that unemp is endogenous; e.g., correlated with omitted time
invariant city specific demographic variables like age, race, education levels,
attitudes towards crime etc.

Eliminating Endogeneity in Two Period Panel Data

= 0 + 02 + 0x + + = 1 2
Then
= 1 : 1 = 0 + 0x1 + + 1
= 2 : 2 = 0 + 0 + 0x2 + + 2
: = 0 + 02 + 2
First dierencing eliminates the unobserved fixed eect !
OLS on first dierenced data gives consistent estimates of (provided
2 is uncorrelated with 2 )

Example: First Dierence Estimates in crime rate regression

= 1540 + 222
(470)

(088)

= 46 2 = 127
222
=0 =
= 252
088

coef on is of expected sign and is significant

Potential Problems with First Dierence Regression

First dierencing removes variables that dont vary with time (e.g. gender,
race, etc.)

Eective sample size is reduced

Policy Analysis with Two-Period Panel Data

Two period panel data is often used for program evaluation studies in
which there is likely to be endogeneity

Example: Evaluation of Michigan Job Training Program

Data for two years (1987 and 1988) on the same manufacturing firms in
Michigan

Some firms received job training grants in 1988 and some did not (training
was available on first come first serve basis)

Panel data regression

= 0 + 0 88 + 1 + +

= scrap rate (% of items scrapped due to defects)

= 1 if firm received a training grant in 1988
= unobserved firm fixed eects (e.g. worker productivity)
( ) 6= 0 (why?)
First Dierence transformation
= 0 + 1 +
= 0 + 188 +

Here, 1 = average treatment eect

= [88|88 = 1] [87|88 = 1] = 0

= [88|88 = 0] [87|88 = 0] = 0

= 1

Example: First Dierences Regression

d = 564 739

(405)
(683)
2
= 54 = 022

1=0 =

739
= 108
683

Panel Data with More than 2 Time Periods

Suppose = 1 2 and 3
= 1 + 22 + 33 + 0x + +
2 = 1 if = 2; 0 otherwise
3 = 1 if = 3; 0 otherwise
Then
= 1 : 1 = 1 + 0x1 + + 1
= 2 : 2 = 1 + 2 + 0x2 + + 2
= 3 : 3 = 1 + 3 + 0x2 + + 3

First dierencing gives

= 1 + 22 + 33 + 02 + 2 = 2 3
That is,
= 2 : 2 = 2 + 02 + 2
= 3 : 3 = 2 + 3 + 02 + 3
because
23 = 23 22 = 1

Estimation is by pooled OLS on first dierenced data

Error terms for a given are correlated across time
(3 2 ) = (3 2 2 1 )
= (2 )

Hence, Gauss-Markov assumptions are violated and OLS is not ecient.

Unbalanced Panel Data PDF
No ratings yet
Unbalanced Panel Data PDF
51 pages
Panel Data Method-Baltagi
100% (1)
Panel Data Method-Baltagi
51 pages
Panel Data
100% (2)
Panel Data
5 pages
Panel Data Assignment
No ratings yet
Panel Data Assignment
24 pages
A14 Road Project Management Analysis
No ratings yet
A14 Road Project Management Analysis
13 pages
Policy Implementation
No ratings yet
Policy Implementation
9 pages
ESG Performance and Economic Growth A Panel
No ratings yet
ESG Performance and Economic Growth A Panel
25 pages
Panel Data Analysis
No ratings yet
Panel Data Analysis
39 pages
Session6 - Infant Feeding & PMTCT
No ratings yet
Session6 - Infant Feeding & PMTCT
23 pages
Econometric Analysis of Health Data
No ratings yet
Econometric Analysis of Health Data
231 pages
Stata Basics for Econometrics Students
No ratings yet
Stata Basics for Econometrics Students
181 pages
Teaching With Stata: Peter A. Lachenbruch & Alan C. Acock Oregon State University
No ratings yet
Teaching With Stata: Peter A. Lachenbruch & Alan C. Acock Oregon State University
28 pages
Mental Health Bulletin No. 239 February 8th 2010
No ratings yet
Mental Health Bulletin No. 239 February 8th 2010
21 pages
Yeşim Kaya Marmara University Research Metodology Course 15 June 2010
100% (1)
Yeşim Kaya Marmara University Research Metodology Course 15 June 2010
43 pages
Strategy Globalfund2023-2028-Kpi Handbook en
No ratings yet
Strategy Globalfund2023-2028-Kpi Handbook en
140 pages
Graphing Stata (MIT)
No ratings yet
Graphing Stata (MIT)
56 pages
Introduction To Stata 2024-06-18 Handout
No ratings yet
Introduction To Stata 2024-06-18 Handout
52 pages
STATA Frain
No ratings yet
STATA Frain
68 pages
Chapter 14: Introduction To Panel Data
No ratings yet
Chapter 14: Introduction To Panel Data
14 pages
Stata Data Managment
No ratings yet
Stata Data Managment
79 pages
Applied Econometrics with Stata Guide
No ratings yet
Applied Econometrics with Stata Guide
170 pages
Literature Review On Prevention of Mother To Child Transmission of Hiv
100% (1)
Literature Review On Prevention of Mother To Child Transmission of Hiv
6 pages
Stata Training
No ratings yet
Stata Training
24 pages
Department of Economics: ECONOMICS 481: Economics Research Paper and Seminar
No ratings yet
Department of Economics: ECONOMICS 481: Economics Research Paper and Seminar
15 pages
Panel Data Regression Models Guide
No ratings yet
Panel Data Regression Models Guide
25 pages
Stata Commands for Econometrics
100% (1)
Stata Commands for Econometrics
51 pages
UsefulStataCommands PDF
No ratings yet
UsefulStataCommands PDF
51 pages
Introduction To Factor Analysis (Compatibility Mode) PDF
No ratings yet
Introduction To Factor Analysis (Compatibility Mode) PDF
20 pages
Demonetisation of Zimbabwean Currency
No ratings yet
Demonetisation of Zimbabwean Currency
3 pages
Chapter Three
No ratings yet
Chapter Three
100 pages
Panel Data Analysis Workshop
No ratings yet
Panel Data Analysis Workshop
42 pages
BIM4Water Combined 2015-06-10
No ratings yet
BIM4Water Combined 2015-06-10
64 pages
Econometric S Lecture 43
No ratings yet
Econometric S Lecture 43
36 pages
Panel Data Assign
No ratings yet
Panel Data Assign
19 pages
Factors Influencing The Use of Prevention of Motherto-Child Transmission of HIV (PMTCT) Services in Ilala District, Dar Es Salaam Tanzania
100% (1)
Factors Influencing The Use of Prevention of Motherto-Child Transmission of HIV (PMTCT) Services in Ilala District, Dar Es Salaam Tanzania
9 pages
After Midterm Slides
No ratings yet
After Midterm Slides
134 pages
Panel Data Analysis
No ratings yet
Panel Data Analysis
42 pages
Econometrics for Advanced Students
No ratings yet
Econometrics for Advanced Students
73 pages
Internatiional Financial Management: Unit I
No ratings yet
Internatiional Financial Management: Unit I
51 pages
Stata An Introduction Summer 2020
No ratings yet
Stata An Introduction Summer 2020
60 pages
Regression Explained SPSS
100% (1)
Regression Explained SPSS
23 pages
Sri Lankan Economy Crisis Final
No ratings yet
Sri Lankan Economy Crisis Final
19 pages
Introduction To Econometrics - Stock & Watson - CH 7 Slides
100% (1)
Introduction To Econometrics - Stock & Watson - CH 7 Slides
35 pages
Mehmetoglu, Jakobsen. Applied S - Mehmet Mehmetoglu
100% (1)
Mehmetoglu, Jakobsen. Applied S - Mehmet Mehmetoglu
493 pages
Panel Ecmiic2
No ratings yet
Panel Ecmiic2
57 pages
Econometric Analysis: Pooled and Panel Data
No ratings yet
Econometric Analysis: Pooled and Panel Data
34 pages
Lecture Note 11 Panel Analysis
No ratings yet
Lecture Note 11 Panel Analysis
11 pages
Lec06 - Panel Data
No ratings yet
Lec06 - Panel Data
160 pages
Introduction To Panel Data UG-students
100% (1)
Introduction To Panel Data UG-students
57 pages
Econometrics II: Panel Data Analysis: First-Differences, Fixed and Random Effects
No ratings yet
Econometrics II: Panel Data Analysis: First-Differences, Fixed and Random Effects
61 pages
Panel Vs Pooled Data
No ratings yet
Panel Vs Pooled Data
9 pages
ECON0019 Lecture9 Slides
No ratings yet
ECON0019 Lecture9 Slides
32 pages
ECN3322 - Panel Data-1
No ratings yet
ECN3322 - Panel Data-1
56 pages
Panel Data Analysis
No ratings yet
Panel Data Analysis
61 pages
CH - 13 - Pooling Cross Sections Across Time Simple Panel Data Methods
No ratings yet
CH - 13 - Pooling Cross Sections Across Time Simple Panel Data Methods
8 pages
Cap 1-3 Hsiao Analysis of Panel Data
No ratings yet
Cap 1-3 Hsiao Analysis of Panel Data
57 pages
Solutions 5
No ratings yet
Solutions 5
6 pages
Slides On Panel Data Analysis
No ratings yet
Slides On Panel Data Analysis
44 pages
Cuarta Clase
No ratings yet
Cuarta Clase
142 pages
Econometrics Problem Solutions
No ratings yet
Econometrics Problem Solutions
40 pages
The 'Expansionary Fiscal Contraction' Hypothesis: A Neo-Keynesian Analysis
No ratings yet
The 'Expansionary Fiscal Contraction' Hypothesis: A Neo-Keynesian Analysis
18 pages
Dirk Kruger MacroTheory
No ratings yet
Dirk Kruger MacroTheory
308 pages
William W.S. Wei-Time Series Analysis - Univariate and Multivariate Methods (2nd Edition) - Addison Wesley (2005) PDF
No ratings yet
William W.S. Wei-Time Series Analysis - Univariate and Multivariate Methods (2nd Edition) - Addison Wesley (2005) PDF
634 pages
Carnivores of West Africa
No ratings yet
Carnivores of West Africa
590 pages
(Levine&Easterly) It'Snotfactoraccumulation Stylizedfactsandgrowthmode
No ratings yet
(Levine&Easterly) It'Snotfactoraccumulation Stylizedfactsandgrowthmode
59 pages
Find Mean and Variance of T Distribution
No ratings yet
Find Mean and Variance of T Distribution
9 pages
TV Ratings Forecasting Models
No ratings yet
TV Ratings Forecasting Models
26 pages
Polar Codes Construction and Performance Analysis - Ramtin Pedarsani
No ratings yet
Polar Codes Construction and Performance Analysis - Ramtin Pedarsani
48 pages
Padeepz MA3251 Notes-1
No ratings yet
Padeepz MA3251 Notes-1
239 pages
School State Y Value Alumni Giving Rate Y Graduatio Nrate X1 % of Classes Under 20 X2 Student/F Aculty Ratio X3
No ratings yet
School State Y Value Alumni Giving Rate Y Graduatio Nrate X1 % of Classes Under 20 X2 Student/F Aculty Ratio X3
10 pages
Statistics Practice Test
No ratings yet
Statistics Practice Test
6 pages
Stat 221 Test 1 2024 Real
No ratings yet
Stat 221 Test 1 2024 Real
2 pages
Chapter 3
No ratings yet
Chapter 3
67 pages
Estimation
No ratings yet
Estimation
44 pages
Statistics for Business Analysts
No ratings yet
Statistics for Business Analysts
12 pages
CH 07 Tif
100% (1)
CH 07 Tif
29 pages
Lesson Plan Ma3401
No ratings yet
Lesson Plan Ma3401
3 pages
Gradient Descent
No ratings yet
Gradient Descent
16 pages
CMSU Student Survey Analysis
No ratings yet
CMSU Student Survey Analysis
16 pages
Z Score
No ratings yet
Z Score
4 pages
Populasi Dan Sampel Skala Pengukuran
No ratings yet
Populasi Dan Sampel Skala Pengukuran
25 pages
3 4 Worksheet For Loacation and Dispersion - PDF
No ratings yet
3 4 Worksheet For Loacation and Dispersion - PDF
17 pages
STT843 HW2 Solution YiChen
No ratings yet
STT843 HW2 Solution YiChen
24 pages
Stochastic Processes Exam
No ratings yet
Stochastic Processes Exam
2 pages
Statistics
No ratings yet
Statistics
1 page
Econometrics PPT-1
No ratings yet
Econometrics PPT-1
11 pages
Crash-Frequency Data Analysis
No ratings yet
Crash-Frequency Data Analysis
49 pages
Bayesian Classification - Problem
No ratings yet
Bayesian Classification - Problem
4 pages
A Simple Selection Test Between Gompertz and Logistic Growth Models
No ratings yet
A Simple Selection Test Between Gompertz and Logistic Growth Models
10 pages
Predictor Effect Plot Guide
No ratings yet
Predictor Effect Plot Guide
44 pages
Babbie Chapter8
No ratings yet
Babbie Chapter8
28 pages
Estimating Population Variance
No ratings yet
Estimating Population Variance
26 pages
ASEU TEACHERFILE WEB 3936427542519593610.ppt 1608221424
No ratings yet
ASEU TEACHERFILE WEB 3936427542519593610.ppt 1608221424
9 pages
CH Var Basic PDF
No ratings yet
CH Var Basic PDF
48 pages
2020HW7
No ratings yet
2020HW7
2 pages

Pooled and Panel Data Analysis

Uploaded by

Pooled and Panel Data Analysis

Uploaded by

Econ 582

Introduction to Pooled Cross Section and

May 22nd, 2012

Pooled Cross Section and Panel Data

Pooled Cross Section and Panel Data

Current population survey (CPS) in 1978 and 1988

Definition 2 (Panel Data) Observe cross sections of the same individuals at

Pooled Cross Section Data

Example: Womens fertility over time (Wooldridge)

74 = 1 if year = 74 0 otherwise (year dummy)

[|x = 74] [|x = 72] = 1

() may change over time. Best to use HC standard errors

[|x = 74] = 0 + 1 + ( + 1)0x

Testing for Structural Change (Chow Test)

(no structural change) 1 = = 6 = 0 and 1 = = 6 = 0

Use F-test or Wald test

Policy Analysis with Pooled Cross Section Data

Pooled cross-sections can be useful for evaluating the impact of certain

Event or policy intervention must be a natural experiment - i.e., must

Control variable must be exogenous (no endogenous regressors)

Example: Eect of Garbage Incinerator Location on House Values in North

2 year pooled cross section of data for 1978 and 1981

Regression using 1981 data

= 1 if near incinerator, 0 otherwise

Regression using 1978 data

Dierence in Dierences (Di-in-Di) Estimate

Dummy Variable Formulation of Di-in-Di Estimation

Dummy variable regression results

= 82 517 + 18 790 81 18 824

Some exogenous event (e.g., change in government policy) changes the

Control group is not aected by the policy change

Two Period Panel Data

Extends the natural experiment framework to situations in which there may

= 12838 416 umemp

Bias likely due to omitted variables (unemp is endogenous)

Error Components Framework for Two Period Panel Data

represents unobserved omitted variables that vary across individuals but

x is endogenous if it is correlated with and pooled OLS is biased and

Example: Pooled OLS estimates in crime rate regression

= 9342 + 794 87 + 427

unemp is not significant in pooled regression

Eliminating Endogeneity in Two Period Panel Data

Example: First Dierence Estimates in crime rate regression

coef on is of expected sign and is significant

Potential Problems with First Dierence Regression

Eective sample size is reduced

Policy Analysis with Two-Period Panel Data

Example: Evaluation of Michigan Job Training Program

Panel data regression

= scrap rate (% of items scrapped due to defects)

Here, 1 = average treatment eect

Example: First Dierences Regression

Panel Data with More than 2 Time Periods

First dierencing gives

Estimation is by pooled OLS on first dierenced data

Hence, Gauss-Markov assumptions are violated and OLS is not ecient.

You might also like