Lecture Note 13

The document discusses the Double-Difference (DD) or Difference-in-Difference method for evaluating program impacts using panel data collected before and after program implementation. It outlines the methodology for calculating DD estimates through both simple comparisons and regression analysis, including fixed-effects models and the incorporation of covariates. Additionally, it addresses the application of DD in cross-sectional data and the refinement of the method using propensity score matching (PSM) to ensure comparability between treatment and control groups.

Uploaded by

MazharulIslam Nabil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views39 pages

Lecture Note 13

Uploaded by

MazharulIslam Nabil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

Double-Difference or Difference-in-Difference

Method

Dr. Muhammad Shahadat Hossain Siddiquee

Professor, Department of Economics
University of Dhaka
Email: shahadat_eco@yahoo.com
Cell: +8801719397749
DD Method
• The matching methods discussed in previous lectures are meant to reduce bias
by choosing the treatment and comparison groups on the basis of observable
characteristics.
• They are usually implemented after the program has been operating for some
time and survey data have been collected.
• Another powerful form of measuring the impact of a program is by using panel
data, collected from a baseline survey before the program was implemented
and after the program has been operating for some time.
• These two surveys should be comparable in the questions and survey
methods used and must be administered to both participants and
nonparticipants.
• Using the panel data allows elimination of unobserved variable bias, provided
that it does not change over time
DD Method(Contd..)
• This approach, the double-difference (DD, also commonly known as difference-
in-difference) method has been popular in nonexperimental evaluations.
• The DD method estimates the difference in the outcome during the post-
intervention period between a treatment group and comparison group relative to
the outcomes observed during a pre-intervention baseline survey.
Simplest Implementation: Simple Comparison
Using “ttest”
• The simplest way of calculating the DD estimator is to manually take the
difference in outcomes between treatment and control between the
surveys.
• The panel data hh_9198.dta are used for this purpose.
• The following commands open the data ﬁle and create a new 1991-level outcome
variable (per capita expenditure) to make it available in observations of both
years.
• Then, only 1998 observations are kept, and a log of per capita expenditure
variable is created; the difference between 1998 and 1991 per capita expenditures
(log form) is created.
Simplest Implementation: Simple Comparison Using
“ttest” (Contd..)
• use ..\data\hh_9198
• gen exptot0=exptot if year==0 ; (note: check using tab year)
• egen exptot91=max(exptot0), by(nh) ; (note: check using command br)
• keep if year==1 ; (note: only 1998 observations are kept)
• gen lexptot91=ln(1+exptot91)
• gen lexptot98=ln(1+exptot)
• gen lexptot9891=lexptot98-lexptot91

The following command (“ttest”) takes the difference variable of outcomes created
earlier (“lexptot9891”) and compares it for microcredit participants and nonparticipants.
In essence, it creates a second difference of “lexptot9891” for those with dfmfd=1 and
those with dfmfd==0.
This second difference gives the estimate of the impact of females’ microcredit
program participation on per capita expenditure.
Simplest Implementation: Simple Comparison Using
“ttest” (Contd..)
• ttest lexptot9891, by(dfmfd)

The result shows that microcredit program participation by females increases per
capita consumption by 11.1 percent and that this impact is signiﬁcant at a less than 1
percent level.
Results Obtained Using ‘ttest’ Command
Regression Implementation
• Instead of manually taking the difference of the outcomes, DD can be
implemented using a regression.
• On the basis of the discussion in Ravallion (2008), the DD estimate can be
calculated from the regression

where T is the treatment variable, t is the time dummy, and the coefﬁcient of the
interaction of T and t (DD) gives the estimate of the impact of treatment on
outcome Y.
Regression Implementation (Contd..)
• The following commands open the panel data ﬁle, create the log of outcome variable,
and create a 1998-level participation variable available to both years—that is, those
who participate in microcredit programs in 1998 are the assumed treatment group.
• cd "C:\Users\Dept. of
Economics\Desktop\panel_econometrics_2019\mss_lectures“
• use hh_9198, clear
• gen lexptot=ln(1+exptot);
• gen dfmfd1=dfmfd==1 & year==1 ;
• egen dfmfd98=max(dfmfd1), by(nh);
The next command creates the interaction variable of treatment and time dummy
(year in this case, which is 0 for 1991 and 1 for 1998).
• gen dfmfdyr=dfmfd98*year
Regression Implementation (Contd..)

The next command runs the actual regression that implements the DD
method:
• reg lexptot year dfmfd98 dfmfdyr
Regression Implementation (OUTPUT)
Regression Implementation (Contd..)
• The results show the same impact of female participation in microﬁnance
programs on households’ annual total per capita expenditures as obtained in the
earlier exercise.
• A basic assumption behind the simple implementation of DD is that other
covariates do not change across the years.
• But if those variables do vary, they should be controlled for in the regression
to get the net effect of program participation on the outcome.
• So the regression model needs to be extended by including other covariates that
may affect the outcomes of interest.
• Create a variable lnland using the following command
• gen lnland=ln(1+hhland/100); (note: acre to decimal)
Regression Implementation (Contd..)
• reg lexptot year dfmfd98 dfmfdyr sexhead agehead educhead lnland vaccess
pcirr rice wheat milk oil egg [pw=weight]

• Note that stata offers 4 weighting options: frequency weights (fweight), analytic
weights (aweight), probability weights (pweight) and importance weights (iweight).

• By holding other factors constant, one sees that the impact of the microfi nance
programs has changed from signifi cant to insignifi cant (t = 0.97). See the finding
in the output table reported below.
Regression Output
Checking Robustness of DD with Fixed-Effects
Regression
• Another way to measure the DD estimate is to use a fixed-effects regression
instead of ordinary least squares (OLS).
• Fixed-effects regression controls for household’s unobserved and time-invariant
characteristics that may influence the outcome variable. The Stata “xtreg”
command is used to run fixed-effects regression. In particular, with the “fe”
option, it fits fixed-effect models.
• Following is the demonstration of fixed-effects regression using the simple
model:
• xtreg lexptot year dfmfd98 dfmfdyr, fe i(nh)
• The results showed again a significant positive impact of female participation.
Fixed-Effects Regression OUTPUT
Fixed-Effects Regression including After Covariates
• By including other covariates in the regression, the fixed-effects model can be
extended in the following way:

xtreg lexptot year dfmfd98 dfmfdyr sexhead agehead educhead lnland

vaccess pcirr rice wheat milk oil egg, fe i(nh)

• Results below show that, after controlling for the effects of time-invariant
unobserved factors, female participation in microcredit has a 9.1 percent positive
impact on household’s per capita consumption, and the impact is very signiﬁcant.
Fixed-Effects Regression OUTPUT after Considering
Covariates
Applying the DD Method in Cross-Sectional Data
• DD can be applied to cross-sectional data, too, not just panel data.
• The idea is very similar to the one used in panel data.
• Instead of a comparison between years, program and non-program villages are
compared, and instead of a comparison between participants and nonparticipants,
target and non-target groups are compared.
• Accordingly, the 1991 data hh_91.dta are used.
• Create a dummy variable called “target” for those who are eligible to participate in
microcredit programs (that is, those who have less than 50 decimals of land).
Then, create a village program dummy (“progvill”) for those villages that are
belonging to the program.
Applying the DD Method in Cross-Sectional Data (Contd..)
• use ..\data\hh_91,clear;
• gen lexptot=ln(1+exptot);
• gen lnland=ln(1+hhland/100);
• gen target=hhland<50;
• gen progvill=thanaid<25;
Then, generate a variable interacting the program village and target:
• gen progtarget=progvill*target
Then, calculate the DD estimate by regressing log of total per capita expenditure against
program village, target, and their interaction
• reg lexptot progvill target progtarget
The results show that the impact of microcredit program placement on the target group is
not signiﬁ cant (t = −0.61)
DD OUTPUT Using Cross-Sectional Data
DD OUTPUT Using Cross-Sectional Data (Contd..)

• The coefficient of the impact variable (“progtarget”), which is 0.053, does not
give the actual impact of microcredit programs; it has to be adjusted by
dividing by the proportion of target households in program villages. The
following command can be used to find the proportion:
• sum target if progvill==1
DD OUTPUT Using Cross-Sectional Data (Contd..)
• Of the households in program villages, 68.9 percent belong to the target group.
Therefore, the regression coefficient of “progtarget” is divided by this value,
giving 0.077, which is the true impact of microcredit programs on the target
population, even though it is not significant.
• As before, the regression model can be specified adjusting for covariates that
affect the outcomes of interest:
• reg lexptot progvill target progtarget sexhead agehead educhead lnland
vaccess pcirr rice wheat milk oil egg [pw=weight]

Holding other factors constant, one finds no change in the significance level of
microcredit impacts on households’ annual total per capita expenditures.
DD OUTPUT Using Cross-Sectional Data (Contd..)
Fixed-effects Regression for Cross-section
• Again, fixed-effects regression can be used instead of OLS to check the
robustness of the results.
• However, with cross-sectional data, household-level fixed effects cannot be run,
because each household appears only once in the data. Therefore, a village-level
fixed-effects regression is run using the following command.

xtreg lexptot progvill target progtarget, fe i(vill)

• This time there is a negative (insigniﬁcant) impact of microcredit programs on

household per capita expenditure
Fixed-effects Regression Output for Cross-section
FER Output after Considering Covariates
xtreg lexptot progvill target progtarget sexhead agehead educhead lnland, fe
i(vill)
Taking into Account Initial Conditions
• Even though DD implementation through regression (OLS or fixed effects) controls for
household- and community-level covariates, the initial conditions during the baseline survey
may have a separate influence on the subsequent changes in outcome or assignment to the
treatment.
• Ignoring the separate effect of initial conditions therefore may bias the DD estimates.
• Including the initial conditions in the regression is tricky.
• As the baseline observations in the panel sample already contain initial characteristics, extra
variables for initial conditions cannot be added directly.
• One way to add initial conditions is to take into account an alternate implementation of the
fixed-effects regression.
• In this implementation, difference variables are created for all variables (outcome and
covariates) between the years, and then these difference variables are used in regression instead
of the original variables.
• In this modified data set, initial condition variables can be added as extra regressors
without a colinearity problem.
• The following commands create the difference variables from the panel data hh_9198
Commands for Taking Into Account Initial Conditions
• sort nh year
• by nh: gen dlexptot=lexptot[2]-lexptot[1]
• by nh: gen ddfmfd98= dfmfd98[2]- dfmfd98[1] Stata creates these difference
• by nh: gen ddfmfdyr= dfmfdyr[2]- dfmfdyr[1]
variables for both years. Then an OLS
• by nh: gen dsexhead= sexhead[2]- sexhead[1]
• by nh: gen dagehead= agehead[2]- agehead[1] regression is run with the difference
• by nh: gen deduchead= educhead[2]- educhead[1] variables plus the original covariates
• by nh: gen dlnland= lnland[2]- lnland[1]
• by nh: gen dvaccess= vaccess[2]- vaccess[1]
as additional regressors, restricting
• by nh: gen dpcirr= pcirr[2]- pcirr[1] the sample to the baseline year (year
• by nh: gen drice= rice[2]- rice[1] = 0). This is done because the baseline
• by nh: gen dwheat= wheat[2]- wheat[1]
• by nh: gen dmilk= milk[2]- milk[1]
year contains both the difference
• by nh: gen dmustoil= oil[2]- oil[1] variables and the initial condition
• by nh: gen dhenegg= egg[2]- egg[1] variables.
• by nh: gen dagehead= agehead[2]- agehead[1]
Commands for Taking Into Account Initial Conditions (Contd..)

• reg dlexptot ddfmfd98 ddfmfdyr dsexhead dagehead deduchead dlnland dvaccess

dpcirr drice dwheat dmilk dmustoil dhenegg sexhead agehead educhead lnland
vaccess pcirr rice wheat milk oil egg if year==0 [pw=weight]

The results show that, after controlling for the initial conditions, the impact of
microcredit participation disappears (t = 1.42):
Output Taking Into Account Initial Conditions
The DD Method Combined with PSM
• The DD method can be refined in a number of ways.
• One is by using propensity score matching (PSM) with the baseline data to make
certain the comparison group is similar to the treatment group
• Then, apply the double differences to the matched sample.
• This way, the observable heterogeneity in the initial conditions can be dealt
with.
• Using the “pscore” command, the participation variable in 1998/99 (which is
created here as “dfmfd98” for both years) is regressed with 1991/92 exogenous
variables to obtain propensity scores from the baseline data.
The DD Method Combined with PSM: Commands
• use ..\data\hh_9198,clear
• gen lnland=ln(1+hhland/100)
• gen dfmfd1=dfmfd==1 & year==1
• egen dfmfd98=max(dfmfd1), by(nh)
• gen dfmfdyr=dfmfd98*year
• keep if year==0
• pscore dfmfd98 sexhead agehead educhead lnland vaccess pcirr rice wheat milk
oil egg [pw=weight], pscore(ps98) blockid(blockf1) comsup level(0.001)
The balancing property of the PSM has been satisfied, which means that
households with the same propensity scores have the same distributions of all
covariates for all five blocks. The region of common support is [.06030439,
.78893426], and 26 observations have been dropped.
The DD Method Combined with PSM: OUTPUT
PSM Results
PSM Results (Contd..)
PSM Results (Contd..)
Commands for DD with PSM (Contd..)
• The following commands keep the matched households in the baseline year
and merge them with panel data to keep only the matched households in the
panel sample:
• keep if blockf1!=.
• keep nh
• sort nh
• keep if _merge==3
The next step is to implement the DD method as before. For this exercise, only
the fixed-effects implementation is shown:
• xtreg lexptot year dfmfd98 dfmfdyr sexhead agehead educhead lnland
vaccess pcirr rice wheat milk oil egg, fe i(nh)
PSM-DD with FE

Difference in Differences
No ratings yet
Difference in Differences
7 pages
Wooldridge Session 5
No ratings yet
Wooldridge Session 5
57 pages
Sophia Rabe-Hesketh, Anders Skrondal - Multilevel and Longitudinal Modeling Using Stata. 2 Vols.-Stata Press (2012)
100% (2)
Sophia Rabe-Hesketh, Anders Skrondal - Multilevel and Longitudinal Modeling Using Stata. 2 Vols.-Stata Press (2012)
1,030 pages
Econometrics 2: 1. Repeated Cross Section: Difference in Differences
No ratings yet
Econometrics 2: 1. Repeated Cross Section: Difference in Differences
28 pages
Wooldridge Slides 10 Diff in Diffs
No ratings yet
Wooldridge Slides 10 Diff in Diffs
31 pages
Econometrics for Researchers
No ratings yet
Econometrics for Researchers
31 pages
Linear Mixed Effects Modeling in SPSS
No ratings yet
Linear Mixed Effects Modeling in SPSS
29 pages
Differences in Differences
No ratings yet
Differences in Differences
78 pages
Homework 2
No ratings yet
Homework 2
3 pages
Eh426 At7 2024 RDD
No ratings yet
Eh426 At7 2024 RDD
18 pages
Regression Discntinue Paper PDF
No ratings yet
Regression Discntinue Paper PDF
21 pages
Applied Longitudinal Analysis. ISBN 0470380276, 978-0470380277
100% (26)
Applied Longitudinal Analysis. ISBN 0470380276, 978-0470380277
23 pages
Heus Preview
No ratings yet
Heus Preview
29 pages
Multilevel and Longitudinal Modeling Using Stata 3rd Edition Sophia Rabe-Hesketh Digital Version 2025
100% (3)
Multilevel and Longitudinal Modeling Using Stata 3rd Edition Sophia Rabe-Hesketh Digital Version 2025
162 pages
Lecture Note 12
No ratings yet
Lecture Note 12
42 pages
Matching Regression
No ratings yet
Matching Regression
6 pages
Lect - 10 - Difference-in-Differences Estimation PDF
No ratings yet
Lect - 10 - Difference-in-Differences Estimation PDF
19 pages
Lect 10 Diffindiffs 230305 014504
No ratings yet
Lect 10 Diffindiffs 230305 014504
20 pages
Exam 2
No ratings yet
Exam 2
21 pages
Bertrand Et Al. (2004) - How Much Should We Trust Differences-In-Differences Estimates
No ratings yet
Bertrand Et Al. (2004) - How Much Should We Trust Differences-In-Differences Estimates
28 pages
Assignment #2 - For Statistical Software
No ratings yet
Assignment #2 - For Statistical Software
4 pages
Multilevel and Longitudinal Modeling Using Stata 3rd Edition Sophia Rabe-Hesketh Instant Download
No ratings yet
Multilevel and Longitudinal Modeling Using Stata 3rd Edition Sophia Rabe-Hesketh Instant Download
61 pages
2025 More On Panels
No ratings yet
2025 More On Panels
17 pages
Module 2.4 Regression Discontinuity
No ratings yet
Module 2.4 Regression Discontinuity
14 pages
Cunningham Mixtape
No ratings yet
Cunningham Mixtape
328 pages
Scott Cunningham - Causal Inference - The Mixtape-Draft Version v. 1.8 (2021)
No ratings yet
Scott Cunningham - Causal Inference - The Mixtape-Draft Version v. 1.8 (2021)
324 pages
Surviving Graduate Econometrics With R Difference-In-Differences Estimation - 2 of 8
No ratings yet
Surviving Graduate Econometrics With R Difference-In-Differences Estimation - 2 of 8
7 pages
Analysis of German Credit Data
100% (1)
Analysis of German Credit Data
24 pages
17.874 Lecture Notes Part 6: Panel Models
No ratings yet
17.874 Lecture Notes Part 6: Panel Models
13 pages
Applied Economics DD Lecture Notes
No ratings yet
Applied Economics DD Lecture Notes
76 pages
EAD Model Development Using SAS
No ratings yet
EAD Model Development Using SAS
12 pages
Panel Stata Command
No ratings yet
Panel Stata Command
6 pages
RD Lecture Slides
No ratings yet
RD Lecture Slides
101 pages
EDA Final Exam Question Paper
No ratings yet
EDA Final Exam Question Paper
2 pages
T04 PDF
No ratings yet
T04 PDF
3 pages
Financial Econometrics Homework 6
No ratings yet
Financial Econometrics Homework 6
20 pages
Econo Labs
No ratings yet
Econo Labs
27 pages
Difference-In-Differences Model Guide
No ratings yet
Difference-In-Differences Model Guide
2 pages
Regression Discontinuity Designs in Social Sciences
No ratings yet
Regression Discontinuity Designs in Social Sciences
41 pages
Advanced Econometrics Team Homework
No ratings yet
Advanced Econometrics Team Homework
3 pages
1 4 Multilevel and Longitudinal Mode PDF
No ratings yet
1 4 Multilevel and Longitudinal Mode PDF
1,503 pages
Project Employee Absenteeism
No ratings yet
Project Employee Absenteeism
33 pages
Handout 6 Causality
No ratings yet
Handout 6 Causality
16 pages
Econometric Analysis of Policy Effects
No ratings yet
Econometric Analysis of Policy Effects
48 pages
Lme4: Mixed-Effects Modeling With R
No ratings yet
Lme4: Mixed-Effects Modeling With R
145 pages
Utaa 001
No ratings yet
Utaa 001
17 pages
Stata Economic Data Analysis Guide
No ratings yet
Stata Economic Data Analysis Guide
22 pages
Advanced Econometrics Exam Prep
No ratings yet
Advanced Econometrics Exam Prep
3 pages
PhD Microeconometrics with R
No ratings yet
PhD Microeconometrics with R
5 pages
A Practical Introduction To Regression Discontinuity Designs
No ratings yet
A Practical Introduction To Regression Discontinuity Designs
165 pages
Diff Diff
No ratings yet
Diff Diff
121 pages
STATA Panel Data Commands Guide
No ratings yet
STATA Panel Data Commands Guide
7 pages
Econometric Analysis with R & Stata
100% (1)
Econometric Analysis with R & Stata
3 pages
LDA 01 Linear Discriminant Analysis
No ratings yet
LDA 01 Linear Discriminant Analysis
65 pages
Causal Inference in Econometrics
No ratings yet
Causal Inference in Econometrics
138 pages
2024 501 Incourese1 AnswerKeys
No ratings yet
2024 501 Incourese1 AnswerKeys
2 pages
PracticeSheet GameTheory 402
No ratings yet
PracticeSheet GameTheory 402
8 pages
Lecture Note 14
No ratings yet
Lecture Note 14
11 pages
Lecture Note 15
No ratings yet
Lecture Note 15
9 pages
Lecture Note 16
No ratings yet
Lecture Note 16
20 pages
Actuarial Corporate Governance
No ratings yet
Actuarial Corporate Governance
24 pages
An Article in The Journal of Environmental Engineering Vol. 115
No ratings yet
An Article in The Journal of Environmental Engineering Vol. 115
4 pages
Nota Topik 1
No ratings yet
Nota Topik 1
25 pages
Zakat Payment Decision Analysis
No ratings yet
Zakat Payment Decision Analysis
10 pages
Final Exam Formulas
No ratings yet
Final Exam Formulas
2 pages
Risk Description
No ratings yet
Risk Description
1 page
Extensive Reading 05
No ratings yet
Extensive Reading 05
15 pages
The Chain Ladder Method: Calculation of The Proportionality Factors
0% (1)
The Chain Ladder Method: Calculation of The Proportionality Factors
81 pages
Ch11 - Simple Linear Regression
No ratings yet
Ch11 - Simple Linear Regression
40 pages
02 Regression and Classification Problems
No ratings yet
02 Regression and Classification Problems
7 pages
Autocorrelation in Econometrics
No ratings yet
Autocorrelation in Econometrics
4 pages
Unit 7
No ratings yet
Unit 7
7 pages
Econometrics Final Assignment
No ratings yet
Econometrics Final Assignment
4 pages
Business Economics & Policy Courses
No ratings yet
Business Economics & Policy Courses
14 pages
Uji Hipotesis Alya
No ratings yet
Uji Hipotesis Alya
3 pages
Chap03 4
No ratings yet
Chap03 4
49 pages
Grace Bellamy 1824 West Park St. New York, NY 10249 Home: (212) 555-901 Office: (212) 555-1090
No ratings yet
Grace Bellamy 1824 West Park St. New York, NY 10249 Home: (212) 555-901 Office: (212) 555-1090
2 pages
Instagram's Impact on Student Vocabulary
No ratings yet
Instagram's Impact on Student Vocabulary
134 pages
Chap 9 Multicollinearity
No ratings yet
Chap 9 Multicollinearity
29 pages
Direktori Perusahaan Pialang Reasuransi - Januari 2020
No ratings yet
Direktori Perusahaan Pialang Reasuransi - Januari 2020
5 pages
1 Ch1-4 Actuarial Advice and The Environment
No ratings yet
1 Ch1-4 Actuarial Advice and The Environment
146 pages
EC374 Advanced Econometrics Assignment 1
No ratings yet
EC374 Advanced Econometrics Assignment 1
2 pages
BDS-Homework-1-Submission - Ipynb - Colab
No ratings yet
BDS-Homework-1-Submission - Ipynb - Colab
11 pages
PBA Projections Practice Note
No ratings yet
PBA Projections Practice Note
67 pages
Panel Cross Heteroskedasticity Test
No ratings yet
Panel Cross Heteroskedasticity Test
2 pages
Regression Analysis Guide
No ratings yet
Regression Analysis Guide
13 pages
Chapter 7 Partial Redundancy Analysis - Workshop 10 - Advanced Multivariate Analyses in R
No ratings yet
Chapter 7 Partial Redundancy Analysis - Workshop 10 - Advanced Multivariate Analyses in R
8 pages
ITLS5050 Data Set 2 v7 Multiple Regression
No ratings yet
ITLS5050 Data Set 2 v7 Multiple Regression
48 pages