This repository contains the Global Learning Assessment Database (GLAD), a collection of harmonized learning assessments datasets at the student and country level.
All the code required to create this collection, starting from the raw microdata of each assessment, are available in this repository. Our intention in doing so is to incentivize others to contribute to growing this collection.
For an example of analysis enabled by this collection, please check the Learning Poverty repo and its corresponding technical paper [1].
[1] Azevedo, J.P., and others. 2019. “Will Every Child Be Able to Read by 2030? Why Eliminating Learning Poverty Will Be Harder Than You Think, and What to Do About It.” World Bank Policy Research Working Paper series. Washington, DC: World Bank.
- Harmonization: harmonizes raw microdata of learning assessments into student-level datasets
- Indicators: consolidate harmonized data by subgroups into country-level outcomes
Starts from the original datasets of each assessment (pulled from eduraw collection in datalibweb or from a local copy, directly downloaded from the data publishers) and ends with the creation of the dataset GLAD_ALL and GLAD_ALL-BASE. Files receive a master vintage that reflects any possible updates of a surveyid (region_year_assessment).
Those two modules of GLAD (ALL and ALL-BASE) are at the learner level, that is, one observation corresponds to one learner or student or pupil. Both modules contain the harmonized variables, but the module ALL-BASE additionally includes all the original variables from the raw data. Since the ALL-BASE file may be very large, we recommend using the module ALL whenever possible.
The output files are saved in the clone with adaptation vintage wrk_A, and corresponding markdown documents are generated with the same name. The assessments currently in the loop are (click on the links for each file's markdown documentation):
- GLAD_ALL for PIRLS: 2001 2006 2011 2016
- GLAD_ALL for TIMSS: 2003 2007 2011 2015
- GLAD_ALL for LLECE: 2006 2013
- GLAD_ALL for PASEC: 2014
- GLAD_ALL for SACMEQ: 2000 2007
Starts from the GLAD_ALL datasets (flexibly pulled from GLAD collection in datalibweb or from the clone) and ends with the creation of the dataset GLAD_CLO for all the surveys specified in the loop. In the resulting CLO file, each country-grade has several observations, corresponding to the subgroups all / male / female / urban / rural. There is flexibility in the code to add other subgroups as needed.
The output files are saved in the clone with adaptation vintage wrk_A, and corresponding markdown documents are generated with the same name. The assessments currently in the loop are (click on the links for each file's markdown documentation):
- GLAD_CLO for PIRLS: 2001 2006 2011 2016
- GLAD_CLO for TIMSS: 2003 2007 2011 2015
- GLAD_CLO for LLECE: 2006 2013
- GLAD_CLO for PASEC: 2014
- GLAD_CLO for SACMEQ: 2000 2007
The GLAD programs by default use data from datalibweb. Please see guidelines to retrieve data from datalibweb. Note that datalibweb requires access and authentication to the WorldBank network.
The GLAD programs also make use of the edukit package. The latest version of edukit and installation instructions can be found in the EduAnalyticsToolkit repo.
The programs are set up to automatically generate a documentation .md file for each GLAD.dta created. This will only run from Stata versions 15 or above, for it uses the dyntext command.
See the Contribution and Replication note for information on how to navigate this repository, how to contribute to the code and how to replicate the numbers.
In short, external researchers wanting to reproduce the collection can do so, provided they download the raw microdata from the assessments, and change the default source from datalibweb to their local download copy of the raw microdata.
For the internal World Bank audience, all the datasets in this collection are readily accessible in datalibweb and Statistics On-Line (SOL).
This Repository is maintained by the EduAnalytics team at the World Bank Education Global Practice.
The EduAnalytics team aims to provide internal and external clients timely access to high quality data, tools, and analytics that can be used to measure, monitor, and understand the education sector across regions.
The team can be reached at eduanalytics@worldbank.org.