Evaluation Toolkit
Evaluation Toolkit
Evaluation Toolkit
Educational
Laboratory
Central
Program Evaluation Toolkit:
Quick Start Guide
At Marzano Research
A Publication of the National Center for Education Evaluation and Regional Assistance at IES
3 7
4 6
5 8
Program Evaluation Toolkit: Quick Start Guide
Joshua Stewart, Jeanette Joyce, Mckenzie Haines, David Yanoski,
Douglas Gagnon, Kyle Luke, Christopher Rhoads, and Carrie Germeroth October 2021
i
Contents
Unpacking the Program Evaluation Toolkit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
What is the toolkit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
What is program evaluation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Who should use the toolkit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Am I ready to use this toolkit?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Where do I start? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
What is not included in the toolkit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
How do I navigate the toolkit website? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
What is included in the toolkit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ref-1
Figures
1 Guiding questions for the Program Evaluation Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Tracker for the Program Evaluation Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Opening page of the Program Evaluation Toolkit website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 Opening page of Module 1 on the Program Evaluation Toolkit website . . . . . . . . . . . . . . . . . . . . . . 8
Table
1 Module selection checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
ii
Unpacking the Program
Evaluation Toolkit
What is the toolkit?
The Program Evaluation Toolkit presents a step-by-step process for conducting a program
evaluation . Program evaluation is important for assessing the implementation and outcomes
of local, state, and federal programs . Designed to be used in a variety of education settings,
the toolkit focuses on the practical application of program evaluation . The toolkit can also
build your understanding of program evaluation so that you can be better equipped to
understand the evaluation process and use evaluation practices .
The toolkit consists of this Quick Start Guide and a website with eight modules that begin at
the planning stages of an evaluation and progress to the presentation of findings to stake-
holders . Each module covers a critical step in the evaluation process .
The toolkit includes a screencast that provides an overview of each stage of the evalua-
tion process . It also includes tools, handouts, worksheets, and a glossary of terms (see the
appendix of this guide) to help you conduct your own evaluation . The toolkit resources will
help you create a logic model, develop evaluation questions, identify data sources, develop
data collection instruments, conduct basic analyses, and disseminate findings .
1
Unpacking the Program Evaluation Toolkit
Where do I start?
You can progress through the toolkit modules either sequentially or selectively, review-
ing only modules that pertain directly to your current evaluation needs (figure 1) . In each
module the first chapter provides a basic introduction to the module topic, and the sub-
sequent chapters increase in complexity and build on the basic introduction . For each
module you can decide what level of complexity best meets your program evaluation needs .
Modules, 3, 4, and 7 require statistical knowledge . If you lack statistical expertise, you might
consider working through them with a colleague who has statistical expertise . You can use
the toolkit tracker to document your progress (figure 2) . In the tracker you can record when
you start a module and which modules you have completed .
It is best to start with Module 1: Logic models, which focuses on developing a logic model
for your program . A logic model is a graphical representation of the relationship between
program components and desired outcomes . A well-crafted logic model will serve as the
foundation for the other modules in the toolkit . You will draw on your logic model when
developing measurable evaluation questions, identifying quality data sources, and selecting
appropriate analyses and other key components of your evaluation . If you choose to prog-
ress through the toolkit selectively, the module selection checklist can help you identify
which modules to prioritize (table 1) .
2
Unpacking the Program Evaluation Toolkit
RegionalEducationalLaboratory
CentralatMarzanoResearch
3 7
4 6
5 8
ProgramEvaluationToolkitGuidingQuestions
Module 1 — Logic models Module 5 — Dataquality
• What is the purpose of a logic model? • What available data can I identify that can be used to
• How do I describe my program using a logic model? answer my evaluation questions?
• How do I assess the quality of my data?
Module 2 — Evaluationquestions
• How do evaluation questions relate to the logic Module 6 — Datacollection
model? • What data collection instruments will best help me
• How do I write high-quality evaluation questions for answer my evaluation questions?
my program? • How do I develop a simple but effective data collection
instrument?
Module 3 — Evaluationdesign
• Which design will best meet my evaluation needs? Module 7 — Dataanalysis
• What is the relationship between my evaluation design • How do I move from analysis to recommendations?
and Every Student Succeeds Act levels of evidence and • Which analysis method best meets my evaluation
What Works Clearinghouse design standards? needs?
3
Unpacking the Program Evaluation Toolkit
RegionalEducationalLaboratory
CentralatMarzanoResearch
3 7
4 6
5 8
ProgramEvaluationToolkitTracker
Module 1 — Logic models Module 5 — Dataquality
Started Completed Started Completed
4
Unpacking the Program Evaluation Toolkit
a . This module includes technical information and might require more advanced statistical knowledge .
Source: Authors’ compilation .
5
Unpacking the Program Evaluation Toolkit
6
Unpacking the Program Evaluation Toolkit
Clicking on any of the eight module links will bring you to a webpage with information about
the module content, organized into chapters (figure 4) . You can use the chapters to engage
with the module content in smaller sections . Each chapter includes a short video that
explains the content and a link to the PowerPoint slides used in the video . In addition, each
module webpage includes links to the tools, handouts, and worksheets used in the module .
You can download and print these materials to use while watching the video, or you can use
them while conducting your own evaluation .
7
Unpacking the Program Evaluation Toolkit
ram
CENTRAL
Quick Start
A Module Based Toolkit for Professional Development and Program Evaluation Guide
Chapter 1: Module 1 will guide you through the process of developing a logic model for a program. This module
What Is a Logic Model? contains four chapters that will help you do the following:
Chapter 4:
Short-, Mid-1 and Long-Term Chapter 1: What is a Logic Model?
Outcomes
Chapter 1 reviews the purpose of logic models and introduces the logic model components.
Module 2: Eval Questions
Module 8: Dissemination
Resources
-�11
• •
•
Chapter Resources
Slide Deck:
Handouts:
nEL
I ES ·.•• t"<: Regional Educational
Laboratory Program
•
Explore the Institute of Education Sciences IES Policies and Standards Additional Resources
8
Unpacking the Program Evaluation Toolkit
Module 1 guides you through developing a logic model for a program . The module contains
four chapters that will help you do the following:
• Chapter 1: Understand the purpose and components of logic models .
• Chapter 2: Write a problem statement to better understand the problem that the
program is designed to address .
• Chapter 3: Use the logic model to describe the program’s resources, activities, and
outputs .
• Chapter 4: Use the logic model to describe the short-term, mid-term, and long-term
outcomes of the program .
Chapter 1 reviews the purpose of logic models and introduces the logic model components .
Chapter 2 explains how to write a problem statement that describes the reason and context
for implementing the program . Chapters 3 and 4 present the central logic model compo-
nents: resources, activities, outputs, and short-term, mid-term, and long-term outcomes .
These two chapters also explain how the components relate to and inform the overall logic
model . In addition, the module highlights available resources on logic model development .
Module 2 guides you through writing measurable evaluation questions that are aligned to
your logic model . The module contains three chapters that will help you do the following:
• Chapter 1: Learn the difference between process and outcome evaluation questions and
understand how they relate to your logic model .
• Chapter 2: Use a systematic framework to write, review, and modify evaluation questions .
• Chapter 3: Prioritize questions to address in the evaluation .
Chapter 1 introduces the two main types of evaluation questions (process and outcome) and
explains how each type aligns to the logic model . Chapter 2 presents a systematic frame-
work for developing and revising evaluation questions and then applies that framework to
sample evaluation questions . Chapter 3 describes and models a process for prioritizing eval-
uation questions . The module includes worksheets to help you write, review, and prioritize
evaluation questions for your own program .
9
Unpacking the Program Evaluation Toolkit
Module 3 reviews major considerations for designing an evaluation . The module contains
three chapters that will help you understand the following:
• Chapter 1: The major categories of evaluation design, including when to use each design .
• Chapter 2: Threats to validity, including how to consider these threats when designing an
evaluation .
• Chapter 3: The relationship between evaluation design and Every Student Succeeds Act
(ESSA) tiers of evidence and What Works Clearinghouse (WWC) design standards .
Chapter 1 introduces four major categories of evaluation design: descriptive designs, correla-
tional designs, quasi-experimental designs, and randomized controlled trials . The chapter
explains considerations for when to use each category, including which is suited to the two
types of evaluation questions (see module 2) . Chapter 2 presents threats to internal and
external validity and provides examples of common challenges in designing evaluations .
Chapter 3 discusses the four tiers of evidence in ESSA and the three ratings of WWC design
standards . The chapter explains how each tier or rating connects to evaluation design
choices . The module includes activities to help you identify appropriate evaluation designs
and links to resources from which you can learn more about the ESSA tiers of evidence and
WWC design standards .
10
Unpacking the Program Evaluation Toolkit
Module 5 provides an overview of data quality considerations . The module also covers align-
ing data to evaluation questions . The module contains three chapters that will help you do
the following:
• Chapter 1: Identify the two major types of data and describe how to use them in an
evaluation .
• Chapter 2: Evaluate the quality of your data, using six key criteria .
• Chapter 3: Connect data to your evaluation questions .
Chapter 1 discusses the two main types of data (quantitative and qualitative) and explains
how to use both types of data to form a more complete picture of the implementation and
outcomes of your program . Chapter 2 discusses the key elements of data quality: validity,
reliability, timeliness, comprehensiveness, trustworthiness, and completeness . In addition,
the chapter includes a checklist for assessing the quality of data . Chapter 3 covers the align-
ment of data to evaluation questions . The chapter introduces the evaluation matrix, a useful
tool for planning your evaluation and the data you need to collect .
Module 6 presents best practices in developing data collection instruments and describes
how to create quality instruments to meet data collection needs . The module contains three
chapters that will help you do the following:
• Chapter 1: Plan and conduct interviews and focus groups .
• Chapter 2: Plan and conduct observations .
• Chapter 3: Design surveys .
Chapter 1 describes how to prepare for and conduct interviews and focus groups to collect
data to answer evaluation questions . Chapter 2 covers developing and using observation
protocols that include, for example, recording checklists and open field notes, to collect
data . Chapter 3 focuses on survey development and implementation . Each chapter includes
guiding documents, examples of data collection instruments, and a step-by-step process for
choosing and developing an instrument that best meets your evaluation needs .
11
Unpacking the Program Evaluation Toolkit
Module 7 reviews major considerations for analyzing data and making recommendations
based on findings from the analysis . The module contains three chapters that will help you
understand the following:
• Chapter 1: Common approaches to data preparation and analysis .
• Chapter 2: Basic analyses to build analytic capacity .
• Chapter 3: Implications of findings and how to make justifiable recommendations .
Chapter 1 reviews common techniques for data preparation, such as identifying data errors
and cleaning data . It then introduces quantitative methods, including basic descriptive
methods and linear regression . The chapter also reviews basic qualitative methods . Chapter
2 focuses on cleaning and analyzing quantitative and qualitative datasets, applying the
methods from chapter 1 . Chapter 3 presents a framework and guiding questions for moving
from analysis to interpretation of the findings and then to making defensible recommenda-
tions based on the findings .
Module 8 presents best practices in disseminating and sharing the evaluation findings . The
module contains two chapters that will help you do the following:
• Chapter 1: Learn how to develop a dissemination plan .
• Chapter 2: Explore best practices in data visualization .
Chapter 1 describes a dissemination plan and explains why a plan is helpful for sharing eval-
uation findings . It then outlines key considerations for developing a dissemination plan, such
as the audience, the message, the best approach for communicating the message, and the
best time to share the information with the audience . The chapter also includes important
considerations for ensuring that dissemination products are accessible to all members of the
audience . Chapter 2 reviews key considerations for visualizing data, including the audience,
message, and approach . The chapter also presents examples of data visualizations, including
graphs, charts, and tables, that can help make the data more easily understandable .
12
How did stakeholders
collaborate in developing
the toolkit?
The development of this toolkit arose in response to the Colorado Department of Edu-
cation’s need for tools and procedures to help districts systematically plan and conduct
program evaluations related to locally implemented initiatives . The Regional Educational
Laboratory Central partnered with the Colorado Department of Education to develop an
evaluation framework and a set of curated resources that cover program evaluation from
the planning stages to presentation of findings . The Program Evaluation Toolkit is an expan-
sion of this collaborative work .
13
Appendix. Glossary of terms
This appendix provides definitions of key terms used in the Program Evaluation Toolkit .
Terms are organized by module and listed in the order in which they are introduced in each
module .
Problem statement: A description of the problem that the program is designed to address .
Resources: All the available means to address the problem, including investments, materi-
als, and personnel . Resources can include human resources, monetary resources, facilities,
expertise, curricula and materials, time, and any other contributions to implementing the
program .
Activities: Actions taken to implement the program or address the problem . Activities can
include professional development sessions, after-school programs, policy or procedure
changes, use of a curriculum or teaching practice, mentoring or coaching, and development
of new materials .
Outcomes: The anticipated results once you implement the program . Outcomes are divided
into three types:
Short-term outcomes: The most immediate results for participants that can be
attributed to program activities . Short-term outcomes are typically changes in knowl-
edge or skills . Short-term outcomes are expected immediately following exposure to
the program (or shortly thereafter) .
systemic changes or changes in student outcomes . They might not be the sole result of
the program, but they are associated with it and might manifest themselves after the
program concludes .
Additional considerations: Important details or ideas that do not fit into the other com-
ponents of the logic model . Additional considerations can include assumptions about the
program, external factors not covered in the problem statement, and factors that might
influence program implementation but are beyond the evaluation team’s control .
A-2
Appendix. Glossary of terms
Outcome questions: Questions about the impact of a program over time . They are also
called summative questions .
PARSEC: A framework for creating quality evaluation questions . PARSEC is an acronym for
pertinent, answerable, reasonable, specific, evaluative, and complete .
Specific: A question directly addresses a single component of the logic model . Specific
questions are clearly worded and avoid broad generalizations .
Complete: The entire set of questions addresses all the logic model components that
are of critical interest .
A-3
Appendix. Glossary of terms
Correlational designs: Used to identify a relationship between two variables and deter-
mine whether that relationship is statistically meaningful . Correlational analyses do not
demonstrate causality . They can find that X is related to Y, but they cannot find that X
caused Y .
Comparison group: The group that does not receive the intervention and is used
as the counterfactual to the intervention .
Validity: The extent to which the results of an evaluation are supportable, given the eval-
uation design and the methods used . Validity applies to the evaluation design, analytic
methods, and data collection . Ultimately, valid claims are sound ones . There are two main
types of validity:
Selection bias: When the treatment group differs from the comparison group in a
meaningful way that is related to the outcomes of interest .
A-4
Appendix. Glossary of terms
Hawthorne effect: When individuals act differently because they are aware that
they are taking part in an evaluation .
The Every Student Succeeds Act (ESSA): A law that encourages state and local education
agencies to use evidence-based programs . There are four ESSA tiers of evidence (U .S .
Department of Education, 2016) . These tiers fall under the Education Department General
Administrative Regulations Levels of Evidence for research and evaluation design standards:
What Works Clearinghouse (WWC) design standards: The WWC is part of the U .S . Depart-
ment of Education’s Institute of Education Sciences . To provide educators with the informa-
tion they need to make evidence-based decisions, the WWC reviews research on education
programs, summarizes the findings of that research, and assigns evidence ratings to individ-
ual studies (What Works Clearinghouse, 2020) .
A-5
Appendix. Glossary of terms
There are three WWC design standards that correspond to the ESSA tiers of evidence . A
study can be found:
To meet WWC standards without reservations: This tier corresponds to strong evi-
dence under ESSA .
To meet WWC standards with reservations: This tier corresponds to moderate evi-
dence under ESSA
Not to meet WWC standards: This tier still provides promising evidence under ESSA .
A-6
Appendix. Glossary of terms
Generalizability: The extent to which the results of an evaluation apply to different types of
individuals and contexts .
Sample size: The number of participants needed in a sample to collect enough data to
answer the evaluation questions .
Sampling frame: A list of all possible units (such as students enrolled in schools in a particu-
lar district) that can be sampled .
Random sampling: A sampling technique in which every individual within a population has a
chance of being selected for the sample . There are three main types of random sampling:
Simple random sampling: Individuals in a population are selected with equal probabili-
ties and without regard to any other characteristics .
Stratified random sampling: Individuals are first divided into groups based on known
characteristics (such as gender or race/ethnicity) . Then, separate random samples are
taken from each group .
Clustered random sampling: Individuals are placed into specific groups, and these
groups are randomly selected to be in the sample . Individuals cannot be in the sample
if their groups are not selected .
Nonrandom sampling: A sampling technique in which only some individuals have a chance
of being selected for the sample . There are four main types of nonrandom sampling:
Consecutive sampling: Individuals meeting a criterion for eligibility (such as being math
teachers) are recruited until the desired sample size is reached .
Convenience sampling: Individuals are selected who are readily available and from
whom data can be easily collected .
A-7
Appendix. Glossary of terms
Snowball sampling: Individuals are recruited through referrals from other participants .
Purposive sampling: Individuals are selected to ensure that certain characteristics are
represented in the sample to meet the objectives of the evaluation .
Saturation: The point at which the data collected begin to yield no new information and
data collection can be stopped .
Unit of measurement: The level at which data are collected (for example, student, class-
room, school) .
Confidence interval: A range of values for which there is a certain level of confidence that
the true value for the population lies within it . The range of values will be wider or narrower
depending on the desired level of confidence . Standard practice is to use a 95 percent confi-
dence level, which means there is a 95 percent chance that the range of values contains the
true value for the population .
Null hypothesis: A statement that suggests there will be no difference between the treat-
ment group and the comparison group involved in an evaluation .
Statistical power: The probability of rejecting the null hypothesis when a particular alterna-
tive hypothesis is true .
Continuous data: Data that can take on a full range of possible values, such as student test
scores, years of teaching experience, and schoolwide percentage of students eligible for the
National School Lunch Program .
Binary data: Data that can take on only two values (yes or no), such as pass or fail scores on
an exam, course completion, graduation, or college acceptance .
Standard deviation: A measure that indicates how spread out data are within a given
sample .
A-8
Appendix. Glossary of terms
Data quality: The extent to which data accurately and precisely capture the concepts they
are intended to measure .
Validity: The extent to which an evaluation or instrument really measures what it is intended
to measure . Validity applies to the evaluation design, methods, and data collection . There
are two main types of validity:
External validity: The extent to which an instrument or evaluation findings can be gen-
eralized to different contexts, such as other populations or settings .
Reliability: The extent to which the data source yields consistent results . There are three
common types of reliability:
Test–retest reliability: The extent to which the same individual would receive the
same score on repeated administrations of an instrument .
Inter-rater reliability: The extent to which multiple raters or observers are consistent
in coding or scoring .
Timeliness: The extent to which data are current and the results of data analysis and inter-
pretation are available when needed .
Trustworthiness: The extent to which data are free from manipulation and entry error .
Trustworthiness is often addressed by training data collectors .
Completeness: Data are collected from all participants in the sample and are sufficient to
answer the evaluation questions . Completeness also relates to the degree of missing data
and the generalizability of the dataset to other contexts .
A-9
Appendix. Glossary of terms
Triangulation: Reviewing multiple sources of data to look for similarities and differences .
Member checks: Establishing the validity of qualitative findings through key stakeholder and
participant review .
Audit trail: A documented history of qualitative data collection and analysis . Careful doc-
umentation of data collection procedures, training of data collectors, and notes allows for
findings to be cross-referenced with the conditions under which the data were collected .
Evaluation matrix: A planning tool to ensure that all necessary data are collected to answer
the evaluation questions .
A-10
Appendix. Glossary of terms
Focus group: Directly asking a group of participants questions to collect data to answer an
evaluation question .
Survey: Administering a fixed set of questions to collect data in a short period . Surveys can
be an inexpensive way to collect data on the characteristics of a sample in an evaluation,
including behaviors, practices, skills, goals, intentions, aspirations, and perceptions .
Observable variable: Behaviors, practices, or skills that can be directly seen and measured .
Also called a measurable variable . These data are collected in a variety of ways (for example,
observations, surveys, interviews) .
Open-ended question: A question that does not include fixed responses or scales but allows
respondents to add information in their own words .
Close-ended question: A question that includes fixed responses such as yes or no, true or
false, multiple choice, multiple selection, or rating scales .
A-11
Appendix. Glossary of terms
Midpoint: The middle of a rating scale with an odd number of response options . Typically,
respondents can select the midpoint to remain neutral or undecided on a question .
Double-barreled question: A question that asks two questions but forces respondents to
provide only one answer . For example, “Was the professional development culturally and
developmentally appropriate?”
Loaded question: A question that could lead respondents to answer in a way that does not
represent their actual position on the topic or issue . For example, the wording of a question
or its response options could suggest to respondents that a certain answer is correct or
desirable .
Probing question: A follow-up question that helps gain more context about a particular
response or helps participants think further about how to respond .
Recording checklist: A standardized form, with preset questions and responses, for observ-
ing specific behaviors or processes .
Observation guide: A form that lists behaviors or processes to observe, with space to record
open-ended data .
Mutually exclusive: When two response options in a survey cannot be true at the same
time .
Collectively exhaustive: When response options in a survey include all possible responses to
a question .
A-12
Appendix. Glossary of terms
Data error: The difference between an actual data value and the reported data value .
Outlier: A data value that is positioned an abnormal distance from the expected data range .
Data analysis: The process of examining and interpreting data to answer questions . There
are two broad approaches to data analysis:
Standard deviation: A measure of how spread out data points are that describes how far the
data are from the mean .
Range: The maximum and minimum observed values for a given variable .
Quartile: One of four even segments that divide up the range of values in a dataset .
Interquartile range: The spread of values between the 25th percentile and the 75th
percentile .
t-test: A comparison of two means or standard deviations to determine whether they differ
from each other .
Correlation analysis: Analysis that generates correlation coefficients that indicate how
differences in one variable correspond to differences in another . A positive correlation
A-13
Appendix. Glossary of terms
coefficient indicates that the two variables either increase or decrease together . A negative
correlation coefficient indicates that, as one variable increases, the other decreases .
Simple or linear regression analysis: Analysis that can show the relationship between
two variables .
Multiple regression analysis: Analysis that can control for other factors by including
additional variables .
Dependent variable: A variable that could be predicted or caused by one or more other
variables .
Independent variable: A variable that has an influence on or association with the dependent
variable .
Covariate: A variable that has a relationship to the dependent variable that should be con-
sidered but that is not directly related to a program . Examples of covariates are student
race/ethnicity, gender, socioeconomic status, and prior achievement .
A-14
Appendix. Glossary of terms
Dissemination plan: Strategically planning dissemination activities to use time and other
resources efficiently and to communicate effectively .
Audience: The group of people who need or want to hear the information that will be
disseminated .
Message: The information that the audience needs to know about an evaluation and that
the evaluators want to share .
Approach: The means used to disseminate the information to the audience . There are many
dissemination approaches:
Blog: An online forum for sharing regular updates about a program and the evaluation
process .
Data dashboard: A visual tool for organizing and sharing summaries of large amounts
of data, especially quantitative data .
Media release: A write-up about an evaluation and its findings to be shared with
media outlets .
Evaluation report: A formal, highly organized document describing the methods, mea-
sures, and findings of an evaluation .
Summary of findings: A short one- to two-paragraph piece that briefly describes what
is happening and what was found in an evaluation .
Social media: Digital tools to quickly create and share information about an evaluation
with a variety of audiences .
Webinar: A visual medium and way to reach large numbers of people, often at little or
no cost .
A-15
Appendix. Glossary of terms
Infographic: A one- or two-page document that graphically represents data and find-
ings to tell a story .
Plain language: Using clear communication and writing so that it is easy for the audience to
understand and use the findings of an evaluation .
Accessibility: Ensuring that dissemination products are available to all individuals, including
people with disabilities, by meeting the requirements for Section 508 compliance .
Data visualization: Using graphical representations so that data are easier to understand .
Alternative text: A narrative description of a figure, illustration, or graphic for readers who
might not be able to engage with the content in a visual form .
A-16
References
U .S . Department of Education . (2016) . Non-regulatory guidance: Using evidence to strengthen educa-
tion investments. https://www2 .ed .gov/policy/elsec/leg/essa/guidanceuseseinvestment .pdf .
What Works Clearinghouse . (2020) . Standards handbook (Version 4 .1) . U .S . Department of Education,
Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance .
https://ies .ed .gov/ncee/wwc/handbooks .
Ref-1
Acknowledgments
The Program Evaluation Toolkit would not have been possible without the support and con-
tributions of Trudy Cherasaro, Mike Siebersma, Charles Harding, Abby Laib, Joseph Boven,
David Alexandro, and Nazanin Mohajeri-Nelson and her team at the Colorado Department of
Education .
REL 2022–112
October 2021
This resource was prepared for the Institute of Education Sciences (IES) under Contract
ED-IES-17-C-0005 by the Regional Educational Laboratory Central administered by Marzano
Research . The content of the resource does not necessarily reflect the views or policies of IES
or the U .S . Department of Education, nor does mention of trade names, commercial prod-
ucts, or organizations imply endorsement by the U .S . Government .
This REL resource is in the public domain . While permission to reprint this publication is not
necessary, it should be cited as:
Stewart, J ., Joyce, J ., Haines, M ., Yanoski, D ., Gagnon, D ., Luke, K ., Rhoads, C ., & Germeroth, C .
(2021) . Program Evaluation Toolkit: Quick Start Guide (REL 2022–112) . U .S . Department of Edu-
cation, Institute of Education Sciences, National Center for Education Evaluation and Regional
Assistance, Regional Educational Laboratory Central . http://ies .ed .gov/ncee/edlabs .
This resource is available on the Regional Educational Laboratory website at http://ies .ed .gov/
ncee/edlabs .