Chapter 4
Exercise 4.1.
Identifying Examples of Evaluation, Assessment, and Measurement.
For each of the class scenario provided below, describe specific examples of evaluation,
assessment and measurement.
1. At the end of a lesson on adding mixed fractions, Mr. Tong gives hos fourth-grade students
sheet of fifteen pairs of mixed fractions to add. Mr. Tong collects the completed sheets and
cheeks his students' answers. He notices that all of the students have at least thirteen correct
answers, except Ethan and Marcella, who both have answered only eight correctly. Mr. Tong
decides to provide extra help for Ethan and Marcella while the rest of the class has free reading
time.
a. Evaluation: Mr. Tong looks over the completed worksheets and decides that Ethan and
Marcella need additional support since they got fewer correct answers than their classmates. He
adjusts his teaching approach by offering them extra help while the rest of the class moves on to
independent reading.
b. Assessment: The worksheet helps Mr. Tong see how well his students understand adding
mixed fractions and spot who needs more practice.
c. Measurement: Counting how many problems each student got right out of 15 gives a clear way
to see their level of understanding.
2. A high school English teacher. Mrs. Goldbloom. gives students a writing assignment in which
they are to create an allegorical story. She reads the stories and gives each a score based on (a)
the correct use of allegory, 10 points; (b) creativity, 5 points; and (c) writing style, 5 points.
When handing back the papers. Mrs. Goldbloom announces that any students whose total scores
are less than 12 must schedule a conference with her and then write a second draft of their
stories.
a. Evaluation: Mrs. Goldbloom checks the students' scores and figures that anyone scoring under
12 needs to meet with her to go over their work and revise it for improvement.
b. Assessment: This writing task allows Mrs. Goldbloom to see how well students understand
allegory while also considering their creativity and writing ability.
c. Measurement: The scores given for different aspects of the story—such as allegory (10
points), creativity (5 points), and writing style (5 points)—provide a way to gauge student
performance.
3. Ms. Fisk shows a video tape to her first-graders about being kind to other children. Curious
about the effect of the tape on her students, she watches them carefully for the next two hours.
During that time period, Ms. Fisk counts twenty-three examples specific instances of kindness
that her students exhibit toward each Other. "Wow," she thinks to herself that tape really made
an impression. Since the tape was so effective with this group, I think I'll plan to show it to next
year's class too."
a. Evaluation: After watching her students for a while, Ms. Fisk decides that the video had a
positive effect, since she notices a lot more kindness among the kids.
b. Assessment: Ms. Fisk observes how her students interact to see if the video influenced their
behavior in any way.
c. Measurement: Ms. Fisk counts the 23 acts of kindness as a way to measure how much the
video impacted her students.
4. Ms. Rodriguez gives her students several practice opportunities to identify subjects and
predicates in sentences that she displays on an overhead projector. While students begin to
practice independently at their seats, she thinks to herself. "Miguel has not raised his hand to
participate in any of the exercises so far. Maybe he is confused. I'd better check to see how he's
doing on the written assignment."
a. Evaluation: Ms. Rodriguez notices that Miguel hasn’t participated in the exercises and decides
that she needs to check his progress to understand how well he’s grasping the material.
b. Assessment: Ms. Rodriguez is informally assessing Miguel’s understanding of the lesson by
deciding to check his written assignment after observing his lack of participation.
c. Measurement: The written assignment serves as a way to measure how well Miguel is doing
with identifying subjects and predicates.
5. Before beginning a new algebra unit, Mr. Thomas designs a twenty-item test that focuses on
prerequisite skills for the unit. None of his students score above 70% correct on the test. Looking
at the results, Mr. Thomas thinks to himself, "I guess I'd better not assume that all my students
have mastered the prerequisite skills. I should teach and review them throughout the unit."
a. Evaluation: Mr. Thomas reviews the test results and decides that the students haven’t mastered
the prerequisite skills, so he plans to review and teach them throughout the unit.
b. Assessment: The test serves as an assessment to determine how well students have mastered
the skills needed before starting the new algebra unit.
c. Measurement: The test scores, which show none of the students scoring above 70%, provide a
way to measure their current level of understanding of the prerequisite skills.
Exercise 4.2
Identifying Major Types of Evaluation and Assessment
Identify the type of evaluation and assessment best exemplified by the scenarios provided below
by placing one of the following letters in front of each:
P = pre-instructional, F formative, S = summative, D = diagnostic.
1. Sally has been having trouble with three-digit multiplication, so Mr. Michaels calls her to his
desk to work a couple three-digit multiplication exercises. As Sally works, Mr. Michaels watches
her closely and asks her questions about what she is doing.
F (Formative) - Mr. Michaels is observing Sally while she works on the problems, checking
how she’s doing and offering help if she’s struggling.
2. After demonstrating the procedure for taking a pulse, Mrs. Edwards watches as her students
take each other pulses. Observing that some students are not placing their fingers on the correct
locations. she decides to get everyone's attention and demonstrate again.
F (Formative) - Mrs. Edwards notices that some students are having trouble with taking
pulses and decides to stop and show them the correct way again.
3, At the conclusion of a social studies unit on Native Americans. Ms. Jordan has groups of
students work together to write and produce plays to demonstrate some of the information they
have learned.
S (Summative) - The group project where students write and perform a play is a final task to
see how much they’ve learned about Native Americans during the unit.
4. Before beginning a new lesson on converting fractions to decimals, Mr. Fogel writes several
division sentences on the board and asks students to find the quotients. Because all of his
students find the correct quotients within five minutes, he decides to continue with teaching the
new skill.
P (Pre-instructional) - Mr. Fogel checks his students’ ability to handle division problems
before starting the lesson on fractions to make sure they’re ready for the new material.
5. In the middle of explaining how hydrogen and oxygen combine to form water, Ms. Lukas
notices a puzzled look on several students' faces, she decides to explain the process again. this
time more slowly and illustrating with molecule models.
F (Formative) - Ms. Lukas sees that a few students look confused, so she decides to explain
the concept more slowly and use a few more examples to clarify things.
6. While checking Calvin's math homework. Mrs. Clemens notices that he missed almost all of
the subtraction exercises. Analyzing Calvin's mistakes more closely. she discovers that he
sometimes adds digits rather than subtracting them.
D (Diagnostic) - Mrs. Clemens notices that Calvin is struggling with subtraction and takes a
closer look at his mistakes to figure out why he’s not getting it right.
7. Ms. Hernandez is planning a series of lessons on the system of checks and balances in the U.
S. federal government. Before proceeding with the lessons. she asks each student to write on a
piece of paper the primary functions of the executive, legislative, and judicial branches of
government.
P (Pre-instructional) - Ms. Hernandez has her students write about what they know about the
branches of government before starting her lessons, so she can see what they already
understand.
8. Several days after the students in a high school have heard a popular sports figure warn them
about the dangers of drug and alcohol abuse, the school's principal distributes a survey to 100
students selected at random. The survey asks questions about students' attitudes toward drug and
alcohol abuse. The results of the survey are compared to the results of a similar survey that was
administered earlier in the school year to 100 randomly selected students
S (Summative) - The survey is used to see how students’ attitudes have changed after the
anti-drug talk, comparing their responses to an earlier survey.
9. As Casey reads a set of sight words presented on flashcards, her teacher has her first-grade
classmates either put their thumbs up if they agree with Casey or thumbs down if they disagree.
F (Formative) - As Casey reads sight words, her classmates give thumbs-up or thumbs-down
to show whether they agree with her, providing feedback on her progress.
10. Following a simulated frog dissection activity on the Internet, Mr. Andre gives students in his
biology class a blank diagram of a frog's anatomy and asks them to label all the major internal
organs.
F (Formative) - After the online frog dissection, Mr. Andre asks his students to label a frog
diagram, checking their understanding of the frog’s anatomy.
Exercise 4.3
Identifying Threats to Assessment Validity and Reliability
Read the following assessment scenarios. Below each scenario describe any specific threats to
assessment validity or reliability that you identify. For each threat that you identify briefly
specific action that you could take to improve the assessment strategy.
l. Mrs. Franco sits down to begin grading her students' English essays. Noticing that the paper on
top of the stack belongs to Ashley, she says to herself, "Ashley is my student. I'm glad I get to
read her paper first. I'm sure her essay will set the standard for the rest of the class."
a. Threat to reliability or validity:
Mrs. Franco's personal bias toward Ashley might affect how she grades her paper, leading to
inconsistent grading. This could make her evaluation unfair and unreliable.
b. Specific action to correct:
To prevent this, Mrs. Franco should grade the essays without knowing which student wrote each
one. This would ensure a more impartial and consistent grading process.
2. Mr. Glasser scores an algebra test while watching the seventh game of the World Series on
television. He must get the tests scored for tomorrow's class, but his favorite baseball team is
playing.
a. Threat to reliability or validity:
Scoring the test while distracted by the game could lead to inconsistent grading because Mr.
Glasser might not be giving each test the full attention it deserves. This could reduce the
reliability of the scores, as some tests may be graded more thoroughly than others due to the
distraction.
b. Specific action to correct:
Mr. Glasser should grade the tests without distractions, setting aside time to focus solely on
grading. This would ensure that each test is scored fairly and consistently.
3. One of Ms. Heath's instructional goals is for her students to be able to recognize imports and
exports. For assessment, Ms. Heath asks students to match each term with its correct definition.
a. Threat to reliability or validity:
The matching activity might not give a full picture of the students' understanding. It could be too
basic and might not assess whether students can apply the concepts of imports and exports in
different contexts, affecting the validity.
b. Specific action to correct:
Ms. Heath could add questions that ask students to explain the concepts or provide examples of
imports and exports in real-life situations to better measure their understanding.
4. Mrs. McDonald wants to find out how well her biology students can apply basic principles of
genetics. For her assessment. she gives students five of the same genetics scenarios that she used
in class and asks them to make predictions about hair color and eye color.
a. Threat to reliability or validity:
Since students have already seen these scenarios in class, they might remember the answers
instead of actually applying their knowledge. This could make the assessment less effective in
measuring true understanding.
b. Specific action to correct:
Mrs. McDonald could create new but similar scenarios that require students to think critically
rather than rely on memory. This would help ensure the assessment truly reflects their ability to
apply genetics principles.
5. For the final assessment for his unit on African history, Mr. James has each of his students
draw a question written on a slip of paper from a hat. The students have twenty minutes to
answer the questions they have drawn.
a. Threat to reliability or validity:
Because students randomly pick questions, some might get easier ones while others get harder
ones. This makes the assessment uneven, so it might not fairly measure each student’s
understanding.
b. Specific action to correct:
Mr. James could make sure all students answer the same set of questions or provide multiple
questions for them to choose from. This way, everyone has a fair chance to demonstrate what
they’ve learned.
6. At the end of a biology unit on the animal kingdom and the major classes of animals, Mrs.
Wilson sits down to prepare a test. Mrs. Wilson has a personal fondness for birds, and
approximately half of the test's fifty questions are related to this particular class of animals.
a. Threat to reliability or validity:
Since a large portion of the test focuses on birds, the assessment is imbalanced and does not
fairly cover all major animal classes. This affects the validity, as it does not accurately measure
students' overall understanding of the animal kingdom.
b. Specific action to correct:
Mrs. Wilson should revise the test to ensure that questions are evenly distributed across all major
animal groups. This way, the assessment more accurately reflects the entire unit's content.
7. Following a lesson on solving proportions. Mr. Watts gives his students a set of twenty
proportions to solve to assess their learning. He allows the students to work together in groups
for the remainder of the class and then tells them to finish the sheet at home.
a. Threat to reliability or validity:
Since students are working together and then completing the assignment at home, some might
get more help than others. This means the results may not accurately show each student's
individual understanding of proportions, affecting the validity of the assessment.
b. Specific action to correct:
Mr. Watts could have students complete at least part of the assignment independently in class
before working in groups or taking it home. This way, he can better gauge their actual skills
before they get outside help.
8. Throughout the entire school year, a science teacher. Mrs. Frederico, has tried to develop her
students' abilities to apply the scientific method. She concludes the year by asking students to list
and explain each of the major steps of the scientific method. She also asks them to indicate how
well they think they can use the scientific method and how useful they think it is.
a. Threat to reliability or validity:
The assessment mainly checks if students can recite the steps of the scientific method rather than
showing they know how to use it. This means it may not truly reflect their ability to apply what
they've learned. Also, asking students to rate their own skills might not give an accurate picture
of their actual understanding.
b. Specific action to correct:
Instead of just listing steps, Mrs. Frederico could have students conduct a simple experiment
where they apply the scientific method. This way, she can better see if they understand how to
use it in practice.
Exercise 4.4
Demonstrating Instructional Goals for Chapter 4
Respond to each of the questions below to demonstrate your achievement of the chapter's
instructional goals.
GOAL I: State definitions for the following terms and identify examples of each: evaluation,
assessment, and measurement.
l. In your own words, explain the meanings of the following terms so that their differences and
relationships are clear:
Evaluation is the process of judging the overall effectiveness of a course or program. It looks
at whether the course met its learning goals and if the teaching methods and materials were
successful. It's more about how well the entire learning experience worked.
Assessment is when teachers check how well students are learning. It includes things like
quizzes, projects, or assignments. It helps measure students' understanding of the material,
either during the course (formative) or at the end (summative).
Measurement is simply turning students' performance into numbers, like grades or test
scores. It helps track how well a student is doing and gives a way to compare their
performance.
2. In the following scenario, identify a specific example of evaluation, assessment. and
measurement: At the end of eighth grade, all the students in a school district write a three-
paragraph essay. The essays are rated on a scale of I (poor) to 10 (superior) by all the eighth-
grade English teachers. Any student whose essay receives a score less than 5 will be assigned to
a remedial writing class for the next school year.
a. Evaluation:
At the end of the year, the overall effectiveness of the writing curriculum is being evaluated
based on the essays written by the students. The fact that students who score below 5 are
assigned to remedial writing suggests that the school district is assessing how well the writing
program has worked in preparing students.
b. Assessment:
The essays are used to assess the students' writing skills. The process of having students write a
three-paragraph essay helps the teachers gauge their abilities and identify areas where they might
need more support or instruction.
c. Measurement:
The essays are measured using a scale from 1 (poor) to 10 (superior). The score each student
receives is a numeric value that quantifies their performance and determines whether they will be
assigned to remedial classes.
GOAL 2: Explain the purposes or the following types of instructional evaluation: pre-
instructional, formative, summative, and diagnostic.
3. In your own words, describe the reasons why classroom teachers use pre instructional,
formative, summative, and diagnostic evaluation.
a. Pre-instructional Evaluation:
Before starting a lesson, teachers check to see what students already know. This helps them plan
their teaching so they don’t waste time on things students already understand, or miss out on
concepts they haven’t learned yet.
b. Formative Evaluation:
During the lesson, teachers use formative evaluation to keep track of how students are doing. It
could be through quizzes, discussions, or small activities. This helps teachers adjust their
teaching if students need more help or if they’re ready to move on.
c. Summative Evaluation:
Summative evaluation happens at the end of a lesson or unit. It’s used to check how much
students have learned overall. This could be through final exams, essays, or big projects, and it
helps teachers decide if the learning goals were met.
d. Diagnostic Evaluation:
Diagnostic evaluation helps teachers figure out what specific areas a student is struggling with.
Before or during the lesson, teachers may use it to identify where students need extra help, so
they can focus on those areas and prevent bigger learning gaps.
GOAL 3: Recognize examples of pre-instructional, formative, summative, and diagnostic
assessment.
4. Identify the type of assessment best represented by the four scenarios provided below by using
the following letters:
P = pre-instructional, F = formative, S = summative, D = diagnostic.
F A third-grade teacher holds up cards with addition sentences written on them. The students
write the sum for each sentence on their own small chalkboards and hold them up for the teacher
to see. The teacher calls on students who have incorrect answers so she can give them further
help.
P Before beginning a lesson on capitalizing proper nouns, a teacher gives his students a list of
ten nouns and asks them to circle all the proper nouns.
D A teacher asks a student who has performed poorly on a test of physics principles to give
an oral explanation of the meaning of torque.
S After teaching students how to write business letters, a teacher asks each of her students to
write a business letter. Students whose letters receive a score of "8" on a ten-point scale are
considered to have mastered the skill.
Goal 4: State and describe the two characteristics that all types of assessment data must possess:
reliability and validity
5. In the space below, explain in your own words the meaning of reliability and validity. Explain
why all assessment data should be reliable and valid as much as possible.
In syllabus design, reliability means that the assessment results stay consistent over time. If you
give the same test again or have different people score it, you'd expect similar results. It ensures
that the data you get is dependable and not influenced by random factors.
Validity means that the assessment is actually measuring what it's supposed to measure. In other
words, if the test is meant to check certain skills or knowledge, it should do just that. A valid
assessment ensures that you're testing the right things.
Having both reliable and valid assessments is crucial because they help make sure the data we
collect is both consistent and accurate. Reliable assessments provide dependable results, while
valid assessments ensure the test is actually measuring what it should. If either of these is
lacking, the results could end up being misleading or unfair, leading to incorrect conclusions
about students' abilities.
Goal 5: explain and demonstrate how assessment reliability and validity are related.
6. In your own words, explain how assessment reliability and validity are related to each other, is
it possible to have one without the other? Back up your answer with specific examples that are
different from those presented in the book.
Reliability and validity are both important, but they focus on different things. Reliability is about
whether the results stay the same every time you do the assessment. Validity is about whether the
test is actually measuring what it’s supposed to measure.
You can have a reliable test without it being valid. For example, if a quiz always gives the same
results but only tests memorization (not real understanding), it’s reliable but not valid. It’s
consistent, but it’s not testing the right things. You can also have a valid test without it being
reliable. If a project accurately checks a student’s knowledge (valid), but the grading is
inconsistent or changes each time, it’s not reliable.
Goal 6: describe and demonstrate threats to assessment validity and reliability.
7. Explain how content inconsistency and action inconsistency threaten assessment validity.
Support your explanation with specific examples.
Content inconsistency and action inconsistency can both mess up the validity of an assessment,
meaning the test won't really measure what it's supposed to.
Content inconsistency happens when the test doesn't match what was actually taught in the
course. For instance, if a course focuses on writing but the exam has a section on math, the exam
no longer tests writing skills, which is what the course is meant to teach. It’s testing something
unrelated, so it’s not valid.
Action inconsistency occurs when the test is administered or graded in an unfair way. For
example, if one student gets extra time to finish an assignment while another doesn’t, the
assessment isn’t measuring their true abilities. It’s being affected by unequal conditions, which
makes the results invalid.
Both types of inconsistency mess with how accurately the test measures what it’s supposed to.
To keep an assessment valid, the content should match what was taught, and everyone should be
treated the same way when they take the test.
8. Explain how each of the variables listed below can threaten assessment reliability. Support
your explanations with specific examples.
a. Scoring procedures
When the grading isn’t consistent, it makes the results less reliable. For example, if one person
grades an essay strictly while another gives it a higher score, the final grade depends on who
grades it, which makes the test less predictable.
b. Item sampling
If the questions on a test don’t cover everything they’re supposed to, the results can be
inconsistent. For instance, if a math test only includes questions from one unit instead of the
whole course, a student could do well or poorly based on what questions they got, not their
overall understanding.
c. Item quality
Poorly worded questions can confuse students, which messes with how reliable the test is. If a
question isn’t clear, students might get it wrong for the wrong reasons, leading to inconsistent
results.
d. Administration conditions
The environment where the test is taken can also affect how reliable the results are. For example,
if some students take the test in a quiet room and others in a noisy one, the distractions could
cause differences in scores, making the results less reliable.
e. Learner characteristics
Things like how tired or nervous a student is can affect how well they do on the test. For
example, if a student is anxious, they might not perform as well as they could, which means their
scores aren’t stable or reliable.
GOAL 7: Describe and demonstrate strategies for maximizing assessment validity and reliability.
9. Describe and illustrate (give examples) two specific ways to improve assessment validity.
a. Make sure assessments match course goals
One way to improve validity is by ensuring the assessment aligns with the course's learning
objectives. For instance, if the course teaches writing, then the assessment should focus on
writing tasks like essays or reports. Using an assessment that tests what you actually taught
ensures that students are being evaluated on the skills they were meant to learn.
b. Use different types of assessments
Another way to improve validity is by using different methods to assess students. For example, a
combination of exams, projects, and presentations can cover different areas of learning. This
helps get a fuller, more accurate view of a student’s knowledge and skills, rather than relying on
just one type of test. It ensures the assessment is measuring all the important aspects of the
course.
10. Describe and illustrate (give examples) three specific ways to improve assessment reliability.
a. Make grading clear and consistent
One way to improve reliability is to use clear grading guidelines. For example, creating a
detailed rubric that explains exactly what to look for in an essay (such as clarity, structure, and
grammar) will help ensure that everyone grades the same way, making the results more reliable.
b. Cover a range of topics
Make sure the assessment includes questions that reflect all the important parts of the course. For
instance, if a class teaches multiple topics, like history and economics, try to include questions
that test different parts of the subject. This ensures the assessment is fair and the results are more
reliable.
c. Ensure a fair testing environment
Having a consistent testing environment is also key to reliability. For example, all students
should have the same amount of time and take the test in similar conditions. If some students get
extra time or work in a quiet room while others don’t, the results might not reflect their actual
ability.
Reference:
1. Language Teaching Library. Syllabus Design. London, Centre for Information on Language
Teaching and Research, 1984.
2. Zook, Kevin. Instructional Design for Classroom Teaching and Learning. Boston, Houghton
Mifflin, 2001.