Chapter 1: Concepts, issues and principles
of language assessment
CONTENT
Assessment and testing
Types and purposes of assessment
Issues in Language Assessment
Principles of language assessment
Understand the differences between assessment and testing
Understand the differences of basic assessment concepts and terms
Distinguish among five different types of language tests and apply
Grasp some major current issues related to language assessment
Understand the five major principles of language assessment
Analyze the importance of each principle
Apply each principle to classroom based assessment instruments
Activity 1
• What are some key concepts
in testing and assessment of
your knowledge?
Key concepts
– Assessment always needs testing?
– Is testing is the only form to assess?
Ongoing process
assessment
Methodological techniques
Comments, appraisal,
feedback, observations, etc
TESTS
A subset of assessment : Prepared administrative
procedures
Measure performance => indicator of competence
Measure a given domain: proficiency, specific element
Accurate measure of the test-taker’s ability within a
particular domain
Measurement and evaluation
Quantifying the observed performance Evaluation: results of the test used for
decision-making
Rankings, letter grades
Interpretation of information
Exact descriptions of students
performance
Test scores are example of
measurement
Compare one student to another
Explicit specifications for scoring => Conveying the meanings of those scores
more objective is evaluation
Example: a teacher’ evaluation of student’s
progress in her class
• Evaluation: yes
(evaluating student’s learning)
• Test: No
(No test involved)
• Measurement: No
(No number assigned)
• Evaluation without measurement
Activity 2:
Match the concepts and their examples
– Non-test evaluation
– Non-evaluative tests
– Non-test, evaluative measures
– Evaluative tests
– Non-test, non-evaluative activities
Test – measurement – evaluation - assessment
Types of assessment
• What kinds of assessment/ test you use in
your teaching contexts?
• At school (primary/ lower and upper
secondary school)??
• Any formative assessment?
Different types of assessment
Formal assessment Informal assessment
test Incidental, unplanned comments and
journal responses
portfolio Coaching
Impromptu feedback to students
Good job!/ Great!/ Did you say can or
can’t? /
Summative Formative
Purposes - To measure students - To improve instruction and
competency provide student feedback
- Ss: Recognise their progress
When - End of (unit of) course - ongoing throughout the unit
Types - End of term test - In-class activities
- National exams/ - Observations
qualifications - Projects
- .... - ......
Impacts - Students: to gauge their - Students: To self-monitor
progress toward course or understanding
grade-level - Teachers: to check for
goals/benchmarks understanding, adjust teaching
- Teachers: for grades methods, content areas..
promotion
• https://
www.youtube.com/watch?v=SjnrI3ZO2tU
• https://
www.youtube.com/watch?v=bTGnJnuVNt8
At school
Formative Tests
– Oral test
– Quizzes (15 minutes, 30 minutes….)
– 45 mins test Summative Tests
– End-of-term test…
• Any formative/summative assessment?
Work in groups and fill in the table
Impacts/ how results are
Purposes When Types effective for teachers and
learners
Summative
Formative
Different types of tests
Achieveme Diagnostic Placement Proficiency Aptitude
nt test test test test test
Purposes of achievement tests
To determine whether course objectives have been
met
To determine whether appropriate knowledge and
skills acquired
Often summative: at the end of a lesson, unit or a
term of study.
Formative: feedback about the quality of a learner’s
performance in subsets of the unit or course
To diagnose aspects of a language needed to develop/ include in the course
To elicit information on what students need to work in the future
To offer more detailed, subcategorized information on the learner
Placement tests
To place a student into a particular level or section
of a language curriculum or school
To indicate the point at which students find material
appropriately challenging
To provide diagnostic information on a student’s
performance
To test global competence in a language/ overall ability: not
limited to any course, curriculum or single skill
Always summative and norm-referenced: results in form of a
single score
Not equipped to provide diagnostic feedback
Aptitude tests
To measure capacity or general
ability to learn a foreign language a
priori and ultimate predicted success
in that undertaking
Designed to apply to the classroom
learning of any language
Assessment
for
learning
Assessment Assessment
of as
learning learning
Assessment FOR learning
AFL is more commonly known as formative & diagnostic
assessments. Assessment FOR learning is the use of a task or
an activity for the purpose of determining student progress
during a unit or block of instruction. Teachers are now
afforded the chance to adjust classroom instruction based
upon the needs of the students. Similarly, students are
provided valuable feedback on their own learning.
(Kenji Takahashi's Resource Site)
http://www.tvdsb.ca/webpages/takahashid/techdia.cfm?subpage=128207
Assessment OF learning
AOL is the use of a task or an activity to measure, record and
report on a student's level of achievement in regards to
specific learning expectations. These are often known as
summative assessments.
(Kenji Takahashi's Resource Site)
http://www.tvdsb.ca/webpages/takahashid/techdia.cfm?subpage=128207
Assessment AS learning
AAL is the use of a task or an activity to allow students the
opportunity to use assessment to further their own learning.
Self and peer assessments allow students to reflect on their own
learning and identify areas of strength and need. These tasks
offer students the chance to set their own personal goals and
advocate for their own learning. These are often known as
formative assessments.
(Kenji Takahashi's Resource Site)
http://www.tvdsb.ca/webpages/takahashid/techdia.cfm?subpage=128207
Revision Activity:
Formative or summative?
• Look at these phrases and say out loud quickly if
they are formative or summative?
– Ongoing assessment
– During period of study
– At the end of a period of a study
– Looks back at the syllabus
– Responding the evolving needs of the learners
– Scaffolding learning
– Outcomes relate to learner’s performance
– A kind of purpose (i.e. for teaching/learning)
– A kind of judgement
Principles of language assessment
1. Reliability
2. Validity
3. Practicality
4. Authenticity
5. Washback
is consistent in its conditions across two or more administrations
gives clear directions for scoring/evaluation
has uniform rubrics for scoring/ evaluation
lends itself to consistent application of those rubrics by the
scorer
contains items/ tasks that are unambiguous to the test-taker
Student-related reliability:
illness, fatigue, anxiety, physical
and psychological factors
Rater reliability:
consistent scores of
different scorers
Test administration
reliability: conditions of
the test administration
Test reliability: subjective tests
(open-ended responses, essay
response), objective tests
Measures exactly what it proposes to measure
Does not measure irrelevant or “contaminating” variables
Relies as much as possible on empirical evidence
(performance)
Involves performance that samples the test’s criterion
(objective)
Offers useful, meaningful information about a test-taker’s
ability
Is supported by a theoretical rationale or argument
Stays within budgetary limits
Can be completed by the test-taker
within appropriate time constraints
Has clear directions for administrations
Appropriately utilizes available human
resources
Does not exceed available material
resources
Considers the time and effort involved
for both design and scoring
contains language that is as natural as possible
has items that are contextualized rather than isolated
includes meaningful, relevant, interesting topics
provides some thematic organization to items, such as
through a story line or episode
offers tasks that replicate real-world tasks
Washback
Positively influences what and how teachers teach
Positively influences what and how learners learn
Offers learners a chance to adequately prepare
Gives learners feedback that enhances their language
development
Is more formative in nature than summative
Provides conditions for peak performance by the learner
Match one term to one definition and write your answer into the answer column.
Answer Term Definition
A. This test measures how much of the material taught in a
1. Washback
whole course has actually been learned
B. A test where there is only one correct answer and,
2. A Reliable Test
therefore, no judgement required when marking
C. A test which measures the overall language abilities of a
3. A Valid Test
student without referring to a particular course
D. The effect (Positive or Negative) that a test has on
4. An Achievement Test
teaching and learning.
E. This test involves whole pieces of discourse and tests a
5. A Progress Test
relatively wide range of language. e.g. A cloze test
F. A test which actually tests what it is designed or intended
6. A Proficiency Test
to test.
G. This test consists of separate items.
7. A Discrete-item Test
E.g. Another word for sea is ............
H. A test where there is a choice of how you express your
8. An Integrative Test answer and therefore some personal judgement involved in
marking.
I. A test which produces consistent results when it is used on
9. An Objective Test
different occasions.
J. This is a type of achievement test, but for part of a course
10. A Subjective Test
(one or more unit) only.
Match one term to one definition and write your answer into the answer column.
Answer Term Definition
A. This test measures how much of the material taught in a
D 1.Backwash
whole course has actually been learned
B. A test where there is only one correct answer and,
I 2. A Reliable Test
therefore, no judgement required when marking
C. A test which measures the overall language abilities of a
3. A Valid Test
F student without referring to a particular course
D. The effect (Positive or Negative) that a test has on
4. An Achievement Test
A teaching and learning.
E. This test involves whole pieces of discourse and tests a
5. A Progress Test
relatively wide range of language. e.g. A cloze test
J F. A test which actually tests what it is designed or intended
6. A Proficiency Test
to test.
C G. This test consists of separate items.
7. A Discrete-item Test
E.g. Another word for sea is ............
G H. A test where there is a choice of how you express your
8. An Integrative Test answer and therefore some personal judgement involved in
E marking.
I. A test which produces consistent results when it is used on
9. An Objective Test
different occasions.
B J. This is a type of achievement test, but for part of a course
10. A Subjective Test
(one or more unit) only.
H
APPLYING PRINCIPLES TO THE EVALUATION
OF CLASSROOM TESTS
1. Are the test procedures practical?
2. Is the test itself reliable?
3. Can you ensure rater reliability?
4. Does the procedure demonstrate
content validity?
5. Has the impact of the test been
carefully accounted for?
6. Is the procedure “biased for best?”
7. Are the test tasks as authentic as
possible?
8. Does the test offer beneficial washback
to the learner?
1. Are the test procedures practical?
1. Are administrative details all carefully
attended to before the test?
2. Can students complete the test
reasonably within the set time frame?
3. Can the test be administered smoothly,
without procedural “glitches”?
4. Are all printed materials accounted
for?
5. Has equipment been pre-tested?
6. Is the cost of the test within budgeted
limits?
7. Is the scoring/ evaluation system
feasible in the teacher’s time frame?
8. Are methods for reporting results
determined in advance?
2. Is the test itself reliable?
1. Does every student have a
cleanly photocopied test sheet?
2. Is sound amplification clearly
audible to everyone in the room?
3. Is video input clearly and
uniformly visible to all?
4. Are lighting, temperature, extraneous
noise and other classroom condions equal
(and optimal) for all students?
5. For close-ended responses, do
scoring procedures leave little debate
about correctness of an answer?
3. Can you ensure rater reliability?
1. Have you established consistent
criteria for correct responses?
2. Can you give uniform attention to those
criteria throughout the evaluation time?
3. Can you guarantee that scoring is based
only on the established criteria and not on
extraneous or subjective variables?
4. Have you read through tests at
least twice to check for consitency?
5. If you have made “midstream” modifications
of what you consider a correct response, did
you go back and apply the same standard to all?
6. Can you avoid fatigue by reading the tests
in several settings, especially if the time
requirement is a matter of several hours.
4. Does the procedure demonstrate content
validity?
1. Are unit objectives
clearly identified?
2. Are unit objectives represented
in the form of test specifications?
3. Do the test specifications include
tasks that have already been performed
as part of the course procedures?
4. Do the test specifications include
tasks taht represent all (or most) of
the objectives for the unit?
5. Do those tasks involve actual
performance of the target task?
5. Has the impact of the test been carefully
accounted for?
1. Have you offered students
appropriate review and
preparation for the test?
2. Have you suggested test-taking
strategies that will be beneficial?
3. Is the test structured so that, if possible,
the best students will be modestly
challenged and the weaker students will not
be overwhelmed?
4. Does the test lend itself to
your giving beneficial washback?
5. Are the students encouraged to
see the test as a learning
experience?
6. Is the procedure “biased for best”?
7. Are the test tasks as authentic as
possible?
1. Is the language in the test
as naturally as possible?
2. Are items as contextualized as
possible rather than isolated?
3. Are topics and situations
interesting, enjoyable, and/ or
humourous?
4. Is some thematic organization
provided, such as through a story
line or episode?
5. Do tasks represent, or closely
approximate, real-world tasks?
8. Does the test offer beneficial washback
to the learner?
1. Is the test designed in such a way that
you can give feedback that will be relevant
to the objectives of the unit being tested?
2. Have you given students sufficient pre-
test opportunities to review the subject
matter of the test?
3. In your written feedback to each student,
do you include comments that will contribute
to students’ formative development?
4. After returning tests, do you spend class time
“ going over” the test and offering advice on
what students should focus on in the future?
5. After returning tests, do you
encourage questions from students?
6. If time and circumstances permit, do you
offer students (especially the weaker ones)
a chance to discuss results in an office hour?
Further references
• https://
www.youtube.com/watch?v=2MbXbK_SXJ0
• https://www.youtube.com/watch?v=UtSeNH9
PvHw