Natriello 1987
Natriello 1987
Natriello 1987
Educational Psychologist
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/hedp20
To cite this article: Gary Natriello (1987) The Impact of Evaluation Processes on Students, Educational Psychologist, 22:2,
155-175, DOI: 10.1207/s15326985ep2202_4
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained
in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the
Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and
should be independently verified with primary sources of information. Taylor and Francis shall not be liable for
any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of
the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions
EDUCATIONAL PSYCHOLOGIST, 22(2), 155-175
Copyright o 1987, Lawrence Erlbaum Associates, Inc.
Requests for reprints should be sent to Gary Natriello, Teachers College, Columbia Univer-
sity, Box 85, New York, NY 10027.
and classrooms by (a) briefly reviewing a conceptual framework for thinking
about the evaluation process; (b) examining research on the impact of fea-
tures of the evaluation process on students, with particular emphasis on alter-
able elements of the process; and (c) considering the limitations of previous
research and directions for future research.
of the evaluation process that must be attended to by evaluators and that may
have an impact on students. Brief consideration of each of these stages sug-
Dralslng
, ,
-
x
S t u d e n t Performance Student P e r f o m r i
The first stage in the model is establishing purposes for evaluating students,
which suggests that there are multiple purposes for the evaluation of student
performance. Although there are a number of brief discussions of the pur-
poses of evaluation in many texts on measurement and evaluation (e.g.,
Ahmann & Glock, 1967; Lien, 1967; Remmers, Gage, & Rummel, 1960), the
purposes of evaluation receive scant attention in the literature.
Discussion of the purposes of evaluation of student performance suggest
that there are four generic functions that evaluation processes are thought to
serve: certification, selection, direction, and motivation. Certification refers
to the assurance that a student has attained a certain level of accomplishment
or mastery. Selection entails the identification of students or groups of stu-
Downloaded by [Monash University Library] at 01:53 08 January 2015
The third stage moves beyond the general assignment of a task to provide in-
formation on the properties of the task that will be considered important in
the evaluation of performance. Although there is little discussion of task-
specific criteria for evaluation in the evaluation literature, attention has been
devoted to the types of criteria employed in the evaluation process. It is gen-
erally accepted that the achievement of students in a subject is the one crite-
rion common to all evaluation systems in schools and classrooms (Brown,
1971). There is little discussion as to the appropriateness of using achieve-
ment criteria, though in recent years there has been increased attention de-
voted to determining whether the evaluation process is actually linked to the
instructional process (Linn, 1983; Rudman et al., 1980) so that students are
not placed in a position where they are evaluated on things not covered in the
instructional program (Natriello, 1982). Although there is agreement that
types of criteria other than achievement criteria enter into evaluation proces-
ses in schools and classrooms (Thorndike, 1969), there is less agreement as to
which of these other types of criteria, such as participation, effort, and con-
duct (Natriello & McPartland, in press; Salganik, 1982; Schunk, 1983;
Weiner, 1979), may be appropriate.
The fourth stage in the evaluation process has received considerable atten-
tion amidst renewed calls for higher standards in U.S. schools (National
IMPACT OF EVALUATION PROCESSES 159
The fifth stage in the evaluation process involves the collection of partial in-
formation on student performance of assigned tasks and the outcomes of
those performances. The collection of such information requires a sampling
process because it would be impractical, if not impossible, to collect total in-
formation on student performance. Most of the important decisions about
the collection of performance information thus involve sampling decisions to
insure that the information collected provides a valid and reliable estimate of
performance appropriate to the purposes, tasks, criteria, and standards that
have already been determined.
By far the dominant technique for collecting information on student per-
formance is some form of testing. A number of analysts have contributed im-
portant observations about the relationship between testing practices and the
purposes, tasks, criteria, and standards for the evaluation of students. For
example, Deutsch (1979) objected to the overwhelming use of tests em-
ploying norm-referenced standards for the purpose of selection at the ex-
pense of student motivation and individual development. Others have re-
jected norm-referenced tests in favor of criterion-referenced tests for the
purpose of certification (Glaser, 1963; Hambleton, Swaminathan, Algina, &
Coulson, 1978; Popham & Husek, 1969). The relationship between tests and
assigned tasks and the biases that result when tests do not correspond to the
curriculum have also been given serious attention (Leinhardt & Seewald,
1981; National Institute of Education, 1979; Rudman et al., 1980). Descrip-
tive accounts of testing reveal a wide range of testing practices, and the use of
tests from various sources for multiple purposes together with evidence that
the level of expertise for test construction among teachers may be quite low
(Gullickson, 1982, 1984; Herman & Dorr-Bremme, 1984; Natriello, 1982).
Alternatives to traditional testing have also been examined, including
routine class and homework assignments, classroom interaction during
question-and-answer sessions, recitations, discussions, oral reading, prob-
lem solving at the chalkboard, special projects, presentations, and reports
(Gaston, 1976; Heller, 1978; Herman & Dorr-Bremme, 1984). Although such
practices appear to broaden the base of information on student performance,
there are serious questions about the quality of the information they provide
(Rudman et al., 1980).
The seventh stage of the model involves the communication of the results of
the evaluation to relevant parties, including the student, parents, school offi-
cials, and potential employers (Ahmann & Glock, 1967). Designating feed-
back as a distinct stage serves to underline the point that a good deal of
evaluative information is never communicated to performers or other rele-
vant parties.
The nature and extent of communications regarding student performance
have been the subject of various investigations and commentaries. Some of
these have focused on the various forms of feedback from traditional report
cards (Chansky, 1975; Jarrett, 1963; Walling, 1975) to checklists (Rudman,
1978) to graded tests (Gullickson, 1982) to conferences (Ediger, 1975; Natri-
ello, 1982). Other investigations have considered the relationship of feedback
techniques to other dimensions of the evaluation process. The relationship
between the type of feedback and the purpose of the evaluation process has
IMPACT OF EVALUATION PROCESSES 161
received the attention of numerous authors (Cross & Cross, 1980; Hansen,
1977; Lissman & Paetzoid, 1983; Oren, 1983; Slavin, 1978). Several investi-
gators have considered the relationship between task characteristics and the
nature of feedback (Lintner & Ducette, 1974; Lissman & Paetzoid, 1983).
Finally, the eighth stage involves consideration of the impact of the evalua-
tion process in light of the original purposes of the process. The purposes of
certification, selection, direction, and motivation might suggest an analysis
of mastery, classification, progress, and continued engagement, respec-
tively. This eighth stage of the evaluation process leads back to the first stage
of establishing or reestablishing the purpose for evaluating students as the cy-
cle continues.
Of course, the eight stages of the model are an oversimplification of real-
Downloaded by [Monash University Library] at 01:53 08 January 2015
ity. It might be argued that later stages of the model have an impact on earlier
stages. For instance, some would argue that the criteria and standards set for
a task really define the task assignment or that the constraints of the sampling
process help to define the real criteria and standards. Moreover, although the
first six stages of the model are portrayed as having an impact on outcomes
only through the mediation of the feedback process designated as the seventh
stage, in reality each stage of the process may have direct effects on the out-
comes of the evaluation process, as is shown in the next section. Limitations
such as these notwithstanding, the model does highlight some key elements of
evaluation processes and provides a set of broad categories within which to
consider the impact of various features of evaluation processes on students.
Purposes of Evaluations
fined outcome (e.g., Lissman & Paetzoid, 1983; Schunk, 1983; Williams,
Pollack, & Ferguson, 1975). As a result, little is known about the develop-
ment and implementation of evaluation systems in school and classroom
contexts where evaluation must serve multiple purposes.
Resolution of Tasks
other, and less student autonomy to choose tasks) there was higher concur-
rence among classmates, between self and classmates, between teacher and
classmates, and between self and teacher in ratings of reading ability.
Rosenholtz and Rosenholtz (1981) found that these same high-resolution
classroom structures led to more dispersed evaluations of reading ability by
students themselves, by classmates, and by teachers. In addition, they also
found that low-resolution classroom structures diminished the effect of
teacher evaluation on peer evaluations of an individual's reading ability. In a
study of third-grade classrooms, Simpson (1981) found that low levels of cur-
ricular differentiation led to " . . . a more nearly normal distribution of self-
reports of ability by increasing the proportion of students reporting ability
levels below average and far below average" (p. 127). Moreover, low curricu-
lar differentiation also appeared to lead to a more generalized view of aca-
demic ability, to greater peer consensus about students' performance levels,
and to greater influence for peers on an individual's self-reported ability.
Clarity of Criteria
Dornbusch and Scott (1975) made the point that criteria add to the
definition of the assigned task and direct the attention of performers to the
key elements of the task for which they will be held accountable. Schunk
(1983) reported on a study in which some children were offered rewards for
participating in a task, others were offered rewards for careful work on the
task, and still others were not offered rewards until they had completed the
task. The results indicated that the first group of children, those who had re-
ceived both a task assignment and information on the criteria for perform-
ance, showed the highest levels of skill, self-efficacy, and rapid problem
solving.
IMPACT OF EVALUATION PROCESSES 163
Natriello (1982) found that over 30% of the students in his study of four sub-
urban high schools reported that they had received unsatisfactory evalua-
tions because they had misunderstood the criteria by which they were to be
evaluated. Smith (1984) observed that clarity has been demonstrated to be an
important component of teaching in research on teaching effectiveness
(Rosenshine & Furst, 1973). In his study of the impact of teacher "use of un-
Downloaded by [Monash University Library] at 01:53 08 January 2015
Demandingness of Standards
Referents of Standards
The impact of different types of standards has also been investigated. Per-
haps the most attention has been devoted to norm-referenced standards or
"grading on the curve." Michaels (1977) designated the reward structure asso-
Downloaded by [Monash University Library] at 01:53 08 January 2015
ciated with this practice as "individual competition, in which grades are as-
signed to students based on their performances relative to those of their class-
mates" and distinguished it from "individual reward contingencies, in which
grades are assigned to students on the basis of how much material each stu-
dent apparently masters" (p. 87). He considered the effects of these two re-
ward structures along with two other reward structures (group competition
and group reward contingencies) on student academic performance. In re-
viewing the relevant literature, he concluded that individual competition
consistenly produces superior academic performance.
However, Michaels (1977) observed that the superior academic perform-
ance found to be associated with individual competition may be limited to the
top third of the class, to those students who are most responsive to the reward
structure. Deutsch (1979) presented a more critical analysis of individual
competition or grading on the curve, a situation he described as an artificially
created shortage of good grades. He argued that the "disappointing rewards,
induced by an artificial scarcity, are likely to hamper the development of edu-
cational merit and the sense of one's own value" (p. 394). Moreover, under
individual competition, "students are more anxious, they think less well of
themselves and of their work, they have less favorable attitudes toward their
classmates and less friendly relations with them, and they feel less of a sense
of responsibility toward them" (p. 399).
In considering the impact of individual competition and individual reward
contingencies on actual student performance, Deutsch disagreed with the
conclusions reached by Michaels. Examining the same studies examined by
Michaels, Deutsch (1979) concluded that a number of these studies were
flawed because they did not equate the objective probability of reward in the
reward structures being compared. Deutsch's reanalysis of these studies
showed "no systematic differences in performance on isolated work under
several different reward systems" (p. 398). This position was confirmed by
Williams et al. (1975), who found no significant differences between the
achievement and self-reported attitudes or school-related behavior of stu-
dents exposed to norm-referenced and criterion-referenced standards. How-
ever, they also found that criterion-referenced standards provided assurance
to students who performed poorly initially that enabled at least some of them
to increase their performance on later tests, and that criterion-referenced
standards allowed students who did well initially to become confident and
work less than students working under a norm-referenced system.
Norm-referenced standards have also been compared to individually refer-
enced standards for their effects on student performance. Slavin (1980)
found that students in classes in which evaluations were based on experimen-
tal individually referenced standards achieved more on a final standardized
test than students in control classes evaluated by norm-referenced standards.
However, Beady, Slavin, and Fennessey (1981) found no differences in the
effects of norm-referenced standards and individually referenced standards
Downloaded by [Monash University Library] at 01:53 08 January 2015
Frequency of Sampling
Downloaded by [Monash University Library] at 01:53 08 January 2015
Soundness of Appraisals
flect their effort and performance), they were less likely to consider these
evaluations important and less likely to devote effort to the associated tasks.
An interesting complication of these effects is found in work on the theory
of learned helplessness, which suggests that experiencing uncontrollable out-
comes should depress performance (Abramson, Seligman, & Teasdale, 1978;
Seligman, 1975), as well as in work that suggests that the experience of
uncontrollable outcomes facilitates increased performance by producing an
increased need for control (Roth & Bootzin, 1974; Thornton & Jacobs,
1972). An integrative model developed by Wortman and Brehm (1975) sug-
gests that brief exposure to uncontrollable outcomes will lead to improved
performance, whereas extended exposure will lead to decreased perform-
ance. Research involving high school students (Buys & Winfield, 1982) re-
veals only decreased student performance in reaction to the experience of
uncontrollable outcomes, a pattern the authors link to the relatively less self-
reliant and less self-confident nature of high school students compared to
Downloaded by [Monash University Library] at 01:53 08 January 2015
Differentiation of Feedback
dom allow students to choose their tasks, there is a higher dispersion among
students' reported ability levels, higher generalization of students' reported
ability levels, higher peer consensus as to students' relative performance lev-
els, and greater peer influence over student's reported ability levels. Thus, the
use of less differentiated forms of feedback such as grades seems to lead to
more pronounced and more powerful ability stratification processes in the
classroom.
A similar effect on the distribution of attributional tendencies in class-
rooms was found by Oren (1983). Oren explored the effects of evaluation
feedback on the attributional tendencies of students. Results indicated that in
classrooms with differentiated, specific, and individualized feedback, the
attributional tendencies of low achievers were more like those of high achiev-
ers. Specifically, low achievers in such classrooms scored higher on internal
control than low achievers in classrooms with less differentiated feedback
systems.
The affective value of feedback has also been shown to influence attribu-
tions in classrooms. Meyer et al. (1979) reported on a series of six experimen-
tal studies that investigated the extent to which praise and criticism in re-
sponse to task performance provided information about others' perceptions
of a focal actor's ability. In these studies, subjects were presented with de-
scriptions of two students who had obtained identical results at a task. One of
the students received neutral feedback while the other was praised for success
or criticized for failure. Studies using adult subjects revealed that praise after
success and neutral feedback after failure led to the perception that the focal
actor's ability was low, and neutral feedback for success and criticism after
failure led to the perception that the focal actor's ability was high. However,
these findings varied by the age of the respondents. For example, third-grade
students believed that the student praised by the teacher was the brighter one;
students in Grades 4 to 7 selected the praised student and the student receiv-
ing neutral feedback in approximately equal numbers; and students in
Grades 8 and above believed that the student receiving neutral feedback was
brighter than the one receiving positive feedback following successful per-
formance. Although the effects of feedback in the classroom appear to be
powerful, they are multidimensional and complex. Simple injunctions to in-
crease feedback for one purpose or another are likely to set in motion a range
of processes that are in need of further examination.
Although the studies of the effects of features of the evaluation process just
noted have suggested some possible consequences for certain individual fea-
tures, little attention has been devoted to developing an understanding of en-
tire evaluation systems composed of purposes, tasks, criteria, standards,
samples, appraisals, and feedback. One of the key issues to be examined in
thinking about systems of evaluation is the relationship between various as-
pects of the process and the extent to which there is consistency among them.
For instance, evaluations and evaluation systems may differ in terms of the
consistency between task assignments and criteria set for the task. Some
teachers may take care that the performance criteria set for a task be appro-
priate to the nature of the task assignment but others may not. In the latter
case a teacher may designate a task as a creative opportunity when an assign-
ment is made but hold students accountable for a formulaic set of criteria. A
second instance might be the consistency between the criteria and standards
set for the task and the process of sampling student performances and out-
comes. For example, a teacher may specify criteria related to the actual per-
formance of the task (e.g., how to proceed to solve a math problem) but only
sample the outcome of the performance (e.g., the correctness of the answer).
Although little research has been conducted to examine the actual extent to
which teachers implement a consistent system of performance evaluation for
students, interviews conducted by Natriello (1982) with secondary school
teachers suggest that teachers vary widely in their ability to articulate a sys-
tematic approach to the evaluation of student performance. Examinations of
teacher preparation curricula, which indicate that prospective teachers re-
ceive little or no training in the evaluation of student performance (Mayo,
1967; Roeder, 1973), suggest that this finding may be widely applicable. The
effects of this lack of consistency could be quite negative. Natriello (1982) re-
ported that high school students who experienced more inconsistencies in the
evaluation system were also more likely to become disengaged from school.
suggest that such standards may not be used extensively by teachers at the
present time (Natriello & McPartland, in press).
Second, most of the effects studies concentrate on one or two aspects of
the evaluation process. As a result, they fail to consider the impact of other
key elements in determining the effects of evaluations. The conclusions
drawn from such studies consider the nature of the assigned tasks upon
which students are being evaluated, yet it is clear that task differences condi-
tion the impact of evaluation processes (Doyle, 1983).
Third, few of the effects studies consider the multiple purposes for evalua-
tions in schools and classrooms. As a result, they often compare different
evaluation methods in terms of some outcome that has nothing to do with the
purpose for which one of the methods was developed. For instance, a study
demonstrating that differentiated feedback contributes more to directing fu-
ture student performance than a single letter grade may be doing nothing
more than showing that an evaluation system created for the purpose of pro-
viding direction to students does a better job of providing that direction than
another evaluation system created for the purpose of selecting students.
The limitations of previous studies of the impact of evaluation processes
on students suggest important directions for further research. Research is
needed on the basic patterns of evaluation practices in schools and class-
rooms. Investigators have typically begun with some common assumptions
about the current state of practice as they planned intervention studies of
evaluation processes. However, additional research is needed to provide a
better descriptive account of how students are currently evaluated in schools
and classrooms.
Research on evaluation practices in schools and classrooms will need to
consider explicitly which of the multiple purposes of evaluation processes can
be served by which combinations of practices. For example, previous re-
search suggests that the design of an evaluation system for the purpose of
enhancing student motivation might involve a differentiated task structure in
the classroom, a mix of more and less predictable tasks, clearly articulated
criteria, challenging yet attainable, self-referenced standards, relatively fre-
quent collection of information on student performance, appraisals that
truly reflect student effort and performance, and differentiated and encour-
aging feedback. An evaluation system designed for purposes of certification
would look quite different. Researchers should be sensitiveto the purposes of
evaluation systems when they examine existing evaluation arrangements,
which typically involve compromises among the competing demands of mul-
tiple purposes. They should also be aware of the multiple purposes served by
evaluation systems when they design interventions to achieve certain pur-
poses at the expense of neglecting other purposes that must be attended to in
operating schools and classrooms.
Research on evaluation practices might be improved considerably if inves-
Downloaded by [Monash University Library] at 01:53 08 January 2015
REFERENCES
Abramson, L. Y., Seligman, M. E. P., & Teasdale, J. D. (1978). Learned helplessness in
humans: Critique and reformulation. Journal of Abnormal Psychology, 87, 49-74.
Ahmann, J. S., & Clock, M. D. (1967). Evaluating pupil growth: Principles of tests and
measurements (4th ed.). Boston: Allyn & Bacon.
Armbruster, B. B., Stevens, R. J., & Rosenshine, B. (1977). Analyzing content coverage and
Downloaded by [Monash University Library] at 01:53 08 January 2015
emphasis: A study of threecurricula andtwo tests (Tech. Rep. No. 26). Urbana: University of
Illinois, Center for the Study of Reading.
Atkinson, J. W. (1958). Towards experimental analysis of human motivation in terms of
motives, expectancies, and incentives. In J. W. Atkinson (Ed.), Motivesin fantasy, action and
society (pp. 273-306). Princeton, NJ: Van Nostrand.
Beady, C. J., Jr., Slavin, R. E., & Fennessey, G. M. (1981). Alternative student evaluation
structures and a focused schedule of instruction in an inner-city junior high school. Journalof
Educational Psychology, 75, 5 18-523.
Bidwell, C. E. (1965). The school as a formal organization. In J . G. March (Ed.), Handbook of
organizations (pp. 972-1022). Chicago: Rand McNally.
Bolocofsky, D. N., & Mescher, S. (1984). Student characteristics: Using student characteristics
to develop effective grading practices. The Directive Teacher, 6, 11-23.
Bresee, C. W. (1976). On "grading on the curve." The Clearing House, 5, 108-1 10.
Brookover, W. B., & Schneider, J. M. (1975). Academic environments and elementary school
achievement. Journal of Research and Development in Education. 9. 82-91.
Brophy, J., & Evertson, C. (1981). Student characteristicsand teaching. New York: Longman.
Brown, D. J. (1971). Appraisal procedures in the secondary schools. Englewood Cliffs, NJ:
Prentice-Hall.
Buys, N., & Winfield, A. H. (1982). Learned helplessness in high school students following
experience of noncontingent rewards. Journal of Research in Personality, 6, 6-9.
Chansky, N. M. (1975). A critical examination of school report cards from K through 12.
Reading Improvement, 12, 184-1 92.
Crano, W. D., & Mellon, P. M. (1978). Causal influence of teachers' expectations on children's
academic performance: A cross-lagged panel analysis. Journal of Educational Psychology,
70, 39-49.
Crooks, A. D. (1933). Marks and marking systems: A digest. Journalof Educational Research,
27, 259-272.
Cross, L. J., & Cross, C. M. (1980). Teachers' evaluative comments and pupil perception of
control. Journal of Experimental Education, 49,68-7 1.
Davis, R. G,, & McKnight, C. (1976). Conceptual, heuristic, and S-algorithmic approaches in
mathematics teaching. Journal of Children's Mathematical Behavior, I(Supp1. I), 271-286.
Deutsch, M. (1979). Education and distributive justice: Some reflections on grading systems.
American Psychologist, 34, 391-401.
IMPACT OF EVALUATION PROCESSES 173
Dornbusch, S. M., & Scott, W. R. (1975). Evaluation and the exercise of authority. San
Francisco: Jossey-Bass.
Doyle, W. (1983). Academic work. Review of Educational Research, 53, 159- 199.
Ediger, M. (1975). Reporting pupil progress: Alternatives to grading. Educational Leadership,
32, 265-267.
Egan, 0.. &Archer, P . (1985). The accuracy of teachers'ratings of ability: A regression model.
American Educational Research Journal, 22, 25-34.
Evans, E. D. & Engelberg, R. A. (1985, April). A developmentalstudy of student perceptions of
school grading. Paper presented at the biennial meeting of the Society for Research on Child
Development, Toronto.
Feldhusen, J. F. (1964). Student perceptions of frequent quizzes and post-mortem discussions of
tests. Journal of Educational Measurement, 1, 5 1-54.
Gaston, N. (1976). Evaluation in the affective domain. Journal of Business Education, 52,
134-136.
Glaser, R. (1963). Instructional technology and the measurement of learning outcomes. Ameri-
can Psychologist, 18, 5 19-521.
Glass, G. V. (1978). Standards and criteria. Journal of Educational Measurement 15, 237-261.
Gullickson, A. R. (1982). Thepractice of testing in elementary andsecondary schools. Unpub-
Downloaded by [Monash University Library] at 01:53 08 January 2015
Natriello, G. (1985). Merit pay for teachers: The implications of theory for practice. In H. C .
Johnson, Jr., (Ed.), Merit, money and teachers'careers (pp. 99-120). Sanham, MD: Univer-
sity Press of America.
Natriello, G., & Dornbusch, S. M. (1984). Teacherevaluativestandardsandstudenteffort. New
York: Longman.
Natriello, G., & McDill, E. L. (1986). Performance standards, student effort on homework and
academic achievement. Sociology of Education, 59, 18-3 1.
Natriello, G., & McPartland, .I.(in press). Adjustments in high school teachers'grading criteria:
Accommodation or motivation? Baltimore: Johns Hopkins University, Center for the Social
Organization of Schools.
Oren, D. L. (1983). Evaluation systems and attributional tendencies in the classroom: A socio-
logical approach. Journal of Educational Research, 76, 307-312.
Page, E. B. (1958). Teacher comments and student performance: A seventy-four classroom ex-
periment in school motivation. Journal of Educational Psychology, 49, 173- 181.
Peckham, P. D., & Roe, M. D. (1977). The effects of frequent testing. JournalofResearch and
Development in Education, 10, 40-50.
Popham, W. J., & Husek, T. R. (1969). Implications of criterion-referenced measurement.
Journal of Educational Measurement, 6, 1-9.
Purkey, S. C., & Smith, M. S. (1983). Effectiveschools: Areview. TheElementary SchoolJour-
nal, 83, 427-452.
Remmers, H. H., Gage, N. L., & Rummel, J . F. (1960). A practical introduction to measure-
ment and evaluation. New York: Harper & Brothers.
Rheinberg, F. (1983). Achievement evaluation: A fundamental difference and its motivational
consequences. Studies in Educational Evaluation, 9, 185-194.
Roeder, H. H. (1973). Teacher education curriculum-Your final grade is F. Journalof Educa-
tional Measurement, 10, 141-143.
Rosenholtz, S. J., & Rosenholtz, S. H. (1981). Classroom organization and the perception of
ability. Sociology of Education, 54, 132- 140.
Rosenholtz, S. J., & Wilson, B. (1980). The effect of classroom structure on shared perceptionr
of ability. American Educational Research Journal, 17, 75-82.
Rosenshine, B., & Furst, N. (1973). The use of direct observation to study teaching. In R. M. W.
Travers, (Ed.), Second handbook of research on teaching (pp. 122-183). Chicago: Rand
McNally.
Rosenthal, R., & Jacobson, L. (1968). Pygmalion in theclassroom. New York: Holt, Rinehart &
Winston.
Rosswork, S. G. (1977). Goal setting: The effects on an academic task with varying magnitudes
of incentive. Journal of Educational Psychology, 69, 7 10-7 15.
Roth, S., & Bootzin, R. R. (1974). The effects of experimentally induced expectancies of external
control: An investigation of learned helplessness. Journal of Personality and Social Psychol-
ogy, 28, 253-264.
Rudman, H. C., Kelly, J . L., Wanous, D. S., Mehrens, W. A., Clark, C. M., & Porter, A. C.
(1980). Integrating assessment with instruction: A review (1922-1980). East Lansing, MI:
Michigan State University, College of Education, Institute for Research on Teaching.
Rudman, M. K. (1978). Evaluating students: How to d o it right. Learning, 7, 50-53.
Salganik, L. H. (1982, March). The effects of effort marks on report card grades. Paper pre-
sented at the annual meeting of the American Educational Research Association, Los
Angeles.
Schunk, D. H. (1983). Reward contingencies and the development of children's skills and self-
efficacy. Journal of Educational Psychology, 75, 5 11-5 18.
Seligman, M. E. P. (1975). Helplessness: On depression, development, and death. San
Francisco: Freeman.
Simpson, C. (1981). Classroom structure and the organization of ability. Sociology of Educa-
tion, 54, 120-132.
Downloaded by [Monash University Library] at 01:53 08 January 2015
Slavin, R. E. (1977). Classroom reward structure: An analytical and practical review. Review of
Educational Research, 47, 633-650.
Slavin, R. E. (1978). Separating incentives, feedback, and evaluation: Toward a more effective
classroom system. Educational Psychologist, 13, 97-100.
Slavin, R. E. (1980). Effects of individual learning expectations on student achievement. Journal
of Educational Psychology, 72, 520-524.
Smith, L. R. (1984). Effect of teacher vagueness and use of lecture notes on student perform-
ance. Journal of Educational Research, 78, 68-74.
Stewart, L. G., &White, M. A. (1976). Teacher comments, letter grades and student perform-
ance: What do we really know? Journal of Educational Psychology, 68, 488-500.
Terwilliger, J. G. (1977). Assigning grades-Philosophical issues and practical recommenda-
tions. Journal of Research and Development in Education, 10, 21-39.
Thompson, J. D. (1967). Organizations in action. New York: McGraw-Hill.
Thorndike, R. L. (1969). Marks and marking systems. In R. L. Ebel (Ed.), Encyclopedia of edu-
cational research (pp. 759-766). New York: Macmillan.
Thornton, J. W., & Jacobs, P. D. (1972). The facilitating effects of prior inescapable unavoida-
ble stress on intellectual performance. Psychometric Science, 26, 265-271.
Varenne, H., &Kelly, M. (1976). Friendship and fairness: Ideological tensions in an American
high school. Teachers College Record, 77, 601-614.
Waller, W. (1932). The sociology of teaching. New York: Wiley.
Walling, D. R. (1975). Designing a "report card" that communicates. Educational Leadership,
32, 258-260.
Weiner, B. (1979). A theory of motivation for some classroom experiences. Journal of Educa-
tional Psychology, 71, 3-25.
Williams, R. G., Pollack, M. J., & Ferguson, N. A. (1975). Differential effects of two grading
systems on student performance. Journal of Educational Psychology, 67, 253-258.
Wilson, S. (1976). You can talk to teachers: Student-teacher relations in an alternative high
school. Teachers College Record, 78, 77-100.
Wise, R. I., &Newman, B. (1975). The responsibilities of grading. EducationalLeadership, 32,
253-256.
Wortman, C. B., & Brehm, J. W. (1975). Responses to uncontrollable outcomes: Anintegration
of reactance theory and the learned helplessness model. In L. Berkowitz (Ed.), Advancesin ex-
perimental social psychology (Vol. 8, pp. 278-336). New York: Academic.