[go: up one dir, main page]

0% found this document useful (0 votes)
59 views44 pages

Davidson and Nevin 1999

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 44

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 1999, 71, 439–482 NUMBER 3 (MAY)

STIMULI, REINFORCERS, AND BEHAVIOR:


AN INTEGRATION
M ICHAEL D AVISON AND J OHN A. N EVIN
UNIVERSIT Y OF AUCKLAND AND
UNIVERSIT Y OF NEW HAMPSHIRE

We propose that a fundamental unit of behavior is the concurrent discriminated operant, and we
discuss in detail a quantitative model of the concurrent three-term contingency that is based on the
notion that an animal’s behavior is controlled to differing extents by both stimulus–behavior and
behavior–reinforcer relations. We show how this model can describe performance in a variety of
experimental procedures: conditional discrimination and matching to sample, both with and without
reinforcement for responses that are traditionally identified as errors; conditional discrimination
with more than two stimuli and choice alternatives; delayed matching to sample and delayed rein-
forcement in matching to sample; second-order and complex conditional discrimination; and mul-
tiple and concurrent schedules. Although the model is incomplete in its coverage, and may be
incorrect, we believe that this conceptual approach will bear fruit in the development of behavior
theory.
Key words: discriminated operant, conditional discrimination, stimulus control, reinforcement, de-
tection models, matching, discriminability

An adequate formulation of the interaction discriminated operant,1 which we take to be a


between an organism and its environment fundamental analytic unit for the science of
must always specify three things: (1) the oc- behavior. Some experimenters have concen-
casion upon which a response occurs, (2) the trated on ‘‘the occasion upon which a re-
response itself, and (3) the reinforcing con- sponse occurs’’—the antecedent stimulus. In
sequences. The interrelations among them general, they have arranged maximally differ-
are the contingencies of reinforcement. (Skin-
ent consequences in the presence of two stim-
ner, 1969, p. 7)
uli, such as reinforcement versus extinction,
and then varied some aspects of one or both
With the specification of the three-term con-
stimuli. Conversely, those experimenters who
tingency quoted above, Skinner defined the
have concentrated on ‘‘the reinforcing con-
sequences’’ have usually explored the effects
of various schedule contingencies within a
The order of authorship is alphabetical. We thank all
those who have made empirical and theoretical contri- single undifferentiated session, with no ex-
butions to our understanding of this area over many plicit antecedent stimuli. When two or more
years. In particular, Michael Davison thanks Brent Alsop, schedules have been studied within a single
Jenny Boldero, Doug Elliffe, Rebecca Godfrey, Peter Jen- session, they have generally been correlated
kins, Chris Jensen, Max Jones, Jeff Kennedy, Dianne Mc- with highly distinctive stimuli. Although there
Carthy, and Don Tustin. Tony Nevin thanks Heather
Cate, Charlotte Mandell, Stephen Whittaker, Peter Yar- are many exceptions to these overly simple
ensky, and joins M.D. in thanking Brent Alsop, Peter Jen- generalizations, there have been few system-
kins, and Dianne McCarthy. We both thank Barbara Wan- atic efforts to study the joint effects of varia-
chisen for providing facilities for us to collaborate on tions in reinforcing consequences and in ac-
writing this manuscript at Baldwin-Wallace College. We
also both thank three anonymous reviewers for laboring companying stimuli. Neither is there much
through the original version of this manuscript, and we systematic information on the effects of vari-
hope that they will recognize the many ways in which ations in ‘‘the response itself,’’ and how dif-
their efforts have improved our work and ideas. Nevin’s ferent response definitions may interact with
work on this article was supported in part by Grant
AG11459 from the National Institute on Aging to the
stimulus and reinforcer control. Here, we
University of New Hampshire. show that variations in each of the three
Address correspondence to Michael Davison, Depart- terms of the discriminated operant may have
ment of Psychology, University of Auckland, Private Bag functionally similar effects, and we present a
92019, Auckland, New Zealand (E-mail: m.davison@
auckland.ac.nz) or Tony Nevin, RR2 Box 162, Vineyard
Haven, Massachusetts 02568 (E-mail: tnevin@worldnet. 1 The definition of this and other technical terms can

att.net). be found in Appendix A.

439
440 MICHAEL DAVISON and JOHN A. NEVIN

is the concurrent discriminated operant and not


(as suggested by Skinner, 1969) the single dis-
criminated operant.
We begin by illustrating some qualitative
similarities in the effects of the three terms
Fig. 1. The contingencies of the concurrent discrim- defining concurrent discriminated operants
inated operant. Each of two different responses is rein-
forced in the presence of a different discriminative stim-
within some standard experimental para-
ulus. digms: (a) successive discriminations and
multiple schedules, which correspond to one
column of the matrix of Figure 1; (b) simul-
simple algebraic model that provides an eco- taneous discriminations and concurrent
nomical summary of these effects in many schedules, which correspond to collapsing
standard experimental situations. the two rows of the matrix into a single row;
The model has two major components.
and (c) conditional discriminations, which
The first component characterizes an organ-
correspond to the full matrix.
ism’s history of reinforcement for different
responses in the presence of different stimuli The model that we develop here is based
during prolonged exposure to an experimen- on previous modeling efforts by ourselves
tal condition. Two parameters are identified and our colleagues over a number of years,
with the confusability of the relations among which are reviewed briefly. We then present
the stimuli, responses, and reinforcers defin- the model for the basic conditional discrimi-
ing two or more discriminated operants; nation paradigm illustrated in Figure 1, and
these parameters are used to derive an alge- develop it for progressively more complex
braic expression of the effective allocation of cases. The model’s predictions are compared
reinforcers accruing to those operants. The with conditional discrimination data from a
second component characterizes the way in variety of paradigms, including signal recog-
which the effective allocation of reinforcers nition and matching to sample with two or
determines steady-state behavior. more defined stimuli and responses. We also
The model will be developed initially for treat the effects of reinforcing responses that
experimental paradigms that explicitly define are conventionally construed as errors in
two discriminated operants, where Response these paradigms. We then return to cases in
1 is reinforced in the presence of, or follow- which extraneous responses and reinforcers
ing, Stimulus 1, and Response 2 is reinforced must be considered, and discuss ways to in-
in the presence of, or following, Stimulus 2, corporate the effects of reinforcer magnitude
as shown in the matrix of Figure 1. The re- or quality in future models. Although all of
sponses are available concurrently, and the the data that we compare with model predic-
stimuli are presented successively. In some
tions are from nonhuman subjects, primarily
standard paradigms, Response 2 may be un-
pigeons, we conclude by considering the rel-
specified, as in single-response ‘‘go/no-go’’
successive discriminations. However, Herrn- evance of our model to research and appli-
stein (1970) rightly pointed out that when an cation with humans.
experimenter arranges for a response (B1) to
be reinforced (R1), there exists by definition
a complementary (or extraneous) class of re- SOME EQUIVALENCES
sponses (not-B1, designated Be by Herrnstein) AMONG THE TERMS OF
that is reinforced by a complementary (or ex- THE DISCRIMINATED
traneous) class of reinforcers (not-R1, desig- OPERANT
nated Re). The term extraneous is to be under-
stood only in relation to the experimenter’s A number of experiments have demon-
arrangement of contingencies. Thus, all be- strated that varying one of the three terms of
havior occasioned by an antecedent stimulus the discriminated operant may be function-
occurs in a context of concurrent alterna- ally equivalent to varying another term, thus
tives, whether measured or not. We therefore suggesting the possibility of a unified descrip-
suggest that the fundamental unit of behavior tive account.
STIMULI, REINFORCERS, AND BEHAVIOR 441

Free-Operant Successive Discrimination and S1-S2 disparity, both response rates varied di-
Multiple Schedules rectly with the rate of reinforcement during
Perhaps the simplest arrangement for the S1 so that their ratio was approximately con-
study of discriminated operants is the suc- stant. An alternative approach to the study of
cessive go/no-go or S D/S D free-operant par- discriminated operant performance in mul-
adigm. For example, a pigeon is trained to tiple schedules is to hold the stimuli constant
peck a key for food reinforcers arranged by and vary the rates of reinforcement during
a variable-interval (VI) schedule of rein- both S1 and S2. For example, Reynolds (1963)
forcement in the presence of one stimulus trained pigeons to peck at red (S1) and green
(designated S1, S D, or more generally S1), (S2) keys, where S1 and S2 alternated every 3
where S1 alternates with a second stimulus min and independent VI schedules were ar-
(designated S2, S D, or more generally S2) in ranged in their presence. Over successive
the presence of which responses never pro- conditions, the VI schedules were varied sys-
duce food. This procedure is known as a tematically. In general, response rates were
multiple VI extinction schedule of reinforce- positively related to the rate of reinforcement
ment, with its components defined by the arranged by the VI schedule in each compo-
stimuli and the schedules they accompany. nent, and the ratio of response rates was an
As noted above, the procedure corresponds orderly increasing function of the ratio of re-
to the left column of the matrix in Figure 1. inforcer rates. From this line of research, it is
In effect, the procedure defines two succes- clear that response rates are equal under two
sive discriminated operants: S1:(peck → VI sorts of conditions: first, when the schedules
food) and S2:(peck → no food). Our nota- are different and the accompanying stimuli
tion is intended to signify that the stimuli set are the same; and second, when the accom-
the occasion for a specified outcome if a panying stimuli are different but the sched-
specified response occurs. We will subse- ules are the same. Colloquially, in the first of
quently refer to these relations by saying that these the subject ‘‘cannot’’ discriminate be-
a given stimulus signals a specified contin- cause the stimuli are indiscriminable, where-
gency of reinforcement. as in the second the subject ‘‘will’’ not dis-
The usual result of this procedure is a high criminate in the sense that, if the
rate of responding during S1 and a near-zero consequences are indiscriminable, equal re-
rate during S2. However, the response rate sponding is obligatory.
during S1 depends on the rate of reinforce- Clearly, there are two continua to be ex-
ment arranged by the VI schedule: The high- plored here: the difference between the stim-
er the rate of reinforcement, the higher the uli, and the difference between the reinforce-
rate of responding. At the same time, the rate ment schedules. However, research on
of responding during S2 depends on the phys- multiple schedules exemplifies the point that
ical difference between S1 and S2: The smaller analyses of the effects of the stimuli and of
the S1-S2 disparity, the higher the rate of S2 the reinforcement schedules have been large-
responding. In the limit, with zero S1-S2 dif- ly independent. One body of literature has
ference, the response rates become identical, followed the tradition of Guttman and Kalish
at least if the occurrence of reinforcement or (1956), using multiple VI extinction sched-
the passage of time cannot serve as cues for ules while varying one or more stimulus di-
the component in effect. This would require mensions during a maintained generalization
that the two components alternate irregular- test (for review, see Heinemann & Chase,
ly, that S1 components end after each rein- 1975). A separate literature has followed the
forcer, and that S2 components end at ran- early work of Reynolds (1961), exploring the
dom times (Alsop & Davison, 1991). effects of various reinforcement schedules
Cumming (1955) systematically explored and component durations with constant,
both the rate of reinforcement during S1 and highly distinctive stimuli (for review, see Dav-
the S1-S2 disparity with pigeons as subjects and ison & McCarthy, 1988). Only a few studies
with S1 and S2 defined by two luminance lev- (e.g., White, Pipe, & McLean, 1984) have sys-
els of a white keylight. He found that the ratio tematically examined the joint effects of both
of response rates during S1 versus S2 increased determiners of discriminated operant perfor-
with the S1-S2 disparity, and that for any given mance in multiple schedules.
442 MICHAEL DAVISON and JOHN A. NEVIN

Choice: Simultaneous Discrimination and proach defines the responses by the stimuli
Concurrent Schedules toward which they are directed (where ‘‘to’’
is intended as shorthand for ‘‘directed to-
Traditional studies of discrimination learn-
ward’’). The paradigm corresponds to col-
ing have often presented two stimuli simul- lapsing the upper and lower rows of the ma-
taneously in discrete trials, with their spatial trix in Figure 1 into a single row, in which the
locations (typically left-right) varied irregular- contingencies are signaled by simultaneous
ly, and with reinforcement available for a sin- presentation of S1 and S2. The four-operant
gle response directed toward one stimulus interpretation refers to both stimulus and re-
(S1) but not to the other (S2). Performance sponse locations: S 1 left:(respond left →
is usually measured as percentage choices of food); S1 left:(respond right → no food); S1
S1, conventionally identified as ‘‘correct’’ re- right:(respond right → food); and S1 right:
sponses. In most research, the stimuli differ (respond left → no food), thereby corre-
substantially, and interest typically centers on sponding to the full matrix of Figure 1. There
the acquisition, transfer, or reversal of the dis- has been considerable theoretical debate
crimination as affected by other variables over the correct interpretation (see Mackin-
such as prior learning history or physiological tosh, 1974, for review), with no generally ac-
intervention. cepted conclusion. In view of our interest in
Clearly, acquisition and maintained accu- the response term, we opt for the four-oper-
racy will depend on the difference between ant approach.
the stimuli, and for this reason the simulta- A separate line of research has studied the
neous discrimination procedure has been effects of simultaneously available reinforce-
used for psychophysical assessment of sensory ment schedules, known as concurrent sched-
sensitivity in well-trained animal subjects ules, which may be continuously available for
(e.g., Mentzer, 1966). In the limit, when the free-operant responding or arranged in dis-
stimuli are identical, accuracy should fall to crete trials (e.g. Herrnstein, 1961; Nevin,
chance levels. Performance on a difficult lu- 1969a; see Davison & McCarthy, 1988, and
minance discrimination, under which accu- Williams, 1988, for reviews). Because this
racy was maintained at about 75% to 80%, work informs much of our thinking, it will be
was studied by Nevin (1967). He arranged described in some detail.
discrete trials that ended after 2 s if no re- In an early study of concurrent-schedule
sponse occurred and found that the proba- performance, Herrnstein (1961) trained pi-
bility or schedule of reinforcement for cor- geons to peck at either of two simultaneously
rect choices affected the overall probability of available keys, with food reinforcers arranged
response but not the ratio of S1 to S2 respons- by independent VI schedules. In one condi-
es (and thus the percentage of correct re- tion, for example, the average interval be-
sponses). This result parallels Cumming’s tween reinforcers was 2.25 min for pecks on
(1955) findings with free-operant multiple Key 1 and 4.5 min for pecks on Key 2. As a
schedules described above. Although most si- result, the birds could obtain about 27 rein-
multaneous discrimination research has al- forcers per hour on Key 1 and about 13 per
lowed only one response per trial, as in Nev- hour on Key 2. In the course of a 60-rein-
in’s (1967) study, free-operant simultaneous forcer session, the birds made about 6,000
discrimination procedures with extended key pecks, with about 4,000 on Key 1 and
stimulus presentations, and with VI reinforce- 2,000 on Key 2. Thus, responding was distrib-
ment of responses on one alternative and ex- uted in about the same ratio as the obtained
tinction of responses on the other, have also reinforcer rates. This result held for several
been studied (e.g., Honig, 1962). other schedule combinations. The general re-
Whether the procedure involves discrete sult is expressed algebraically as
trials or extended stimulus presentations, it
may be construed as involving either two or B1 R
four operants. The two-operant interpreta- 5 1, (1)
B2 R2
tion neglects response locations and empha-
sizes the stimuli: Respond to S1 → food, and where B1 and B2 are the numbers of respons-
respond to S2 → no food. In effect, this ap- es emitted on the two keys, and R1 and R2 are
STIMULI, REINFORCERS, AND BEHAVIOR 443

the numbers of reinforcers obtained by pecks where c represents a constant bias toward one
to those keys. or the other operant, evident in unequal re-
It is important to note that these response sponding when the reinforcer rates are equal,
ratios were not constrained by the procedure: and a represents sensitivity to reinforcement.
For example, all reinforcers could have been When c is 1.0 and a is 1.0, Equation 2 reduces
obtained by simply alternating from one key to Equation 1 and describes strict matching,
to the other, in which case B1/B2 would be as found by Herrnstein (1961). When a is 0,
1.0 regardless of the two schedule values. To response ratios are constant regardless of the
reduce the likelihood of this pattern of re- reinforcer ratios, a result that would arise if
sponding, most conditions of the experiment the subject collected all reinforcers simply by
involved a penalty for changes from one key pecking the two keys (or stimuli) in strict al-
to the other known as a changeover delay ternation, or in any other pattern that was
(COD). Specifically, Herrnstein arranged independent of the two schedule values.
that pecks could not be reinforced until at This discussion suggests that concurrent
least 1.5 s had elapsed since a changeover operants may be defined either by the re-
from one key to the other. This COD pre- sponse (e.g., by key location) or by the stim-
vented immediate reinforcement of simple al- ulus signaling the schedule in effect, with, as
ternation, and may be interpreted as estab- far as we know, roughly equivalent results.
lishing the independence of the two Moreover, some data suggest that sensitivity
operants: Peck Key 1 → food, and peck Key to reinforcement depends on the extent to
2 → food. Herrnstein found that switching which the two operants are differentiated
was much less frequent and response alloca- with respect to reinforcement by the COD
tion more nearly approximated exact match- (see Davison & McCarthy, 1988, for a discus-
ing when the COD was in effect, suggesting sion of these findings). Another method for
that matching may be the normative result if varying the difference between concurrent
the two operants are indeed independent. operants was described by Miller, Saunders,
Another method for arranging concurrent and Bourland (1980). They employed the
schedules had been described earlier by Fin- Findley switching-key procedure and varied
dley (1958). His method involved correlat- the relative rates of reinforcement. Also,
ing the two VI schedules with different stim- across groups of birds they varied the similar-
uli on a main key, as in multiple schedules, ity of the stimuli defining the two operants
but allowing the subject to change over from (lines of various orientations projected on the
one to the other by pecking a second switch- main key). For one group, the stimuli were
ing or changeover key. In this arrangement, lines of the same orientation; for the second
the concurrent operants are defined by the group, the lines differed by 158; and for the
explicit stimuli on the main key rather than third group, the lines differed by 458. The val-
topographically by key location, as in Herrn- ues of a were about 0.17, 0.33, and 0.99 for
stein’s (1961) study. Despite these differenc- these three sets of stimulus disparities. (With
es, the results were similar to Herrnstein’s, 08 disparity, the value of a should have been
in that the ratio of responses, and of times 0; however, as Alsop & Davison, 1991, sug-
spent in the presence of each stimulus, gested, the reinforcer rates may themselves
roughly equaled or matched the ratio of the have provided cues to the different sched-
reinforcer rates. However, there were some ules.) The general conclusion is that sensitiv-
systematic deviations from matching (see ity to reinforcement depends on the extent
Nevin, 1984, for reanalysis of Findley’s data), to which concurrent operants are differenti-
and Equation 1 must be modified to de- ated by the variables that define them.
scribe them. A simple modification that cap- Conditional Discriminations
tures these and many other results very well
is the generalized matching law (Baum, Conditional discriminations combine the
1974, 1979): successive stimulus presentations of multiple
schedules and the simultaneous availability of
two choices with their associated schedules, as
1 2
a
B1 R
5c 1 , (2) in concurrent schedules. As shown in the ma-
B2 R2 trix of Figure 1, reinforcement is conditional
444 MICHAEL DAVISON and JOHN A. NEVIN

Table 1 conditional discrimination performance de-


Summary examples of the four discriminated operants in pends on stimulus, response, and reinforce-
various paradigms described in the text. ment terms in closely interrelated ways. The
Conse- accuracy of performance obviously depends
Stimulus Response quence on the physical difference between the con-
ditional stimuli. For example, Swets (1959)
Signal Signal 1 noise ‘‘Yes’’ Payoff
detection Signal 1 noise ‘‘No’’ Penalty
varied the signal-to-noise ratio in auditory sig-
Noise ‘‘Yes’’ Penalty nal detection with human subjects. With pi-
Noise ‘‘No’’ Payoff geons, McCarthy and Davison (1980a) varied
Matching Red sample Peck red Food signal duration, and Wright (1972) varied
to sample Red sample Peck green No food wavelength differences of lighted keys. All
Green sample Peck red No food found that accuracy (i.e., the degree to which
Green sample Peck green Food
responses conformed to the experimenter’s
Free operant Vertical line Peck Key 1 VI food
Vertical line Peck Key 2 No food definition of reinforceable responses) in-
Horizontal line Peck Key 1 No food creased with stimulus disparity.
Horizontal line Peck Key 2 VI food Not surprisingly, conditional discrimina-
tion performance also depends on the differ-
entiation between the responses. For exam-
upon the current or prior stimulus. The stan- ple, Eckerman (1970) trained pigeons to
dard yes-no signal-detection experiment pro- peck different locations along a lighted 25-cm
vides one example: If a signal is presented, strip in the presence of different wavelengths.
‘‘yes’’ is followed by a payoff and ‘‘no’’ is fol- Three groups differed according to the re-
lowed by a penalty; if the signal is not pre- sponse definition on the strip key. For Group
sented, ‘‘no’’ is followed by a payoff and ‘‘yes’’ 1, both responses were defined near the cen-
is followed by a penalty. The well-known ter of the strip; for Group 2, responses about
matching-to-sample procedure provides an- 4 cm to the right of center were reinforced
other example: If the sample color on the cen- on 506-nm trials, and responses 4 cm to the
ter key of a three-key chamber is lighted red, left of center were reinforced on 583-nm tri-
and the side keys are then lighted with red als; and for Group 3, responses about 8 cm
and green comparison colors, food is given for to the right of center were reinforced on 506-
pecks to red; but if the sample is green, food nm trials, and responses 8 cm to the left of
is given for pecks to green. Although both of center were reinforced on 583-nm trials.
these examples employ discrete-trial presen- Group 1 made as many ‘‘errors’’ as correct
tations, conditional discriminations may also responses; Group 2 made relatively few er-
be arranged for free-operant behavior during rors; and Group 3 made virtually none. This
extended stimulus presentations. For exam- result complements the findings of Miller et
ple, White (1986) trained pigeons in a two-key al. (1980) with responses defined by their lo-
chamber with VI reinforcement of pecks on cation rather than by the stimulus.
Key 1 and extinction of pecks on Key 2 when The reinforcer is, of course, the third term
both keys had vertical lines projected on them. of the discriminated operant, and conditional
Conversely, he arranged VI reinforcement for discrimination accuracy also depends on
pecks on Key 2 and extinction of pecks on Key whether the consequences of the two correct
1 when both keys had horizontal lines pro- responses (i.e., those specified for reinforce-
jected on them. The four discriminated op- ment) in a standard two-stimulus two-re-
erants in these examples are summarized in sponse conditional discrimination are the
Table 1. In each example, two discriminative same or different. For example, Peterson,
relations are successive, determined by the Wheeler, and Trapold (1980) trained pigeons
stimulus presentation, as in multiple sched- in a conditional discrimination problem in
ules, and two are simultaneous, as in concur- which a green center key signaled that a peck
rent schedules. Thus, a full account of perfor- to the side key with vertical lines was correct,
mance in the conditional discrimination and a red center key signaled that a peck to
paradigm should encompass multiple- and the side key with horizontal lines was correct.
concurrent-schedule performances as well. One group of pigeons received food accom-
The results of several studies suggest that panied by a tone for both kinds of correct
STIMULI, REINFORCERS, AND BEHAVIOR 445

responses; another group received food plus


tone for one kind of correct response and
tone alone (i.e., no food) for the other. The
latter group was substantially more accurate
than the former, especially when delays were Fig. 2. The conditional discrimination matrix. The
introduced between the center-key color and stimuli are designated S and the responses B, and the
the side-key choice. This exemplifies the dif- cells of the matrix are designated as the stimulus–re-
sponse combinations.
ferential outcome effect first reported by Trapold
(1970), and is here interpreted as resulting
from the larger difference between discrimi- efforts, partly to set the stage for the present
nated operants for the latter group. These re- model and partly because they will be re-
sults are entirely consistent with the depen- ferred to below.
dency of multiple- and concurrent-schedule
performances on the degree of differentia- Background
tion between the two discriminated operants, Nevin, Jenkins, Whittaker, and Yarensky
as discussed above. (1977,2 1982) proposed a model of signal-de-
In the model that we develop below, we tection performance based on the direct and
employ a theoretical parameter, dsb , that mea- generalized strengthening effects of reinforc-
sures the distinctiveness of the relation be- ers obtained in the cells of the matrix of Fig-
tween the conditional stimuli and the re- ure 2, which simply expands Figure 1 with
sponses they occasion for one discriminated notation for all four cells. The basic notion
operant relative to another. The value of this was that reinforcers for B1 on S1 trials (R11)
parameter should be affected, for example, would also strengthen B1 on S2 trials, to the
by the difference between the conditional extent that S1 and S2 are confusable. Thus,
stimuli and by the delay between the condi- although R21 is actually zero, it may effectively
tional stimuli and responses. We employ a be greater than zero. Likewise, reinforcers for
second parameter, dbr , to represent the dis- B2 on S2 trials (R22) would also strengthen B2
tinctiveness of the relation between behavior on S1 trials. The subject was assumed to
and reinforcement for one discriminated op- match the ratio of B1 and B2 to the ratio of
erant relative to another. The value of dbr re- direct and generalized reinforcers, separately
flects the joint effects of variables that influ- on S1 and S2 trials (for full rationale and equa-
ence response–reinforcer contingencies such tions, see Nevin, 1981; Nevin et al., 1982).
as the qualities or delays of the outcomes and Davison and Tustin (1978) arrived at a sim-
the topographical differentiation of respons- ilar formulation by a different route. Taking
es. It is important to observe that response the generalized matching law (Equation 2) as
differentiation will be reflected in both pa- their starting point, they proposed that S1 and
rameters: For example, in Eckerman’s (1970) S2 could be construed as biasing response al-
study, increasing the separation between cor- location toward B1 or B2, respectively. Because
rect responses would increase both dsb and we will have several occasions to refer to their
dbr . formulation below, we present their equa-
tions and measures here:
A MODEL OF
1 2
a
DISCRIMINATED B 11 R
5 cd 11 , (3a)
OPERANT BEHAVIOR B 12 R 22
In view of the discussion above, an ade- and
quate model must include terms for the de-

1 2
a
gree of differentiation between two operants B 21 c R 11
based on the stimulus–response relation 5 , (3b)
B 22 d R 22
(what response goes with what stimulus) and,
separately, the response–reinforcer contin- 2 Nevin, J. A., Jenkins, P., Whittaker, S., & Yarensky, P.
gencies that define those two operants (what (1977, November). Signal detection and matching. Paper
reinforcer goes with what response). We be- presented at the meetings of the Psychonomic Society,
gin with a brief review of earlier modeling Washington, DC.
446 MICHAEL DAVISON and JOHN A. NEVIN

where c represents inherent bias that is con- ed that conditional discrimination


stant with respect to the reinforcer ratio and performance depended jointly on stimulus
a represents the sensitivity of choice alloca- discriminability and contingency discrimina-
tion to the reinforcer ratio, as in Equation 2. bility according to the following equations:
The parameter d—stimulus bias—represents
the discriminability between S1 and S2. If d 5
1 2
B 11 d R 1 R2
1, signifying zero discriminability, response 5 ds r 1 , (6a)
B 12 dr R 2 1 R 1
ratios are identical on S1 and S2 trials. To show
that d is predicted to be independent of re- and
inforcement, Equation 3a is divided by Equa-

1 2
tion 3b and the reinforcers cancel out. Re- B 21 1 dr R 1 1 R 2
5 , (6b)
arranging and taking square roots, B 22 ds dr R 2 1 R 1

1 2
0.5
B 11 B 22 where ds represents stimulus discriminability,
d5 . (4)
B 12 B 21 as does d in the Davison-Tustin model, and dr
represents contingency discriminability. Dav-
Thus, d is measured directly by the geometric ison and Jenkins showed that the value of dr
mean of the ratios of correct to incorrect re- described the degree of undermatching in
sponses in the presence of Stimuli 1 and 2. concurrent schedules and conditional dis-
To show that sensitivity to reinforcement criminations (i.e., the extent to which a , 1
(a) is predicted to be independent of stimu- in Equations 2, 3a, and 3b). Moreover, dr
lus discriminability, Equation 3a is multiplied could be identified with parameters of the ex-
by Equation 3b and d cancels out. Taking perimental contingencies in the same way
square roots, that ds could be identified with stimulus pa-
rameters.
1 2 1 2
0.5 a
B 11 B 21 R 11
b5 5c , (5) The Davison-Jenkins (1985) model re-
B 12 B 22 R 22 quires discrimination, as measured by ds , to
where b, the geometric mean of the ratios of be unaffected by dr . This may be seen by di-
responses to Alternatives 1 and 2 given Stim- viding Equation 6a by 6b, canceling out the
uli S1 and S2, is an overall measure of behavior reinforcer terms and showing that ds , like
allocation. Empirically, of course, d may de- Davison and Tustin’s d, is given by Equation
pend on reinforcer scheduling, and a may de- 4. This result leads to a problem. If dr 5 1.0,
pend on the S1-S2 difference. Many experi- representing a total failure to discriminate
ments have explored these questions (e.g., which reinforcer goes with which response,
McCarthy & Davison, 1979, 1980a, 1984) with the model predicts that response ratios will
no simple conclusion emerging (for review, be constant and independent of reinforcer
see Alsop & Davison, 1991). ratios. Thus, performance in S1 5 ds, and per-
Although the model of Davison and Tustin formance in S2 5 1/ds for all reinforcer ratios.
(1978) has been successful as a descriptive However, it is highly unlikely that such differ-
framework, it does not address the processes ential control by the stimuli could be effective
that determine sensitivity to reinforcement: a in the absence of differential control by the
is simply a free parameter. With reference to reinforcement contingency. The situation is
the study by Miller et al. (1980), Davison and similar to that with identical multiple VI VI
Jenkins (1985) suggested that if two concur- schedules signaled by red and green keylights
rently available response alternatives were not discussed above: Although red and green may
well differentiated, reinforcers obtained by be highly discriminable by some other mea-
one response might have the effect of sure, equal response rates are forced if the
strengthening the other response. The idea contingencies of reinforcement are not dis-
is basically similar to the generalized strength- criminated.
ening effects across stimuli proposed by Nev- The foregoing discussion suggests that it is
in et al. (1977, Footnote 2). To characterize essential to distinguish between stimulus dis-
the discriminability of the response–reinforc- criminability as a theoretical parameter and
er contingency, Davison and Jenkins intro- stimulus control or discrimination as measured
duced a second parameter, dr . They suggest- by Equation 4. We will have occasion to re-
STIMULI, REINFORCERS, AND BEHAVIOR 447

mind readers of this point as the argument the model will be generalized to any number
proceeds. of discriminated operants.
Although stimulus discriminability and As described above, the simplest condition-
contingency discriminability were conceptu- al discrimination involves the successive and
alized similarly in the Davison-Jenkins (1985) randomized presentation of one or the other
model, they were not treated similarly in its of two stimuli, designated S1 and S2, where
equations; and it is the equations that do the two response alternatives, B1 and B2, are si-
work. A model that avoids the difficulty in- multaneously available. When S1 is present, B1
herent in the Davison-Jenkins model, and may be deemed correct and is reinforced ac-
which gives algebraic as well as conceptual cording to some schedule, and when S2 is
equivalence to stimulus and contingency dis- present, B2 may be deemed correct and is re-
criminability, was introduced jointly by Alsop inforced according to a separate schedule.
(1987)3 and by Davison (1987)4 and first pub- The paradigm is summarized in the 2 3 2
lished by Alsop (1991) and Davison (1991b). matrix of Figure 2, which repeats Figure 1
It addresses steady-state behavior only, leaving with added notation. The four resulting dis-
for future development the consideration of criminated operants are designated accord-
transition states such as acquisition or extinc- ing to their stimulus and response identifi-
tion. The model will be reviewed and devel- cation.
oped here for a simple conditional discrimi- For the simple case described in Figure 1,
nation performance, and then will be reinforcers can occur only in Cells 11 and 22,
extended to describe performance in related and are designated R11 and R22. These rein-
cases that include complex conditional dis- forcers are assumed to strengthen the re-
criminations, reinforcement for convention- sponses that produce them, designated B11
ally defined ‘‘errors’’ in conditional discrim- and B22. However, to the extent that the stim-
inations, and multiple and concurrent uli are confusable, R11 will also strengthen re-
schedules. sponding in Cell 21, designated B21; likewise,
R22 will also strengthen responding in Cell 12,
Initial Assumptions designated B12, as suggested by Nevin et al.
We assume that behavioral allocation is (1977, Footnote 2). Let us assume that the
based on strict matching of behavior ratios in conditional stimuli, as identified with re-
the presence of (or following) conditional sponses, are located on a dimension of psy-
stimuli to the effective allocation of reinforcers chometric space. We shall not endeavor here
for responses in the presence of these stimuli. to locate the stimuli in an absolute sense, but
In spirit, this assumption is similar to Killeen’s just to measure their distance apart. Follow-
(1994) argument that reinforcement acts on ing Davison (1991b), we assume that the psy-
the effective response unit for the organism, chometric distance between two stimuli is giv-
which may not be the same as the unit spec- en by dsbi1i 2, where i1 and i2 designate the two
ified by an experimental contingency. Our stimulus conditions. Such a measure ranges
model is principally concerned with estimat- from one (the stimuli are completely nondis-
ing the effective allocation of reinforcers criminable) to infinity (the stimuli are per-
when the stimulus–response and response– fectly discriminable). We assume that the dis-
reinforcer contingencies defining the dis- tances between stimuli satisfy a ratio scale, so
criminated operants are confusable. Initially, that it is meaningful to assert that, for ex-
a model will be developed for four discrimi- ample, dsb13 5 2*dsb12. This assumption implies
nated operants, comprising two conditional a log interval scale, and we will sometimes
stimulus conditions and two responses. Later, report values of log dsb (with values between
zero and infinity).
3 Alsop, B. (1987, June). Choice models of signal detection Further, again following Davison (1991b),
and detection models of choice. Paper presented to the 10th we assume that the generalization of rein-
Harvard Symposium on the Quantitative Analysis of Be- forcer effects from Stimulus S1 to Stimulus S2
havior, Boston. decays inversely with dsb12. Thus, the effective
4 Davison, M. (1987, June). Stimulus discriminability,

contingency discriminability, and complex stimulus control. Pa- reinforcer contribution of R11 to Response 1
per presented to the 10th Harvard Symposium on the in Stimulus 2 is R11/dsb12. The function is por-
Quantitative Analysis of Behavior, Boston. trayed in Figure 3. It is similar to the expo-
448 MICHAEL DAVISON and JOHN A. NEVIN

Fig. 3. The assumed decrease in reinforcer value act-


ing on a stimulus–response pair as a function of the psy-
chometric distance between that stimulus–response pair
and the pair that gained reinforcement.

Fig. 4. How the effects of a single reinforcing event


nential decay function conjectured by Shep- in Cell 11 are generalized via Cell 12 to Cell 22 in the
ard (1958) as the universal form of conditional discrimination matrix. Equivalent processes
(not shown) will also generalize this event into Cells 12
generalization gradients, but it falls off rela- and 21. The same process is assumed to occur for rein-
tively more steeply at small values and less forcers delivered in any cell of the matrix.
steeply at large values. Future modeling ef-
forts may need to explore the exponential or
other forms of the decay function. a reinforcer delivered for B 11 will also
As we noted above, dsb is conceptually sim- strengthen B12, and a reinforcer delivered for
ilar, in determining the effective reinforcer B22 will also strengthen B21. Again following
allocation, to the measure of stimulus dis- Davison (1991b), we assume that the gener-
criminability (d or log d) offered by Davison alization of reinforcer effects from response
and Tustin (1978). However, it is also impor- B11 to Response B12 decays inversely with dbr12.
tant to note that these measures are not the Thus, the effective reinforcer contribution of
same, because the mechanisms and equations R11 to Response 2 in Stimulus 1 is R11/dbr12. If
are very different. The measure log d is cal- the response–reinforcer contingencies are
culated from discrimination performance, perfectly differentiable, dbr 5 `, and the re-
which (as suggested above) may be affected inforcer strengthens only the response that
by variables other than the discriminability of produced it. If the response–reinforcer con-
their respective stimulus–response pairs, such tingencies are indistinguishable, dbr 5 1, and
as the discriminability of their respective re- the reinforcer strengthens both responses
sponse–reinforcer contingencies. For exam- equally, regardless of which response pro-
ple, in matching to sample, log d decreases duced it. As for dsb , we will generally report
when a delay is inserted between the choice values of log dbr , which ranges from zero to
response and the reinforcer, even though the infinity. The parameter dbr is affected by such
stimuli, the responses, and the relations be- variables as the differences between the re-
tween them (i.e., respond to the key with the sponse definitions (e.g., along a strip key, as
same color as the sample) are unchanged in Eckerman’s, 1970, study), the differences
(McCarthy & Davison, 1986). As we will show, in the outcomes of the responses (as in Pe-
this decrease in log d is compatible with in- terson et al., 1980), or the delay between re-
variance in dsb , which would be consistent sponses and reinforcers (as in McCarthy &
with the unchanged relation between the Davison, 1986, where dbr is expected to de-
samples and the choice responses. crease with increases in delay). Figure 4
By analogy to the treatment of stimulus– shows how the values of log dsb12 and log dbr12
response confusability, we assume that re- affect the generalization of the effects of a
sponse–reinforcer contingencies are also con- single reinforcer obtained by B1 in the pres-
fusable. To the extent that they are confused, ence of S1 across stimuli and responses.
STIMULI, REINFORCERS, AND BEHAVIOR 449

the measured responses in accordance with


the ratio of effective reinforcers that have ac-
crued to the two cells of the matrix that cor-
respond to that stimulus. In its basic form,
our model is concerned solely with reinforcer
frequencies in the cells of the matrix and not
Fig. 5. The effective reinforcer allocations in the four with their values (as determined by magni-
cells of the conditional discrimination matrix under con- tude or quality); we will discuss the extension
ditions of reinforcement for B1 in the presence of S1 and of the model to encompass such factors in
for B2 in the presence of S2.
the section on concurrent schedules, below.
More generally and formally, we assume
To get an intuitive sense of the model, con- that responses allocated to the cells of the
sider the subject as experiencing the varia- matrix in Figure 2 match the ‘‘apparent,’’
tions and repetitions in its behavior within a ‘‘perceived,’’ or ‘‘effective’’ long-term alloca-
stream of environmental events—lights, tion of reinforcers, which will deviate from
tones, and the like—some of which recur their veridical (i.e., experimenter-measured)
from time to time. When a reinforcer dis- allocation to the extent that dsb and dbr are less
rupts the flow of the stream, its strengthening than infinite, as shown in Figure 5. We rec-
effect is felt directly on the relation between ognize that terms like apparent, perceived, and
current or recent environmental events and effective reinforcers may seem loose, but here
the response being emitted when the rein- they occur as technical terms that are defined
forcer arrived. The strengthening effect also quantitatively by equations describing how,
generalizes to other environmental events through generalization engendered by con-
and responses to the extent that they are con- fusion between stimulus–response relations
fusable with those that occurred in close con- and between response–reinforcer relations,
tiguity with the reinforcer, as measured by the the experimenter-measured reinforcer allo-
inverse of dsb and dbr . We assume that this pro- cations are transformed into quantities that
cess operates following each reinforcer, as affect behavior. The terms are always used as
suggested in Figure 4, to increment the val- a convenient shorthand for the operations of
ues of the cells in the matrix of Figure 2. The our proposed equations.
rate or probability of reinforcement correlat- The resulting equations that predict re-
ed with a particular discriminated operant af- sponding in the presence of, or following, the
fect only the stable-state values assigned to two conditional stimuli are, for S1,
the relevant cells. As noted earlier, we are not
R 22
yet attempting to address acquisition or tran- R 11 1
sition states. We assume that nonoccurrence B 11 d sb12 d br12 d d R 1 R 22
5c 5 c sb12 br12 11 ,
of R1 or R2 does not reduce the effective re- B 12 R 11 R d sb12 R 11 1 dbr12 R 22
inforcement in any cell. Because it operates 1 22
d br12 dsb12
sequentially, reinforcer by reinforcer, the (7a)
model is molecular and dynamic. However,
the model is molar in the sense that, after and for S 2,
prolonged experience under constant exper-
R 11 R
imental conditions, the direct and general- 1 22
ized reinforcement values of the cells will set- B 21 d sb12 d br12 d R 1 d sb12 R 22
5c 5 c br12 11 .
tle into stable ratios, and it is these ratios that B 22 R 11 R 11 1 dsb12 dbr12 R 22
determine choice. 1 R 22
d sb12 dbr12
The effective numbers of reinforcers, di- (7b)
rect or generalized, that have accumulated
during steady-state performance are given by The values of dsb12 and dbr12 can be estimated
the expressions in the four cells in Figure 5. by a nonlinear optimization program (see Ap-
When the experimenter presents a particular pendix B) from a set of data taken across con-
stimulus repeatedly during prolonged expo- ditions that vary the ratio of R11 to R22.
sure to the experimental conditions, the sub- As in the generalized matching law (Baum,
ject is assumed to emit one or the other of 1974), c represents a constant proportional
450 MICHAEL DAVISON and JOHN A. NEVIN

preference (inherent bias; see Davison & Tus- forcement (see Equation 5) but only when dsb
tin, 1978) for one alternative response over 5 1.0. Note, however, that the reinforcer
the other. The value of c should be unaffect- term is not the same as in Equation 5. We use
ed by changes in the conditional stimuli or the upper case here because B is not equiva-
in the frequency of reinforcers for the two lent to b when dsb . 1.0. (Recall that stimulus-
correct responses. It need not, however, re- related parameters analogous to dsb canceled
main constant when either the response to- out when Equations 3a and 3b in the Davison-
pographies or the magnitudes of reinforcers Tustin model or Equations 6a and 6b in the
are changed. Davison-Jenkins model were multiplied. The
The assumption that behavior allocation same sort of cancellation does not occur
between cells strictly matches (equals) the ef- when Equations 7a and 7b are multiplied un-
fective reinforcer frequencies in the cells has less dsb 5 1.0. The implication is that the re-
the benefits of simplicity. It might be objected lation between overall response bias and re-
that such an assumption is too simple because inforcer allocation depends on the
it is well known (e.g., Baum, 1974; see Davi- discriminability of stimulus–behavior rela-
son & McCarthy, 1988, for review) that such tions; we return to this point below.)
strict matching seldom occurs, and that un- Equation 9 is the same as the equation that
dermatching (less change in behavior ratios follows Equation 11 of Davison and Jenkins
than in reinforcer ratios) is the norm. How- (1985), who showed that it gave a good ac-
ever, as we now show, undermatching arises count of choice data normally construed as
naturally from this model when the discrim- undermatching when fitted by the general-
ination between alternatives is less than per- ized matching law. When dbr approaches in-
fect. finity, Equation 9 simplifies to
If Equations 7a and 7b are multiplied to-
R 11
gether, we can obtain a theoretical measure B5 , (10)
of overall response bias (B): R 22
B 11 B 21 showing that overall behavior allocation to
B2 5 Responses 1 and 2 strictly matches the ratio
B 12 B 22
of obtained (and, in this case, perceived) re-
R 22 R 11 R inforcers.
R 11 1 1 22 To obtain a theoretical measure of stimulus
dsb12 dbr12 dsb12 dbr12
5 · . (8) discrimination, we begin by dividing Equa-
R 11 R 22 R 11 tion 7a by 7b, which yields
1 1 R 22
d br12 d sb12 dsb12 dbr12
B 11 B 22
In this expression, B is the geometric mean D2 5
B 12 B 21 (11)
of the ratios of responses (i.e., B1/B2) taken
across the two choice alternatives. When dsb R 22 R 11
5 1.0, the four operants are effectively sig- R 11 1 1 R 22
d sb12 dbr12 d sb12 d br12
naled by the same stimulus and the rows of 5 · .
the matrix in Figure 2 collapse into a single R 11 R R 11 R
1 22 1 22
row, as in continuous free-operant two-key d br12 d sb12 d sb12 d br12
concurrent schedules. Equation 8 then be- In this expression, D is the geometric mean
comes of the ratios of responses normally construed
 R 22 2 as ‘‘correct’’ responses and ‘‘errors.’’ Setting

B2 5
B 11 B 21
5 c
R 11 1
dbr 
 . (9)
dbr 5 ` and taking square roots, Equation 11
simplifies to
B 12 B 22

R 22 1
R 11

1 2
0.5
dbr  B 11 B 22
D5 5 d sb . (12)
B 12 B 21
In this expression, B is equivalent to b, the
combination response bias measure used by Note that reinforcer frequencies and bias
Davison and Tustin (1978) to characterize have canceled out in this expression, imply-
differential responding with respect to rein- ing that, if dbr 5 `, D is independent of the
STIMULI, REINFORCERS, AND BEHAVIOR 451

ratio of reinforcers, real or apparent. Equa- steps from 1 to 1,000, the bias functions be-
tion 12 is the same as Equation 4, which spec- come progressively steeper in both panels,
ified the parameter d in the Davison-Tustin and will reach an asymptote at exact match-
model. This measure was, in their model, a ing when dbr 5 ` (not shown). This predicted
pure measure of the effect of stimulus dis- steepening parallels the changes that have
parity because they ignored dbr , effectively set- been observed with concurrent VI VI sched-
ting it at infinity. The equivalence of D, d, and ules as a function of disparity between alter-
dsb holds only if dbr 5 `; accordingly, we use natives, as reported by Miller et al. (1980).
the upper-case D here and note that when dbr Note that the functions are roughly linear
is less than infinite, D will depend on the dis- over the center of the range, and thus con-
criminability of response–reinforcer relations form approximately to the generalized
as well as stimulus–response relations. matching law when it is restated in loga-
So far, we have considered cases in which rithms. Note also that the functions are steep-
dsb 5 1 and dbr 5 `; we now explore some er at each intermediate value of dbr when dsb
other cases involving extreme parameter val- is 2 than when it is 10, and that the curvilin-
ues. When dsb 5 dbr 5 1.0, both response ra- earity is more pronounced when dsb is smaller
tios are predicted to be equal to c, and there than when it is larger. Thus, choice allocation
will be no effect of varying R11 or R22. By con- is predicted to be more sensitive to the rein-
trast, when both dsb and dbr are very large, forcer ratio when the stimuli are more con-
Equation 7a will approximate 1` and Equa- fusable.
tion 7b will approximate 2`, implying error- The lower panels of Figure 6 show how log
less performance that is unaffected by vary- D depends on the reinforcer ratio at two val-
ing R11 or R22. ues of dbr when dsb varies parametrically in
When dbr 5 1.0, the absence of any appar- four steps, as predicted by Equation 11. When
ent differential reinforcement leads to the ab- dbr 5 10 and dsb 5 1,000, the function is nearly
sence of control by changes in the reinforcer flat and log D approaches its maximum of
ratio for correct responses (Equation 9 5 c), 1.0, the limiting value permitted by dbr . As dsb
and no differential responding with respect decreases, log D decreases systematically and
to stimuli (Equation 11 5 1.0 regardless of the functions assume an inverted-U shape un-
the value of dsb). The basic result, then, is that til, at dsb 5 1, log D falls to 0. When dbr 5 2,
according to Equations 7a and 7b and their the maximum value of log D is 0.3 (i.e., log
combinations, differential responding with 2), and compared with the predictions for dbr
respect to reinforcement can occur only if dbr 5 10, log D is less affected by variations in log
. 1.0. Moreover, differential responding with reinforcer ratios. These examples illustrate
respect to stimuli depends on effective (rath- the fact that the accuracy of a discrimination
er than arranged) differential reinforcement depends jointly on the discriminability of the
with respect to responding. There can be no stimulus–behavior relations and the behav-
stimulus control without effective differential ior–reinforcer relations.
reinforcement (cf. the model suggested by The effects of dsb and dbr are exactly inter-
Davison and Jenkins, 1985, Equations 6a and changeable in their effects on log D. For ex-
6b, in which stimulus control could occur ample, the values of log D for dsb 5 10, dbr 5
without effective differential reinforcement). 2 and for dsb 5 2, dbr 5 10 are identical, as
More generally, the effects of dsb and dbr on can be seen by inspection of Equation 11 and
differential responding with respect to Stim- by comparison of the shallow inverted-U
uli S1 and S2 (measured by log D) and with functions for these parameter values in the
respect to Responses B1 and B2 (measured by lower panels of Figure 6. By contrast, the up-
log B) as functions of R11/R22 are summarized per panels show that the effects of dsb and dbr
by the examples in Figure 6. The upper pan- on log B are not interchangeable. For exam-
els show the effects of the log reinforcer ratio ple, the function for dsb 5 10, dbr 5 2 is sub-
on log B as predicted by Equation 9 for two stantially shallower than the function for dsb
representative values of dsb : a fairly easy dis- 5 2, dbr 5 10. More generally, when dbr ap-
crimination, dsb 5 10, in the left panel and a proaches one, the effects of both the stimulus
difficult discrimination, dsb 5 2, in the right difference and the reinforcer ratio are pro-
panel. As dbr varies parametrically in four gressively weakened until behavior allocation
452 MICHAEL DAVISON and JOHN A. NEVIN

Fig. 6. Upper panels: log B (Equation 8) as a function of the log reinforcer ratio for two values of response–
reinforcer discriminability (dbr). Lower panels: log D (Equation 11) as a function of the log reinforcer ratio for two
values of stimulus–response discriminability (dsb). Note that we use log B and log D to signify theoretical predictions
(see Appendix A). Parameter values are for dbr in the upper panel and for dsb in the lower panel.

is indifferent to all reinforcer ratios. However, usual, point on the continuum of


when dsb approaches one, measured discrim- discriminated operant contingencies. Any se-
ination decreases at all values of the reinforc- rious model of contingencies of reinforce-
er ratio, but the sensitivity of choice alloca- ment must deal with the entire continuum,
tion to the reinforcer ratio increases and and must provide measures of dsb and dbr that
approaches strict matching. In this sense, are unaffected by the distribution of reinforc-
then, the discriminability of stimulus–re- ers in the matrix. Second, the introduction of
sponse relations and the discriminability of reinforcement for errors makes direct con-
the response–reinforcer contingency are pre- tact with multiple concurrent schedules, in
dicted to have different behavioral effects which each of two simultaneously available re-
even though they are conceived of, and treat- sponses may be reinforced on different
ed, in parallel. schedules depending on the stimulus signal-
ing them (e.g., McLean & White, 1983).
Reinforcers for Errors with Two The approach taken above naturally gen-
Stimuli and Two Responses eralizes to reinforcers delivered in any cell of
Although most conditional discrimination the matrix, with no further assumptions. Re-
experiments employ the contingencies de- inforcers delivered in any cell of a matrix will
scribed above (reinforcement for B1 only on have an influence on the effective reinforcer
S1 trials, and reinforcement for B2 only on S2 rate in other cells depending on the discrim-
trials), the effects of reinforcement for re- inability of both stimulus–response relations
sponses conventionally termed ‘‘errors’’—B1 and response–reinforcer relations between
given S2, and B2 given S1—are of major inter- those cells. The appropriate equations (for a
est. First, reinforcement of only ‘‘correct’’ re- 2 3 2 matrix, suppressing the further sub-
sponses is simply an extreme, and maybe un- scripting of dsb and dbr) are, for S1,
STIMULI, REINFORCERS, AND BEHAVIOR 453

Fig. 8. The matrix of events in a three-stimulus three-


response detection matrix in which B1 is reinforced in
the presence of S1, B2 in the presence of S2, and B3 in
the presence of S3.

and for B2 in the presence of S2. The ar-


Fig. 7. Upper panel: the matrix of events in a three- ranged stimulus–response–reinforcer matrix
stimulus two-response detection matrix in which rein- is shown in the top of Figure 7 and the effec-
forcers are available for B1 in the presence of S1 and S3 tive stimulus–response–reinforcer matrix is
and for B2 in the presence of S2. Lower panel: the effec- shown at the bottom.
tive reinforcer matrix for the events in the upper panel. Note that apparent reinforcement for S1
and S3 is more similar when both stimuli sig-
nal reinforcement for the same response
R 12 R R
R 11 1 1 21 1 22 than when they signal reinforcement for dif-
B 11 d br d sb dsb dbr ferent responses. The similarity of effective
5c , (13a)
B 12 R 11 R 21 R reinforcement for responses to otherwise dis-
1 R 12 1 1 22 tinctive stimuli may contribute to their mem-
dbr d sb d br d sb
bership in the same stimulus class in research
and, for S 2, on categorization or stimulus equivalence.
R 11 R R N Responses, M Stimuli
1 12 1 R 21 1 22
B 21 d dsb dbr dbr
5 c sb . (13b) The generalization of the present model to
B 22 R 11 R 12 R 21 more than two stimuli and more than two re-
1 1 1 R 22
dsb dbr dsb dbr sponses is straightforward. For example, con-
sider the conditional discrimination matrix
More Than Two Stimuli and Two Responses shown in Figure 8. We require three pairwise
The conventional conditional discrimina- stimulus–behavior discriminability parame-
tion procedure employs a pair of stimuli, S1 ters, dsb12, dsb13, and dsb23. Likewise, we also re-
and S2. We now consider the case of more quire three behavior–reinforcement discrim-
than two stimuli, some of which may share inability parameters, dbr12, dbr13, and dbr23. As in
identical reinforcer contingencies. This situ- the two-stimulus two-response model, we as-
ation requires additional stimulus–response sume that reinforcers delivered in one cell
discriminability parameters. Assuming sym- generalize to behavior in all other cells to an
metry (i.e., the discriminability of S1 from S2 extent that depends on the discriminability of
is the same as the discriminability of S2 from the stimulus–behavior relations within col-
S1), there will, for N stimuli, be N!/(N 2 1)! umns, on the discriminability of the re-
such parameters. This number quickly gets sponse–reinforcer relations within rows, or
out of hand, requiring an unachievable num- on both of these. Thus, R11 reinforcers affect
ber of experimental conditions to provide ac- responses in Cell 12 according to R11/dbr12, in
curate estimates of the parameter values. As Cell 13 according to R11/dbr13, and so on. This
a formal demonstration only, here we will operation specifies the decremental effects of
take three stimuli and two responses and, what are, essentially, stimulus–behavior and
rather than show the equations directly, we behavior–reinforcer distances to elucidate for
will show a matrix of effective reinforcer val- the reader the ideas underlying the model.
ues. We will assume that reinforcers are avail- As in the basic two-stimulus two-response case
able for B1 in the presence of both S1 and S3 developed above, the effects of reinforcers in
454 MICHAEL DAVISON and JOHN A. NEVIN

any model for any system) has a number of


immutable requirements. These can be sum-
marized as follows: Any parameter that pur-
ports to be a measure of an independent-var-
iable effect must remain unaffected by the
variation of other independent variables that
purport to affect other parameters. For in-
stance, dsb should be affected, in an appro-
priate direction, by changes in conditional
stimuli, but should not be affected by chang-
es in reinforcer frequencies or, more partic-
ularly, by changes in response–reinforcer dif-
ferentiation. Equally, dbr should be affected in
an appropriate direction by changes in re-
sponse–reinforcer differentiation, but not by
changes in conditional stimuli. In other
words, derived parameters should show what
Nevin (1984) termed parameter invariance.
Such parameter invariance is of prime im-
portance, and an example is useful here.
When Davison and McCarthy (1980) varied
the frequency of reinforcers delivered for er-
Fig. 9. The effective reinforcer matrix for the three- rors (i.e., they arranged R12 and R21 reinforc-
stimulus three-response detection matrix shown in Fig- ers), they found that the Davison-Tustin
ure 8.
(1978) measure of stimulus discriminability,
log d, decreased with increasing error-rein-
all cells, weighted by their psychological dis- forcement probability. This reinforcer manip-
tances from that cell, add within a cell to pro- ulation should, in theory, not have affected a
vide an overall effective reinforcer value for conditional-stimulus measure. They, and Nev-
that cell. Within each row of the matrix, re- in et al. (1982), went to some pains to extend
sponse allocation is assumed to match the rel- the Davison-Tustin model so that the invari-
ative effective reinforcer value. The model for ance of this stimulus measure could be pre-
the three-stimulus example described here served. As it turned out, neither extension
reduces directly to the two-stimulus case de- was satisfactory, but the model presented
scribed above when S3 is eliminated. The ef- here naturally, and without modification,
fective reinforcer matrix for the above 3 3 3 deals with reinforcement for errors, and so
matrix (with reinforcers only as R11, R22, and has the potential to deal directly with such
R33) is shown in Figure 9. data and preserve parameter invariance. We
The expressions in the above matrix are te- shall return to an analysis of these data later.
dious, rather than complicated. However, us- Parameter invariance requires that the
ing the same idea of the generalization of re- terms of our model be measured in ways that
inforcer effects to other stimulus–behavior permit unambiguous assignment of numeri-
pairs, equations for any N 3 M matrix, with cal values, and that these values behave in ac-
reinforcers in any or all cells, can be ob- cordance with the principles of measurement
tained. There is also no reason why, again us- theory. For example, from a series of condi-
ing the same basic theory, we should not ex- tions of a three-stimulus three-response pro-
pand the model into three dimensions (e.g., cedure, one can estimate log dsb12, log dsb23,
the third dimension might be the discrimi- and log dsb13 by a criterion of best fit. If the
nability between a set of second-order con- parameters characterize distances in psycho-
ditional cues or stimulus-choice delays). metric space according to an interval scale,
these three values of log dsb must be related
Requirements for an Effective Model by the expression log dsb13 5 log dsb12 1 log
An effective model for conditional discrim- dsb23. Moreover, the appropriate values must
ination and other performances (and indeed, remain unchanged when Stimuli 1 and 2, 2
STIMULI, REINFORCERS, AND BEHAVIOR 455

and 3, or 1 and 3 are employed in two-stim- the predicted changes in the slope of the
ulus two-response procedures. These require- function relating log B to the log reinforcer
ments can be confirmed (or disconfirmed) if ratio: As dsb increases, the slope decreases
our basic parameters are quantified on log (upper panels of Figure 6). However, the in-
interval scales, which we assume throughout verted-U form predicted for the function re-
and test wherever possible, for example, with lating log D to the reinforcer ratio (lower
temporal, color, and luminance discrimina- panels of Figure 6) appears only at the lowest
tion (Davison, 1991b; Godfrey & Davison, nonzero level of stimulus differences ar-
1998). ranged in his experiment. This failure of pre-
diction is not decisive because the reinforcer
ratio was varied by less than 61 log unit, thus
APPLICATION OF THE capturing only the central, relatively flat part
MODEL TO CONDITIONAL of the function. The same restriction of range
DISCRIMINATION DATA applies to the data of McCarthy and Davison
(1979, 1980a), who also failed to find the pre-
2 3 2 Conditional Discrimination
dicted inverted-U relation. More seriously,
Alsop (1988) and Alsop and Davison Whittaker (1977), using rats in a yes-no sig-
(1991) reported an experiment that mea- nal-detection procedure, varied the reinforc-
sured response–reinforcer discriminability er ratio for the two choice responses over a
(log dbr) and stimulus–response discrimina- much wider range than has been usual, and
bility (log dsb) for seven stimulus pairs or- failed to find both the inverted-U relation be-
dered in terms of stimulus disparity. They tween log D and the reinforcer ratio and the
used a standard signal-detection procedure predicted change in the slope of the relation
with different light intensities as the condi- between log B and the reinforcer ratio when
tional stimuli. These data constituted a de- the disparity of the conditional stimuli was
tailed assessment of the current model: If the changed. The relation between log B and the
model is correct, the estimated value of dsb log reinforcer ratio also failed to give any ev-
should be ordinally related to the stimulus idence of the predicted curvilinearity (Figure
disparity, and the estimated value of d br 6, top panels), and is thus incompatible with
should not change systematically with chang- the results reported by Davison and Jones
es in stimulus disparity. Figure 10 (upper pan- (1995). However, Whittaker used ratio sched-
el) shows that the first of these predictions ules, so the obtained reinforcer ratio covaried
was clearly supported. The situation with re- with response allocation, and conducted only
gard to the second prediction (lower panel) 8 to 15 sessions of 300 trials per session in
is less clear. There does appear to be a U- each condition. Recent research (Davison &
shaped relation between dbr and stimulus dis- Jones, 1998) suggests that even when rein-
parity. However, a Friedman test for a qua- forcer ratios are controlled, stability takes a
dratic relationship fails to find a significant long time to occur at extreme ratios. Thus,
quadratic trend at p 5 .05, so the U-shaped Whittaker’s study probably needs to be rep-
function is more apparent than real, or is not licated.
statistically evident because of large variances As noted above, our model predicts an
in some measures over subjects (i.e., in the asymmetry between the effects of dsb and dbr
A, B, and G sets). There is a serious problem on the relation between B and the reinforcer
in estimating dbr when dsb is high (such as in ratio: B becomes more sensitive to the rein-
Set G) because the few errors that are emit- forcer ratio as dbr increases, but becomes less
ted have a very strong effect on dbr , and a sensitive as dsb increases. Exactly this result
difference of one or two responses in each was reported by Nevin, Cate, and Alsop
error cell can radically change the value of (1993) in an experiment in which S1 and S2
dbr . For the present analysis, we used the Hau- were bright and dim keylights, and B1 and B2
tus (1995) correction on the data (see Ap- were defined as short or long latencies of a
pendix B) to try to eliminate problems of es- single response. They varied the reinforcer
timating the parameters when few responses ratio with large or small differences between
are emitted in some cells of the matrix. S1 and S2, and with large or small differences
Reanalysis of Alsop’s (1988) data reveals between the criteria for short or long laten-
456 MICHAEL DAVISON and JOHN A. NEVIN

Fig. 10. Reanalyses of the signal-detection data reported by Alsop (1988) and by Alsop and Davison (1991). The
upper panel shows the value of log dsb as a function of the ordinally increasing disparity between the intensity of S1
and S2. The lower panel shows estimates of log dbr for each disparity level.

cies. They obtained similar values for log d, rameters of our model, dsb increased with the
the Davison-Tustin measure of discrimina- luminance difference between S1 and S2, and
tion, in two conditions, one with a small S1-S2 dbr increased with the difference between cri-
difference and large B1-B2 difference and the teria for short and long latencies. However,
other with a large S1-S2 difference and a small the estimated value of dbr depended on the
B1-B2 difference. In the first of these condi- S1-S2 difference, violating the requirement of
tions, log b was a steep and orderly increasing parameter invariance. This violation may
function of the log reinforcer ratio, with a, have resulted from the fact that some re-
the generalized-matching-law measure of sen- sponses were not scored because their laten-
sitivity, about 0.75. In the second, log b was a cies did not meet the experimenters’ criteria.
more variable and shallower function of the For example, when B1 and B2 were defined as
log reinforcer ratio, with a values ranging responses with latencies between 1.0 and 2.0
from 20.14 to 0.40. Their results, which are s and between 2.0 and 3.0 s, respectively, a
shown in Figure 11, confirm a counterintui- latency of 0.9 s would not be counted even
tive prediction of our model that is shown in though it might belong to the functional class
Figure 6. of ‘‘short’’ latencies. Moreover, obtained la-
When Nevin et al. (1993) estimated the pa- tencies depended on stimulus intensity as
STIMULI, REINFORCERS, AND BEHAVIOR 457

In particular, she found that discriminability


measures for any particular stimulus disparity
were not different according to whether they
were obtained for a conditional stimulus or a
choice stimulus disparity. This, then, is very
strong evidence for the model.
Value Transfer
Although we have emphasized steady-state
conditional discrimination performance, our
model is also consistent with choice data ob-
tained with novel pairs of stimuli. For exam-
ple, Zentall and Sherburne (1994) trained pi-
geons on randomly alternating simultaneous
discriminations with red (100% reinforce-
ment) versus yellow (0%) and green (50%)
versus blue (0%), with color assignments
counterbalanced across birds. After training
to criterion, they conducted probe choice
tests with yellow and blue, and obtained sig-
nificantly more responses to yellow. This re-
sult follows from our model if we construe
red and yellow as defining B1 and B2 on S1
(red-yellow) trials and green and blue as de-
fining B1 and B2 on S2 (green-blue) trials. For
any value of dsb greater than one (represent-
Fig. 11. Differential responding to B1 or B2, measured
by log b, as a function of the obtained log reinforcer ratio ing discrimination between the two kinds of
(redrawn from Nevin, Cate, & Alsop, 1993). simultaneous-discrimination trials) and dbr
less than infinity (representing differentia-
tion between the response–reinforcer contin-
well as the latency criteria, so the functional gencies within each simultaneous discrimi-
(as opposed to experimenter-defined) re- nation), yellow obtains greater generalized
sponse classes were not independent of the strength than blue, as shown in Figure 12. We
conditional stimuli. In view of these prob- suggest that this transfer of value from S1 to
lems, the failure of parameter invariance is S2 within a simultaneous discrimination,
perhaps not surprising. which has been used to explain transitive in-
Godfrey (1997; Godfrey & Davison, 1998) ference in pigeons (Fersen, Wynne, Delius, &
avoided these problems by defining S1 and S2 Staddon, 1991), follows naturally from our
by the luminance of a center key, and defin- model.
ing B1 and B2 by the luminances of the choice
keys, with bright and dim lights presented ir- Reinforcement for Errors
regularly on the left and right keys. Her pro- Three experiments with animal subjects
cedure identifies the choice responses by the have systematically explored the effects of re-
stimuli signaling them, as in matching to sam- inforcing responses that are conventionally
ple or in a switching-key concurrent sched- designated ‘‘errors.’’ Davison and McCarthy
ule, and permits the differentiation between (1980) trained pigeons to discriminate be-
B1 and B2 to be specified on the same exper- tween S1 (a 5-s keylight) and S2 (a 10-s key-
imental continuum as S1 and S2, but to be light) in a procedure that arranged reinforce-
varied independently of the difference be- ment probabilistically for responses in each
tween S1 and S2. She varied reinforcer ratios of the four cells of the matrix of Figure 2.
for various different levels of conditional Both B11 and B22 were reinforced with a prob-
stimulus disparity and choice stimulus dispar- ability of .7 throughout the experiment, and
ity and found no significant effects of the for- the probability of reinforcement for B12 and
mer on measures of the latter, nor vice versa. B21 was varied across conditions from 0 to .9.
458 MICHAEL DAVISON and JOHN A. NEVIN

Figures 13 and 14 show the results of fitting


Equations 13a and 13b to Davison and Mc-
Carthy’s (1980) and Nevin et al.’s (1982)
data. The predictions fit the data well, with
values of dsb and dbr that are similar to those
obtained for moderately confusable stimuli
when only correct responses are reinforced.
For Davison and McCarthy’s data, mean dsb
values were about 10, but the dbr values ob-
Fig. 12. The effective reinforcer matrix for the ex- tained were poorly estimated because Davi-
perimental conditions arranged by Zentall and Sher- son and McCarthy did not explicitly vary the
burne (1994). reinforcer ratios. Accordingly, we used a dbr
value of 23 (the average for the Nevin et al.
data) for the fits in Figure 13. For Nevin et
In this arrangement, the number of reinforc- al., the average dsb and dbr values were 8 and
ers obtained in each cell of the matrix de- 23, respectively. As a further check on the
pends directly on the frequency of responses consistency of application of this model, we
in that cell. Nevin, Olson, Mandell, and Yar- looked at deviations of predictions from data
ensky (1975) performed a closely comparable both as a function of the percentage of R1
experiment with rats as subjects and bright or reinforcers and as a function of the percent-
dim lights as S1 and S2. Although the numbers age of reinforcers in the ‘‘correct’’ R11 and
of reinforcers (the independent variables in R22 cells. Fits to these deviations showed no
the model) depended on the numbers and significant deviations of slopes from a line of
ratios of responses (the dependent variables) 0 slope except for Bird 60 of the Nevin et al.
in both experiments, the model is applicable data.
because it is based on obtained rather than Our approach has also been supported by
scheduled reinforcers. a recent report by Hartl and Fantino (1996).
The third experiment (Nevin et al., 1982) In a conventional matching-to-sample proce-
used an alternative method for scheduling re- dure, they varied the probabilities of rein-
inforcers to insure that the number of rein- forcement for making one or the other
forcers obtained in a cell approximated the choice response to a comparison stimulus
number programmed (Shimp, 1969; Stubbs that matched the sample, and independently,
& Pliskoff, 1969). The essence of the method for responding to a particular comparison
is to arrange the availability of a reinforcer in stimulus regardless of the sample. The latter
a particular cell and withhold reinforcers in variation effectively arranges reinforcers for
all other cells until that one has been ob- errors, and their data were well explained by
tained. This has come to be known as inter- an earlier (but algebraically equivalent) ver-
dependent (or just dependent) scheduling, sion of our model (see their p. 23 for discus-
because reinforcement for one response de- sion). In conclusion, it appears that our mod-
pends on whether a reinforcer has been el for the conventional two-stimulus
scheduled for and obtained by emitting an- two-response conditional discrimination ex-
other response. It has also become known, in tends naturally, and with the requisite invari-
the signal-detection literature (McCarthy & ance of dsb , to situations in which reinforcers
Davison, 1984), as a controlled reinforcer-ratio occur in all four cells of the 2 3 2 matrix.
procedure, because the ratio of obtained rein-
forcers is specified in advance, by the exper- Matching to Sample and Its Variants
imenter, within the limits of statistical fluctu- As we have noted above, the conventional
ation. Nevin et al. employed pigeons as three-key matching-to-sample paradigm is a
subjects with 2-s and 3-s keylights as S1 and S2 two-stimulus two-response conditional dis-
with interdependent scheduling to control crimination in which the sample, presented
obtained reinforcer ratios, and varied the ra- on the center key, serves as the conditional
tio of reinforcers for B1 and B2 independently cue and the side-key choices are defined by
of the ratio of reinforcers for correct respons- comparison stimuli. The procedure has been
es and errors, as traditionally defined. used intensively by researchers whose prima-
STIMULI, REINFORCERS, AND BEHAVIOR 459

Fig. 13. Predicted and obtained relative distribution of responses for the data reported by Davison and McCarthy
(1980). The straight lines were fitted by the method of least squares.

ry interests are in cognitive processes such as rameter dsb depends on the discriminability of
coding, retrieval, and limited-capacity work- the relations between the sample and choice-
ing memory, as well as in behavior-analytic re- defining stimuli, whereas dbr depends on the
search on equivalence classes. We will not at- discriminability of the relations between the
tempt to review the massive literature in this choice responses and the reinforcer. Accord-
area, but to the extent that our model can ing to our theory, then, in identity matching
account for performance in this paradigm, it the values of dsb and dbr should be identical,
will provide an alternative to explanations despite the fact that the conditional stimuli
that invoke cognitive processes. occur successively in time and the stimuli sig-
Identity versus symbolic matching. The best naling the choice alternatives occur simulta-
known version of the matching-to-sample par- neously. This was shown to be the case in the
adigm, termed identity matching, employs research reported by Godfrey (1997) and
comparison stimuli that are physically similar Godfrey and Davison (1998) (see the section
to the samples. A related version, termed sym- on 2 3 2 conditional discrimination, above).
bolic matching, employs comparison stimuli Delayed matching and delayed reinforcement. A
from an independent dimension to define widely studied variant of the matching-to-sam-
the choice responses. In our model, the pa- ple paradigm that is of special interest in re-
460 MICHAEL DAVISON and JOHN A. NEVIN

thy, & Davison, 1984; White, 1991). In the


terms of our model, this decrease results
from stimulus–response discriminability (dsb)
becoming degraded during the delay with re-
sponse–reinforcer discriminability (dbr) re-
maining high. The present model, then,
would predict that as dsb decreases with in-
creasing delays, log d (the Davison-Tustin
measure that has been used extensively in
this area) should also decrease, but sensitivity
to reinforcement as measured by a, the slope
of the generalized-matching-law relation be-
tween log b and the log reinforcer ratio,
should increase, as shown in Figure 6.
It is also well known that introducing a de-
lay between choice responses and the rein-
forcer (delay of reinforcement) also decreas-
es log d (e.g., McCarthy & Davison, 1991).
However, log d values would fall not because
of decreasing stimulus–response discrimina-
bility, but because of decreasing response–re-
inforcer discriminability. As the delay length-
ens, the delay-of-reinforcement procedure
will come to function as a reinforcement-for-
errors procedure, and will produce the same
effects. But when response–reinforcer dis-
criminability is compromised, sensitivity to re-
inforcement, as measured by a, will also de-
crease.
Fig. 14. Predicted and obtained relative distribution Rather than attempting to estimate dsb and
of responses for the data reported by Nevin, Jenkins, dbr from the voluminous data in this area, we
Whittaker, and Yarensky (1982). The straight lines were will show some examples of effects that can
fitted by the method of least squares.
be expected when values of log d and a (mea-
sures frequently reported in this area) are
search on short-term memory introduces a predicted from the present model using rep-
delay (or retention interval) between the offset resentative values of dsb and dbr . But first we
of the conditional cue and the availability of develop the model. We shall assume, for con-
the choice responses. We will distinguish two venience, that we can specify a dt value for
procedures that are commonly used in re- each constant increment in delay time. What
search on delay of choice. The first, termed this means, in effect, is that in each (say) 1-s
a fixed-delay procedure, employs a single delay interval, a constant value of dt—a discrimi-
value between offset of the sample and onset nability parameter just like dsb and dbr—oper-
of the comparison stimuli throughout an ex- ates in the usual way on the effective rein-
perimental condition, and varies the delay be- forcer matrix that exists at the start of this 1-s
tween conditions in order to determine a de- interval to produce a new effective reinforcer
lay gradient or forgetting curve. The second, matrix. This simply has the effect of progres-
termed a mixed-delay procedure, arranges a sively moving the effective reinforcer matrix
number of different delays within a single towards nondifferential reinforcement with
condition. A well-nigh universal effect of respect to responses (i.e., equal reinforcer
lengthening the delay is a progressive de- frequencies for the two responses in the pres-
crease in accuracy of discrimination as mea- ence of each stimulus). The selection of a 1-
sured by percentage correct or by the Davi- s step is, of course, arbitrary, and smaller
son-Tustin measure log d (e.g., Cumming, steps will simply require a smaller dt value.
Berryman, & Nevin, 1963; Harnett, McCar- Because effective reinforcer allocation
STIMULI, REINFORCERS, AND BEHAVIOR 461

Fig. 15. Log discrimination and sensitivity to reinforcement, as measured using the Davison-Tustin (1978) model,
when the delay between stimulus presentation and choice is varied for some representative values of log dsb and log
dbr . The value of dt is constant across time.

changes over the course of the delay, the ment, measured by a, either increases or may
model must be applied in a successive, rather appear to remain constant when dbr is rela-
than a simultaneous, manner. That is, for the tively high and dsb is relatively low.
delay-of-choice situation, we need to operate Figure 16 shows delay-of-reinforcement
progressively on stimulus–response discrimi- predictions. Both log d and a fall with increas-
nability (dsb )using dt during this delay before ing delay with sensitivity being generally high-
applying response–reinforcer confusion when er when dbr is greater, though sensitivity is sim-
the choice is emitted. In the delay-of-rein- ilar when dsb and dbr are both high and when
forcement situation, the matrix results of they are both low.
stimulus–response confusion will be progres- In an extensive experiment, using mixed
sively operated on by dt over the delay before rather than fixed delays and quite high con-
being operated on by response–reinforcer ditional stimulus discriminability values (log
confusion. The particular matrix values that d values generally between 1.3 and 1.9), Jones
are current when the choice stimuli are pre- and White (1992) reported a statistically sig-
sented, or when the reinforcer is delivered, nificant increase in sensitivity to reinforce-
will be used to predict response ratios and ment with increasing stimulus–choice delay,
calculate log d. confirming the predictions in the upper pan-
Figure 15 shows how predicted measures of els of Figure 15. In a related experiment with
discrimination and sensitivity to reinforce- fixed rather than mixed delays, Harnett et al.
ment change with increasing delay of choice (1984) found the usual decrement in log d
with some representative values of dsb and dbr . with increasing delay, and but no statistically
Discrimination, measured by log d, falls un- significant change in sensitivity to reinforce-
der all conditions, and sensitivity to reinforce- ment. Because their log d values were some-
462 MICHAEL DAVISON and JOHN A. NEVIN

Fig. 16. Log discrimination and sensitivity to reinforcement, as measured using the Davison-Tustin (1978) model,
when the delay between choice and reinforcement is varied for some representative values of log dsb and log dbr . The
value of dt is constant across time.

what lower than those of Jones and White, with increasing stimulus–choice delay, given
this lack of a statistically significant change that log d values were around 1.0, is inconsis-
accords reasonably with the shallow sensitivity tent with the predictions in Figure 15. Mc-
function predicted in the lower left panel of Carthy and Davison made no correction for
Figure 15. zero cells, and such a correction would have
McCarthy and Davison (1986) reported had the effect of marginally decreasing the
that both delay of choice and delay of rein- sensitivity (and log d) values at short delays.
forcement decreased log d, but that the effect However, it remains puzzling that no increase
for delay of choice was greater than that for in sensitivity was found. This last result, then,
delay of reinforcement. This difference is appears partially but not strongly to argue
predicted for cases in which dbr is high, as can against the present model. In passing, it is
be seen by comparing the left panels of Fig- important to note that McCarthy and Davison
ures 15 and 16 (but note that the predicted did fit Equations 7a and 7b to their data, and
decreases in log d are identical when dbr is found some changes in dsb and dbr with in-
moderate, as in the right panels). McCarthy creasing stimulus–choice and choice–rein-
and Davison (1991) replicated this effect, and forcer delays. This lack of invariance is not an
also measured sensitivity to reinforcement (a) argument against the full model, however, be-
for each stimulus–choice and choice–rein- cause they did not use the model with in-
forcer delay. Sensitivity to reinforcement de- creased confusion with elapsing delays as pre-
creased (though not greatly) when both de- sented above.
lays were varied. Although our model Finally, McCarthy and Voss (1995), using a
predicts the decrease in a with increasing fixed-delay procedure, provided clear evi-
choice–reinforcer delay, the decrease in a dence that sensitivity to reinforcement fell
STIMULI, REINFORCERS, AND BEHAVIOR 463

with increasing stimulus–choice delay, both model for mixed stimulus–choice delays is ef-
for small and large reinforcer durations, con- fected by degrading dsb initially by dt , as in the
trary to our prediction. Given the wide range single-delay model above, and then degrad-
of results reported on the relation between ing each of the resulting reinforcer values by
sensitivity to reinforcement and delay in de- the appropriate dd values. The development
layed matching to sample, it is not surprising is done here for just two mixed delays, but
that we have some difficulty in modeling in may logically be applied to any number. With
this area. The majority of results seem to sug- the loss of only a little generality (the effects
gest that sensitivity falls with delay, which is of dt), we shall present the development for
incompatible with our model. However, we a second-order conditional discrimination in
would argue that the final assessment of our which two stimuli such as continuous versus
model for delay of choice should await the flashing house lights, on a separate dimen-
empirical resolution of just what variables are sion from the first-order stimuli, signal differ-
critical to produce increases or decreases in ent sets of first-order stimulus–behavior–re-
the value of a with delay; until, in other inforcer contingencies. These stimuli are
words, we have a clear empirical result to designated SA and SB, and represent the two
model.5 delays in White and Cooney’s procedure.
Second-order discrimination of mixed delays. The reinforcer matrix we shall use is shown
Many researchers, but notably White and his in Figure 17. Note that the B1 and B2 contin-
associates (e.g., White, 1985), have varied gencies are reversed between the second-or-
stimulus–choice delays within sessions rather der stimuli, SA and SB. The discriminability of
than across conditions. Recently, White and the second-order stimuli is d2, and we assume
Cooney (1996) varied the reinforcer ratio that the response–reinforcer discriminability
separately following two delays that occurred (dbr) and the first-order stimulus–response
in irregular order. For example, in one set of discriminability (dsb) are the same under both
conditions, reinforcer probabilities for cor- second-order conditional stimuli.
rect responses to red and green choice keys Figure 18 shows the predicted effects of
varied across conditions from .1 to .9 when varying the reinforcer ratio (Ra11/Ra22) in the
the stimulus–choice delay was 0.1 s but were presence of SA (analogous to the 0.1-s delay
constant at .5 when the stimulus–choice delay in the example from White and Cooney,
was 4 s. In effect, the length of the delay may 1996) on responding with respect to S1 and
be construed as a second-order conditional S2 (log D) and with respect to B1 and B2 (log
cue signaling differential reinforcer probabil- B) in the presence of SB (analogous to the 4-
ities. Overall, White and Cooney found that s delay) which, in our example, offers equal
the reinforcer ratio arranged at one delay did Rb12 and R b21 reinforcer rates. The values of
not affect performance at the other delay dsb and dbr are both 10, and the value of d2 is
with a different reinforcer ratio, and conclud- 2 (upper panel) and 10 (lower panel). As
ed that ‘‘performance at one retention inter- would be expected, when there is little dis-
val is independent of factors that influence crimination between SA and SB, varying the
performance at another’’ (p. 55). reinforcer ratio in SA has a strong effect on
The two delays that White and Cooney responding in the presence of SB. The value
(1996) used were highly discriminable (Mc- of log D in SB is negative because of the re-
Carthy & Davison, 1980a), but if their delays versed contingencies of reinforcement. These
had been 3.8 and 4 s, they probably would values are affected by the discriminability be-
have found interdependence of reinforcer-ra- tween SA and SB, and are smaller in an abso-
tio effects. To address the generalization of lute sense when d2 is smaller and are nonlin-
the effects of reinforcers across different de- ear with respect to the SA reinforcer ratio.
lays, as well as across stimuli and responses, This analysis shows that the mixed-delay
we introduce the notion of the discriminabil- procedure will affect measurements of stim-
ity between stimulus–choice delays, dd. The ulus discriminability more than the fixed-de-
lay procedure will, and will provide lower es-
5 White and Wixted (1999) recently described an in- timates of discrimination (log d) than the
verse relation between discrimination and sensitivity to single-delay procedure even when reinforcer
relative reinforcement in delayed matching to sample. ratios are the same at all delays. The exact
464 MICHAEL DAVISON and JOHN A. NEVIN

Fig. 17. Upper panel: the matrix of events in a second-order conditional discrimination in which B1 is reinforced
in the presence of S1 and B2 in the presence of S2, both when Stimulus A is presented. The contingencies of rein-
forcement are reversed when Stimulus B is presented. Lower panel: The effective reinforcer matrix for the events
shown in the upper panel.

pattern of results obtained will depend criti- The expanded matrix defining the operants
cally on the distribution and spacing of stim- for this case is shown in Figure 19.
ulus–choice delays. More generally, it shows When the VI schedules were varied, the re-
that the discriminability of second-order cues lation between the ratio of responses at each
correlated with different outcome matrices in S2–S6 orientation and the associated reinforc-
conditional discriminations, as in Hobson er ratio depended on the difference in ori-
(1978), will affect the estimation of model pa- entation of the stimuli signaling the pairs un-
rameters. der investigation. To characterize the results
in terms of the generalized matching law: a
Complex Stimulus Discrimination (sensitivity to reinforcement) decreased as
An experiment reported by White et al. stimulus disparity increased, as predicted by
(1984) involved a free-operant conditional our model. In a related study, White (1986)
discrimination in which pecking the right key varied stimulus differences between condi-
produced food according to one VI schedule tions and found the same result: As predict-
when a vertical line was projected on both ed, sensitivity to reinforcer ratios in a free-
keys (S1) and pecking the left key produced operant conditional discrimination was
food according to another VI schedule when inversely related to stimulus disparity across
the line was tilted 15, 30, 45, 60, or 758 (S2– successive conditions.
S6), presented irregularly within a single ex- This finding is not limited to free-operant
perimental condition. This paradigm is like procedures. Davison and McCarthy (1987)
that discussed in the section on more than trained pigeons to peck left given a fixed du-
two stimuli and two responses, above, but ration of center-key illumination (either 5 s
with five stimuli assigned to one of the re- or 20 s in different parts of their study) and
sponses. Left-key pecks given the vertical line, to peck right given any of 12 other durations
and right-key pecks given any other orienta- ranging from 2.5 s to 57.5 s in 5-s steps.
tion, were neither reinforced nor punished. Again, the paradigm is like that examined in
STIMULI, REINFORCERS, AND BEHAVIOR 465

Fig. 19. The matrix of events in the experiment re-


ported by White, Pipe, and McLean (1984) showing how
the cells of the matrix are subscripted. Response B1 was
reinforced in the presence of a 08 slant, and B2 was re-
inforced in the presence of all other orientations.

inforcement and discriminability that was


found empirically. In addition, Davison
(1991b) reported the analysis of a set of data
on color discrimination in pigeons, in which
they were required to peck the left or right
side keys according to which of eight color
stimuli (559 to 594 nm in steps of 5 nm) had
been presented. Both the reinforcer ratio for
correct responses and the stimuli signaling
reinforcers for pecking left or pecking right
were varied. This analysis, with seven dsb pa-
Fig. 18. The predicted effect of varying the reinforcer rameters, provided an excellent description
ratio in the presence of second-order Conditional Stim- of the data, and furthermore the dsb param-
ulus A on stimulus differentiation (log D, Equation 5) eters were related to wavelength in the same
and response differentiation (log B, Equation 3) in the
presence of second-order Conditional Stimulus B. As not- way as has been found for generalization, dis-
ed in the text, SA is analogous to a short delay correlated crimination, and color-naming functions for
with varying reinforcer ratios, and SB is analogous to a the pigeon over this wavelength range (Shep-
long delay correlated with a constant reinforcer ratio. If ard, 1965; Wright, 1974; Wright & Cumming,
the delays are confusable (d2 small, upper panel), rein-
forcer ratios in SA affect performance in SB, but there is 1971). Thus, our model provides convergent
little effect if the delays are highly discriminable (d2 large, measurement of the distances between stim-
lower panel). uli in psychometric space.
Multiple Stimuli and Multiple
the section on more than two stimuli and two Correct Responses
responses, above, but with 12 different stimuli We next consider the classical recognition
assigned to one of the responses. When they task of psychophysics, in which subjects are
varied the reinforcer ratio, they found that presented with one of N different stimuli in
sensitivity to reinforcement was inversely re- random order and asked to make one of N
lated to the discriminability of the duration different responses to indicate which stimulus
examined, relative to the fixed duration. was presented. Interestingly, humans have
Davison (1989) showed that the present trouble identifying more than seven (62) dif-
model gave a good account of Davison and ferent tone intensities even when the tones
McCarthy’s (1987) temporal discrimination are highly discriminable in the sense that few,
data, providing 12 rational dsb estimates and if any, errors occur when fewer than seven
an appropriate estimate of dbr . More impor- tones are presented (e.g., Pollack, 1952). The
tant, after best fit discriminability estimates recognition experiment has been repeated by
were obtained, the model predicted almost Chase (1983) with pigeons as subjects in a
exactly the relation between sensitivity to re- chamber equipped with nine keys. The stim-
466 MICHAEL DAVISON and JOHN A. NEVIN

Fig. 20. Percentage correct responses for various ranges of log luminance levels (left panel) reported by Chase
(1983) (represented by separate plots with distinct data points in the left panel) and model predictions assuming
that a 0.4 log luminance difference corresponded to a dsb value of 10 (resulting in various assumed values of dsb ,
represented by separate plots with distinct data points in the right panel).

uli were various luminances displayed on a log units, or 80 units on the dsb scale. We also
single rectangular key above the nine re- assumed a value of 10 for dbr for the peck–
sponse keys, presented in discrete trials in ir- food relations between adjacent keys. The
regular order; a single peck at the key desig- overall percentage correct predicted by our
nated as correct for each luminance model for Chase’s conditions is shown in the
produced 2-s access to food. The paradigm is right panel of Figure 20. Overall, the agree-
like that examined in the section on N re- ment between Chase’s average data and the
sponses and M stimuli, above, with the num- predictions of our model is respectable. Our
bers of stimuli and responses varied across model also predicts that errors will be most
conditions. In one series of conditions, Chase frequent in the middle of the stimulus–re-
compared performances involving luminance sponse matrix, as found by Chase with pi-
ranges of 0.8 log units, 1.8 log units, 3.0 log geons and by Pollack (1952) with humans.
units, and 3.8 log units, with the number of Recognition tasks usually arrange ordinal
stimulus–response pairs varying from three to stimulus–response mapping: The least in-
nine. Average percentage correct increased tense stimulus is identified with Response 1,
with the range over which the stimuli were the next with Response 2, and so on. With
distributed and decreased as a function of pigeons, Chase (1983) compared ordered
number of stimulus–response pairs defined and unordered stimulus–response identifica-
within that range, as shown in the left panel tion and found that percentage correct was
of Figure 20. We modeled Chase’s data by as- substantially lower with unordered identifi-
suming that a 0.4 log-unit difference between cation. Our model does not predict a large
stimuli corresponded to a dsb value of 10 decrement with unordered identification, but
(thus, a set of nine stimuli would span a it does predict one aspect of Chase’s results:
range of 80 expressed as dsb). Because Chase’s secondary modes in response probability.
data were similar for the two largest ranges, Chase suggested that ‘‘Secondary peaks can
3.0 and 3.8 log units, we treated both as 3.2 be accounted for if it is assumed that discrim-
STIMULI, REINFORCERS, AND BEHAVIOR 467

ination among key positions is imperfect’’ than the obtained value. If such findings are
(1983, p. 45). This, of course, is exactly what replicated, they could indicate that psycho-
our parameter dbr quantifies, and our model metric space is nonlinear in log terms (and
predicts secondary peaks at stimuli for which may be linear in other terms). Clearly, more
the correct key happened to be adjacent to research is needed in this area.
the key being considered, as Chase reported. One surprising aspect of the results was
For example, when Response 9 was defined that the actual behavior ratios between pairs
as correct for Stimulus 8, a secondary peak of three-term contingencies were unaffected
in the probability of pecking Key 9 appeared by the addition of further contingencies, a re-
at Stimulus 3, which was correct for Response sult that appears to be incompatible with the
8 in the unordered identification set we ex- present model. Consider a 2 3 2 matrix of
plored. the form S2:(B2 → R2) and S5:(B5 → R5). If a
Godfrey (1997) reported an experiment further contingency S4:(B4 → R4) is added, in
that compared performance in 2 3 2, 3 3 3, which S4 is closer physically to S5 than to S2,
and 4 3 4 conditional discrimination matri- we would expect that S5 performance would
ces. The conditional stimuli were intensities gain more reinforcer value than S2 perfor-
of yellow light displayed on a central key, and mance, leading to a decrement in the S2/S5
the choice responses consisted of pecking behavior ratios. However, given reasonable
one of six red keys set around the central key discriminability between S4 and S5, the effect
(thus allowing the potential for 5 3 5 and 6 would be rather small, and may not be dis-
3 6 matrices, but these were not investigat- cernible in the general error variance. This
ed). In one condition, two stimuli designated effect also requires further investigation. De-
S2 and S5 were presented successively with re- spite such uncertainties, it is clear that our
inforcement arranged for pecks at Keys B2 model provides an effective account of stim-
and B5. In a second condition, a stimulus of ulus control and choice in a wide variety of
intermediate intensity (S4) was also present- conditional discrimination paradigms.
ed, with reinforcement arranged for pecks at
Key B4. In a third condition, another stimulus
of intermediate intensity (S3) was included, MULTIPLE AND
with reinforcement arranged for pecks at Key CONCURRENT SCHEDULES
B3. Thus, it was possible to compare the dis- We began this article by discussing some
criminability of a given pair of stimulus–be- qualitative similarities that result when the
havior relations (e.g., dsb25) across conditions differences among stimuli, responses, and re-
with different numbers of discriminated op- inforcers are varied. Our argument began
erants. In addition, it was possible to deter- with conventional multiple and concurrent
mine whether log dsb values combined addi- schedules, but the remainder of our devel-
tively, as we have assumed. The average opment dealt with experimental paradigms in
parameter estimates and the additivity predic- which the reinforcement contingencies for
tions are presented in Figure 21. Across con- two or more concurrently available choice re-
ditions, there were no systematic changes in sponses depend on the values of two or more
any dsb parameter estimates when further successively presented stimuli. In effect, all
stimulus–response–reinforcement contingen- such paradigms can be characterized as mul-
cies were added or subtracted. For example, tiple concurrent schedules. We now apply the
log dsb for S2 versus S5 when studied alone (up- general model that successfully accounted for
per panel) is not substantially affected by add- a large array of findings in these paradigms
ing one (middle panel) or two (lower panel) to the presumably simpler cases: multiple and
intervening stimuli. The model thus works concurrent schedules.
well with complex stimulus–response–rein-
forcer relations. Further, as Figure 21 also Multiple Schedules
shows, the additivity requirement (that, e.g., As we noted at the outset, all measured be-
log dsb23 1 log dsb34 5 log dsb24) seems to be havior occurs in a setting that includes un-
approximately true. However, in all four cases measured, extraneous behavior and reinforc-
that could be investigated with these data, the ers. A 2 3 2 matrix that incorporates
predicted value of log dsb was somewhat less extraneous behavior (Be) and reinforcers (Re)
468 MICHAEL DAVISON and JOHN A. NEVIN

for a standard single-key multiple schedule nents, the number of free parameters in
with distinctive stimuli S1 and S2 to signal the Equations 1 and 2 is excessive.
components would look like Figure 22, and It is possible to model multiple VI extinc-
the resulting equations, including discrimi- tion schedules, which are common in the
nability parameters, would be, for S1, study of stimulus control, by assuming that dbr
is infinite. Then, Equation 15 simplifies to
R 21 R R 2e
R 11 1 1 1e 1 B1 B (R 1 R 1e /dsb )
B 11 dsb dbr dsb dbr 5 dsb · 1e · 2e . (16)
5 (14a) B2 B 2e (R 1e 1 R 2e /dsb)
B 1e R R R
R 1e 1 2e 1 11 1 21 This expression implies that if the ratios of
d sb dbr d sb dbr
extraneous responses and their reinforcers in
and, for S 2, S1 and S2 remain constant, the ratio of mea-
sured responses depends on dsb but is inde-
R 11 R R
R 21 1 1 2e 1 1e pendent of reinforcer rate in S1, as found by
B 21 d sb dbr dsb dbr Cumming (1955).
5 . (14b)
B 2e R 1e R 21 R 11 Finally, if Re is simulated by reinforcement
R 2e 1 1 1 for a specified alternative response, the ma-
d sb dbr d sb dbr
trix for the resulting multiple concurrent
Dividing these equations gives us an expres- schedule is just like that for reinforcement
sion for the relation between the response ra- for ‘‘errors,’’ for which performance is read-
tio, B 1/B 2, and the reinforcer rate in each ily modeled (Equations 13a and 13b). Re-
component, in Equation 15, below. Some search on the simulation of extraneous rein-
simplifying assumptions are needed to re- forcers in multiple schedules has been
duce the number of free parameters in Equa- reported by Davison (1993), Lobb and Davi-
tion 15. It may seem reasonable to assume son (1977), McLean and White (1983), and
that the discriminability of response–reinforc- McLean (1992, 1995). The findings in gen-
er contingencies such as (B1:key peck → R1: eral suggest that simulated extraneous rein-
food) and (Be :scratch → Re :relief of itch) forcers are reallocated between components,
would be essentially infinite. However, key especially when they are arranged on ratio
pecking for magazine grain and pecking at schedules. The present model cannot direct-
the chamber floor for spilled grain might be ly, without further assumptions, predict the
moderately confusable, but the frequencies degree of reallocation of Re between multiple-
of these events are unknown, and we do not schedule components; it simply allows for it
wish to adopt the probably incorrect assump- (but see Davison, 1993). One result is clearly
tion that Re is the same in both components. predicted from Equations 13a and 13b: ‘‘Suc-
If variable-time reinforcement is explicitly cessive independence’’ (the independence of
used to simulate Re, the extraneous reinforc- log response ratios in one component from
ers are countable, but dbr may range from one the conditions of reinforcement in the other
to near infinity, depending on the subject’s component; McLean & White, 1983) will oc-
response rate, interresponse-time distribu- cur when dsb12 is asymptotically large (as with,
tion, and reinforcement history. Related dif- e.g., red vs. green signaled components; see
ficulties arise with Be: Unless we adopt Herrn- Charman & Davison, 1983). However, succes-
stein’s (1970) restrictive (and probably sive independence will not hold when dsb12 is
incorrect) assumptions that B 1 Be 5 k, and small because the effects of reinforcers in one
that k is the same for both schedule compo- component will generalize and contribute to

1 2 1R 2
R 21 R R 2e R 1e R R
B 1e R 11 1 1 1e 1 2e 1 1 21 1 11
B 11 d sb dbr d sb dbr d sb d br d sb d br
5 . (15)

1 2 1R 2
B 21 R R R R R R
B 2e R 1e 1 2e 1 11 1 21 21 1 11 1 2e 1 1e
d sb dbr d sb dbr dsb dbr dsb dbr
STIMULI, REINFORCERS, AND BEHAVIOR 469

Fig. 21. Left: data from Godfrey (1997) comparing obtained dsb in 2 3 2, 3 3 3, and 4 3 4 conditional discrim-
ination tasks using the same stimuli. Right: predicted stimulus–behavior discriminability values for some pairwise
combinations obtained by adding log dsb for component pairs (e.g., using log dsb23 1 log dsb34 to predict log dsb24).
The top panel shows data from Stimuli 2 and 5 in a 2 3 2 task, the center panel shows data from Stimuli 2, 4, and
5 in a 3 3 3 task, and the bottom panel shows data from Stimuli 2, 3, 4, and 5 in a 4 3 4 task.

effective reinforcement in the other compo-


nent and vice versa, as explained above for 2
3 2 conditional discriminations.
Concurrent Schedules
Conventional two-key concurrent sched-
ules are arranged in the presence of a single
stimulus condition, so S1 and S2 are equiva-
lent and its matrix (Figure 23) would collapse
into a single row (which simplifies subscript-
Fig. 22. A detection matrix for multiple schedules in- ing). The resulting equation for the B1/B2 ra-
corporating extraneous behavior and reinforcers. tio is
470 MICHAEL DAVISON and JOHN A. NEVIN

The predictions of this approach are that


the ratio of responses B1 and B2 will be un-
affected by the value of Re if, and only if, dbr1e
and dbr 2e are infinite (no confusion). Such a
situation is shown in the right graphs in Fig-
Fig. 23. A detection matrix for concurrent schedules ure 24. Interpreting this situation as a three-
incorporating extraneous behavior and reinforcers.
alternative concurrent schedule, the con-
stant-ratio rule (i.e., constancy of choice
R2 R ratios between a pair of alternatives when a
R1 1 1 e third is added or removed; Luce, 1959) will
B1 d br12 dbr1e
5 . (17) be correct under these conditions—that is,
B2 R R
R2 1 1 1 e only when both differences between defined
dbr12 dbr 2e and extraneous response–reinforcer contin-
If we assume that Be → R e is infinitely discrim- gencies are highly discriminable. If they are
inable from B 1 → R 1 and B 2 → R 2, the equa- less than highly discriminable, then because
tion simplifies to the effects of Re are additive, increasing ex-
traneous reinforcer rates (or decreasing ar-
R2 ranged R1 and R2 reinforcer rates) will de-
R1 1
B1 dbr12 crease response differentiation between B1
5 , (18) and B2 (Figure 24, left and center panels). In
B2 R
R2 1 1 generalized-matching terms, sensitivity to re-
dbr12
inforcement is predicted to fall as arranged
which, as noted above, provides a good ac- R1 and R2 reinforcer rates decrease, assuming
count of concurrent VI VI schedule perfor- a constant third-alternative reinforcer rate. If
mance. extraneous reinforcers are assumed to exist

Fig. 24. The predicted relation between log response ratios and log reinforcer ratios in concurrent VI VI sched-
ules. The parameters of each graph are, respectively, the value of the extraneous reinforcer rate (or the third-
alternative reinforcer rate) and the value of the response–reinforcer discriminability between the first two alternatives
and the third alternative. The data were simulated by distributing a total of 40 reinforcers between the first two
alternatives.
STIMULI, REINFORCERS, AND BEHAVIOR 471

in their experiments, this result was reported course, cannot predict anything other than
by Alsop and Elliffe (1988) and by Elliffe and the absence of responding on the extinction
Alsop (1996). The present model predicts alternative.
that overall reinforcer rate does not affect re-
sponse ratios under two specific conditions: Choice-Controlling Variables
(a) if the extraneous reinforcer rate has a val- It is well known that many other variables
ue of zero or (b) if there is perfect discrimi- in addition to reinforcer rate, such as rein-
nation between the pair of response–rein- forcer magnitude, delay, and quality, deter-
forcer contingencies under consideration mine choice allocation in concurrent sched-
and any substantive extraneous reinforcer ules (see Davison & McCarthy, 1988, for
rate. The same conclusions follow for explic- review). Such variables can be incorporated
itly defined and reinforced third-alternative into our model, but not without raising some
responses. theoretical problems. When some variable
We have assumed in this discussion that if such as reinforcer magnitude is arranged dif-
the two alternatives being examined are Al- ferentially for two or more choices, both the
ternatives 1 and 2, the response–reinforcer values of the reinforcers and the discrimina-
discriminability between Alternatives 1 and 3 bility of the response–reinforcer relations are
and between Alternatives 2 and 3 are equal. altered. This issue has not arisen previously
However, the values of these discriminabilities because we have modeled only cases in which
may not be equal if B1 and B2 are topograph- the same reinforcer was arranged for all dis-
ically different. criminated operants, and only reinforcer fre-
As Davison and Jenkins (1985) pointed quency varied. We now explore several ways
out, and as we illustrate in Figure 24, Equa- of addressing additional reinforcer variables,
tion 18 implies that the relation between the with reinforcer magnitude (M) as the rele-
log response ratio and the log reinforcer ratio vant variable, and neglecting (for this pur-
is nonlinear whenever dbr is less than infinity. pose) extraneous behavior and reinforcers.
The nonlinearity should become more ap- Research has suggested the following gen-
parent as reinforcer ratios become more ex- eralizations that should be accommodated by
treme. Using a switching-key procedure with the model without adding outrageous num-
only moderately discriminable keylight lumi- bers of parameters:
nances to signal the alternatives, Davison and Result 1. When both reinforcer magnitudes
Jones (1995) arranged reinforcer ratios that are varied, log response ratios are a mono-
were as extreme as 160:1. They reported sta- tonic increasing, probably linear, function of
tistically significant nonlinearity between log log magnitude ratio. Linearity cannot be
response and reinforcer ratios in the direc- strongly asserted because researchers have
tion predicted by our model. not varied reinforcer magnitudes over a wide
A second implication of Equation 18 is range of values. The most extensive data
that, if dbr12 is not infinite, responding will be come from Schneider (1973), who used four
maintained in the extinction component of magnitude combinations and found no ob-
concurrent VI extinction schedules, as has vious deviations from linearity. It is possible
been reported by several researchers (e.g., that Result 1a, below, is more general.
Catania & Cutts, 1963; Davison & Hunter, Result 1a. If one magnitude is held con-
1976). Moreover, when R2 is zero, Equation stant while the other varies, log response ra-
18 simplifies to B1/B2 5 1/dbr12; that is, the tios are a monotonic increasing but nonlinear
response ratio is constant and independent function of log magnitude ratios (Davison &
of R1. Davison and Jones (1998) reported a Hogsden, 1984).
switching-key experiment supporting this pre- Result 2. When reinforcer magnitudes are
diction. Subsequent research in Davison’s lab- different but constant for two alternatives and
oratory, using extended training under each the ratio of reinforcer rates is varied, log re-
experimental condition, has provided further sponse ratios are a biased linear function of
support, and it would be interesting to ex- log reinforcer-rate ratios. Again, the data sup-
plore the effects of systematically varying the porting this rather widely accepted general-
difference between the stimuli that define the ization are sparse, as pointed out by Davison
alternatives. The generalized matching law, of and McCarthy (1988).
472 MICHAEL DAVISON and JOHN A. NEVIN

Result 3. When reinforcer magnitudes are where dbm is the discriminability of the rela-
different but constant for two alternatives and tion between responses and the values of the
reinforcer rates are kept equal while their ab- reinforcer magnitudes M1 and M2, and dbr is
solute value is varied, response ratios are a interpreted as the discriminability of the re-
decreasing function of absolute reinforcer lation between responses and the frequency
rate (Davison, 1988). of reinforcers. In this expression, dbm will be
Result 3a. However, when reinforcer mag- some function of the difference or the ratio
nitudes are the same for two alternatives, sen- of the reinforcer magnitudes. Equation 20
sitivity to variations in reinforcer rate is an will behave much like the concatenated gen-
increasing function of absolute reinforcer eralized matching law for reinforcer frequen-
rate (Alsop & Elliffe, 1988; Elliffe & Alsop, cy and magnitude, in that a given pair of un-
1996; Logue & Chavarro, 1987). equal magnitudes M1 and M2 will establish a
Result 4. When reinforcer magnitudes are constant bias when reinforcer frequency is
kept in a constant ratio while their absolute varied (Result 2). However, in all other re-
value increases, response ratios decrease spects, Equation 20 behaves much like Equa-
(Logue & Chavarro, 1987). tion 19.
We consider several ways of dealing with A second problem with both of the fore-
this complex pattern of results. First, consider going equations is that the value of a rein-
simply multiplying each reinforcer term in forcer (i.e., its direct and generalized
Equation 18 by its corresponding magnitude: strengthening effects) may not be linearly re-
lated to its physical magnitude (as specified,
R 2M2 say, by duration of food access; e.g., Epstein,
R1M1 1
B1 d br 1981). This leaves us in the uncomfortable
5 . (19) position of having to conjecture both the val-
B2 R1M1
R 2M2 1 ue of each reinforcer as a function of M and
d br the discriminability of the response–reinforc-
er relations as a function of M1 and M2. Ex-
Unfortunately, the apparent simplicity of this
actly the same problems arise when other re-
approach is undermined by the fact that the
inforcer parameters, such as delay or quality,
value of dbr will depend on the difference be-
are varied. If our model is to be extended to
tween M1 and M2, as in the differential out-
choice-controlling variables other than fre-
come effect. Any positive ordinal relation pre-
quency of reinforcement, a substantial re-
dicts an inverse S-shaped curve for Result 1
search effort will be required to identify in-
and fails to predict the constancy of sensitivity
dependent functions for the discriminability
in Result 2. Equation 19 can account for Re-
and value of the consequences of choice.
sult 3a by allowing for extraneous reinforcers,
Finally, in all fairness, it should be pointed
as noted above, but this approach predicts
out that the problems we have in determin-
the reverse of Result 3. It is possible to ac-
ing an effective model for the interaction of
count for Results 1a and 4 by assuming an
reinforcer rates and magnitudes are just as
appropriate function for dbr in relation to the
much problems for the generalized matching
absolute and relative values of M1 and M2, but
law, unless, of course, one is willing to allow
this would not be satisfying unless the func-
sensitivities to reinforcer rates and magni-
tion was based on independent evaluation of
tudes to vary as a dependent variable in that
the discriminability of different reinforcer
description.
magnitudes.
Second, consider writing separate, multi- The Question of Overmatching
plicative terms for reinforcer frequency and
amount: It is clear by inspection of Equations 17 and
18 that strict matching is the upper limit of
R 12 M sensitivity to reinforcement. However, even
R 11 1 M 11 1 12 with this clear prediction, we must expect
B 11 dbr12 dbm12
5c · , (20) that the variance in estimating sensitivity to
B 12 R 11 M 11
R 12 1 M 1 reinforcement in experimental situations will
dbr12 12 d bm12 provide a distribution of values that may well
STIMULI, REINFORCERS, AND BEHAVIOR 473

extend on occasion to sensitivities of greater has been reported in this area is consider-
than one, as reported by Baum (1979). able. However, some experimental results ap-
Overmatching, however, can also be pro- pear to be inconsistent with each other, and
duced experimentally in two, probably relat- none of the experiments have been explicitly
ed, cases: First, overmatching occurs when designed as tests of the present approach.
concurrent VI VI performance is punished by Further detailed research on, for example,
contingent electric shock (de Villiers, 1980; the interaction of reinforcer frequencies and
Farley, 1980; Farley & Fantino, 1978). These magnitudes should be able to provide an un-
researchers suggested that punishers ob- ambiguous guide to the structure of the equa-
tained on a response alternative subtracted tion for concurrent-schedule behavior allo-
from the reinforcers obtained at that alter- cation.
native. The value of an alternative, under
their matching approach, was thus Ri 2 aPi,
where P is the punishment frequency and a DISCUSSION
is a scaling parameter that equates one rein- We have presented a particular theoretical
forcer to one punisher. Second, overmatch- model of discriminated operant behavior that
ing has been reported when subjects are re- exemplifies a general perspective that we find
quired to work for a period of time to change valuable: unified treatment of the ways in
over from one alternative to another (Baum, which antecedent stimuli and reinforcers af-
1982; Davison, 1991a). A similar approach to fect behavior, where stimulus and reinforcer
that used for the quantification of punish- effects are treated equivalently and inter-
ment—a subtractive model—has proved ef- changeably. This approach follows from qual-
fective here (Davison), and recently Temple, itative similarities among the effects of exper-
Scown, and Foster (1995) have shown that imental variations in stimulus–behavior
Davison’s model predicts the effect of varying relations and behavior–reinforcer relations,
changeover delays in concurrent VI VI sched- as specified by the experimenter’s definitions
ules. Naturally, then, in the present approach, of these terms and the experimentally ar-
overmatching will require a similar subtrac- ranged contingencies linking them.
tive model. The process of translating our general ap-
It is relatively easy to include subtractive proach into a particular model was guided by
terms of the form 2w (for a constant punish- the more specific and quantitative approach
er caused by a changeover delay or a travel of signal-detection theory. As initially put for-
time) to the numerator and denominator of ward by Tanner, Swets, Green, and their as-
our concurrent-schedule equation (Equation sociates in the 1950s and 1960s (Green &
18) as suggested by Davison and Jenkins Swets, 1966), signal-detection theory specifies
(1985). However, a further question arises: two independent model parameters—d9 and
Should this factor be subtracted from the ar- b—that correspond to two aspects of behav-
ranged reinforcer frequencies, or from the ior—discrimination between successive stim-
frequencies after generalization has oc- ulus presentations and bias toward one or the
curred? We know of no data to guide us in other choice response. Signal-detection re-
this, but the logic of Davison’s (1991a) model search demonstrated that d9 was in fact in-
suggests that the subtraction should be after dependent of bias, with discrimination de-
generalization. Finally, in line with the logic pending on the physical properties of the
of our approach and our treatment of rein- stimuli and bias depending on such nonstim-
forcer magnitude, an additional parameter ulus variables as payoffs, costs, and instruc-
might be necessary that captures the discrim- tions. That research did not, however, ex-
inability of the response–punishment contin- amine whether b was independent of
gency. discrimination. The demonstration of invari-
In summary, the concatenation of choice- ance of d9 across variations in these nonstim-
affecting variables (e.g., reinforcer magni- ulus variables, and between procedures dif-
tude, reinforcer delay, and punishment) with fering in how responses were defined and
reinforcer frequency to predict choice re- recorded (such as yes-no, forced choice, or
mains a challenge to this, and other, treat- ratings), permitted identification of d9 with
ments of choice. The amount of research that stimulus discriminability. This was a major
474 MICHAEL DAVISON and JOHN A. NEVIN

contribution to psychophysics and a chal- cesses, and quantitative predictions derived


lenge to behavior theory. Could we do the from many of these are indiscriminable from
same sort of thing across a yet wider range of those derived from many others, given the
procedures? usual noise in the data. Over the course of
We have followed the detection-theory ap- our 20 years of informal collaboration, we
proach by defining two independent model have explored many of these. We are not
parameters that reflect the discriminability of firmly wedded to the specific model we have
stimulus–behavior and behavior–reinforcer offered here, but naturally we have been
relations as specified for a particular discrim- guided towards this model from both theo-
inated operant relative to others arranged retical considerations and from fits to data.
within the same experimental session. The What alternatives are there, in general, to
model advanced here has the following gen- the approach taken here? The first set of al-
eral properties: First, stimulus–behavior and ternatives is in the nature of psychometric
behavior–reinforcer relations are conceptu- space. It seems to us exceedingly unlikely that
alized similarly and their discriminabilities the log dsb 2 log dbr psychometric space is flat
are quantified identically. Second, reinforcers in the way we have assumed, as suggested in
are presumed to strengthen the particular ex- Figure 4. If it is not flat, then the distances
perimentally defined response that produced between reinforcing events and the point at
it in the presence of the particular environ- which they affect another response will be in-
mental stimulus that the experimenter has accurately measured in our approach. There
currently or recently presented. Third, the are a number of different techniques of de-
strengthening effect of the reinforcer gener- termining such spaces, but lacking sufficient
alizes to another discriminated operant to the data, we have made the simplest assumption.
extent that the stimulus–behavior relation Equally, as Davison (1991b) discussed, there
and the behavior–reinforcer relation charac- are indefinitely many ways of specifying the
terizing the second operant, taken separately, effective distances in psychometric space. We
are similar to those of the reinforced operant. have assumed city-block measurement (see
Fourth, similarity is expressed as the inverse Figure 4), in which effects are determined by
of distance in a psychometric space with or- the distance along the perimeter of the tri-
thogonal axes corresponding to stimulus–be- angle in psychometric space. Straightforward
havior and behavior–reinforcer relations, alternatives to this distance measurement are
within which all experimentally defined dis- the Euclidean (hypotenuse distance) mea-
criminated operants are located. The dis- surement, and the supremum measurement
tance between any pair of operants is given (the effective distance is the largest of the dsb
by the city-block sum of distances on the stim- or dbr distances). Davison suggested that the
ulus–behavior and behavior–reinforcer axes. supremum could be better than the city-block
measurement in predicting detection perfor-
Models and Model Domains mance, but we have not followed his sugges-
In this paper, we have applied a consistent tion here because the gains were slight, and
conceptual model to a number of research the equations were more difficult. Other
areas. This model hinges on (a) the sugges- measures remain to be investigated, and, of
tion that the effects of stimuli and reinforcers course, the appropriate measurement and
on behavior are symmetrical, (b) that they act the flatness or otherwise of the psychometric
in a particular way in flat and orthogonal psy- space interact considerably.
chometric space, and (c) that generalization The specific quantitative process of gener-
between points in psychometric space occurs alization (we assumed a reciprocal process for
according to a specific quantitative process. dsb and dbr in Equations 7a and 7b) could be
In many ways, only the first of these is critical modeled by almost any monotonically de-
to our thinking, and the model developments creasing nonlinear or even linear function.
done here may have required a degree of Given the noise in the available data, it is not
specificity about the second and third points at all easy to choose among these, and our
that is premature. Unfortunately, there are in- decision to use the reciprocal function is
definitely many alternative psychometric based on simplicity of equation form, togeth-
spaces and quantitative generalization pro- er with the expectation of greater generaliza-
STIMULI, REINFORCERS, AND BEHAVIOR 475

tion decrements close to the reinforcing though the available data are equivocal, our
event than far from it. model’s treatment of stimulus–choice delay
Thus, taking all the above considerations may need to be reconsidered.
into account, we can estimate the probability In addition, in conditional discrimination
of our having lit upon the correct model as performance, the inverse-U relation between
probably something rather less than 1 in 106. log D and the reinforcer ratio predicted by
However, at a more general and conceptual our model and shown in Figure 6 is not gen-
level, we have, we hope, defined an approach erally obtained, and the data of Whittaker
to the problem of predicting how three-term (1977) do not support the predicted relations
contingencies in the context of other three- between log B and the reinforcer ratio.
term contingencies will affect behavior. A different sort of problem arises in the
treatment of performance on concurrent VI
What the Model Does Well VI schedules when variables other than rein-
In procedures that arrange two stimuli and forcer rate, such as reinforcer magnitude, are
two responses, our model parameters exhibit considered. First, there is some uncertainty
independence and invariance when either about the best way to incorporate terms re-
stimulus–behavior or behavior–reinforcer re- flecting such variables in the model’s basic
lations are varied experimentally. In addition, equations; and second, the data now available
our model correctly predicts the opposite ef- are so complex as to defy effective modeling
fects of dsb and dbr on sensitivity to variation without ad hoc assumptions concerning ap-
in reinforcer ratios, and the effects of rein- plication to each data set.
forcers for ‘‘errors.’’ The model extends nat- Finally, the application of our model to
urally to procedures involving more than two multiple VI VI schedules is complicated by
stimuli or responses, and correctly predicts the nature of the dependent variable. In the
the effects of varying the numbers of discrim- paradigms in which our model is generally
inated operants and the differences among successful, behavior is readily measured as a
them. Moreover, the model parameters are ratio of concurrently available responses. In
well behaved in that they change monotoni- standard multiple schedules, by contrast, the
cally with experimentally defined variables dependent variable is the rate of a single re-
and roughly satisfy additivity within psycho- sponse. To accommodate multiple-schedule
metric space. The model can account for (and single-schedule) performance, some ra-
many of the effects of delay of choice and tional and empirically meaningful way to re-
delay of reinforcement on accuracy of dis- express response rate will be required.
crimination (measured as log d) in delayed
matching to sample and related procedures. Things Not Modeled Here
And finally, it gives a rational account of un- We have simply ignored many variables that
dermatching on concurrent VI VI schedules. affect behavior in the paradigms we have con-
Thus, its effectiveness is not limited to a sin- sidered. For example, in discrete-trial condi-
gle paradigm. tional discriminations, we ignore the interval
between trials and its well-known effects. Stat-
What the Model Does Not Do Well ed more generally, we do not address the role
Despite its effectiveness across several dif- of the context within which a given discrimi-
ferent paradigms, the model encounters dif- nated operant is defined, except for the gen-
ficulties in some of them. In particular, the eralized effects of other defined discriminat-
model predicts an inverse relation between ed operants.
the discriminability of stimulus–behavior re- Another variable that we neglect for the
lations and sensitivity to reinforcer ratios that present is whether the reinforcers for the two
is confirmed when stimulus–behavior dis- sorts of correct responses in a conditional dis-
criminability is varied by changing the stimuli crimination differ in amount, delay, or qual-
themselves. However, it is not always con- ity. In the section on conditional discrimina-
firmed when stimulus–behavior discrimina- tions, we cited the research of Peterson et al.
bility is degraded by imposing delays between (1980) showing that qualitatively different
the conditional stimuli and the choice re- outcomes enhanced accuracy in symbolic
sponses in delayed matching to sample. Al- matching to sample, especially when delays
476 MICHAEL DAVISON and JOHN A. NEVIN

were imposed between stimulus offset and ulus–reinforcer relation is implied by the
choice, as an example of the effect of in- joint values of the stimulus–behavior and be-
creased discriminability of the response–re- havior–reinforcer discriminabilities, compli-
inforcer relation. One way to model this dif- cating direct application to Nevin’s work.
ferential outcome effect is to incorporate a
parameter that characterizes reinforcer dis- Research and Applications with Humans
criminability into our basic equations and al- Finally, we consider some ways in which our
low it to increase with differences in reinforc- model applies to human performance. First,
er amount, delay, or quality. However, as of course, it models the general properties of
described in the section on choice-control- human signal-detection performance; that is
ling variables and summarized above, it is not what inspired our approach in the first place
clear how best to incorporate such changes (Nevin, 1969b). We should, however, note
in our basic model, and we leave the issue as some specific aspects of detection perfor-
a challenge for the future. mance that are not fully captured here. In the
Another issue that we ignore for the pur- human psychophysics literature, the well-
poses of this model is the role of the subject’s known ROC or isosensitivity curve is often
history of reinforcement in conditions prior reported to be linear in double-normal co-
to the current condition, or stated otherwise, ordinates, as predicted by classical signal-
the length of the effective time window within detection theory (Green & Swets, 1966). Our
which the direct and generalized effects of model predicts an isosensitivity curve that is
reinforcers accumulate. It seems likely that concave in such coordinates, with the degree
the length of the time window depends on of concavity being minimal at large values of
the frequency with which experimental con- dbr but increasing as dbr decreases. This is not
ditions are altered, and we simply assume suf- a serious problem because dbr should be large
ficient exposure to insure control by the con- when ‘‘yes’’ and ‘‘no’’ are well differentiated
ditions of reinforcement that are currently and are followed immediately by differential
arranged, and no influence of prior condi- payoffs, penalties, or other explicit feedback.
tions. Thus, the isosensitivity curve should be nearly
Relatedly, we do not attempt to model be- straight in double-normal coordinates, and
havior during acquisition or transitions be- the usual noise in the data precludes detec-
tween experimental conditions despite the tion of slight curvilinearity. Second, many hu-
fact that our approach is readily translated man isosensitivity curves are asymmetrical,
into dynamic equations. This is clearly an im- with slopes less than 1.0, whereas our model
portant direction to pursue, but there are predicts symmetry with an average slope of
many different ways to treat dynamic process- 1.0. Signal-detection theory accommodates
es, and we need to be sure that our model is asymmetry and nonunit slopes by adding a
as effective as possible for steady-state behav- parameter that reflects the ratio of variances
ior before extending its approach to trial-by- in the postulated distributions of sensory ef-
trial or reinforcer-by-reinforcer changes. fects that account for detection performance
Also relatedly, we do not treat the effects (see Egan, 1975, for treatment of this and re-
of reinforcement on resistance to change of lated approaches). We could do likewise by
discriminated operant behavior (e.g., Nevin, allowing dsb to take different values for signal
1992). There are two reasons for postponing (S1) and noise (S2) trials, but for the present
attempts to treat resistance to change within we refrain from this added degree of com-
our general model. First, Nevin’s analyses plexity (and parametric freedom).6
have been concerned almost entirely with More generally, our model may help to un-
changes in the rates of responding in multi-
ple schedules, and response rates (as noted 6 Many of these issues are discussed by Alsop (1998),

above) are not comfortably handled by our whose approach is closely related to ours. In particular,
model without further assumptions. Second, he states that ‘‘Ultimately, signal-detection performance
Nevin has argued that resistance to change is the product of a variety of discriminations involving the
sample stimuli, the response alternatives, and the feed-
depends on the stimulus–reinforcer relation back or outcomes for these choices’’ (p. 249). Our model
and not on the behavior–reinforcer relation. attempts to formalize and quantify these determiners of
In our model, the discriminability of the stim- detection performance.
STIMULI, REINFORCERS, AND BEHAVIOR 477

derstand some sources of variation in human and Ryder (1980) compared the performanc-
performance on conditional discriminations es of Korsakoff patients and normal subjects
and concurrent schedules. For example, Bar- on concurrent VI VI schedules, and found
on and Surdy (1990) varied the magnitudes that choice allocation was less sensitive to the
of payoffs and penalties in a continuous rec- reinforcer ratio for the Korsakoff patients. In
ognition task with young (age 18 to 26 years) fact, the Korsakoff group median estimated
and older (age 62 to 75 years) adults. The value of a in Equation 2 above was 0.03, sug-
paradigm was a 2 3 2 conditional discrimi- gesting a very low value of dbr . Korsakoff pa-
nation analogous to signal detection, in tients also exhibit deficits in various discrim-
which the ‘‘signal’’ was prior exposure to an ination learning and delayed discrimination
item. Recognition performance was generally tasks. These stimulus-discrimination deficits
less accurate for the older adults, although may result, at least in part, from the patients’
the difference decreased with extended prac- difficulty in distinguishing the response–re-
tice. Interestingly, the older adults were also inforcer contingencies in these stimulus-con-
less sensitive to variations in payoff and pen- trol tasks. Interventions designed to enhance
alty magnitudes. In terms of our model, the the discriminability of response–reinforcer
implication is that dbr was lower for the older contingencies might ameliorate some of the
subjects. Because measured discrimination apparent deficits in stimulus discrimination.
depends on dbr , as shown in Figure 6, the age At the least, therapists must take the discrim-
difference in recognition may have been inability of both stimulus–behavior and be-
overestimated. The general message here is havior–reinforcer relations into account in
that dbr must be equated across groups of sub- the functional analysis and modification of
jects before differences in discrimination per- behavior in applied settings. In this way, a
formances with the same stimuli can be com- quantitative approach to behavior therapy
pared. (Davison, 1992; McDowell, 1982) may be de-
It may be that the age difference in sensi- veloped.
tivity to the magnitude of the payoffs and
penalties reported by Baron and Surdy
(1990) could be reduced by enhancing the CONCLUSION
distinctiveness of the response–reinforcer re-
lations. Some suggestive data on this issue The basic approach behind the model we
have been obtained by Stine-Morrow, Soeder- present here is based on ideas that have in-
berg Miller, and Nevin (in press), who stud- formed the experimental analysis of behavior
ied young and older adults in a lexical deci- for many years. For example, reinforcer rates
sion task. The paradigm was a 2 3 2 have been treated as functionally equivalent
conditional discrimination employing spoken to environmental stimuli in the generaliza-
words and confusable nonwords as stimuli. tion-decrement account of extinction: The
Over several variations in the context of stim- richer the schedule during training, the
ulus presentations, the older subjects were greater the discriminability of nonreinforce-
generally less accurate in distinguishing be- ment. We have tried to make the equivalence
tween words and nonwords when feedback of the effects of stimuli and reinforcers ex-
and accumulated payoffs were given only at plicit and to quantify the discriminability of
the end of a block of trials. However, when their relations with behavior in concurrent
immediate feedback signaling money earned discriminated operants. We have proposed
was provided for correct identifications of some equations suggesting how their effects
words and nonwords on each trial, discrimi- may combine and have applied them as
nation increased markedly for the older broadly as we are able without ad hoc modi-
adults and the age difference was eliminated. fication. The outcome, we believe, is favor-
In terms of our model, the provision of im- able enough to encourage further efforts
mediate feedback may be construed as en- along these lines. We will continue to explore
hancing dbr . alternatives in the domain of integrative mod-
Our model may also help to interpret per- els suggested by our basic approach and ap-
formance deficits in patients with Korsakoff’s ply them yet more broadly. It is our special
syndrome. Oscar-Berman, Heyman, Bonner, hope that readers will do likewise.
478 MICHAEL DAVISON and JOHN A. NEVIN

REFERENCES (Eds.), Signal detection: Mechanisms, models, and appli-


cations (pp. 57–78). Hillsdale, NJ: Erlbaum.
Alsop, B. L. (1988). Detection and choice. Unpublished Davison, M. (1992). Applied quantitative behavior anal-
doctoral dissertation, University of Auckland, New ysis: A view from the laboratory. Journal of Behavioral
Zealand. Education, 2, 207–211.
Alsop, B. (1991). Behavioral models of signal detection Davison, M. (1993). On the dynamics of behavior allo-
and detection models of choice. In M. L. Commons, cation between simultaneously and successively avail-
J. A. Nevin, & M. C. Davison (Eds.), Signal detection: able reinforcer sources. Behavioural Processes, 29, 49–
Mechanisms, models, and applications (pp. 39–55). Hills- 65.
dale, NJ: Erlbaum. Davison, M., & Hogsden, I. (1984). Concurrent variable-
Alsop, B. (1998). Receiver operating characteristics from interval schedule performance: Fixed versus mixed
nonhuman animals: Some implications and directions reinforcer durations. Journal of the Experimental Analysis
for research with humans. Psychonomic Bulletin & Re- of Behavior, 41, 169–182.
view, 5, 239–252. Davison, M. C., & Hunter, I. W. (1976). Performance on
Alsop, B., & Davison, M. (1991). Effects of varying stim- variable-interval schedules arranged singly and con-
ulus disparity and the reinforcer ratio in concurrent- currently. Journal of the Experimental Analysis of Behavior,
schedule and signal-detection procedures. Journal of 25, 335–345.
the Experimental Analysis of Behavior, 56, 67–80. Davison, M., & Jenkins, P. E. (1985). Stimulus discrimi-
Alsop, B., & Elliffe, D. (1988). Concurrent-schedule per- nability, contingency discriminability, and schedule
formance: Effects of relative and overall reinforcer performance. Animal Learning & Behavior, 13, 77–84.
rate. Journal of the Experimental Analysis of Behavior, 49, Davison, M., & Jones, B. M. (1995). A quantitative anal-
21–36. ysis of extreme choice. Journal of the Experimental Anal-
ysis of Behavior, 64, 147–162.
Baron, A., & Surdy, T. M. (1990). Recognition memory
Davison, M., & Jones, B. M. (1998). Performance in con-
in older adults: Adjustment to changing contingen-
current variable-interval extinction schedules. Journal
cies. Journal of the Experimental Analysis of Behavior, 54,
of the Experimental Analysis of Behavior, 69, 49–57.
201–212.
Davison, M., & McCarthy, D. (1980). Reinforcement for
Baum, W. M. (1974). On two types of deviation from the
errors in a signal-detection procedure. Journal of the
matching law: Bias and undermatching. Journal of the
Experimental Analysis of Behavior, 34, 35–47.
Experimental Analysis of Behavior, 22, 231–242.
Davison, M., & McCarthy, D. (1987). The interaction of
Baum, W. M. (1979). Matching, undermatching, and
stimulus and reinforcer control in complex temporal
overmatching in studies of choice. Journal of the Exper-
discrimination. Journal of the Experimental Analysis of Be-
imental Analysis of Behavior, 32, 269–281.
havior, 48, 97–116.
Baum, W. M. (1982). Choice, changeover, and travel. Davison, M., & McCarthy, D. (1988). The matching law: A
Journal of the Experimental Analysis of Behavior, 38, 35– research review. Hillsdale, NJ: Erlbaum.
49. Davison, M. C., & Tustin, R. D. (1978). The relation be-
Catania, A. C., & Cutts, D. (1963). Experimental control tween the generalized matching law and signal-detec-
of superstitious responding in humans. Journal of the tion theory. Journal of the Experimental Analysis of Be-
Experimental Analysis of Behavior, 6, 203–208. havior, 29, 331–336.
Charman, L., & Davison, M. (1983). Undermatching de Villiers, P. A. (1980). Toward a quantitative theory of
and stimulus discriminability in multiple schedules. punishment. Journal of the Experimental Analysis of Be-
Behaviour Analysis Letters, 3, 77–84. havior, 33, 15–25.
Chase, S. (1983). Pigeons and the magical number sev- Eckerman, D. A. (1970). Generalization and response
en. In M. L. Commons, R. J. Herrnstein, & A. R. Wag- mediation of a conditioned discrimination. Journal of
ner (Eds.), Quantitative analyses of behavior: Vol. 4. Dis- the Experimental Analysis of Behavior, 13, 301–316.
crimination processes (pp. 37–57). Cambridge, MA: Egan, J. P. (1975). Signal-detection theory and ROC analysis.
Ballinger. New York: Academic Press.
Cumming, W. W. (1955). Stimulus disparity and variable- Elliffe, D., & Alsop, B. (1996). Concurrent choice: Ef-
interval reinforcement schedule as related to a behavioral fects of overall reinforcer rate and the temporal dis-
measure of disparity. Unpublished doctoral dissertation, tribution of reinforcers. Journal of the Experimental
Columbia University. Analysis of Behavior, 65, 445–463.
Cumming, W. W., Berryman, R., & Nevin, J. A. (1963). Epstein, R. (1981). Amount consumed as a function of
Acquisition of delayed matching in the pigeon. Jour- magazine-cycle duration. Behaviour Analysis Letters, 1,
nal of the Experimental Analysis of Behavior, 6, 101–107. 63–66.
Davison, M. (1988). Concurrent schedules: Interaction Farley, J. (1980). Reinforcement and punishment effects
of reinforcer frequency and reinforcer duration. Jour- in concurrent schedules: A test of two models. Journal
nal of the Experimental Analysis of Behavior, 49, 339–349. of the Experimental Analysis of Behavior, 33, 311–326.
Davison, M. (1989). Effects of relative reinforcer fre- Farley, J., & Fantino, E. (1978). The symmetrical law of
quency on complex color detection. Journal of the Ex- effect and the matching relation in choice behavior.
perimental Analysis of Behavior, 51, 291–315. Journal of the Experimental Analysis of Behavior, 29, 37–
Davison, M. (1991a). Choice, changeover, and travel: A 60.
quantitative model. Journal of the Experimental Analysis Fersen, L. von, Wynne, C. D. L., Delius, J. D., & Staddon,
of Behavior, 55, 47–61. J. E. R. (1991). Transitive inference formation in pi-
Davison, M. C. (1991b). Stimulus discriminability, con- geons. Journal of Experimental Psychology: Animal Behav-
tingency discriminability, and complex stimulus con- ior Processes, 17, 334–341.
trol. In M. L. Commons, J. A. Nevin, & M. C. Davison Findley, J. D. (1958). Preference and switching under
STIMULI, REINFORCERS, AND BEHAVIOR 479

concurrent scheduling. Journal of the Experimental Anal- inability in signal detection. Journal of the Experimental
ysis of Behavior, 1, 123–144. Analysis of Behavior, 34, 273–284.
Godfrey, R. (1997). An investigation of behavioural models McCarthy, D., & Davison, M. (1980b). On the discrimi-
of detection. Unpublished doctoral dissertation, Univer- nability of stimulus duration. Journal of the Experimental
sity of Auckland, New Zealand. Analysis of Behavior, 33, 187–211.
Godfrey, R., & Davison, M. (1998). Effects of varying McCarthy, D., & Davison, M. (1984). Isobias and alloio-
sample- and choice-stimulus disparity on symbolic bias in animal psychophysics. Journal of Experimental
matching-to-sample performance. Journal of the Exper- Psychology: Animal Behavior Processes, 10, 390–409.
imental Analysis of Behavior, 69, 311–326. McCarthy, D., & Davison, M. (1986). Delayed reinforce-
Green, D. M., & Swets, J. A. (1966). Signal-detection theory ment and delayed choice in symbolic matching to
and psychophysics. New York: Wiley. sample: Effects on stimulus discriminability. Journal of
Guttman, N., & Kalish, H. I. (1956). Discriminability and the Experimental Analysis of Behavior, 46, 293–303.
stimulus generalization. Journal of Experimental Psychol- McCarthy, D. C., & Davison, M. (1991). The interaction
ogy, 51, 79–88. between stimulus and reinforcer control on remem-
Harnett, P., McCarthy, D., & Davison, M. (1984). De- bering. Journal of the Experimental Analysis of Behavior,
layed signal detection, differential reinforcement, and 56, 51–66.
short-term memory in the pigeon. Journal of the Exper- McCarthy, D. C., & Voss, P. (1995). Delayed matching-
imental Analysis of Behavior, 42, 87–111. to-sample performance: Effects of relative reinforcer
Hartl, J. A., & Fantino, E. (1996). Choice as a function frequency and of signaled versus unsignaled reinforc-
of reinforcement ratios in delayed matching to sam- er magnitudes. Journal of the Experimental Analysis of
ple. Journal of the Experimental Analysis of Behavior, 66, Behavior, 63, 33–51.
11–27. McDowell, J. J. (1982). The importance of Herrnstein’s
Hautus, M. J. (1995). Corrections for extreme propor- mathematical statement of the law of effect for be-
tions and their biasing effects on estimated values of havior therapy. American Psychologist, 37, 771–779.
d9. Behavior Research Methods, Instrumentation, and Com- McLean, A. P. (1992). Contrast and reallocation of ex-
puters, 27, 46–51. traneous reinforcers between multiple-schedule com-
Heinemann, E. G., & Chase, S. (1975). Stimulus gener- ponents. Journal of the Experimental Analysis of Behavior,
alization. In W. K. Estes (Ed.), Handbook of learning 58, 497–511.
and cognitive processes (pp. 305–349). Hillsdale, NJ: Erl- McLean, A. P. (1995). Contrast and reallocation of ex-
baum. traneous reinforcers as a function of component du-
Herrnstein, R. J. (1961). Relative and absolute strength ration and baseline rate of reinforcement. Journal of
of response as a function of frequency of reinforce- the Experimental Analysis of Behavior, 63, 203–224.
ment. Journal of the Experimental Analysis of Behavior, 4, McLean, A. P., & White, K. G. (1983). Temporal con-
267–272. straint on choice: Sensitivity and bias in multiple
Herrnstein, R. J. (1970). On the law of effect. Journal of schedules. Journal of the Experimental Analysis of Behav-
the Experimental Analysis of Behavior, 13, 243–266. ior, 39, 405–426.
Hobson, S. L. (1978). Discriminability of fixed-ratio Mentzer, T. L. (1966). Comparison of three methods for
schedules for pigeons: Effects of payoff values. Journal obtaining psychophysical thresholds from the pigeon.
of the Experimental Analysis of Behavior, 30, 69–81. Journal of Comparative and Physiological Psychology, 61,
Honig, W. K. (1962). Prediction of preference, trans- 96–101.
position, and transposition-reversal from the gener- Miller, J. T., Saunders, S. S., & Bourland, G. (1980). The
alization gradient. Journal of Experimental Psychology, 64, role of stimulus disparity in concurrently available re-
239–248. inforcement schedules. Animal Learning & Behavior, 8,
Jones, B. M., & White, K. G. (1992). Sample-stimulus 635–641.
discriminability and sensitivity to reinforcement in de- Nevin, J. A. (1967). Effects of reinforcement scheduling
layed matching to sample. Journal of the Experimental on simultaneous discrimination performance. Journal
Analysis of Behavior, 58, 159–172. of the Experimental Analysis of Behavior, 10, 251–260.
Killeen, P. R. (1994). Mathematical principles of rein- Nevin, J. A. (1969a). Interval reinforcement of choice
forcement. Behavioral and Brain Sciences, 17, 105–172. behavior in discrete trials. Journal of the Experimental
Lobb, B., & Davison, M. C. (1977). Multiple and con- Analysis of Behavior, 12, 875–885.
current schedule performance: Independence from Nevin, J. A. (1969b). Signal detection theory and oper-
concurrent and successive schedule contexts. Journal ant behavior: A review of David M. Green and John
of the Experimental Analysis of Behavior, 28, 27–39. A. Swets’ Signal Detection Theory and Psychophysics. Jour-
Logue, A. W., & Chavarro, A. (1987). Effect on choice nal of the Experimental Analysis of Behavior, 12, 475–480.
of absolute and relative values of reinforcer delay, Nevin, J. A. (1981). Psychophysics and reinforcement
amount, and frequency. Journal of Experimental Psychol- schedules: An integration. In M. L. Commons & J. A.
ogy: Animal Behavior Processes, 13, 280–291. Nevin (Eds.), Quantitative analyses of behavior: Vol. 1.
Luce, R. D. (1959). Individual choice behavior: A theoretical Discriminative properties of reinforcement schedules (pp. 3–
analysis. New York: Wiley. 27). Cambridge, MA: Ballinger.
Mackintosh, N. (1974). The psychology of animal learning. Nevin, J. A. (1984). Quantitative analysis. Journal of the
New York: Academic Press. Experimental Analysis of Behavior, 42, 421–434.
McCarthy, D., & Davison, M. (1979). Signal probability, Nevin, J. A. (1992). An integrative model for the study
reinforcement, and signal detection. Journal of the Ex- of behavioral momentum. Journal of the Experimental
perimental Analysis of Behavior, 32, 373–386. Analysis of Behavior, 57, 301–316.
McCarthy, D., & Davison, M. (1980a). Independence of Nevin, J. A., Cate, H., & Alsop, B. (1993). Effects of dif-
sensitivity to relative reinforcement rate and discrim- ferences between stimuli, responses, and reinforcer
480 MICHAEL DAVISON and JOHN A. NEVIN

rates on conditional discrimination performance. Tustin, R. D., & Davison, M. C. (1978). Distribution of
Journal of the Experimental Analysis of Behavior, 59, 147– response ratios in concurrent variable-interval perfor-
161. mance. Journal of the Experimental Analysis of Behavior,
Nevin, J. A., Jenkins, P., Whittaker, S., & Yarensky, P. 29, 561–564.
(1982). Reinforcement contingencies and signal de- White, K. G. (1985). Characteristics of forgetting func-
tection. Journal of the Experimental Analysis of Behavior, tions in delayed matching to sample. Journal of the Ex-
37, 65–79. perimental Analysis of Behavior, 44, 15–34.
Nevin, J. A., Olson, K., Mandell, C., & Yarensky, P. White, K. G. (1986). Conjoint control of performance
(1975). Differential reinforcement and signal detec- in conditional discriminations by successive and si-
tion. Journal of the Experimental Analysis of Behavior, 24, multaneous stimuli. Journal of the Experimental Analysis
355–367. of Behavior, 45, 161–174.
Oscar-Berman, M., Heyman, G. M., Bonner, R. T., & Ry- White, K. G. (1991). Psychophysics of direct remember-
der, J. (1980). Human neuropsychology: Some dif- ing. In M. L. Commons, J. A. Nevin, & M. Davison
ferences between Korsakoffs and normal operant per- (Eds.), Signal detection: Mechanisms, models, and appli-
formance. Psychological Research, 41, 235–247. cations (pp. 221–237). Hillsdale, NJ: Erlbaum.
Peterson, G. B., Wheeler, R. L., & Trapold, M. A. (1980). White, K. G., & Cooney, E. B. (1996). Consequences of
Enhancement of pigeons’ conditional discrimination remembering: Independence of performance at dif-
performance by expectancies of reinforcement and ferent retention intervals. Journal of Experimental Psy-
nonreinforcement. Animal Learning & Behavior, 8, 22– chology: Animal Behavior Processes, 22, 51–59.
30. White, K. G., Pipe, M. E., & McLean, A. P. (1984). Stim-
Pollack, I. (1952). The information of elementary au- ulus and reinforcer relativity in multiple schedules:
ditory displays. Journal of the Acoustical Society of Ameri- Local and dimensional effects on sensitivity to rein-
ca, 24, 745–749. forcement. Journal of the Experimental Analysis of Behav-
Reynolds, G. S. (1961). Relativity of response rate and ior, 41, 69–81.
reinforcement frequency in a multiple schedule. Jour- White, K. G., & Wixted, J. T. (1999). Psychophysics of
nal of the Experimental Analysis of Behavior, 4, 179–184. remembering. Journal of the Experimental Analysis of Be-
Reynolds, G. S. (1963). Some limitations on behavioral havior, 71, 91–113.
contrast and induction during successive discrimina- .Whittaker, S. G. (1977). Scaling probability of reinforcement
tion. Journal of the Experimental Analysis of Behavior, 6, with a signal-detection procedure. Unpublished doctoral
131–139. dissertation, University of New Hampshire.
Schneider, J. W. (1973). Reinforcer effectiveness as a Williams, B. A. (1988). Reinforcement, choice, and re-
function of reinforcer rate and magnitude: A com- sponse strength. In R. C. Atkinson, R. J. Herrnstein,
parison of concurrent performances. Journal of the Ex- G. Lindzey, & R. D. Luce (Eds.), Stevens’ handbook of
perimental Analysis of Behavior, 20, 461–471. experimental psychology: Vol. 2. Learning and cognition
Shepard, R. N. (1958). Stimulus and response general- (pp. 167–244). New York: Wiley.
ization: Deduction of the generalization gradient Wright, A. A. (1972). Psychometric and psychophysical
from a trace model. Psychological Review, 65, 243–256. hue discrimination functions for the pigeon. Vision
Shepard, R. N. (1965). Approximation to uniform gra- Research, 12, 1447–1464.
dients of generalization by monotone transformations Wright, A. A. (1974). Psychometric and psychophysical
of scale. In D. Mostofsky (Ed.), Stimulus generalization theory within a framework of response bias. Psycholog-
(pp. 94–110). Stanford, CA: Stanford University Press. ical Review, 81, 332–347.
Shimp, C. P. (1969). The concurrent reinforcement of Wright, A. A., & Cumming, W. W. (1971). Color-naming
two inter-response times: The relative frequency of an functions for the pigeon. Journal of the Experimental
inter-response time equals its relative harmonic Analysis of Behavior, 15, 7–17.
length. Journal of the Experimental Analysis of Behavior, Zentall, T. R., & Sherburne, L. M. (1994). Transfer of
12, 403–411. value from S1 to S2 in a simultaneous discrimina-
Skinner, B. F. (1969). Contingencies of reinforcement. Engle- tion. Journal of Experimental Psychology: Animal Behavior
wood Cliffs, NJ: Prentice Hall. Processes, 20, 176–183.
Stine-Morrow, E. A. L., Soederberg Miller, L., & Nevin, J.
A. (in press). The effects of context and feedback on Received July 15, 1997
age differences in spoken recognition. Journal of Ger- Final acceptance January 21, 1999
ontology: Psychological Sciences.
Stubbs, D. A., & Pliskoff, S. S. (1969). Concurrent re-
sponding with fixed relative rate of reinforcement. APPENDIX A
Journal of the Experimental Analysis of Behavior, 12, 887–
895. G LOSSARY
Swets, J. A. (1959). Indices of signal detectability ob- — a set of experimentally specified
Si1,2,. . .n
tained with various psychophysical procedures. Journal
of the Acoustical Society of America, 31, 511–513. stimuli.
Temple, W., Scown, J. M., & Foster, T. M. (1995). Bj1,2,. . .n — a set of experimentally specified re-
Changeover delay and concurrent-schedule perfor- sponses.
mance in domestic hens. Journal of the Experimental Rij — an outcome contingent on Bj given Si.
Analysis of Behavior, 63, 71–95.
Trapold, M. A. (1970). Are expectancies based upon dif- Discriminated operant — a fundamental be-
ferent positive reinforcing events discriminably differ- havioral unit defined as Si:(Bj → Rij).
ent? Learning and Motivation, 1, 129–140. c — inherent bias in choice between B1 and
STIMULI, REINFORCERS, AND BEHAVIOR 481

B2 that is independent of S1, S2, and R11/R22; el rather than calculated or estimated from
estimated as the intercept of the least squares data.
fit to data relating log b (see below) to log dt — a theoretical parameter characterizing
R11/R22. the degradation of stimulus–behavior dis-
a — sensitivity of the ratio B1/B2 to the ratio criminability, dsb, over time separating Si and
of reinforcers R1/R2 obtained by B1 and B2, Bj, or equivalently, the degradation of behav-
estimated as the slope of the least squares fit ior–reinforcer discriminability over time sep-
to data relating log b (see below) to log R11/ arating Bj and Rij.
R22. Note that when S1 and S2 are indistin- dd — a theoretical parameter characterizing
guishable or undefined, as in two-alternative the discriminability of different delays inter-
concurrent schedules, c and a are estimated vening between Si and Bj.
from the logarithmic form of the generalized d2 — a theoretical parameter characterizing
matching law: log(B1/B2) 5 a log(R1/R2) 1 the discriminability of different second-order
log c. stimuli signaling different first-order relations
dsbij — a theoretical parameter characterizing among Si, Bj, and Rij.
the discriminability of the stimulus–behavior dbm — a theoretical parameter characterizing
relations Si1:Bj1, Si2:Bj2. the discriminability of reinforcers R1 and R2
dbrij — a theoretical parameter characterizing that differ in magnitude or quality.
the discriminability of the behavior–reinforc- Pij — punishers contingent on Si:Rj.
er relations Bj1 → R11, Bj2 → R22. a — a scale factor equating the weakening
d — stimulus discriminability in the model of effects of a given punisher with the strength-
Davison and Tustin (1978); calculated as the ening effects of a given reinforcer.
geometric mean of B 11 /B 12 , B 22 /B 21 . Fre- d9 — a parameter in the theory of signal de-
quently reported as log d, estimated as the tection corresponding to the discriminability
difference between intercepts of least squares of a signal in a background of noise, inter-
fits to data relating log B11/B12 and log B21/ preted as the difference between the means
B22 to log R11/R22 over several conditions. of two hypothetical distributions of sensory
b — overall bias in the model of Davison and effects divided by their standard deviation.
Tustin (1978); calculated as the geometric b — a parameter of the theory of signal de-
mean of B11/B12 and B21/B22. The model pre- tection corresponding to response bias, inter-
dicts that log b 5 a log (R11/R22) 1 log c. preted as the location of a response criterion
ds — stimulus discriminability in the model of on the sensory-effect continuum.
Davison and Jenkins (1985); conceptually
and algebraically equivalent to d, and calcu-
lated as for d above. APPENDIX B
dr — the discriminability of response–rein- Fitting the Model
forcer relations in the model of Davison and
Jenkins (1985); conceptually equivalent to dbr A note is in order here about fitting the
above, and estimated by nonlinear optimiza- model we have presented. To estimate the
tion. various parameters, various data could be
B — overall bias in the present model, cal- used in various different ways. For example,
culated as for b above. However, it is not pre- as data, we might fit a proportional model
dicted by the same equation and depends on [Bi/(Bi 1 Bj)] to proportional predictions, us-
both dsb and dbr (see text, Equation 8). The ing each cell as a proportion of the other
upper case is intended to distinguish its the- cells in the presence of each stimulus. Such
oretical origin, and signifies the value pre- a proportional fit would have the effect of dif-
dicted by our model rather than calculated ferentially weighting data that occur around
or estimated from data. a response proportion of .5 in comparison
D — measured discrimination in the present with those more extreme. This would pro-
model, calculated as for d above. However, in duce parameter estimates that would be more
the present model it depends on both dsb and accurate for wide generalization (small dis-
dbr (see text, Equation 11). The upper case is criminability) values. The opposite weighting
intended to distinguish its theoretical origin, can be achieved if simple ratios are used, in
and signifies the value predicted by our mod- which case small degrees of generalization
482 MICHAEL DAVISON and JOHN A. NEVIN

(large discriminabilities) would be more ac- quency of presentation of all stimuli is the
curately estimated. The middle way, which same (signal-presentation probability 5 .5 in
may have the benefit of equalizing variances a 2 3 2 matrix), or if the response and re-
across all values of response measures, is to inforcer numbers are normalized between
use log ratio fits. Tustin and Davison (1978) stimuli.
showed that log ratio measures were homo- A common problem in fitting data from
scedastic in concurrent VI VI performance, conditional discrimination situations is low
and perhaps this would also apply to the sig- response numbers in some cells. If both dsb
nal-detection situation. We recommend this and dbr are close to infinity, error responses
procedure, at least as an interim measure. may only occasionally be emitted. As a result,
A problem arises, however. If we use Equa- parameter estimates may be unattainable (if
tions 1 and 2 directly to obtain values of dsb response counts are zero), or may be poor in
and dbr , optimization programs will frequently accuracy. This can be overcome in a number
keep increasing the values of both dsb and dbr of ways. First, and this is definitely not rec-
without bound. This results from the nature ommended, data can be collected until there
of the equation, whereby perfect discrimina- is at least one response in each error cell.
tion has an infinite value. Stable and sensible This procedure will bias the estimate of dsb
fits can, however, be achieved if the equations and dbr , usually towards being too large. Sec-
are algebraically converted to optimize for ond, the fits can be done as relative numbers
values of psb and pbr [defined as dsb/(1 1 dsb) rather than log ratios. Although this allows a
and dbr/(1 1 dbr)]. Because these parameters fit to be carried out, parameter estimates of
have a range of 0.5 to 1.0, they can easily be small confusions will be in error. Third, Hau-
constrained to fall within this range, or, more tus (1995) has published a theoretical analysis
usually, to be less than or equal to 1. We must, of the ways in which these problems can be
however, remember that if such fitted param- overcome showing that the procedure of sim-
eter values are found to be either consistently ply adding 0.5 to response counts in all cells
above 1.0, or less than 0.5, these values con- will usually provide a better estimate of pa-
tain important information on the adequacy rameters like dsb and dbr . We recommend that
(indeed, inadequacy) of the model under in- procedure.
vestigation. Finally, and more technically, what is the
It is still necessary to decide which data best way of actually carrying out the fit? The
should be used in log ratios. Again, we rec- best we have found is to use a spreadsheet
ommend pairwise ratios of responses within that incorporates an optimizer. Quattro-Prot
each stimulus (e.g., in a three-response is particularly good in this regard, because it
choice, B1/B2, B2/B3, and B3/B1. This pro- contains built-in statistical functions that
vides a set of data that are well distributed avoid many columns of calculation. Excelt is
between positive and negative values and also satisfactory, and SigmaPlott is fast. Happily,
provides some data signal (systematic vari- they all seem to give very similar answers! The
ance) for the relative performance (and usual caveats, of course, apply: You should
hence, the parameter estimate) between B2 have seriously more data than the number of
and B3 (dbr 23). Notice, though, that the pres- parameters you need to fit. Also, when fitting
ent models do allow ratios to be taken verti- large numbers of parameters, optimization
cally in a matrix (e.g., B11/B21) when the fre- can take some time, even on a fast computer.

You might also like