0% found this document useful (0 votes)

67 views13 pages

Bayesian Cognition Explored

Uploaded by

José Manuel Olivares Reverol

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views13 pages

Bayesian Cognition Explored

Uploaded by

José Manuel Olivares Reverol

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Overview

Bayesian models of cognition

Nick Chater,1∗ Mike Oaksford,2 Ulrike Hahn3 and Evan Heit4

There has been a recent explosion in research applying Bayesian models to

cognitive phenomena. This development has resulted from the realization that
across a wide variety of tasks the fundamental problem the cognitive system
confronts is coping with uncertainty. From visual scene recognition to on-line
language comprehension, from categorizing stimuli to determining to what degree
an argument is convincing, people must deal with the incompleteness of the
information they possess to perform these tasks, many of which have important
survival-related consequences. This paper provides a review of Bayesian models
of cognition, dividing them up by the different aspects of cognition to which
they have been applied. The paper begins with a brief review of Bayesian
inference. This falls short of a full technical introduction but the reader is
referred to the relevant literature for further details. There follows reviews of
Bayesian models in Perception, Categorization, Learning and Causality, Language
Processing, Inductive Reasoning, Deductive Reasoning, and Argumentation. In all
these areas, it is argued that sophisticated Bayesian models are enhancing our
understanding of the underlying cognitive computations involved. It is concluded
that a major challenge is to extend the evidential basis for these models, especially
to accounts of higher level cognition.  2010 John Wiley & Sons, Ltd. WIREs Cogn Sci 2010 1
811–823

INTRODUCTION ease—the external world, our memories of the

past, and the meaning of people’s utterances, seem,

F rom the point of view of the brain, nothing is

certain. Sensory input is noisy and extremely
partial: the structure of the environment must
introspectively at least, to be hearteningly stable. It
is only in the light of careful experimental analysis
that the frailty of such knowledge is revealed—so that
tentatively be inferred from unreliable scraps of perceptual illusions,1 the unreliability of judgement
information. Memory is also subject to distortion and memory,2 and the slipperiness of linguistic
and interference; and our view of the past thus interpretation3 seem, from an introspective point of
requires inferring a rich structure on the basis of view, rather unexpected. From this perspective, a
a sketchy and unreliable record. Linguistic input fundamental information processing task of the brain
is notoriously ambiguous, underspecified, may be is to weld scraps of information together to produce
deliberately deceptive, and its significance can only an integrated model of the external world; and to use
be a matter of conjecture, rather than certainty. This this model to help determine action and choice.
uncertainty concerning what to believe is paralleled in How can this be done? The Bayesian approach to
similar, and equally severe, uncertainties concerning cognition seeks to model this information processing
what we want, and how we should act. Yet the problem using the mathematical calculus of uncertain
brain copes with such uncertainties with surprising inference: probability theory. Each conjecture about
the world is associated with a numerical degree of
∗ Correspondence to: n.chater@ucl.ac.uk belief, defined to be on the interval between 0 and
1 Department of Cognitive, Perceptual and Brain Sciences and 1, where 1 corresponds to absolute certainty that
Centre for Economic Learning and Social Evolution (ELSE), UCL the belief is true; and 0 corresponds to absolute
London, UK
2 Department of Psychological Science, Birkbeck College, University
certainty that it is false. These beliefs can be identified
of London, London, UK with probabilities; and a consistent cognitive agent is
3 School of Psychology, Cardiff University, Cardiff, UK required to obey the rules of probability theory—at
4 Psychology Department, University of California, Merced, CA, least, if the agent is to avoid paradoxical conclusions.
USA This probabilistic perspective on the mind can be
DOI: 10.1002/wcs.79 traced back to one of the origins of probability theory.

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 811

Overview wires.wiley.com/cogsci

Indeed, the very title of Bernoulli’s4 seminal book After providing a brief overview of Bayesian
Ars Conjectandi, ‘The Art of Conjecture’, embodies inference, in this rest of this article we survey some
the idea that probability captures how people actually of the burgeoning research applying Bayesian mod-
make conjectures; as well as providing a calculus for els to cognition and perception. Seven sections cover
helping people to make conjectures more accurately. Bayesian models in Perception, Categorization, Learn-
Thus, one important strand in the development of ing and Causality, Language Processing, Inductive
probability theory viewed it directly as a theory of Reasoning, Deductive Reasoning, and Argumentation.
thought, as well as a helpful mathematical calculus.
The probabilistic approach can be adopted
at three different levels, corresponding to Marr’s5 BAYESIAN INFERENCE
three levels of explanation. Computational level
From a probabilistic standpoint, beliefs are a matter
explanation aims to specify the nature of the problem
of degree. Each hypothesis, Hi , can be associated with
that the brain faces: the goals of the system and
a degree of belief P(Hi ); and very modest consistency
the structure of the environment in which these
constraints require that these degrees of belief must
goals must be achieved. At the computational level,
obey the laws of probability. Thus, the probability
then, probabilistic methods are used to specify the
distribution over the various Hi can be viewed as
problem that the brain faces. Thus, learning to control
characterizing prior beliefs. Suppose that Hi has
an arm, or use a language, might be viewed as
implications for the data we expect to encounter
problems of probabilistic inference, given certain prior
(e.g., Hi states that the floodlights are on; which if
assumptions; and in the light of data gleaned from
true, makes sense sensory inputs—roughly, the bright
experience. Modern engineering, machine learning,
ones—more likely than others). These implications
and artificial intelligence typically view a wide range
can be captured by P(D|Hi ), the probability of the
of information processing problems faced by the brain, data, given the hypothesis. In the light of D, we need to
from motor control, to speech perception, to object update the priors P(Hi ) to P(Hi |D), the probabilities of
recognition from this probabilistic perspective. the hypotheses, given that the data is known. A simple
Algorithmic level explanation requires specify- identity of probability theory, Bayes’ theorem, shows
ing the representations and computational operations how this can be done:
over those representations that constitute cognition.
Even if the brain faces probabilistic challenges, it may P(D|Hi )P(Hi )
be that it solves them, using some set of heuristics P(Hi |D) =
P(D)
or approximations which do not involve actually
carrying out probabilistic calculations. On the other The probability of the data P(D) is not, of course,
hand, though, the modern technology of probabilistic known independently of the hypotheses that might
inference, as explored in state-of-the-art engineering generate that data—so in practice P(D) is typically
and artificial intelligence systems, does provide a rich expanded using the probabilistic identity:
set of hypotheses about human cognition. Cognitive
science is, after all, a process of reverse engineering; P(D) = P(D|Hj )P(Hj )
and reverse engineering inevitably draws on the best j
engineering solutions to the information processing
problems that the brain faces. Because of the centrality of the problem of updating
Finally, even if the brain is probabilistic at the beliefs in the light of new information, Bayes’ Theorem
computational and algorithmic levels, this does not has very broad application, so much so, indeed, that
necessarily imply that it is probabilistic at the third the interpretation of degrees of belief in terms of
of Marr’s levels of explanation, the implementational probabilities is often known as the Bayesian approach.
level. Indeed, probabilistic algorithms used in speech If we quantify ‘degrees of belief’ numerically, as
engineering or computer vision run on the binary logic the Bayesian approach presupposes, why should the
of digital computers. But some neuroscientists have laws of probability theory, rather than some other
begun to conjecture that the brain may be probabilistic principles, define the calculus of degrees of belief?
at its very foundations—that individual neurons From the point of view of cognitive science, there
may convey probabilistic information, that neural are two strong arguments for adopting a probabilistic
populations may capture probability distributions, approach. The first, mentioned above, is that violation
that basic neural processes might be understood of the laws of probability leads to paradoxical
as directly carrying out elementary probabilistic conclusions. Indeed, the laws of probability can be
inference.6 derived from a variety of plausible, modest, but

812  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

WIREs Cognitive Science Bayesian models of cognition

very different assumptions concerning how degrees rule.10 More recently, this perspective has become
of belief should behave. Perhaps the best known increasingly influential throughout the brain and
such derivation is the Dutch book theorem,7 which cognitive sciences, as well as in computer vision.
shows that, under fairly general conditions, gamblers Moreover, the Bayesian approach is consistent with
whose degrees of belief violate the laws of probability a broader tradition in perceptual research, the idea
will happily accept a combination of bets which are, that perception is analysis-by-synthesis.11 That is,
nonetheless, guaranteed to lose money, whatever their the perceptual data is presumed to be analyzed
outcomes—which appears to be an unequivocally (i.e., calculating P(H|D)) from a knowledge of the
irrational choice. This type of argument suggests perceptual data that would be generated by various
that, given that brains reason spectacularly well about possible scene interpretations (i.e., from a knowledge
uncertainty, it is unlikely systematically depart from of P(D|H), and of course a prior distribution
the norms of good probabilistic reasoning by too P(H) over the hypotheses concerning the scenes)—a
much—any good uncertain reasoner is, the argument transformation which requires the application of
might go, to some degree a good Bayesian, that is, Bayes’ theorem. In practice, the process of finding
probabilistic, reasoner. an interpretation from which the perceptual data
In addition to this a priori line of argument, and can reasonably be generated requires a combination
perhaps more persuasive from the point of view of the of bottom-up and top-down perceptual inferences,12
practicing neuroscientist and cognitive scientist is that a process that can be captured computationally by
the Bayesian approach is widely used in engineering recent methods such as Data-Driven Markov Chain
approaches to solving the types of problem faced by Monte Carlo.13 Thus, the Bayesian approach to
the brain. Thus, the fields of computer vision, speech perception requires that the perceptual system is
recognition, computational linguistics, robotics, able to generate sensory input, as well as being
machine learning, information retrieval and expert able to perceive it; and hence provides a natural
systems, and many more, have seen a dramatic explanation of the existence of imagery, consistent
upsurge in the application of probabilistic methods. with some existing psychological theories,14 and with
To the extent that the project of understanding the experimental data indicating the influence of top-
mind/brain is reverse engineering, that is, attempting down perceptual processes.15
to find the engineering principles that underpin neural Bayesian models of perception have been
and cognitive function, then any credible scientific subjected to direct experimental test in a number of
theory has to be good engineering; and the Bayesian domains (e.g., the integration of sensory cues16 ). And
approach seems plausibly to pass this test. a wide variety of computational models of empirical
Below, we briefly describe the Bayesian approach findings in perception have been put forward, ranging
to cognition in a number of domains, ranging from low-level image intepretation,17 shape from
from perception to learning about causal relations, shading,8,18 and shape from texture,19 to boundaries
to Bayesian models of higher-level reasoning and interpolation.20,21 There has also been an explosive
argumentation. growth in theories in the field of computational neu-
roscience which view specific neural mechanisms as
carrying out probabilistic computations, from lateral
PERCEPTION inhibition in the retina,22 to the activity of single
From a computational level perspective, the problem cells in the blow-fly,23 or to populations of neurons
of perception is that of inferring the structure of including the accumulation of sensory evidence.6
the world from sensory input. This problem may Indeed, it turns out that a large class of
seem to be ill-posed, because any given sensory input apparently non probabilistic models of perception can
may have been generated by an infinity of possible also be accommodated into the Bayesian framework.
states of the world.8 From a probabilistic perspective, A long tradition in perception, often viewed as stand-
the infinity of possible interpretations is not in itself ing in direct opposition to the Bayesian approach, is
problematic. Rather, the challenge of probabilistic based on simplicity: the perceptual system is assumed
inference in perception is to assign probabilities to to choose an interpretation of sensory input that pro-
each of these possible interpretations, based not only vides a briefest encoding of the sensory data. Here, the
on sensory input itself, but prior knowledge. This is a starting point for the perceiver is a coding language:
problem of Bayesian inference par excellence. a representational system in which scenes, and the
The Bayesian approach in perception has its sensory inputs that they deliver, can be represented.
beginnings in Helmholtz’s9 notion of ‘unconscious According to simplicity-based explanations, for exam-
inference’, although he did not explicitly use Bayes’ ple, Gestalt principles, such as common fate (grouping

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 813

Overview wires.wiley.com/cogsci

objects with the same movement together, such as a new item, an agent may postulate a new category;
flock of birds) or good continuation (assuming align- and therefore that the number of categories may
ment between items, even when occluded, typically grow, perhaps unboundedly, as the number of items
indicates they should be grouped, or perhaps part of categorized increases. This type of ‘nonparametric’
the same object, for example when the outline of an categorization model is widely used in Bayesian
animal is seen through dense foliage), arise because models of categorization, from Anderson41 through
of a preference for simple codes—codes which specify to Griffiths et al.42 and Goodman et al.43 . Exemplar
a single motion direction for the entire flock, rather models can then be seen as a limiting case of this class
than for each bird individually; or specify the position of model.44
of a single occluded object, rather than independently Viewing the problem of categorization as a
coding the positions of each object fragment. Yet matter of probabilistic inference provides more than
it turns out that simplicity-based approaches to an interesting notational variant of initial non
perception24–31 are mathematically equivalent to the probabilistic formulations. On the one hand, it
Bayesian approach, under mild conditions.32 The provides a fresh perspective on the explanation for
choice of coding language can be viewed as implicitly classic psychological data. So, to take a simple
specifying a prior probability distribution—such that example, the finding that people are usually able
items that have a brief representation in the language to classify more typical category members more
have relatively high prior probability. rapidly than less typical category members34 has a
natural interpretation: that the features of prototypical
items provide more unequivocal evidence for the
CATEGORIZATION specific category membership than do less prototypical
items; and hence fewer such features needs to be
Understanding perceptual input involves the creation
processed, on average, for a category judgment
of categories. Categorization allows generalization
to be made reliably. Moreover, the probabilistic
from one category member to another; and also allows
framework provides a starting point for a wide
the formulation of abstract relations defined over
range of generalizations, which may take account
categories, rather than concrete items. From a formal
of the fact, for example, that a single item may be
point of view, categorization is an aspect of high-level
a member of multiple categories45 ; that the prior
perception, where categorization of the items is in the
assumptions that underpin categorization may be
scene is just one of many pieces of information that
powerfully influenced by background theories46 ; or
must be recovered from sensory input. In cognitive
that the relative importance of different features, and
psychology, early theories of categorization focused
even the choice of appropriate features, may itself
on supervised categorization—that is, learning a
depend on the category being considered, and have to
category from a set of examples, labeled with
be learned.45
their category. The two main theoretical approaches
both focused on similarity between the item to be
classified to a prototypical category exemplar,33,34 or
alternatively to one of a set of category exemplars.35
LEARNING AND CAUSALITY
While initially formulated in probabilistic terms, both Conditioning in animals has traditionally been
types of theory have increasingly been formulated conceived as a matter of the formation of associations,
from a Bayesian point of view.36–40 Roughly speaking, which might be presumed to form on the basis of, for
the prototype view of categorization can be viewed as example, the constant conjunction of two events, or
assuming that categories corresponds to the Gaussian their spatial and temporal proximity. Nonetheless, a
(or similar) blobs, which may potentially overlap, in wide variety of empirical findings has indicated that
some feature space; and the problem of categorization the animal may be viewed as an intelligent problem
is to work out, given an item, the probability solver,47 attempting to figure out the structure of the
distribution over the Gaussian blobs that may have world, from available contingency data. Thus, for
generated it. According to the simplest formulation, example, the discovery of blocking,48 that once an
we assume that the participant is certain that the new animal has learned that an outcome is predicted by
item is generated by one of the previous encountered one cue, it is less liable to associate that outcome
categories; but in reality, of course, it is possible when the second cue is added; however reliable that
that a new item is generated by a category that has second cue may be, may be suggest that the animal
not been previously encountered. Thus one extension already has an ‘explanation’ of the outcome; and
of the prototype approach, from a probabilistic hence no further explanation, for example, in terms
point of view, is to allow that, in response to a of the second cue, is required. To the extent that

814  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

WIREs Cognitive Science Bayesian models of cognition

the animal is regarded as making inferences about going off many hours later, in view of our belief that,
the structure of the environment from observed data had the button not been pressed, the alarm would not
concerning the arrival of lights, tones, food pellets, have sounded. On the other hand, we do not assume
or shocks, the problem that the animal faces appears that alarm clock sounding is caused by the chiming
closely analogous to the general problem of scientific of the church clock next door, even if this regularly
inference, and hence to be naturally modeled with a occurs very few seconds before, because we know that
Bayesian framework.49–51 From this point of view, if some intervention occurred to stop the church clock
well-known conditioning phenomena, such as that is chiming, the alarm sounds nonetheless. It turns out
a contingency that has been reliably reinforced is that it is possible to construct a calculus of causal
extinguished more rapidly than a contingency that has intervention within a probabilistic framework55,56 ;
been partially reinforced, has a natural probabilistic and there has been recent experimental work attempt-
explanation. If a contingency is typically reliable, then ing to determine how far this framework can provide
after a few ‘extinction’ trials, there is already strong a useful model of human causality judgments, when
evidence that the state of the world has changed and intervention is allowed.54
that the strong tendency is no longer in operation; Finally, there has been a very promising line of
on the other hand, if the contingency is initially research in cognitive development, exploring Bayesian
unreliable, then a few such trials are to be expected network models of contingency learning, causal
by chance, in a case, and hence the animal will be learning, and learning from intervention, throughout
slower to reach the conclusion that the world has development.57 For example, Gopnik et al.57 discuss
changed, and that the contingency is no longer in a variety of experiments,58,59 which demonstrate
operation. This type of phenomenon is difficult to that pre-school children have the ability to learn
account for according to some mechanistic associative causal structures. In particular, this knowledge can
accounts, because the association formed by partial
be revealed by the nature of the interventions children
reinforcement is simply assumed to be weaker, and
choose to perform on the experimental apparatus
for this reason should be expected to be eliminated
embodying the causal relationships. This knowledge
more rapidly.
is independent of the frequency information available
Similarly, a variety of probabilistic models
in the experimental set up and does not appear to be
have been put forward to explain human judgment
learnable within non-Bayesian frameworks.
of contingency and causality, when learning from
Note though, that contingency is a relatively
experience. Cheng,52 for example, has put forward
weak source of information about causal relation-
a ‘probabilistic contrast’ model of human causal
ships. In observing the relationship between an object
judgment, according to which the strength of a causal
relationship is assumed to be measured by the contrast and its shadow, for example, the fact that the shadow
between probability of the effect, in the presence of has roughly the same shape as the object that casts it,
the cause, and the probability of the effect in the that the shadow moves predictably when the object
absence of the cause. Griffiths and Tenenbaum53 moves, and that, in many cases at least, the shadow
have proposed a Bayesian model in which the and object connect smoothly at the object base, pro-
existence of, and the nature of, a potential causal vide powerful indications of the existence of a relation-
relationship between events is itself inferred from the ship between the two; a trail of footprints in the sand
observed data. This account aims to explain empirical can reasonably be causally attributed to the recent
data concerning both how the structure of causal passage of feet purely in virtue of their shape and
relationships can be learned, as well as the strength arrangement. Indeed, a variety of classic psychological
of those relationships, which is the primary concern demonstrations of ‘perceptual’ causality60 and even
of Cheng’s model. Sloman and Lagnado,54 moreover, causal relations underpinned by social interactions,61
have directly studied the role of intervention in human appear to be perceived essentially instantaneously,
causal judgments. without requiring prior learning. A strength of the
According to many standard philosophical Bayesian approach is that it is, in principle, possible
accounts of causality, the existence of causal relation to build models which include richer representations
between two events A and B depends on counterfac- of the physical structure of the environment, or prior
tual claims about whether, for example, B would still knowledge about other aspects of the physical and
have occurred even if A had been ‘blocked’, leaving social world, such that examples of this kind can
everything else unchanged as far as possible. Thus, for readily be captured. Such work is at an early stage62 ;
example, pressing the ‘alarm set’ button on the alarm but, for example, there has already been significant
clock appears to be causally related to the alarm clock progress in constructing computational models of the

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 815

Overview wires.wiley.com/cogsci

attribution of intentions to an agent, from observing increasingly suggested that a probabilistic integration
the agent’s behavior.63 of multiple cues is used by the language processing
system in order to determine the most probable parse
and interpretation of the input.67–69
LANGUAGE PROCESSING As with other aspects of learning, it is also
Probabilistic approaches have also been influential natural to view the problem of acquiring a language
in recent accounts of language processing and as an example of uncertain inference. Any finite set of
acquisition.64 Within linguistics, it has been standard linguistic data available to the child will be compatible
to view probabilistic aspects of language as of with an infinite number of languages; and the child
marginal importance, although mainly the study of must learn to generalize from the observed input to be
syntax. Language is often viewed as a set of well- able to successfully produce and understand linguistic
formed strings, which are generated by a symbolic material that has never previously been encountered.
grammar, and associated, through systems of symbolic From a non probabilistic point of view, the
rules, with phonological and semantic representations. problem of learning a language appears almost
The mappings between phonology, syntax, and insuperably difficult; it will, for example, be extremely
semantics can be fully described, according to this hard for the learner to distinguish between, say,
point of view, without reference to probabilities. normal English and a version of English with one
Probability is, nonetheless, fundamentally involved additional constraint, for example, that it is not
in language processing and acquisition in a number of grammatically acceptable to begin and end a sentence
ways. with the word fish, to include more than five adjectives
Notice, for example, that the problem of in a noun phrase, or to use a sentence whose sequence
analog-to-digital conversion, that is, turning an of words forms a palindrome (disallowing dogs chase
extremely rich and complex acoustic waveform into a dogs). These possible variants of English would be
discrete phonological representation is an enormously extremely difficult to rule out, because the structures
challenging problem of uncertain inference. The that they disallow are extremely rare, and might not
speech wave is typically highly locally ambiguous, be expected to occur more than a few times, if at all,
and can only be disambiguated by piecing together during childhood. From a probabilistic point of view,
large numbers of locally ambiguous cues, together these variations need not be ruled out unequivocally,
with background knowledge concerning the speaker, but rather assigned a very low prior probability (e.g.,
the topic being discussed, and so on. Unsurprisingly, on the basis that prior probability should be inversely
speech technology draws on a rich repertoire of related to complexity); from a non probabilistic point
probabilistic methods including hidden Markov of view, such possibilities either need to be ruled out
models, and neural networks.65 Probability plays a entirely, or pose genuine problems for the learner.
similar role in helping to construct a globally coherent Note, though, that languages do exhibit numerous
parse (and associated semantic representation), in the apparently arbitrary constraints, which learners are
light of the notorious local ambiguity of natural able to successfully learn. So, for example, the child
language, whether such ambiguity is lexical (e.g., must infer that, while it is acceptable to say I made the
bank as financial institution or geographical feature), clock break, I broke the clock, and I made the clock
syntactic [e.g., I saw the man (with the telescope) disappear, it is not acceptable to say I disappeared
vs. I saw (the man with the telescope)], or semantic the rabbit, even though the meaning of this string of
(e.g., all the witnesses saw a burglar running from words is entirely clear. Learning the absence of certain
the scene, which might or might not be interpreted linguistic possibilities has often been viewed as posing
as implying that each witness all the same burglar). ‘logical’ problems for language acquisition, however,
Again, a globally coherent parse and interpretation much data the child receives.70 From a probabilistic
of a sentence can only be achieved by integrating standpoint, it is possible to show that learning is
these locally ambiguous cues, together with relevant possible in principle, given sufficient data.71 More
background knowledge; and, just as in the problem important, perhaps, Bayesian analysis of language
of perception, the natural framework in which to acquisition provides the tools to assess the prior
consider such integration is probabilistic inference. information that the learner must possess, in order
Traditional theories of parsing have not, to learn these and other regularities, given realistic
however, taken a probabilistic standpoint; indeed, estimates of the data available to the child.72
such accounts have often, instead, focus purely on There has, moreover, been increasing interest in
structural features of the competing parses.66 Research building statistical computational models, although
over the last decade and a half has, however, not always using a strictly Bayesian framework,

816  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

WIREs Cognitive Science Bayesian models of cognition

which can potentially model the acquisition of a Rabbits have sesamoid bones (a)
variety of aspects of phonology, syntax and semantics, Dogs (Bears) have sesamoid bones
ranging from the acquisition of morphology, to
syntactic categories, and broad semantic classes73–76 ;
and there has been substantial progress in developing Bluejays (Geese) have sesamoid bones (b)
computational models that are able to learn phrase Blue tits have sesamoid bones
structure and dependency relations from corpora
of untagged text.77 From the point of view of a
Bayesian analysis, the problem of language acquisition This Barratos islander is obese (c)
remains formidable indeed; but significant progress
All Barratos islanders are obese
has been made both in developing specific models of
learning, and defining methods for determining what
is learnable in principle. This Shreeble is blue (d)

All Shreebles are blue

INDUCTIVE REASONING
Inductive reasoning involves drawing conclusions that Cows require vitamin K for the liver to function (e)
are probably true, given a set of premises. Conse- Horses require vitamin K for the liver to function
quently, a rational Bayesian approach seems uniquely
All mammals require vitamin K for the liver to function
suited to model induction. Inductive reasoning con-
trasts with deductive reasoning, in which the conclu-
sion must necessarily follow from a set of premises. Cows require vitamin K for the liver to function (f)
In contrast, two inductive arguments can each have
Ferrets require vitamin K for the liver to function
some degree of inductive strength (Figure 1).
There is now a well-documented set of empirical All mammals require vitamin K for the liver to function
regularities on inductive reasoning (see Ref 78, for
FIGURE 2 | Empirical effects. (a) Similarity : when premise and
a more extensive review). These demonstrations
conclusion are more similar (rabbits–dogs) inference is stronger than
all use inference patterns like that in figure 1. when they are less similar (rabbits–bears). (b) Typicality : typical
Rips,79 looked at how people project properties categories (bluejays) lead to stronger inferences than less typical
of one category of animals to another (Figure 2(a) (geese). Variability : variable categories (c) lead to stronger inferences
and (b)). He found that the more similar the than less variable categories (d). Diversity . : diverse categories (f) lead
premise category is to the conclusion category the to stronger inferences than less diverse categories (e).
stronger the inference (Figure 2a). He also found
that the more typical the premise category [bluejays
(typical) vs. geese (atypical)] the stronger the inference
(Figure 2b). Using multiple regression analyses, Rips variability of the conclusion category. After just one
found distinct contributions of premise-conclusion case, variable categories (Figure 2(c)), for example,
similarity and premise typicality (see Ref 80 for people on an imaginary island (Barratos) with
further investigations of similarity and typicality respect to obesity, lead to weaker inferences than
effects). non-variable categories, such as imaginary birds
Using similar materials, Nisbett et al.,81 found (Shreebles) with respect to color (Figure 2(d)). Nisbett
that participants were very sensitive to the perceived et al.81 also systematically varied the given number of
observations. For example, participants were told that
1, 3, or 20 shreebles had been observed. Inferences
Cows have sesamoid bones (a)
were stronger with increased sample size (see also Ref
All mammals have sesamoid bones
80). Osherson et al.80 showed that diversity of cases
also affects inductive strength, that is, Figure 2(f) is
considered stronger than Figure 2(e). This diversity
Ferrets have sesamoid bones (b) effect runs in the opposite direction to the typicality
All mammals have sesamoid bones effect: Whereas a typical premise category leads
to a fairly strong inductive argument (Figure 2(b)),
FIGURE 1 | Inductive arguments vary in strength. The conclusion in an argument with two typical premise categories
argument (a) may seem stronger, or more probable given the evidence, (Figure 2(e)) is weaker than an argument with a typical
than the conclusion in (b).
premise and an atypical premise (Figure 2(f)).

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 817

Overview wires.wiley.com/cogsci

A rational Bayesian model82 views evaluating an the application of rational Bayesian models in this
inductive argument as learning for which categories area, we concentrate on conditional inference which
a property is true or false. In Figure 1(a), the goal is is currently the most researched topic in the area.
to learn which animals have sesamoid bones. For this Four inference patterns have mainly been stud-
novel property, hypotheses must be derived from prior ied: two which are logically valid: modus ponens
knowledge about familiar properties. People know (MP) and modus tollens (MT), and two falla-
some facts that are true of all mammals (including cies: denying the antecedent (DA) and affirming the
cows), but they also know some facts that are true consequent (AC) (Figure 3). Classical logic predicts
of cows but not some other mammals. The question endorsement of the valid inferences and rejection
is which of these known kinds of properties does the of the fallacies. However, all four inferences are
novel property, ‘has sesamoid bones’, resemble most, endorsed above 50% and in the characteristic order:
an all-mammal property, or a cow-only property? MP > MT > AC > DA93 revealing a large discrep-
Crucially it is assumed that novel properties follow ancy between performance and logical expectations.
the same distribution as known properties. Because The core intuition behind a rational Bayesian
many known properties of cows are also true of other model of conditional inference is that it must account
mammals, argument Figure 1(a) seems fairly strong. for the non monotonicity of everyday informal
As well as typicality, a Bayesian model also reasoning with conditionals.94,95 Classical logic is
addresses the other key results in inductive reasoning. monotonic (Figure 4(a)) and hence is unable to
Similarity effects arise because given that rabbits have account the ability of additional information to defeat
sesamoid bones, it more likely that dogs do rather previously derived conclusions (Figure 4(b)). The only
than bears, because rabbits and dogs share more recourse is to question the premises, e.g., in Figure 4(b)
known properties than rabbits and bears. Diversity to suggest that birds fly is false. But surely, while
effects are also addressed. Figure 2(e) will access many defeasible, this is a very useful generalization that we
idiosyncratic properties true just of large farm animals would not want to reject as false.
and so a novel property of cows and horses may seem The Bayesian approach is to adopt The
idiosyncratic to farm animals. In contrast, Figure 2(f) Equation and to treat conditional inference as
could not access familiar idiosyncratic properties true Bayesian conditionalization.87,88 That is, people are
of just these two animals, so prior hypotheses must trying to determine the posterior probability of
be derived from known properties that are true of the conclusion, P1 (flys(a)), given they now know
all mammals or all animals. We have focused here that the categorical premise holds with certainty,
on a narrow class of inductive inference problems P1 (bird(a)) = 1 (Figure 4(a)). By Bayesian condition-
that have been especially well-studied empirically. alization, P1 (flys(a)) = P0 (flys(a)|bird(a)), that is, the
But recent Bayesian models have analyzed a wide posterior probability of the conclusion equals the prior
conditional probability of the conclusion given the cat-
range of inductive problems, which can be naturally
egorical premise. Note that this approach easily han-
formulated and modeled in probabilistic terms.83,84
dles non monotonicity, for example, P0 (flys(a)|bird(a))
= 0.9 and P0 (flys(a)|bird(a),Ostrich(a)) = 0 are per-
fectly probabilistically consistent (Figure 4b).
DEDUCTIVE REASONING This approach cannot immediately apply to
MT and the fallacies because, for example, DA
Work on ostensibly deductive reasoning tasks reveals requires knowledge of P0 (¬flys(a)|¬bird(a)) and
many apparent errors and biases when performance is there is insufficient information in the premises
compared to classical logical standards.85 The recent
emergence of rational Bayesian models casts this per- p ⇒ q, p p ⇒ q, ¬q
formance in a better light by comparing performance (MP) (MT)
∴q ∴ ¬p
to a probabilistic standard.86,87 Such models have been
developed in all the three main areas investigated in
p ⇒ q, ¬p p ⇒ q, q
the psychology of reasoning, conditional inference,88 (DA) (AC)
data selection,89 and syllogistic reasoning.90 The key ∴ ¬q ∴p
idea behind them all is that the conditional prob- FIGURE 3 | The valid inferences, modus ponens (MP) and modus
ability, P(q|p), provides the meaning of conditional tollens (MT), and the fallacies, denying the antecedent (DA) and
statements, if p then q (e.g., if you turn the key then affirming the consequent (AC), investigated in conditional inference.
the car starts), and so P(if p then q) = P(q|p). This These inference schema are to be read that if the list of premises above
latter identity is called The Equation.91,92 To illustrate the line are true so must be the conclusion below the line.

818  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

WIREs Cognitive Science Bayesian models of cognition

triangle(x ) ⇒ 3 – sides(x ), triangle(a) triangle(x) ⇒ 3 – sides(x), triangle(a), red(a)

(a) →
∴ 3 – sides(a) ∴ 3 – sides(a)

bird(x) ⇒ flys(x), bird(a) bird(x) ⇒ flys(x), bird(a), Ostrich(a)

(b) →
∴ flys(a) ∴ ¬flys(a)

FIGURE 4 | Monotonic (a) and non-monotonic (b) conditional inference by MP. In (a), the additional information, that the particular triangle a is
red, cannot override the original conclusion that qua triangle, a has three sides. In contrast, in (b), the additional information, that the particular bird
a is an Ostrich does override the original conclusion that qua bird, a can fly.

bird(x) ⇒ flys(x), bird(a) P0(flys(x) | bird(x)) = 0.9, P1(bird(a)) = 1

FIGURE 5 | Bayesian conditionalization. P0 = prior (a)
∴ flys(a) ∴ P1(flys(a)) = 0.9
probability, for example, prior to learning that a is a
bird; P1 = posterior probability, for example, after
learning that a is a bird. By Bayesian conditionalization
P1 (flys (a)) = P0 (flys (a)|bird (a)). Note that (a) and P0(flys(x) | bird(x)) = 0.9, P1(bird(a)) = 1, P1(Ostrich(a)) = 1
(b) are perfectly probabilistically compatible, that is, (b)
Bayesian conditionalization is non-monotonic. ∴ P1(flys(a)) = 0

to determine this probability. This is actually also to persuade yourself or others of a particular,
true of P0 (flys(a)|bird(a)) for MP, which on the perhaps controversial, position.97 The rational Bay-
subjective view of probability (see Introductory text) esian approach has been extended to at least some
must be determined by reference to global world aspects of argumentation.98 On this view concern
knowledge via the Ramsey Test, that is, add the centers on how the premises, P, of an argument
antecedent, bird(a), to one’s stock of beliefs, make affect the probability of the conclusion, C. If
minimal adjustments to incorporate it, and then P(C|P) is high then the argument has high inductive
read off the probability of the consequent, flys(a), strength.
this is P0 (flys(a)|bird(a)) (Figure 5). To determine This account has been applied most directly
the conditional probabilities for DA, AC, and MT to reasoning fallacies in the attempt to understand
requires the assumption that the priors P0 (flys(x)) how some instances seem to be good arguments
and P0 (bird(x)) are also available from global world while others do not.99 For example, the classical
knowledge. Figure 6 show shows how well the
so-called argument from ignorance, or argumentum
Bayesian conditionalization model accounts for the
ad ignorantiam, has many seemingly very weak
principle data on conditional inference.
exemplars:
ARGUMENTATION
Reasoning and decision making often takes place in Ghosts exist, because nobody has proven
the service of argumentation, that is, the attempt that they don’t (1)

Data Model
1
(a) (b) (c)
P (Endorse)

0.8

FIGURE 6 | Fit of the Bayesian

0.6
conditionalization model to the empirical data.
(a) the fit of the standard account presented in the
text; (b) the fit provided by classical logic (modified Error bars = 95% CIs
0.4
to incorporate error); (c) the fit of a modified MP DA AC MT MP DA AC MT MP DA AC MT
Bayesian conditionalization model.96 Inference

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 819

Overview wires.wiley.com/cogsci

However, other exemplars of this argument form seem Positive Negative Model
quite strong in scientific and everyday discourse: 1
One Fifty

Prob (Conclusion)
This drug is safe, because no-one has 0.9
found any toxic effects (2)
0.8

The classic tool brought to the analysis of fallacies, 0.7

formal logic, is widely acknowledged to be completely R2 = .98 Error bars = 95% Cls
unable to explain the difference between (1) and 0.6
(2).99 More recent pragma-dialectical approaches,97 Strong Weak Strong Weak
which argue that fallacies arise due to the applica- Prior belief
tion of rules of argumentation outside the discourse
FIGURE 8 | The mean acceptance ratings for Ref 100 by evidence
context in which they apply, similarly cannot dis-
(1 vs. 50 experiments), prior belief (strong vs. weak), and argument type
tinguish (1) from (2). This is simply because (1) and (positive vs. negative). CI = confidence interval, (N = 84).
(2) could appear in exactly the same discourse con-
text but (2) would still be regarded as stronger
than (1).98 Figure 8 shows the effect of manipulating these
The rational Bayesian approach distinguishes factors on peoples’ assessments of arguments strength,
(1) and (2) in terms of their inductive strength. Essen- using an amount of evidence manipulation which
tially, adequate tests of toxicity exist to establish with should affect sensitivity and selectivity. Fitting the
a high probability that a drug is safe. However, there Bayesian model to the data revealed this to be the
are no adequate tests of the non existence or existence case: sensitivity and specificity were higher in the
of Ghosts that could establish with high probability high than in the low amount of evidence condition,
that Ghosts exist. The Bayesian approach assumes with P(¬P|¬C) = 0.83 and P(P|C) = 0.66 (high), and
that P(C|P) is calculated by Bayes’ theorem which P(¬P|¬C) = .77 and P(P|C) = 0.46 (low), respec-
dictates the factors which should influence people’s tively. Moreover, sensitivity was higher than selec-
assessments of argument strength. tivity. Similar Bayesian models have also been used
According to this approach, the argument in to analyze circular reasoning and the slippery slope
(2) corresponds to negative test validity, P(¬T|¬e), argument.98
that is, the probability that a drug is not toxic (safe)
given there is no evidence of toxicity. This nega-
tive argument contrast with positive test validity, CONCLUSION
P(T|e) (Figure 7). By Bayes’s theorem, these quan- The brain faces pervasive uncertainty. Bayesian mod-
tities depend on the sensitivity and selectivity of els of cognition aim to understand a wide range
the test and the prior belief that the drug is toxic of cognitive problems involving uncertainty, ranging
(Figure 7). If selectivity is higher than sensitivity—a from perception to high-level reasoning and argument.
frequent occurrence in real world clinical and psy- Bayesian methods thus may provide a potential link
chometric tests—then positive arguments based on between high-level and low-level cognition that may
P(T|e) are stronger than negative arguments based on bridge across each of Marr’s levels of explanation.
P(¬T|¬e).96,98 Currently, it would be true to say that the degree
of acceptance enjoyed by Bayesian models is roughly
inversely related to the level of the cognitive phenom-
nh l (1−h) ena being modeled, that is, acceptance is greatest at
P(T/e) = P(¬T/¬e) =
nh + (1−l )(1−h) l (1−h) + (1−n)h the low neural/perceptual level and decreases as one
moves toward higher level phenomena such as reason-
FIGURE 7 | Positive (P(T|e)) and negative (P(¬T|¬e)) test validity. ing. This seems due in part to availability at the lower
These probabilities can be calculated from the sensitivity (P(e|T )) and level of some quite exquisitely detailed experimental
the selectivity (P(¬e|¬T )) of the test and the prior belief that T is true evidence relating the phenomenon to the models. Over
(P(T)) using Bayes’ theorem. Let n denote sensitivity, that is, the coming years it will be important to see whether
n = P(e|T ), l denote selectivity, that is, l = P(¬e|¬T ), and h denote
similarly detailed and convincing evidence will emerge
the prior probability of drug A being toxic, that is, h = P(T).
for Bayesian models of higher level cognition.

820  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

WIREs Cognitive Science Bayesian models of cognition

REFERENCES
1. Gregory R. The Intelligent Eye. London: Weidenfeld 19. Blake A, Bulthoff HH, Sheinberg D. Shape from tex-
and Nicolson; 1970. ture: ideal observers and human psychophysics. In:
2. Johansson P, Hall L, Sikström S, Olsson A. Failure to Knill D, Richards W, eds. Perception as Bayesian
detect mismatches between intention and outcome in Inference. Cambridge, MA: Cambridge University
a simple decision task. Science 2005, 310:116–119. Press; 1996, 287–321.
3. Quine WVO. Word and Object. Cambridge, MA: 20. Feldman J. Bayesian contour integration. Percept Psy-
Harvard University Press; 1960. chophys 2001, 63:1171–1182.
4. Bernoulli J. Ars Conjectandi, The Art of Conjecture, 21. Feldman J, Singh M. Information along curves and
1713, (translation and notes by Edith Dudley Sylla, closed contours. Psychol Rev 2005, 112:243–252.
Baltimore, Maryland: John Hopkins University Press, 22. Barlow HB. (1959)., Sensory mechanisms, the reduc-
2005). tion of redundancy, and intelligence. In The mechani-
5. Marr D. Vision. San Francisco, CA: W. H. Freeman; sation of thought processes, H.M.S.O., London: pp.
1982. 535–539.
6. Beck J, Ma WJ, Kiani R, Hanks T, Churchland AK 23. Snippe HP, Poot L, van Hateren JH. A temporal model
et al. Probabilistic population codes for Bayesian deci- for early vision that explains detection thresholds for
sion making. Neuron 60:1142–52. light pulses on flickering backgrounds. Vis Neurosci
7. Lehman R. On confirmation and rational betting. 2000, 17:449–462.
J Symbolic Logic 1955, 20:251–262. 24. Attneave F. Some informational aspects of visual per-
8. Freeman WT. The generic viewpoint assumption in ception. Psychol Rev 1954, 61:183–193.
a framework for visual perception. Nature 1994,
25. Hochberg JE, McAlister E. A quantitative approach to
368:542–545.
figural ‘‘goodness’’. J Exp Psychol 1953, 46:361–364.
9. Helmholtz H. Treatise on Physiological Optics, vol.
26. Leeuwenberg ELJ. Quantitative specification of infor-
3. New York: Dover; 1910/1962, (English translation
mation in sequential patterns. Psychol Rev 1969,
by JPC Southall for the Optical Society of America
216–220.
(1925) from the 3rd German edition of Handbuch der
physiologischen Optik (Hamburg: Voss, 1910; first 27. Leeuwenberg E. A perceptual coding language for
published in 1867, Leipzig: Voss)). perceptual and auditory patterns. Am J Community
10. Westheimer G. Was Helmholtz a Bayesian? Percep- Psychol 1971, 84:307–349.
tion 2008, 37:642–650. 28. Leeuwenberg E, Boselie E. Against the likelihood prin-
11. Liberman AM, Mattingly IG. The motor theory of ciple in visual form perception. Psychol Rev 1988,
speech perception revised. Cognition 1985, 21:1–36. 95:485–491.
12. Yuille A, Kersten D. Vision as Bayesian infer- 29. Mach E. The Analysis of Sensations and the Relation
ence: analysis by synthesis? Trends Cogn Sci 2006, of the Physical to the Psychical. New York: Dover
10:301–308. Publications; 1959, (Original work published 1914).
13. Tu Z, Zhu S-C. Image segmentation y data-driven 30. Restle F. Theory of serial pattern learning: structural
Markov chain Monte Carlo. IEEE Trans Pattern Anal trees. Psychol Rev 1970, 77:481–495.
Mach Intell 2002, 24:657–673.
31. Van der Helm PA, Leeuwenberg PA. Goodness of
14. Shepard RN. Ecological constraints on internal repre- visual regularities: a non- transformational approach.
sentation. Psychol Rev 1984, 91:417–447. Psychol Rev 1996, 103:429–496.
15. Bar M, Kassam KS, Ghuman AS, Boshyan J, Schmid 32. Chater N. Reconciling simplicity and likelihood prin-
AM. Top-down facilitation of visual recognition. Proc ciples in perceptual organisation. Psychol Rev 1996,
Natl Acad Sci USA 2006, 103:449–454. 103:566–581.
16. Ernst MO, Banks MS. Humans integrate visual and 33. Reed SK. Pattern recognition and categorization. Cog-
haptic information in a statistically optimal fashion. nit Psychol 1972, 3:382–407.
Nature 2002, 415:429–433.
34. Rosch E, Mervis CB. Family resemblances: studies in
17. Weiss Y. Interpreting images by propagating Bayesian
the internal structure of categories. Cognit Psychol
beliefs. In: Mozer, MC, Jordan MI, Petsche T, eds.
1975, 7:573–605.
Advances in Neural Information Processing Sys-
tems 9. Cambridge MA: MIT Press; 1997, 908–915. 35. Medin DL, Schaffer MM. Context theory of classifi-
18. Adelson EH, Pentland AP. The perception of shad- cation learning. Psychol Rev 1978, 85:207–238.
ing and reflectance. In: D Knill, W Richards, eds. 36. Ashby FG, Gott RE. Decision rules in the perception
Perception as Bayesian Inference. Cambridge, MA: and categorization of multidimensional stimuli. J Exp
Cambridge University Press; 1996, 409–423. Psychol: Learn Mem Cognit 1988, 14:33–53.

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 821

Overview wires.wiley.com/cogsci

37. Ashby FG, Townsend JT. Varieties of perceptual inde- 56. Pearl J. Causality: Models, Reasoning and Inference.
pendence. Psychol Rev 1986, 93:154–79. Cambridge: Cambridge University Press; 2000.
38. Fried LS, Holyoak KJ. Induction of category distribu- 57. Gopnik A, Glymour C, Sobel DM, Schulz LE, Kushnir
tions: a framework for classification learning. J Exp T et al. A theory of causal learning in children: causal
Psychol Learn Mem Cogn 1984, 10:234–257. maps and Bayes nets. Psychol Rev 2004, 111:3–32.
39. Lamberts K. Information-accumulation theory 58. Gopnik A, Sobel DM, Schulz LE, Glymour C. Causal
of speeded categorization. Psychol Rev 2000, learning mechanisms in very young children: two-,
107:227–260. three-, and four-year olds infer causal relations from
40. Nosofsky RM. Attention, similarity, and the patterns of variation and co-variation. Dev Psychol
identification-categorization relationship. J Exp Psy- 2001, 37:620–629.
chol Gen 1986, 115:39–57. 59. Sobel D, Tenenbaum J, Gopnik A. Children’s causal
41. Anderson JR. The adaptive nature of human catego- inferences from indirect evidence: backwards block-
rization. Psychol Rev 1991, 98:409–429. ing and Bayesian reasoning in preschoolers. Cogn Sci
42. Griffiths TL, Sanborn AN, Canini KR, Navarro DJ. 2004, 28:303–333.
Categorization as nonparametric Bayesian density 60. Michotte A. The Perception of Causality. New York:
estimation. In: Oaksford M, Chater N, eds. The Basic Books; 1963.
Probabilistic Mind: Prospects for Rational Models 61. Heider F, Simmel M. An experimental study of
of Cognition. Oxford: Oxford University Press; 2008. apparent behavior. Am J Community Psychol 1944,
43. Goodman ND, Tenenbaum JB, Griffiths TL, Feldman 57:243–259.
J. Compositionality in rational analysis: grammar- 62. Sanborn AN, Mansinghka VK, Griffiths TL.
based induction for concept learning. In: Oaksford A Bayesian framework for modeling intuitive dynam-
M, Chater N, eds. The Probabilistic Mind: Prospects ics. In: Taatgen NA, van Rijn H, eds. Proceedings of
for Rational Models of cognition. Oxford: Oxford the 31st Annual Conference of the Cognitive Science
University Press. 2008. Society. Austin, TX: Cognitive Science Society; 2009,
44. Rosseel Y. Mixture models of categorization. J Math 1145–1150.
Psychol 2002, 46:178–210.
63. Baker CL, Tenenbaum JB, Saxe RR. Action as inverse
45. Heller KA, Sanborn A, Chater N. Hierarchical learn- planning. Cognition In press.
ing of dimensional biases in human categorization.
64. Chater N, Manning C. Probabilistic models of lan-
Neural Inf Process Syst, 2009.
guage processing and acquisition. Trends Cogn Sci
46. Tenenbaum JB, Griffiths TL, Kemp C. Theory-based 2006, 10:335–344.
Bayesian models of inductive learning and reasoning.
65. Rabiner L, Juang L. Fundamentals of Speech Recog-
Trends Cogn Sci 2006, 10:309–318.
nition. New York: Prentice Hall; 1993.
47. Dickinson A. Contemporary Animal Learning The-
66. Frazier L. On Comprehending Sentences: Syntactic
ory. Cambridge: Cambridge University Press; 1980.
Parsing Strategies. Ph.D. Dissertation, University of
48. Kamin LJ. ‘‘Attention-like’’ processes in classical con- Connecticut, 1979.
ditioning. In: Jones MR, ed. Miami Symposium on the
Prediction of Behavior, 1967: Aversive Stimulation. 67. Chater N, Crocker MJ, Pickering MJ. The rational
Coral Gables, Florida: University of Miami Press; analysis of inquiry: the case of parsing. In: Oaksford
1968, 9–31. M, Chater N, eds. Rational Models of Cognition.
Oxford: Oxford University Press; 1998, 441–468.
49. Courville AC, Daw ND, Touretzky DS. Bayesian the-
ories of conditioning in a changing world. Trends 68. Narayanan S, Jurafsky D. A Bayesian model predicts
Cogn Sci 2006, 10:294–300. human parse preference and reading time in sentence
processing. In: Dietterich TG, Becker S, Ghahramani
50. Gallistel CR, Gibbon J. Time, rate, and conditioning.
Z, eds. Advances in Neural Information Processing
Psychol Rev 2000, 107:289–344. Systems, vol. 14. Cambridge, MA: MIT Press; 2002,
51. Kakade S, Dayan P. Acquisition and extinction in 59–65.
autoshaping. Psychol Rev 2002, 109:533–544.
69. McRae K, Spivey-Knowlton MJ, Tanenhaus MK.
52. Cheng PW. From covariation to causation: a causal Modeling the influence of thematic fit (and other con-
power theory. Psychol Rev 1997, 104:367–405. straints) in online sentence comprehension. J Memory
53. Griffiths TL, Tenenbaum JB. Structure and strength in Lang 1998, 38:283–312.
causal induction. Cognit Psychol 2005, 51:354–384. 70. Pinker S. Formal models of language learning. Cogni-
54. Sloman SA, Lagnado DA. Do we do? Cogn Sci 2005, tion 1979, 7:217–283.
29:5–39. 71. Chater N, Vitányi P. ‘Ideal learning’ of natural lan-
55. Spirtes P, Glymour C, Scheines R. Causation, Predic- guage: positive results about learning from positive
tion and Search. Cambridge, MA: MIT Press; 1993. evidence. J Math Psychol 2007, 51:135–163.

822  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

WIREs Cognitive Science Bayesian models of cognition

72. Foraker S, Regier T, Khetarpal N, Perfors A, 87. Oaksford M, Chater N. Bayesian Rationality:
Tenenbaum J. Indirect evidence and the poverty of The Probabilistic Approach to Human Reasoning.
the stimulus: the case of anaphoric one. Cogn Sci Oxford: Oxford University Press; 2007.
2009, 33:287–300.
88. Oaksford M, Chater N, Larkin J. Probabilities and
73. Goldsmith J. An algorithm for the unsupervised learn- polarity biases in conditional inference. J Exp Psychol
ing of morphology. Nat Lang Eng 2006, 12:353–371. Learn Mem Cogn 2000, 26:883–889.
74. Landauer TK, Dumais ST. A solution to Plato’s prob- 89. Oaksford M, Chater N. A rational analysis of the
lem: the Latent Semantic Analysis theory of acqui- selection task as optimal data selection. Psychol Rev
sition, induction and representation of knowledge. 1994, 101:608–631.
Psychol Rev 1997, 104:211–240.
90. Chater N, Oaksford M. The probability heuristics
75. Redington M, Chater N, Finch S. Distributional
model of syllogistic reasoning. Cognit Psychol 1999,
information: a powerful cue for acquiring syntactic
38:191–258.
categories. Cogn Sci 1998, 22:425–469.
76. Griffiths TL, Steyvers M. Finding scientific topics. 91. Adams EW. The utility of truth and probability. In:
Proc Natl Acad Sci U S A 2004, 101:5228–5235. Weingartner P, Schurz G, Dorn G, eds. The Role
of Pragmatics in Contemporary Philosophy. Vienna:
77. Klein D, Manning C. A generative constituent-context Holder-Pichler-Tempsky; 1998, 176–1994.
model for improved grammar induction. Proceed-
ings of the Annual Conference of the Association 92. Edgington D. On conditionals. Mind 1995, 104:
of Computational Linguistics (ACL 40), University of 235–329.
Pennsylvania, Philadelphia, PA, USA, 2002, 128–135. 93. Schroyens W, Schaeken W. A critique of Oaksford,
78. Heit E. Properties of inductive reasoning. Psychon Chater and Larkin’s (2000) conditional probability
Bull Rev 2000, 7:569–592. model of conditional reasoning. J Exp Psychol Learn
79. Rips LJ. Inductive judgments about natural categories. Mem Cogn 2003, 29:140–149.
J Verbal Learn Verbal Behav 1975, 14:665–681. 94. Oaksford M, Chater N. Against logicist cognitive
80. Osherson DN, Smith EE, Wilkie O, Lopez A, Shafir science. Mind Lang 1991, 6:1–38.
E. Category-based induction. Psychol Rev 1990, 95. Oaksford M, Chater N. Rationality in an Uncertain
97:185–200. World. Hove, England: Psychology Press; 1998.
81. Nisbett RE, Krantz DH, Jepson C, Kunda Z. The use
96. Oaksford M, Chater N. Probability logic and the
of statistical heuristics in everyday inductive reason-
Modus Ponens-Modus Tollens asymmetry in condi-
ing. Psychol Rev 1983, 90:339–363.
tional inference. In: Chater N, Oaksford M, eds. The
82. Heit E. A Bayesian analysis of some forms of inductive Probabilistic Mind: Prospects for Bayesian Cogni-
reasoning. In: Oaksford M, Chater N, eds. Ratio- tive Science. Oxford: Oxford University Press; 2008,
nal Models of Cognition. Oxford: Oxford University 97–120.
Press; 1998, 248–274.
97. van Eemeren FH, Grootendorst R. Argumentation,
83. Kemp C, Jern A. A taxonomy of inductive problems. Communication, and Fallacies: a Pragma-Dialectical
In: Taatgen N, van Rijn H. Proceedings of the 31st Perspective. Mahwah, NJ: Lawrence Erlbaum Asso-
Annual Conference of the Cognitive Science Society. ciates; 1992.
Mahwah, NJ: Lawrence Erlbaum Associates; 2009,
255–260. 98. Hahn U, Oaksford M. The rationality of informal
argumentation: a Bayesian approach to reasoning fal-
84. Kemp C, Tenenbaum JB. Structured statistical models
lacies. Psychol Rev 2007, 114:704–732.
of inductive reasoning. Psychol Rev 2009, 116:20–58.
85. Evans J.St.B.T. Heuristics and analytic processes in 99. Hamblin CL. Fallacies. London: Methuen; 1970.
reasoning. Br J Health Psychol 1984, 75:541–568. 100. Oaksford M, Hahn U. A Bayesian analysis of the
86. Oaksford M, Chater N. The probabilistic approach to argument from ignorance. Can J Exp Psychol 2004,
human reasoning. Trends Cogn Sci 2001, 5:349–357. 58:75–85.

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 823

(Studies in Brain and Mind 2) John Bickle (Auth.) - Philosophy and Neuroscience - A Ruthlessly Reductive Account-Springer Netherlands (2003)
100% (1)
(Studies in Brain and Mind 2) John Bickle (Auth.) - Philosophy and Neuroscience - A Ruthlessly Reductive Account-Springer Netherlands (2003)
245 pages
E Nihss
No ratings yet
E Nihss
5 pages
Jean Petitot - Cognitive Morphodynamics - Dynamical Morphological Models of Constituency in Perception and Syntax
100% (1)
Jean Petitot - Cognitive Morphodynamics - Dynamical Morphological Models of Constituency in Perception and Syntax
312 pages
Tallis - What Neuroscience Cannot Tell Us About Ourselves
No ratings yet
Tallis - What Neuroscience Cannot Tell Us About Ourselves
24 pages
Churchland - 1989 - Folk Psychology and The Explanation of Human Behavior
No ratings yet
Churchland - 1989 - Folk Psychology and The Explanation of Human Behavior
18 pages
Strongly Embodied Functionalism - Sachs
No ratings yet
Strongly Embodied Functionalism - Sachs
24 pages
Book Review - Surfing Uncertainty
No ratings yet
Book Review - Surfing Uncertainty
6 pages
Understanding The "Cognitive Revolution" in Psychology
100% (2)
Understanding The "Cognitive Revolution" in Psychology
22 pages
Autopoiesis 40 Years Later
No ratings yet
Autopoiesis 40 Years Later
25 pages
PSYC8650 - Fear in The Brain and Body
No ratings yet
PSYC8650 - Fear in The Brain and Body
27 pages
Baumeister 2023 Pragmatic Prospection The Matrix of Maybe Uncertainty and Human Agency
No ratings yet
Baumeister 2023 Pragmatic Prospection The Matrix of Maybe Uncertainty and Human Agency
10 pages
What Is An "Explanation" of Behavior
100% (1)
What Is An "Explanation" of Behavior
13 pages
Multilevel Models in Psychiatry
No ratings yet
Multilevel Models in Psychiatry
8 pages
The Brain's Default Mode Network: Further
No ratings yet
The Brain's Default Mode Network: Further
17 pages
Mind Embodied: The Evolutionary Origins of Complex Cognitive Abilities in Modern Humans
No ratings yet
Mind Embodied: The Evolutionary Origins of Complex Cognitive Abilities in Modern Humans
1 page
Benjamin Yue Cheung: Education
No ratings yet
Benjamin Yue Cheung: Education
6 pages
Lorna Collins and Elizabeth Rush (Eds) : Making Sense
No ratings yet
Lorna Collins and Elizabeth Rush (Eds) : Making Sense
256 pages
The Emotion Probe
No ratings yet
The Emotion Probe
14 pages
What Is Mental Disorder Developing An Embodied Embedded and Enactive Psychopathology
No ratings yet
What Is Mental Disorder Developing An Embodied Embedded and Enactive Psychopathology
215 pages
Cognitive Psychology 7th Edition Robert J Sternberg Karin Sternberg Instructor Test Bank
No ratings yet
Cognitive Psychology 7th Edition Robert J Sternberg Karin Sternberg Instructor Test Bank
325 pages
Social Cognition: What's Missing?
0% (1)
Social Cognition: What's Missing?
19 pages
Clark Andy - Embodied Cog Science - 1999 Trends
100% (1)
Clark Andy - Embodied Cog Science - 1999 Trends
7 pages
On The Origins of The Mind
No ratings yet
On The Origins of The Mind
10 pages
Sv121 Wilson
No ratings yet
Sv121 Wilson
11 pages
Fractal Physiology and Chaos in Medicine
No ratings yet
Fractal Physiology and Chaos in Medicine
100 pages
Phil 201 Notes
No ratings yet
Phil 201 Notes
17 pages
Perception, Cognition, and Behavior
No ratings yet
Perception, Cognition, and Behavior
19 pages
William R. Uttal - The New Phrenology - The Limits of Localizing Cognitive Processes in The Brain (2001, The MIT Press)
No ratings yet
William R. Uttal - The New Phrenology - The Limits of Localizing Cognitive Processes in The Brain (2001, The MIT Press)
139 pages
Embodied Cognition and Beyond
No ratings yet
Embodied Cognition and Beyond
11 pages
Are You Fully Charged
No ratings yet
Are You Fully Charged
4 pages
Antonio Damasio, Hanna Damasio - Fear and The Human Amigdala
No ratings yet
Antonio Damasio, Hanna Damasio - Fear and The Human Amigdala
13 pages
Risk Perception in Gambling: A Systematic Review
100% (1)
Risk Perception in Gambling: A Systematic Review
24 pages
Artificial Intelligence and Machine Learning (21CS54) Module - 1
No ratings yet
Artificial Intelligence and Machine Learning (21CS54) Module - 1
13 pages
PDF
No ratings yet
PDF
442 pages
Causal Inference
No ratings yet
Causal Inference
30 pages
Quantum Mechanics - Study Notes
0% (1)
Quantum Mechanics - Study Notes
18 pages
Ethics and Clinical Neuroinnovation: Fundamentals, Stakeholders, Case Studies, and Emerging Issues Laura Weiss Roberts
100% (1)
Ethics and Clinical Neuroinnovation: Fundamentals, Stakeholders, Case Studies, and Emerging Issues Laura Weiss Roberts
352 pages
Block-1979 Time and Consciousness
No ratings yet
Block-1979 Time and Consciousness
20 pages
The Nature of Awareness - An Interdisciplinary Exploration
100% (1)
The Nature of Awareness - An Interdisciplinary Exploration
5 pages
Historical Roots of Cognitive Science
No ratings yet
Historical Roots of Cognitive Science
268 pages
Reviews: Philosophical Psychology Vol. 20, No. 1, February 2007, Pp. 127-142
No ratings yet
Reviews: Philosophical Psychology Vol. 20, No. 1, February 2007, Pp. 127-142
5 pages
The Minimal Self
100% (1)
The Minimal Self
394 pages
Chao's, Fractals, Dynamics-01
100% (2)
Chao's, Fractals, Dynamics-01
105 pages
DamasioHowBrainCreatesMind PDF
No ratings yet
DamasioHowBrainCreatesMind PDF
6 pages
1978 Mountcastle Book PDF
No ratings yet
1978 Mountcastle Book PDF
27 pages
Chirimuuta Noncausal Penultimate
No ratings yet
Chirimuuta Noncausal Penultimate
27 pages
Memory and Remembering
No ratings yet
Memory and Remembering
88 pages
Wilson - The Shadows and Shallows of Explanation
No ratings yet
Wilson - The Shadows and Shallows of Explanation
23 pages
(2011) Mind-Body Problem Today
No ratings yet
(2011) Mind-Body Problem Today
9 pages
Beatty, Lucero-Wagoner - 2000 - The Pupillary System
No ratings yet
Beatty, Lucero-Wagoner - 2000 - The Pupillary System
21 pages
William A. Hillix, Luciano L - Abate (Auth.), Luciano L'Abate (Eds.) - Paradigms in Theory Construction-Springer-Verlag New York (2012)
No ratings yet
William A. Hillix, Luciano L - Abate (Auth.), Luciano L'Abate (Eds.) - Paradigms in Theory Construction-Springer-Verlag New York (2012)
453 pages
Re-Thinking Aesthetics
No ratings yet
Re-Thinking Aesthetics
10 pages
The Nature of Emotions
No ratings yet
The Nature of Emotions
8 pages
Attribution Theory
No ratings yet
Attribution Theory
4 pages
16 - The Bayesian Mind
No ratings yet
16 - The Bayesian Mind
33 pages
Tutorial 08 Part 1
No ratings yet
Tutorial 08 Part 1
95 pages
Nhso401 r7b BayesianModeling CompPsy
No ratings yet
Nhso401 r7b BayesianModeling CompPsy
12 pages
Nick Chater, Mike Oaksford - The Probabilistic Mind - Prospects For Bayesian Cognitive Science (2008, Oxford University Press) - Libgen - Li
No ratings yet
Nick Chater, Mike Oaksford - The Probabilistic Mind - Prospects For Bayesian Cognitive Science (2008, Oxford University Press) - Libgen - Li
567 pages
Singmann Et Al. 2016 - Probabilistic Conditional Reasoning
No ratings yet
Singmann Et Al. 2016 - Probabilistic Conditional Reasoning
27 pages
The Perception Machine: Joanna Zylinska
No ratings yet
The Perception Machine: Joanna Zylinska
4 pages
On Optimality of The Shiryaev-Roberts Procedure For Detecting A Change in Distribution
No ratings yet
On Optimality of The Shiryaev-Roberts Procedure For Detecting A Change in Distribution
12 pages
Diary of Wandering Several Provinces
No ratings yet
Diary of Wandering Several Provinces
20 pages
Akshay Raghav The Dopamine Effect or The 'Maya' Effect The Intersection PDF
No ratings yet
Akshay Raghav The Dopamine Effect or The 'Maya' Effect The Intersection PDF
136 pages
A Taste of The Science of Eating Insights From The Science of Smelling
100% (1)
A Taste of The Science of Eating Insights From The Science of Smelling
117 pages
Abc Compiler
No ratings yet
Abc Compiler
9 pages
Role of Reason in Ethics and Morality
No ratings yet
Role of Reason in Ethics and Morality
8 pages
Educator's Philosophy on Child Learning
No ratings yet
Educator's Philosophy on Child Learning
2 pages
GMAT AWA: Essay Types & Scoring
No ratings yet
GMAT AWA: Essay Types & Scoring
17 pages
Pathfit 3
No ratings yet
Pathfit 3
4 pages
CG Vyapam Lab Technician Syllabus and Exam Pattern PDF 2019
No ratings yet
CG Vyapam Lab Technician Syllabus and Exam Pattern PDF 2019
4 pages
Question 5 Mark Scheme Latest
100% (3)
Question 5 Mark Scheme Latest
3 pages
Diploma in Business Management: Pearson BTEC
No ratings yet
Diploma in Business Management: Pearson BTEC
11 pages
Business Ethics Module Overview
No ratings yet
Business Ethics Module Overview
7 pages
GE001 Module 5 Proper Patio
No ratings yet
GE001 Module 5 Proper Patio
11 pages
Esay Writing Sentence
No ratings yet
Esay Writing Sentence
3 pages
Varbal Analytic and Quantitaive Reasoning
100% (1)
Varbal Analytic and Quantitaive Reasoning
44 pages
Yacob Hatata
100% (5)
Yacob Hatata
12 pages
The Validity and Strength of Arguments: When Evaluating Arguments, We Have Two Main Questions To Ask
No ratings yet
The Validity and Strength of Arguments: When Evaluating Arguments, We Have Two Main Questions To Ask
3 pages
Presentaion Skills
No ratings yet
Presentaion Skills
18 pages
Lesson E3 - Oxford Oregon
No ratings yet
Lesson E3 - Oxford Oregon
31 pages
CTA Case No. 10063
No ratings yet
CTA Case No. 10063
16 pages
Types of Claims
100% (1)
Types of Claims
44 pages
Ribeiro 2023. Legitimation by Indignation BOOK Chapter FD-USP Socio Legal Studies
No ratings yet
Ribeiro 2023. Legitimation by Indignation BOOK Chapter FD-USP Socio Legal Studies
25 pages
Argumentative Speech Rubric
No ratings yet
Argumentative Speech Rubric
2 pages
MSV - Assessment One - Case Study Assessment Outline - Autumn 2022-1
No ratings yet
MSV - Assessment One - Case Study Assessment Outline - Autumn 2022-1
8 pages
Article Critique
No ratings yet
Article Critique
5 pages
Service Learning Project Thesis Statement
100% (3)
Service Learning Project Thesis Statement
4 pages
영어독해연습 7~8강
No ratings yet
영어독해연습 7~8강
24 pages
Dictionary of Word Meanings and Uses
No ratings yet
Dictionary of Word Meanings and Uses
377 pages
Sexta 1 Types: Evlf, Flve, Fvle, Elvf (1&4) Result Aspects: E and F (2&3) Process Aspects: L and V
No ratings yet
Sexta 1 Types: Evlf, Flve, Fvle, Elvf (1&4) Result Aspects: E and F (2&3) Process Aspects: L and V
23 pages
Hwa Chong Institution JC 2 Block Test 2 2009: Instructions To Candidates
No ratings yet
Hwa Chong Institution JC 2 Block Test 2 2009: Instructions To Candidates
13 pages
Media Writing PDF
100% (2)
Media Writing PDF
21 pages
Teleological Argument Essay
100% (1)
Teleological Argument Essay
8 pages
Book Review OR Article Critique
100% (4)
Book Review OR Article Critique
16 pages
1828 2712013 - Tips For The English Bagrut (1)
No ratings yet
1828 2712013 - Tips For The English Bagrut (1)
6 pages

Bayesian Cognition Explored

Uploaded by

Bayesian Cognition Explored

Uploaded by

Overview

Bayesian models of cognition

There has been a recent explosion in research applying Bayesian models to

INTRODUCTION ease—the external world, our memories of the

F rom the point of view of the brain, nothing is

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 811

812  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 813

814  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 815

816  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

All Shreebles are blue

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 817

818  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

triangle(x ) ⇒ 3 – sides(x ), triangle(a) triangle(x) ⇒ 3 – sides(x), triangle(a), red(a)

bird(x) ⇒ flys(x), bird(a) bird(x) ⇒ flys(x), bird(a), Ostrich(a)

bird(x) ⇒ flys(x), bird(a) P0(flys(x) | bird(x)) = 0.9, P1(bird(a)) = 1

FIGURE 6 | Fit of the Bayesian

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 819

The classic tool brought to the analysis of fallacies, 0.7

820  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 821

822  2010 Jo h n Wiley & So n s, L td. Vo lu me 1, No vember/December 2010

Vo lu me 1, No vember/December 2010  2010 Jo h n Wiley & So n s, L td. 823

You might also like