Representing Action and Intention
SEBO UITHOL
DON
DERS
series
SEBO UITHOL
94
representing action and intention
promotor:
prof. dr. Harold Bekkering
copromotoren:
dr. Pim Haselager
dr. Iris van Rooij
manuscriptcommissie:
prof. dr. Günther Knoblich
prof. dr. Vittorio Gallese (Università degli studi di Parma)
prof. dr. Bernard Hommel (Universiteit Leiden)
The studies described in this thesis were caried out at the Donders Institute for Brain, Cognition, and Behaviour at
the Radboud University Nijmegen, the Netherlands.
ISBN/EAN: 978-94-91027-38-3
Printed by Ipskamp Drukkers, Enschede, the Netherlands
© Sebo Uithol, 2012
one introduction
two mirror neurons as representations
9
22
three motor resonance
38
four action hierarchies
56
ive intentions in action
78
six action understanding in infants
102
seven discussion
123
references
135
summary
153
dutch summary
157
thank you
161
publications
163
curriculum vitae
165
donders series
167
one
representation
action
intention
introduction
6
Representing action and intention
The concept of ‘intentions’ plays a crucial role in our everyday explanations of our own and others’ behavior. Intentions are generally taken to
be a goal state combined with an action plan for how to reach that state,
which makes the concepts of actions and intentions deeply intertwined: Actions are something we do, intentions make us do it. Movements that are
unintentional, such as blinking, sneezing and spilling coffee, are not considered actions, and—vice versa—desires that do not involve actions are not
considered intentions. Moreover, the relation between intentions and actions seems to be hierarchical: A goal is achieved by means of one or more
actions, that in turn can consist of multiple sub-actions. For example, I can
form the intention to get coffee, and subsequently plan the actions needed
to reach my goal: getting up, walking to the hallway, locking the ofice door,
etc. Locking the ofice door, in turn, consists of taking the ofice key from
my pocket, insert it in the lock etc. How do we do that? What neural mechanisms could underlie this ability to structure our behavior in such a way that
a goal can be achieved?
In this thesis I will argue that the way we understand an action is not necessarily identical to the way we execute them. Therefore, the hierarchical
structure seemingly inherent in actions need not match a similar hierarchy
in the control of these actions. I will irst analyze what we understand when
we understand an action. Next, I will argue that the hierarchy found in action control is not a straightforward causal hierarchy, in which elements
higher in the hierarchy cause or initiate elements lower in the hierarchy.
Instead, the hierarchical structure seems to be emerging from multiple processes that run at different time scales. As our behavior is the result of such
an emergent hierarchical structure in which each of the elements jointly
contribute, the notion of ‘intention’, thought to occupy the highest regions
of the hierarchy, and be the primary cause of our actions, will change dramatically. This, in turn, impacts our conception of ‘action understanding’.
The result I present here is not so much a detailed theory of how our actions
are controlled in the brain, but rather an alternative framework that allows
for designing new experiments and formulating new theories on action execution and action understanding.
First this introduction will give a brief overview of the main concepts
of this thesis: representation in cognitive neuroscience, intention, and action representation in the brain. In discussing these notions, I will further
specify the problems that will be the topic of the subsequent chapters.
7
A representation consists of four elements: vehicle, content, user and object1.
The object, event, or neural state that carries the information is called vehicle, the information that is carried by the vehicle is called content. Each representation necessarily contains a vehicle and content. Representations are
commonly identiied by the content, (Uithol, Haselager, & Bekkering, 2008;
Uithol, van Rooij, Bekkering, & Haselager, 2011a) which means that the two
representations are different when the content is different2. The content of
the two representations is different when each representation plays a different role in the cognitive system. This is often called the functional discreteness
of representations (Haselager, 1997; Stich, 1983, see also Chapter 5).
The entity that is represented, which could be an object, or an event, is
called object. The content stands in some relation to the object that is represented, but content and object are not identical. The content contains only
certain aspects of the object. The fact that content and object are not the
same makes misrepresentation possible (often emphasized in theoretical
accounts, see for instance Cummins (1989) and Dretske (1988)), but also
appears to be useful to assess representational claims about mirror neurons
(see Chapter 2). The fourth and inal element of a representation is user.
The user is the system or process that uses the representation to guide its
behavior, entailing that something is representation only to some system
or process. The importance of a user is often emphasized in theoretical
accounts (Bechtel, 1998; Eliasmith, 2005; Haugeland & Rumelhart, 1991;
Millikan, 1984), but remains rather implicit in discussions in cognitive psychology, and cognitive neuroscience. This is potentially problematic, as the
content of one and the same vehicle can vary, dependent on the user that
reads the representation (see Eliasmith, 2005).
Throughout the history of cognitive science, the notion of representation has played a fundamental role as an explanatory construct. Cognition
and behavior were thought to be the result of computational processes that
1 This set of core elements is different from—for instance—Harvey’s (1996) set. Harvey
places more emphasis on the communicational aspects of representation and deems user,
producer, representation and object the essential aspects. This emphasis on a representation
producer dismisses what Dretske (1988) calls ‘natural signs’ from the realm of representation,
because here there is no producer, only signiicant features in an environment. To encompass
non-communicative features as well, I will stick to the four elements mentioned in the text.
2 Consequently, two different representations can use the very same representational resources, or vehicle. This phenomenon is referred to as “superimposition” (van Gelder, 1999).
introduction
Representation in cognitive neuroscience and
philosophy
8
use representations. The notion has been deined, operationalized and interpreted many times (Bechtel, 1998; Chemero, 2000; Clark, 1997; Dretske,
1988; Haugeland & Rumelhart, 1991; Millikan, 1984), but, as it turns out,
never entirely satisfactory (see Cummins, 1989; Haselager, De Groot, & Van
Rappard, 2003). A particularly problematic issue seems characterizing the
relation between the object and the content. To illustrate: according to
Clark (1997) and Haugeland (1991) a system is “representation using just
in case: 1)It must coordinate its behaviors with environmental features that
are not always ‘reliably present to the system’, 2)It copes with such cases
by having something else (in place of a signal directly received from the
environment) “stand in” and guide behavior in its stead, and 3)That “something else” is part of a more general representational scheme that allows the
standing in to occur systematically and allows for a variety of related representational states” (Clark, 1997).
However, this “standing in”, that is supposed to capture the relation between content and object, is still open for interpretation. Some philosophers (1988)) have tried to ground “standing in for” in a reliable covariance
of the representation and the feature it represents, but others have pointed
out that covariance is neither suficient (Clark, 1997), nor necessary (Millikan, 1984) for standing in. Covariance is not a necessary condition because, for example, an alarm bell can represent a radiation leak even when
a leak has never occurred. Covariance is not suficient either, because an environmental state can be continuously available to the system, so there is no
need for representation. For example, young sunlowers track the sun with
their heads, resulting in a covariance of the sun position and the orientation
of the lower heads. Yet, it seems awkward to describe the behavior of the
sunlowers in terms of internal representations of the position of the sun3.
Notwithstanding these conceptual dificulties, covariation seems to provide at least a irst estimation or a rough approximation of the representational content, and remains to be the main basis for establishing representations in cognitive neuroscience. In a typical setup of a neurophysiological
experiment, an animal is presented with stimuli while the activity of single
neurons is measured. When a reliable covariation between the presence of
a stimulus and the activity in a cell is established, the cells is said to repre3 Haugeland (1991) and Grush (1997) have argued that X is a representation of Y only when
Y is not always reliably present to the system. If Y is reliably present, Grush prefers using the
notion presentation instead of representation. Clark, on the other hand, prefers the notion representation as soon as this representational approach yields explanatory power regardless of
whether the environmental feature is reliably present to the system (Clark, 1997).
9
Embodied representations
When cognition is claimed to be embodied, the dependency of cognition on
the body is emphasized. This dependency can take various forms (Ziemke,
2003), but in cognitive neuroscience, embodiment is usually taken to mean
that sensory and motor cortices are involved in tasks that do not directly
involve perception or action (Barsalou, 2008; de Vignemont & Haggard,
2008; Decety & Grezes, 2006; Glenberg, 1997; Mahon & Caramazza, 2008).
The framework of embodied cognition has important consequences for
our interpretation of both representational vehicle and content. For the
case of perceptual representations it entails that certain concepts, say a bird,
are not represented in an amodal, conceptual format, but in the sensorimotor areas and in a format that is similar to actual perceptions of a bird. This
representation is hypothesized to be a gradual build-up of all the neural
activity caused by encounters with a bird (Barsalou, 1999; 2008). Consequently, when thinking about birds, these perceptual areas are reactivated.
For action representations this means that actions are not stored as amodal
concepts, but as motor activations. Similarly, thinking about a certain action
activates those areas that would also be active upon performing that action.
In the action domain ‘embodiment’ is sometimes interpreted more radically, and used to denote the fact that certain features of the action are
not represented, but delegated, so to speak, to the body (Chiel & Beer,
1997; Van Dijk, Kerkhofs, van Rooij, & Haselager, 2008). As an example:
Our gait is the result of a complex orchestra of movements in many joints.
The muscle activation responsible for a successful gait is hypothesized to
be controlled by central pattern generator (Duysens & Van de Crommert,
1998). However, these neural patterns are not suficient to generate a luent
and eficient gate. Bodily components, such as muscle and tendon elasticity are of crucial importance (Whittington, Silder, Heiderscheit, & Thelen,
2008). In other words, some particular stages or parts of an action are not
introduction
sent the stimulus. To illustrate, using this paradigm mirror neurons in the
premotor and parietal cortex (Di Pellegrino, Fadiga, Fogassi, Gallese, & Rizzolatti, 1992; Fogassi & Luppino, 2005; Gallese, Fadiga, Fogassi, & Rizzolatti,
1996; Rizzolatti, Fadiga, Gallese, & Fogassi, 1996), ‘edge detecting neurons’
in the primary visual areas (Hubel & Wiesel, 1959) and even ‘Jennifer Aniston neurons’ in the medial temporal cortex (Quian Quiroga, Reddy, Kreiman, Koch, & Fried, 2005) have been established.
10
controlled by the neural patterns that activate muscles, but these stages are
accomplished by “neural silence”, and exploiting regularities of the body.
These features of an action are not part of the neural representation of the
action, but they are, nevertheless, part of the action4.
Due to the above-mentioned outsourcing of representational content to
bodily (and also environmental, in case of ‘embedded cognition’) elements,
researchers in various ields started questioning the explanatory value of the
notion of representation altogether, and emphasized the constant dependency of cognition on a body and the world. This anti-representationalism
can be found in developmental psychology (Thelen & Smith, 1994), cognitive neuroscience (Beer, 2000; Edelman, Tononi, & Haier, 2003; Kelso,
1995), philosophy (Keijzer, 2001; van Gelder, 1995; 1998), and artiicial intelligence (Rodney Brooks, 1991a). But the conceptual vagueness of the notion of representation proved problematic. Proposals that were considered
to be non-representational by the anti-representationalists were considered
representational by the representationalists (Haselager et al., 2003). The
debate around the ‘Watt governor’ is illustrative of the profundity of this
conceptual confusion. This relatively simple mechanisms became subject of
a debate about whether it is a representational system or not (Bechtel, 1998;
Dietrich & Markman, 2001; 2003; Nielsen, 2010; van Gelder, 1995; 1998).
For a part, this thesis is able to sidestep this debate between representationalism and anti-representationalism. The irst chapters—chapter 2
and 3—analyze claims that are made about the representational content
attributed to mirror neurons and motor resonance. As such, these chapters are written under the assumption of representationalism, meaning that these
chapters analyze scientiic and conceptual claims that are made within this
framework, and do not the question the framework itself.
This neutral stance is more dificult to maintain in chapter 4 and 5. In
chapter 4 the explicit action hierarchy, with functionally discrete elements
that engage in causal interaction, is argued to be untenable. In chapter
5 not only the straightforward causal relations between action representations are questioned, but so are the elements themselves. Speciically, here it
is argued that the notion of ‘intention’, conceived as a functionally discrete
4 Other embodied interpretations question the strict contrast between perceptual and motor representations. For example, Millikan’s (1995) “pushmi-pullyu representations”, and
Clark’s (1997) “action-oriented representations” code for a certain environmental feature (say
a predator) and at the same time the appropriate response to it (leeing). More recent, Hommel’s ‘Theory of Event Coding’ (Hommel et al., 2001) posits a common representational medium for perceptual and action representations.
11
Intentions
The notion of ‘intention’ has had a long, turbulent and rambling past. In
classical psychology and philosophy, intentions are conceived of as mental
states (Hume, 1739; James, 1890). In James’ (1890) ideomotor theory, for
instance, an intention encompasses a belief about the perceptual consequences of an action and at the same time cause that action.
Halfway the twentieth century, inluenced by the linguistic turn in philosophy (Wittgenstein, 1953), and behaviorism (Skinner, 1953), intentions
were interpreted as restatements of an action, and exiled from the realm of
mental states. In her inluential work, Anscombe (1957) argued that when
we explain intentional action, we give reasons for the action, not causes, thereby suggesting that intentions are not so much causes of actions as descriptions of primary reasons of the agent. Although Davidson initially agreed
with Anscombe (Davidson, 1963), later (1980) he argued that this account
can not account for the deliberation and planning aspect of intention. To
acknowledge the prospective (future-directed) and behavior-structuring aspects of intentions (e.g. the intentions to play squash after work), and in
line with more cognitivist approaches to behavior, the notion of intentions
was brought back to being a mental states.
In this refurbished interpretation of intention, the notion was adopted
in neuroscience. In his seminal experiment, Libet (1985) asked participants
to pay attention to the exact moment at which they had the intention (“felt
the urge”, p. 530) to press a button, and was, through EEG recording, able
to show that this moment was about 250 ms later than the onset of a ‘readi-
introduction
representation, is deeply incompatible with action control processes. This
means that the function normally ascribed to an intention is performed by
various heterogeneous elements, both inside and outside (see discussion
section) of the brain. As representational content is tied to the function—as
explained above—the representational view needs to postulate a representation of an intention with a vehicle that consists of various processes and
elements. Some of these processes are dynamically coupled to each other,
or to external features. This dynamic character and distributed make-up of
the representational vehicle might in the end be reconcilable with the notion of representation, but demand such a stretch of the representational
framework, that it is questionable that the traditional notion still offers explanatory leverage.
12
ness potential’—a signal corresponding with the preparation of an action.
This inding created a volley of follow-up studies (Brass & Haggard, 2008;
Haynes & Rees, 2006; Lau, Rogers, Haggard, & Passingham, 2004), theoretical interpretations (Wegner, 2003), and criticism (Dennett, 1991). After Libet’s remarkable indings not only the timing of intentions has been
studied, the emergence of fMRI as a research tool for cognitive neuroscience made attempts to localize intentions possible as well (Burgess, Veitch,
de Lacy Costello, & Shallice, 2000; Hamilton & Grafton, 2006; Haynes et al.,
2007; Lau et al., 2004; Ouden, Frith, Frith, & Blakemore, 2005).
Although contemporary interpretations of ‘intention’ show slight variability, an intention is usually interpreted as a desired goal, combined with
an action plan to reach that goal (Bratman, 1987; Malle & Knobe, 1997;
Moses, 2001; Pacherie, 2000). These intentions are generally thought to
be functionally discrete (see above), propositional states. Yet, as will be explained in Chapter 4, an explicit and top-down control structure has proven
to be highly problematic. This means that the nature of intention in the
action hierarchy is unclear. After analyzing both the notion of intention,
with all the properties inherent to it, and the neural processes that cause
and control our actions, Chapter 5 concludes that the two frameworks are
incompatible. Intentions, it will be argued, have their origins in action explanations, not action control, and therefore fulill an explanatory role, not
a causal one. This is not to deny the capacity of forming future-directed
plans that guide our behavior, but to show that such future-directed action
control is not best described in terms of intentions as the primary cause of
the observable behavior.
Action representation
Movements are controlled by the primary motor cortex (see Figure 1a) (Kakei, Hoffman, & Strick, 1999). There seems to be an almost direct correlation between neuronal activity in these neurons, and simple movements.
Electrical stimulation to neurons in the primary motor cortex will lead to
muscle twitches (Fritsch & Hitzig, 1870). It was thus found that different
parts of the primary motor cortex control different effectors (Penield &
Rasmussen, 1950), resulting in the well-known ‘homunculus’ (Figure 1b).
13
introduction
Figure 1a, the primary motor cortex, and b) the anatomical segregation of the projections of
the primary motor cortex (adapted from Penield & Rasmussen, 1950).
What is represented in the primary motor cortex is still debated. For example, there is evidence that not only individual muscle forces, but also basic
movements (the ‘motor vocabulary’, Rizzolatti et al., 1988) are represented
here. Graziano (2007; 2002) stimulated individual neurons in the primary
motor cortex of a macaque monkey for a relatively long period (500 ms., as
opposed to the usual maximum of 50 ms). He found that this stimulation
evoked full movements, such as grasping, bringing to the mouth, eating,
instead of the previously reported muscle twitches.
The premotor cortex and the supplementary motor area lie anterior of
the primary motor cortex. It is generally assumed that here more complex
actions in the form of series of basic action chunks are planned and prepared (in interaction with the basal ganglia), and that these complex representations are subsequently propagated to the primary motor cortex (Gentilucci et al., 2000; Goldberg, 1985; Grafton & Hamilton, 2007). The idea
is that actions start with action goals or intentions, often presumed to be
represented in the lateral and medial prefrontal cortex (Hamilton & Grafton, 2008), the anterior cingulated cortex (Haynes et al., 2007; Lau et al.,
2004), and limbic structures (Damasio, 1985). These intentions are posited
to be rather unspeciied, and context-independent (Haggard, 2005). They
are subsequently propagated to the premotor areas where they are embedded in the current context and result into a concrete action plan (Fuster,
2004; Hamilton & Grafton, 2007). Finally, these action representations are
translated into a detailed set of movements in the primary motor area.
Although this image appeals to an intuitive logic, it seems to be an oversimpliication of neural control of actions. First it is problematic to talk
about the origin of an intention, as this seems to suggest a brain area ‘where
14
it all comes together’—or a ‘Cartesian Theater’ in the words of Dennett
(1991)—and where decisions are made. Apart from the conceptual problems discussed by Dennett (the Homunculus problem, for example, see also
Bennett & Hacker (2003)), there seems to be empirical evidence that more
posterior areas are closely involved in action planning as well (Koechlin,
Ody, & Kouneiher, 2003). For example, the existence of canonical neurons
(neurons that associate certain objects to the appropriate action for that
object (Grezes, Armony, Rowe, & Passingham, 2003; Murata et al., 1997))
in the ventral premotor cortex, and mirror neurons (neurons that associate
between own and observed actions), in the left premoter and parietal cortex suggests that action planning does not rely solely on anterior or medial
parts of the prefrontal regions, or on intentions that are created in absence
of context information.
Action hierarchies
The functional and anatomical segregation of different aspects of an action has widely been interpreted as supporting the idea of an action hierarchy (Byrne & Russon, 1998; Cooper & Shallice, 2006; Grafton & Hamilton, 2007; Hamilton, 2009; Hamilton & Grafton, 2007; Liepelt, Cramon, &
Brass, 2008; Saltzman, 1979; Van Elk, 2010; Van Elk, Van Schie, & Bekkering, 2008). This hierarchical view on actions is so commonly accepted that it
seems part of cognitive science’s ‘common sense’, and is as such hardly ever
explicated or questioned. Pfeifer and Scheier note that: “One of the main
reasons that the hierarchical view of behavior control has been (and still is)
so popular is that it is straightforward and easy to understand. Moreover, it
has a strong basis in folk psychology: It seems compatible with what we do in
our everyday activities” (Pfeifer & Scheier, 1999, p. 344).
The general idea of structuring an action into a hierarchy is highly similar to control structures found in classical AI systems (Good Old Fashioned
AI, or GOFAI) and robotics. In these domains action planning is viewed as
a kind of problem solving. When the overall problem is too complicated to
solve at once, the system sets out to solve sub-problems irst. This hierarchical control structure in robotics dates back to 1960s. The robot Shakey is
controlled by a hierarchical planning structure that consists of ive levels
(Nilsson, 1984). The bottom level may be thought of as deining the elementary physical capabilities of the system (i.e. the basis motor vocabulary). The
second level consists of what is called Low-Level Actions, or LLAs. Examples
15
introduction
of these LLAs are the robot’s physical capabilities, such as ‘roll’ and ‘tilt’.
The third level consists of a library of Intermediate-Level Actions, or ILAs.
These ILAs are preprogrammed packages of LLAs. According to Nilsen,
these ILAs are best thought of as: “instinctive abilities of the robot, analogous to such built-in complex animal abilities as ‘walk’ or ‘eat’”(Nilsson,
1984, p. 6). Above the level of ILAs there is fourth level, which is concerned
with planning the solutions to problems. The ifth and top level consists of
the program that actually invokes and monitors executions of the ILAs.
Shakey’s basic planning mechanism, STRIPS, plans irst in an abstract
space and then reines at successively more detailed levels. This makes its
action control structure highly similar to current conceptions of action hierarchies (Grafton & Hamilton, 2007; Hamilton & Grafton, 2007, Chapter 4),
as well as Pacherie’s model of intentional action (Pacherie, 2008, Chapter
5; 2008).
However, these classical control structures have been shown to have severe limitations in dealing with real world environments. They suffer from
various versions of the frame problem (i.e. it is impossible to assess what the
relevant consequences of an action are, and what the irrelevant, see Fodor
(1983), Haselager (1997), and Pylyshyn (1987)), which make their problem
solving routines potentially computationally intractable (as the computations needed for planning grow exponentially with every added element
in the world (van Rooij, 2008)). This makes these classical structures problematic to such an extent that Noli, Ikegami, and Tani deem classical approaches based on explicit design ‘hopeless’ (2008, p. 101). Strikingly, the
very same hierarchical structure that is recognized as causing the problems
in robotics, and therefore deemed hopeless, is posited to be the solution to
the problem of action planning in humans. Haggard posits that “The brain
must expand this task-level representation into an extremely detailed movement pattern specify the precise kinematics of all participating muscles and
joints. Generating this information is computationally demanding. The
brain’s solution to the problem may lie in the hierarchical organization of
the motor system” (Haggard, 2005, p. 292). Chapter 4 will argue, however,
that the explicit representational and hierarchical structure is problematic—both conceptually and empirically—in neural action control.
As an alternative to the GOFAI control structures in robotics, Brooks
(Rodney Brooks, 1986; 1991b; 1991a) introduced the subsumption architecture. This architecture consists of various independent and parallel layers.
Lower layers implement simple forms of behavior, such as ‘walking’, or
16
‘avoiding objects’ and function as autonomous control units with own input from the sensors and output to the effectors. Higher layers do not start
or stop activity in the lower layers, but merely modulate the output of the
lower layers. In such architectures, the resulting behavior is not the result
of top-down control, but emergent from the interaction between all layers,
and the environment. The types of control at lower layers (e.g. ‘avoiding
an object’) operate on a smaller timescale than the higher layers (e.g. ‘exploring the world’). Yamashita and Tani (2008) were able to have a similar
control structure emerge from a network in which the units operate on
different time scales. This result provides an interesting suggestion for how
the brain controls behavior in a hierarchical manner. Additionally, this suggestion seems compatible with recent models of action control (Koechlin et
al., 2003; Kouneiher, Charron, & Koechlin, 2009). In Chapter 4 I will discuss
action hierarchies, their problems and a possible solution.
Outline of this thesis
This thesis is organized as follows. In chapter 2 I will analyze claims that are
being made about the representational content of mirror neurons. I will
show that there is no limit to the level of abstraction of the content that can
be attributed to single mirror neurons. I will argue, however, that the higher
levels are less appropriate if one wants to maintain to the idea of action mirroring as a form of direct matching.
As mirror neurons in macaques were found using invasive techniques,
it is dificult to establish mirror neurons in humans (Chong, Cunnington,
Williams, Kanwisher, & Mattingley, 2008; Kilner, Neal, Weiskopf, Friston, &
Frith, 2009; Lingnau, Gesierich, & Caramazza, 2009). Therefore, one usually does not speak of mirror neuron activity in humans, but more cautiously
of ‘motor resonance’—the phenomenon that the motor areas are activated
upon action observation. In chapter 3 I will discuss motor resonance, and its
putative relation to action understanding. I will show that there is great variability in the interpretations of these notions: two interpretations for motor
resonance, three interpretations for action understanding, and three interpretations for action goal. Consequently, a (ictive, but reasonable) claim
that “motor resonance contributes to understanding the goal of an action”
can have—apart from the vague “contributes to”—eighteen different meanings. An example of the misunderstanding stemming from this use of termi-
17
introduction
nology is discussed, and experiments on basis of the exposition of different
interpretations are suggested.
In chapter 4 it is argued that our intuitive conception of an action hierarchy is insuficient to capture the complexity of the neural control of our
behavior. We have an intuitive way of carving an action into sub-actions and
sub-sub-actions and implicitly assume that these components have a neural
counterpart. This need not be the case. Drawing upon recent AI (Yamashita
& Tani, 2008) and neurocognitive models (Koechlin et al., 2003; Kouneiher
et al., 2009), I will argue that there is indeed a hierarchical organization
present in neural circuits, but that it is based on differences in timescales at
which processes operate. Consequently, the straightforward causal connections between the elements in a hierarchy are untenable.
While in chapter 4 it is argued that the links between the elements in an
action hierarchy were untenable, chapter 5 continues the dissolving of the
discrete and explicit character of the action hierarchy, by arguing that even
some of the elements are not present in action control either. In this chapter the notion of ‘intention’ is analyzed and compared with neural control
of action. It is shown that the discrete character of intentions is incompatible with the dynamic nature of the control processes, and that therefore
intentions cannot play the role in guiding our actions that is commonly
assumed.
Chapter 6 serves as an example of the direct relevance of the previous
chapters to cognitive science. In this chapter the insights in action understanding (chapter 3), action control (chapter 4) and the nature of intention (chapter 5) are applied to infant action understanding and intention
attribution. It will be argued that infant studies tend to leave the notion of
‘action understanding’ underspeciied, thereby paving the way for conlating action understanding and intention attribution, resulting in overly rich
interpretation of infants’ cognitive capacities.
Finally, in chapter 7 a general discussion of the preceding chapters is
presented. It is concluded that our notions of action hierarchies and intentions need to be modiied substantially in order to provide an empirically
fruitful concept. Next, two examples of how the analyses of the preceding
chapters could lead to reinterpretation of current data will be given. Finally,
I will discuss what the conclusions in this thesis mean for the concept of
‘intention’ itself, and this could impact future research.
abstact
Single cell recordings in monkeys provide strong evidence for an important role of the motor system in action understanding. This evidence is backed up by data from studies of the
(human) mirror neuron system using neuroimaging or TMS techniques, and behavioral experiments. Although the data acquired from single cell recordings are generally considered
to be robust, several debates have shown that the interpretation of these data is far from
straightforward. We will show that research based on single-cell recordings allows for unlimited content attribution to mirror neurons. We will argue that a theoretical analysis of the
mirroring process, combined with behavioral and brain studies, can provide the necessary
limitations. A complexity analysis of the type of processing attributed to the mirror neuron
system can help formulating restrictions on what mirroring is and what cognitive functions
could, in principle, be explained by a mirror mechanism. We argue that the processing at
higher levels of abstraction needs assistance of non-mirroring processes to such an extent
that subsuming the processes needed to infer goals from actions under the label ‘mirroring’
is not warranted.
This chapter was published, in a slightly modiied form, as: Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2011). What do mirror neurons mirror? Philosophical Psychology, 24(5), 607–623.
two mirror neurons as
representations
goal inference
mirror neurons
representation
theory of mind
computation
20
Mirroring in cognitive neuroscience
After their discovery in the early 1990s (Di Pellegrino et al., 1992; Gallese
et al., 1996; Rizzolatti et al., 1996) mirror neurons caused great excitement
in cognitive neuroscience, as these neurons seem to suggest a common
coding of action perception and action execution (Hommel, Müsseler,
Aschersleben, & Prinz, 2001; Prinz, 1997), or a shared representation between the observer and the executor of an action (de Vignemont & Haggard, 2008; Grezes & Decety, 2001). They were labeled ‘mirror neurons’, as
the observed action “seems to be ‘relected’, like in a mirror, in the motor
representation for the same action of the observer” (Buccino, Binkofski, &
Riggio, 2004a). It is exactly this ‘rock bottom’ connotation of mirroring (i.e.
direct relection, or direct matching of action features) that has made it an
attractive notion for explanations of cognitive functions, including action
understanding (Fogassi & Luppino, 2005; Iacoboni et al., 2005; Rizzolatti,
Fogassi, & Gallese, 2001) emotion understanding (de Vignemont & Singer, 2006; Keysers & Gazzola, 2006; Wicker et al., 2003), imitation (Brass &
Heyes, 2005; Buccino, Vogt, Ritzl, Fink, Zilles, Freund, & Rizzolatti, 2004b;
Iacoboni et al., 1999; Rizzolatti, 2005; Wohlschlager & Bekkering, 2002),
complementary action (Newman-Norlund, Van Schie, Van Zuijlen, & Bekkering, 2007), and communication (Ferrari, Gallese, Rizzolatti, & Fogassi,
2003; Gallese & Lakoff, 2005).
Evidence for the existence of mirroring processes is derived from broadly three types of experimental research: (i) single-cell recordings in monkeys, (ii) analyses of the entire (human) mirror neuron system (MNS) using imaging or TMS techniques, and (iii) behavioral experiments, using
interference effects and reaction times to probe properties of the MNS. The
received view is that observed actions are mapped onto the motor cortex of
the observer. When there is a matching motor representation available, the
action is recognized. This hypothesis is known as the direct-matching hypothesis (Rizzolatti et al., 2001).
In single cell research, the activity of mirror neurons is often conceptualized as a form of representation, coding for (categories of) actions or action
goals (Fogassi & Luppino, 2005; Gallese et al., 1996; Iacoboni et al., 1999;
Rizzolatti et al., 1996). In this type of research, the activity of a single neuron
is measured and related to the occurrence of an external event. When there
is a reliable covariance between the neuronal activity and an external event
it is concluded that the neuronal activity represents the external event.
21
1 Technically, a consequence of this inability to measure individual neurons in humans is
that mirror neurons have not yet been unequivocally established in humans. There is indirect
evidence of mirror neurons in humans based on repetition suppression (Chong et al., 2008;
Kilner et al., 2009), but this result is not unequivocal (Lingnau et al., 2009).
mirror neurons as representations
In research based on imaging techniques or TMS and behavioral studies,
mirroring is generally viewed as a form of processing, mapping perceptual
representations of the observed action to motor representations of the observer’s own action repertoire (Buccino, Binkofski, & Riggio, 2004a; Iacoboni et al., 2005; Miyashita, 2005; Rizzolatti, 2005; Rizzolatti et al., 2001; Rizzolatti & Craighero, 2004). As this type of research is dependent on imaging
or TMS techniques, reaction times and error rates, which can only show or
inluence activity in large groups of neurons, one can show the involvement
of a brain region as a whole in a certain task, but not the response or contribution of a single neuron.1
The data acquired from single cell recordings is generally regarded to be
robust and solid. Therefore, it is often used to guide research of the other
two types, or to interpret the acquired data. For instance, Newman-Norlund
et al. (2007) use the distribution of strictly and broadly congruent mirror
neurons, as found by Gallese and colleagues (1996) in monkeys to predict
the BOLD signal in the MNS in two different conditions (imitative vs. complementary action). However, the existence of several debates about the
function of mirror neurons (Csibra, 2007; Dinstein, Thomas, Behrmann,
& Heeger, 2008; Jacob, 2008; Jacob & Jeannerod, 2005; Saxe, 2005a) indicate that although the data these single-cell experiments generate might be
hard, the interpretation of these indings is far from straightforward.
In recent years several researchers have formulated criticisms on the received interpretation of the function of mirror neurons (the direct-matching hypothesis) pointing to the fact that this hypothesis cannot account for
many important indings, and have formulated alternative theories (e.g.
Csibra, 2007; de Vignemont & Haggard, 2008; Jacob, 2008). For example,
both Csibra and Jacob argue that mirror neuron activity is not constitutive of action understanding, but only indicative of it (Csibra, 2007; Jacob,
2008; Jacob & Jeannerod, 2005). They argue that action understanding is
an interpretative process that takes place outside the motor system and that
mirror neurons are involved in the subsequent action prediction and planning. Although our analysis is different, the outcome can be interpreted as
(partially) supporting their views. In this paper we want to analyze the paradigm that has lead to many of the indings, i.e. the single cell recordings. By
means of an analysis of the representational elements of mirror neurons, we
22
will show that this type of research allows for virtually unbounded content
attribution to individual neurons (see also Uithol et al., 2008). As a consequence, mirror neurons can be said to represent ever more abstract events,
from grip types to long-term intentions. However, by means of a complexity
analysis of goal inference, a task generally attributed to a mirroring process,
we can formulate a possible limitation on what representational output such
a process can produce. This analysis, combined with behavioral or brain
studies, provides a possible means of limiting the representational content
and can thereby help in interpreting the data acquired with single-cell measurements. We will argue that the recognition and understanding of goals
and intentions need assistance of non-mirroring processes to such an extent
that subsuming these processes under the label ‘mirroring’ is no longer
warranted. Our analysis is speciically aimed at a mirror mechanism and its
alleged support for action understanding, Although there might be consequences for its support for other cognitive functions (see above), these fall
outside the scope of this paper.
Mirror neurons representing actions and
goals
The iring of mirror neurons is often characterized as a form of representation. The neurons are said to represent action means, action ends or goals,
and intentions. Examples are abundant: Gallese et al. (1996) propose that
a “possible function of mirror neuron movement representation is that this
representation is involved in the ‘understanding’ of motor events”; Rizzolatti et al. (1996) propose that “[mirror neuron’s] activity ‘represents’ the
observed action”; and Iacoboni and colleagues (1999) suggest that “F5 neurons code the general goal of a movement”.
What counts as an action means or an action goal is relative and a matter
of interpretation. To give an example: A precision grip can be a means to
the grasping of a cup. This cup grasping, however, can also be considered a
means to the end drinking. Drinking, in its place, can be regarded a means
to maintaining homeostasis or engage in social activity. There thus exists a
continuum from concrete, readily observable movements (e.g. the use of a
precision grip) to highly abstract goals and intentions (such as engaging in
social activity), and there is no a priori way to make a clear-cut and objective contrast between action means and action ends or goals. Several action
hierarchies and labels have been proposed to divide this continuum (Bek-
23
content
user
vehicle
object
representation
proper
Figure 1. The basic elements of a representation.
2 This contrast is also denoted with the terms intention in action and prior intentions (Searle,
1983). See also Chapter 5.
mirror neurons as representations
kering & Wohlschlager, 2002; Grafton & Hamilton, 2007; Jeannerod, 1994).
To clarify our terminology: we will speak of actions in relation to the level
of grips or simple actions (e.g. grasping with a precision grip), and of goals
when the behavior is interpreted more broadly, ranging from motor goals
(e.g. the goal of grasping a cup) to long-term intentions (cleaning the table
or spending your next holiday in Brazil)2.
As mirror neuron iring is commonly viewed as a form of representation, we will analyze it using the basic elements of a representation: vehicle,
content, object and user (Cummins, 1989; Dretske, 1988) (See Figure 1. See
also Bechtel (1998) and Shea (2007) for similar presentations of these elements.) The representation proper consists of a vehicle and a content. The
vehicle of a representation is the physical carrier (e.g. neural state) that represents. The information that is carried by the vehicle is called its content.
Content is not the same as the object that is represented. An object or event
in the outside world can be misrepresented and most of the time the content is of a more general or more abstract nature than the object represented (e.g. a sparrow can get represented as “bird”). It is important to note that
representational objects need not be physical objects. A representational
object can as well be a situation or an event, such as an action. The fourth
and inal element of a representation is a user. The user is the system or
process that uses the representation to guide its behavior. In case of mirror
neurons, the user is likely to be another brain system. For a full understanding of the functionality of the mirror neuron system, one has to specify the
user of the information that mirror neurons are supposed to carry. Yet, in
most models on the working of mirror neurons the user remains unspeciied. We will therefore analyze mirror neuron representations using just the
vehicle, content and object aspect.
24
Single-cell recording experiments are based on what we have elsewhere
called a vehicle-irst approach (Uithol et al., 2008): One starts with a vehicle, in
this case a neuron, and then tries to identify the type of stimuli the vehicle
covaries with (viz. the neuron responds to), thereby establishing a characterization of its content. Not all mirror neurons are equally selective in their
responses. This has led Gallese et al. (1996) to discriminate three categories
of mirror neurons: strictly congruent, broadly congruent and non-congruent mirror neurons. Strictly congruent mirror neurons respond to observed and
executed movements that correspond both in terms of general action (e.g.
grasping) and in the way that action was executed (e.g. precision grip). During action observation, the object of the representation is the movement of
the experimenter or of another monkey. During action execution, the object
is the movement of the monkey. The content of the neuron’s iring is assumed to be the shared feature of the two events that the neuron responds
to, in this case the particular action with a particular grip (e.g. a grasp with
a precision grip). When the motor and perceptual object share a common
feature that gets relected in the activity of the vehicle, the neuron is said to
‘mirror.’
Raising levels of abstraction
Each type of broadly congruent mirror neurons responds to a variety of
grips or actions (Gallese et al., 1996), and consequently no commonality in
the response proile can be found at the level of grips. For instance, broadly
congruent neurons of group 1 are highly speciic to motor activity in terms
of action and speciic type of grip (e.g. a precision grip), but respond to the
observation of various types of grips (e.g. both a precision grip and a full
hand grip). See Table 1 for the various types of broadly congruent mirror
neurons, their response proiles, and the lowest common property in the
motor and visual response proile. Although it is not possible to specify a
shared property on the level of grips, congruence can be found one level
up, at the level of actions. Here the response proile is equally speciic on
the motor and perception side. The key property that mirror neurons owe
their name to—the fact that the common property of a motor and a perceptual event gets relected in the activity of one vehicle—can be preserved,
but only by moving the description of the shared property from the level of
25
type of mirror neuron
response proile
(m = motor, v = visual
lowest common property in
motor and visual proile
non-congruent
m: various actions
v: various actions
object-related actions
broadly congruent group 3
m: speciic action
v: various actions
speciic goals (grasping to eat)
broadly congruent group 2
m: speciic hand action
v: various hand actions
speciic category of actions
(e.g. hand actions)
broadly congruent group 1
m: speciic grip
v: various grips
speciic action (e.g. grasping
with a hand)
stricly congruent
m: speciic grip
v: speciic grip
speciic grip (e.g. grasping with
precision grip)
Table 1. The various mirror neurons, their response proile and their lowest common property
in motor and visual response (i.e., their attributed content).
In sum, neurons can be made to mirror—in the sense of representing a
common property of a motor and perceptual event—by invoking levels of
description of an increasing abstraction. As there exists an almost unlimited
number of levels of abstraction, representational content can be attributed
to any neuron that responds to both executed and perceived actions. It must
be emphasized that this is not a problem only to broadly congruent or noncongruent mirror neurons. The same interpretational principles can ren-
mirror neurons as representations
grips up to the level of actions. The representational content attributed to
this neuron can then be formulated as ‘grasping with the hand’.
In a similar vein, broadly congruent mirror neurons of group 2 can be
taken to be congruent (and thereby representational) on the level of categories of actions, for instance hand actions versus non-hand actions. Neurons
of group 3, in turn, can be considered congruent on the level of action goals,
as these neurons appear to respond to the goal of an action and to be indifferent to the means by which this goal is achieved.
Non-congruent mirror neurons seem to show no clear-cut congruency
between the observed action and the movement of the monkey. Hence, at
irst sight no common property seems available in their response proile.
However, when the level of abstraction is raised to the level of object-related
versus non-object-related actions, this neuron can be considered congruent
and representational again, as mirror neurons only respond to object-related
actions, and not to, for instance, mimed actions. The representational content of this type of neurons can thus be characterized as object-related actions.
So even non-congruent mirror neurons can be made congruent by choosing the appropriate level of description.
26
der a neuron congruent or incongruent anywhere along the continuum.
This problem of unbounded content attribution undermines the explanatory value of the notion of mirroring,
By raising the level of abstractness, one strays from the rock bottom connotation of mirroring, making it increasingly dificult to see how highly
abstract properties can be ‘relected directly’. This problem can be overcome by imposing some principled restrictions on the level of abstraction at
which mirroring can rightfully be said to occur. An analysis of the processes
that are attributed to the MNS can offer such principled restrictions.
Action understanding in the mirror neuron
system
The human MNS is assumed to consist of the rostral part of the inferior
parietal lobule, the lower part of the precentral gyrus and the posterior part
of the inferior frontal gyrus (Rizzolatti & Craighero, 2004). This mirroring
system is supposed to facilitate action understanding, goal understanding
and imitation by means of a mirroring process (Iacoboni et al., 1999; 2005;
Rizzolatti et al., 2001). Although the nature of the mirroring process is still
largely unknown, some claims about features of this process can be found
in the literature. The general idea is that perceptual representations of the
observed action are mapped to motor representations of the observer’s
own action repertoire.3 Importantly, the process is assumed to be direct (i.e.,
the mirror neuron representation is brought about without involvement of
higher, inferential processes, but by means of direct coupling, direct activation, direct association), or otherwise computationally simple. For example,
Rizzolatti and Craighero (2004) write: “The proposed mechanism is rather
simple. Each time an individual sees an action done by another individual,
neurons that represent that action are activated in the observer’s premotor cortex. […] Thus, the mirror system transforms visual information into
knowledge.” Similarly, Iacoboni (2008) writes: “[w]e do not have to draw
complex inferences or run complicated algorithms. Instead, we use mirror
neurons.”4
3 The activity of the mirror neuron system is often described as a form of resonance. This
resonance is claimed to be either interpersonal, i.e. between parts of the premotor system of the
observer and of the executor or intrapersonal, i.e. between a visual and a motor representation
in the observer. See Uithol et al. 2011b for an elaboration on this distinction.
4 Strictly speaking, this statement relects a category mistake in the sense that “we” can use
mirror neurons. One may wonder what the “we” consists of if neurons are not part of it. How-
27
Goal inference is context dependent
The recognition of an action alone is not suficient for a reliable goal inference, as multiple goals can be achieved by a given action (e.g., picking
up a cup for drinking, pouring, cleaning up) (de Vignemont & Haggard,
2008; Jacob & Jeannerod, 2005). Also, multiple actions can be performed
ever, we do not want to elaborate in this paper on this along the lines of Bennett & Hacker
(2003). Rather, we see this statement (and many similar others) as a ‘rough and ready’ type of
description that could be formulated more appropriately (e.g. by speaking of mirror neurons
that implement our “capacity to…”) when the occasion requires it.
mirror neurons as representations
Despite the general agreement on the simplicity of the mirroring process, there is diversity in the ield when it comes to the capacity for abstraction of the mirroring process. It is claimed that the mirroring process produces representations of actions and action means (Buccino, Binkofski, &
Riggio, 2004a; de Vignemont & Haggard, 2008; Fadiga, Craighero, & Olivier, 2005; Rizzolatti & Craighero, 2004), but at other places the scope of
possible MNS output has been expanded to incorporate representations
of the intentions behind actions (Gallese, Keysers, & Rizzolatti, 2004; Iacoboni, 2008; Iacoboni et al., 2005). For example, Rizzolatti & Sinigaglia
(2010) claim that “through matching the goal of the observed motor act
with a motor act that has the same goal, the observer is able to understand
what the agent is doing”.
The range in abstraction attributed to the output of the mirror neuron
system exceeds the one depicted in Table 1 for individual mirror neurons
(from action means to immediate action goals). We will argue that, on theoretical grounds, it is implausible that a direct or otherwise simple process
has the capacity for reliably producing representational content at or above
the level of action goals. In order to do so we will have to make minimal assumptions on what could be possibly meant by ‘processing’ in the context of
mirroring. For our purposes it will sufice to assume that by processing one
means a form of ‘computation’ in the broad sense of the word (Chalmers,
1995; Eliasmith, 2010; Piccinini, 2008). This would include non-traditional
and non-symbolic forms of computation—such as the various forms of neural network computations—but it would exclude hypothetical mechanisms
with presumed computational processing powers that have no possible
physical implementation (see also Frixione (2001), Tsotsos (1990), and van
Rooij (2008)).
28
to achieve a certain goal (the goal to drink with grasping a cup, ordering a
beer, opening the tap). In other words, there is not a one-to-one mapping
between actions and goals, but a many-to-many mapping. Therefore, goalaction associations alone cannot produce a unique goal when observing a
given action.
Which goal can be reached by an action is dependent on the context in
which the action is performed. For instance, hand waving can be a means
to shooing away mosquitos as well as making a taxi stop. It is the context
of the action—the presence of taxis or mosquitoes, an urban environment
or a campground—that leads to a different interpretation of the observed
action. So in order to reliably infer goals from observed actions, both the
action and the context must be processed (De Ruiter, Noordzij, NewmanNorlund, Hagoort, & Toni, 2007; Jacob & Jeannerod, 2005; Kilner, Friston,
& Frith, 2007a; Toni, Lange, Noordzij, & Hagoort, 2008; van Rooij, Haselager, & Bekkering, 2008). This context-dependency of goals is also present
in Iacoboni and colleagues’s tea-cup experiment (Iacoboni et al., 2005),
where grasping a cup from a neatly set table was used to suggest the goal
of ‘drinking’, and grasping it from a messy table to suggest ‘cleaning up’.
Typical mirror areas (the posterior part of the inferior frontal gyrus and the
adjacent sector of the ventral premotor cortex) were shown to be more active when an intention could be inferred from one of these contexts than
in cases of no context, which the authors took as evidence that a mirroring process is responsible for performing the context-dependent inferring
of goals (Iacoboni, 2008; Iacoboni et al., 2005). We argue, however, that
although it might be the case that parts of the inferior frontal cortex are
involved in context-dependent goal inference, it does not appear likely that
this is done by means of a mirroring mechanism. The degree of context
understanding required for reliable goal inferences seems, given the computational complexity of this task, to exceed by far the abilities of a direct or
otherwise simple mechanism.
Goal inference is ‘non-direct’ and ‘complex’
Goal inference above the level of immediate goals cannot be based on a
direct or a simple mechanism. Although it remains somewhat implicit what
most researchers mean by direct, we will argue that it cannot mean that sensory representations are mapped on the motor system without signiicant
aid of other, more inferential processes. Such a direct mapping mechanism
29
5 Evidently, real world situations can vary in many more than just 35 features, but to make
our case we do not need to assume anymore such features.
mirror neurons as representations
would need access to a mapping structure in which each possible actioncontext combination maps directly to a unique goal. But given the number
of possible actions and the number of possible relevant context aspects, this
strategy would, due to what is called a combinatorial explosion, very soon result in an unmanageable number of mappings. To illustrate, consider there
are, say, just 35 possible context features 5 (e.g., that the person in a white
coat is a surgeon (or a psychopath), the scalpel is sterile (or poisoned), the
setting is a hospital (or a movie set), the person being cut is a patient (or an
actor), the other people in the room are nurses (or medical students), etc.)
and there are several different goals (e.g., grasping the scalpel to cure, to
hurt, to cut wire, to clean, to put away, to give to a nurse, etc.). If we make
the simplifying assumption that any context feature is either ‘present’ or
‘absent’ then there already exist more than 100,000,000,000 distinct possible contexts (that is one hundred billion, roughly the number of neurons
in the entire human brain). Allowing values in between ‘present’ and ‘absent’ only serves to increase the number more. This unmanageable number
of combinations makes it impossible to solve goal inference using a direct
mapping solution. In everyday’s conditions, people can easily infer goals
from observed actions in a certain context. This suggests that people do not
instantiate all possible action context combinations. Instead, it suggests that
action to goal mappings must be ‘non-direct’ or ‘inferential,’ in the sense
that they use some form of knowledge about the interaction between actions
and contexts to infer plausible goals.
Besides statements about mirroring being a direct process, claims that
context-sensitive goal inferences could be achieved by a simple mechanism
can also be found in the literature (Iacoboni, 2008; Rizzolatti & Craighero,
2004). There are, however, reasons to believe that no simple mechanism
can support a general capacity for making context-sensitive goal inferences,
as these inferences belong to a class of inferences that are known to be notoriously dificult. Inferring goals from actions is a form of abduction, also
called inference to the best explanation (Baker et al., 2008; Charniak & Goldman, 1993; Haselager, 1997). In abduction, causes are hypothesized to explain observed effects. In the case of goal inference, the cause is a ‘goal’
and the observed effect is an ‘action’. Existing models of abduction of a
reasonable generality all belong to the class of so-called computationally intractable (or NP-hard) functions (Abdelbar & Hedetniemi, 1998; Bylander,
30
Allemang, Tanner, & Josephson, 1991; Eiter & Gottlob, 1995; Garey & Johnson, 1979; Nordh & Zanuttini, 2005; 2008; Thagard & Verbeurgt, 1998; van
Rooij, 2008). These are functions strongly conjectured by mathematicians
to defy eficient computation by any physically implementable mechanism
(see Garey & Johnson (1979), and van Rooij (2008) for details). This thus
suggests that it is unlikely that a reasonably general capacity for making goal
inferences is based solely on a mechanism that qualiies as ‘simple’.
Taking complexity into account
Due to the tractability issues with abduction in its general form, it seems unlikely that humans can perform completely domain general goal inferences.
Instead, when trying to igure out a goal behind an action, humans may be
performing this task against the background of a restricted domain of situations. This restricted domain must still be quite general if it is to account for
the variety of situations in which humans can infer goals. At the same time
it needs to be suficiently constrained in nature to allow for tractable goal
inference. At present it is unclear how tractable models of abduction can
be formulated without rendering their domain of application too simple
for modeling real world domains (see e.g. Nordh & Zanuttini, 2005; 2008).
It is noteworthy that computational models of goal inference in cognitive
science that seem to work (i.e., that make plausible goal inferences without
running into tractability issues) severely restrict the possible contexts and
the number of possible actions and goals, keeping their application domain
far removed from realistically complex situations (see Baker, Tenenbaum,
Saxe, & Trafton, 2007; Cuijpers, Van Schie, Koppen, Erlhagen, & Bekkering, 2006; Erlhagen, Mukovskiy, & Bicho, 2006; Oztop, Wolpert, & Kawato,
2005). For example, Baker et al. (2007) modeled goal inferences made by
an observer viewing a point moving in the lat plane to one of 3 possible
goal states, and Oztop et al. (2005) modeled goal inferences made by an
observer viewing a reaching movement in the lat plane to one of eight possible goal states, and a grasping movement with three possible goals. The
availability of such successfully predictive, albeit highly restricted, models
seems to lead to an underestimation of the computational complexity inherent in more general domains.
A possible way of dealing with the limitations of a direct-matching mechanism and the fact that it cannot account for goal inference above the level
of immediate goals could be to “upgrade” the notion of mirroring, by sup-
31
6 These limitations, however, are as such not enough to provide a solid and undisputable
bound for content attribution to the iring of individual mirror neurons. That would require
a deep understanding of the functioning of the MNS at the neuronal level, as well as taking
an ontological position on the nature of representation, which would be beyond the scope of
this article.
mirror neurons as representations
plying it with more inferential capabilities. This upgraded process might
thereby be capable of explaining a substantial part of goal inference after
all. A problem with such an approach, however, is that calling this complex,
unknown form of processing a ‘mirroring process’ does not provide additional explanatory value over calling it just processing. Moreover, it further
increases the risk of underestimating the complexity of the computational
problem underlying goal inference.
Once we recognize the dificulties inherent in making context-sensitive
goal inferences, we can better address the question of how ‘mirroring’ can
contribute to an explanation of goal inference. To draw a parallel: line detection is likely an important sub-process of visual perception, yet we do
not think of object recognition as a case of mere line detection. Likewise,
mirroring may be an important sub-process in goal inference, but we would
not do well to assume that goal inference is a case of mere mirroring. In
this light it is informative that computational models that set out to connect to neuroatonomy place the major burden of processing context in goal
inferences outside the MNS (Kilner, Friston, & Frith, 2007b; 2007a; Oztop,
Kawato, & Arbib, 2006). This separation between context-processing and
mirror neuron processing makes sense given that context is such a multimodal and multifaceted construct; it is unlikely that a mirroring process is
by itself capable of processing context in all its complexity.
These considerations can help to interpret the iring characteristics of
mirror neurons in the sense that when the attributed content becomes increasingly abstract, it becomes less plausible that the mirror neuron is the
actual functional unit that is playing the key role in the cognitive function
one is interested in6. Instead, at higher levels of abstraction, such as higherlevel action goals or intentions, it becomes increasingly plausible that less
direct (more inferential) cognitive processes inluence the response proile
of mirror neurons.
An intuitive response to our analysis is that we do not take the role of
expectations into account. In our daily lives we are familiar with many different contexts and appropriate actions in that context. Such associations
could result in an expectation or prediction of what the goal of an observed
action is, which can help goal inference on many occasions. Indeed, most of
32
the time a reaching action has grasping as its goal, and most grasped cups
are grasped for drinking. But it is important to realize that such default associations can only predict one goal given a certain action or context-action
pair. People can easily ignore such default interpretations when necessary
(say, when the waiter is grasping your empty cup). It is precisely this capacity
to sometimes use a default association and sometimes use another context
sensitive process that makes direct matching an insuficient mechanism to
account for our general capacity to infer goals from actions.
Another objection to our arguments could be that there is data available
that does show mirror neurons representing higher intentions. For instance
Fogassi et al. (2005) found mirror neurons in the monkey’s Inferior Parietal
Lobule that responded selectively for different intentions underlying the
same actions. Monkeys were trained to grasp a piece of food and either
place it in a container on their shoulder, or eat it. Some neurons responded
differently for these two intentions. Importantly, in some neurons this difference in iring was preserved when the monkeys observed the experimenters perform the same actions. Fogassi et al. take this as evidence that mirror
neurons are in fact capable of mirroring intentions.
At a irst sight this data seem in contradiction with our analysis, as we
argued that a mirror mechanism cannot yield an action description at the
level of intentions. However, as it is pointed out by (among others) Csibra
(2007), from the fact that this activity is indicative of intention understanding, one cannot conclude that it is constitutive of intention understanding. A mechanism in which low-level mirror processes are modulated by
other, context sensitive processes would produce similar neuronal in these
neurons. The inding of neurons with mirror properties in medial frontal
and temporal cortices outside the classical mirror neuron area (Mukamel,
Ekstrom, Kaplan, Iacoboni, & Fried, 2010) point in the direction that other
regions are also involved in action recognition.
Also the activation of non-mirroring areas upon interpreting actions
found in imaging studies, such as occipital, posterior parietal and frontal
areas (Calvo-Merino, Glaser, Grezes, Passingham, & Haggard, 2005; Grezes
& Decety, 2001; Iacoboni et al., 2005) hints to the modulating role of these
brain systems. When Rizzolatti and Sinigaglia (2010) speculate on mirror
neurons’ capacity to generalize between various effectors, and actions, some
of which fall outside the observer’s own motor repertoire, they too recognize the necessity of aid of non-mirroring processes in action understanding. We think that, in addition to characterizing the nature of the mirrored
33
Conclusion
Research based on single-cell recordings is hindered by a conceptual indetermination that allows for content attribution of an unbounded abstraction, troubling the interpretation of the data of these experiments. However, research into the MNS as a whole, together with an analysis of the
computational complexity of the attributed task can help to restrict the class
events that can be said to be mirrored. We have argued that, on theoretical
grounds, a direct or otherwise simple (mirror-like) process cannot be used
to infer action goals, as the context-dependency of goals deies a simple or
direct solution to the task of goal inference. We therefore propose to restrict
the use of the term mirroring to describe a simple relective mechanism
that is involved in relatively low-level action observation and recognition,
such as grips of basic actions. This restriction can help interpreting the data
acquired by means of single-cell recordings.
It is of course obvious that we, humans, are very well capable of attributing intentions from observed actions, in spite of the presumed complexity
of this task. Explaining how we solve or evade the apparent computational intractability inherent in context-dependent goal inference is a major
question for cognitive neuroscience. However, we think that answering this
question is not helped by heaping together potentially complex processes
under the label of ‘mirroring’. Instead, much work in various areas of cognitive science needs to be done before this question can be answered. How
good, exactly, are we at goal inference? In what circumstances do we make
mistakes? In what way is context relevant for goal attribution? What restrictions apply to the action domain that make goal inference computationally
tractable enough for humans in everyday life? Answers to these questions
can guide future theory development on goal inference. It is only when the
complexity of the task is appreciated to the full extent that we can expect
to get insight into how goal inference is achieved by brain mechanisms and
how mirroring contributes to these mechanisms.
mirror neurons as representations
motor representations (de Vignemont & Haggard, 2008), unraveling this
interaction between different brain parts is a necessary step in addressing
the problem of how people are capable of understanding observed actions
or inferring the goals those actions serve, than subsuming complex processes under the label of “mirroring”.
abstract
The discovery of mirror neurons in monkeys, and the inding of motor activity during action
observation in humans are generally regarded to be supportive for motor theories of action
understanding. These theories take motor resonance to be essential in the understanding
of observed actions and the inference of action goals. However, the notions of ‘resonance’,
‘action understanding’ and ‘action goal’ appear to be used ambiguously in the literature. A
survey of the literature on mirror neurons and motor resonance yields two different interpretations of the term resonance, three different interpretations of action understanding, and
again three different interpretations of what the goal of an action is. This entails that, unless
it is speciied what interpretation is used, the meaning of any statement about the relation
between these concepts can differ to a great extent. By discussing an experiment we will
show that more precise deinitions and use of the concepts will allow for better assessments
of motor theories of action understanding and hence a more fruitful scientiic debate. Lastly,
we will provide an example of how the discussed experimental setup could be adapted to
test other interpretations of the concepts.
This chapter has been published, in a slightly modiied form, as: Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2011). Understanding motor resonance. Social Neuroscience, 6(4), 388–397.
three motor resonance
mirror neurons
action understanding
theory of mind
goals
36
Introduction
The discovery of mirror neurons in macaque monkeys (Di Pellegrino et al.,
1992; Gallese et al., 1996; Rizzolatti et al., 1996) has generally been greeted
as support for the idea that motor areas play an essential role in understanding observed actions and the inference of the pursued goals of these
actions, as these neurons ire upon both observing and executing actions,
leading to the idea that the observer simulates the observed action (Gallese
& Goldman, 1998). This suggestion was further backed up by the inding
that the human motor system becomes activated during action observation
(Buccino, Binkofski, & Riggio, 2004a; Buccino et al., 2001; Fadiga et al.,
2005; Rizzolatti & Craighero, 2004). Due to the supposedly direct and noninferential character of this process, this phenomenon is often referred to
as “motor resonance”.
Ever since the discovery of mirror neurons many fascinating indings have
been reported. However, the explanatory power of mirror neurons regarding action understanding has fallen out of step with the continuous stream
of experiments and accompanying indings. Theories on the mirror neuron system (MNS) and motor resonance have recently received criticism
(Dinstein et al., 2008; Hickok, 2009; Jacob & Jeannerod, 2005). The general
purport of this criticism is that mirror neurons cannot account for certain
experimental indings (Hickok, 2009; Saxe, 2005a; 2009), or that the generalization from monkey data to the human mirror neuron system is not
warranted (Dinstein et al., 2008; Lingnau et al., 2009). Also, theoretical
concerns about the limitation of action understanding by means of directmatching have been raised (Csibra, 2007; Jacob & Jeannerod, 2005; Uithol,
van Rooij, Bekkering, & Haselager, 2011a).
It is not the purport of this paper to review the extensive body of research
on mirror neurons and to argue for a speciic framework in which the experimental indings are best explained. To a large extent we will remain
neutral on these matters. Instead, we will show that the ongoing discussion
makes often use of imprecise terminology. Due to the use of ambiguous
concepts on both sides, the discussion between proponents and critics of
motor resonance-based theories of action understanding advances only
with great dificulty. By means of a careful analysis of the concepts of ‘motor
resonance’, ‘action understanding’ and ‘action goals’, we aim to clarify the
troubled debate around motor theories of action understanding and the
role mirror neurons play.
37
notion
interpretation
explanation
example
resonance
intrapersonal
Resonance between
visual and motor areas
Visual representation of grip
type is propagated to motor
areas
interpersonal
Resonance between
observer and executor
of action
Both observer and executer have
representation grasp action
in motor areas
action
Action of higher
abstraction than
observed action
Drinking
object
Object at which the
action is directed
A cup
world state
Desired world state
that can be achieved
by action
A full cup of coffee
action recognition
Recognition of observed action
Recognize action as grasping
goal recognition
Recognition of goal of
an action
Recognize grasping action as
serving drinking
action anticipation
Generation of response to observed
action
Prepare grasping action when
offered a cup.
action goal
action understanding
Table 1. The possible interpretations of resonance, action goal and action understanding, as
found in the literature.
motor resonance
The notion ‘motor resonance’ appears to be used ambiguously in the literature on the MNS. At least two fundamentally different interpretations of the
notion of resonance are used in neurocognitive explanations of the MNS,
which we will call intrapersonal and interpersonal resonance. Each interpretation has different elements taking part in the resonance process. Next we
will show that three qualitatively different interpretations can be found of
what the goal of an action is: The goal as a more abstract action, the goal
as a graspable object and the goal as a desired world state. We will discuss
these three interpretations. Finally, we will show that the notion of action
understanding can describe three different cognitive functions, which we
will label action recognition, goal recognition and action anticipation. An overview
of the different interpretations and our terminology is shown in Table 1.
The interpretations will be discussed in detail below.
It is important to note that none of these interpretations is in itself right or
wrong, or better than another one. As long as it is speciied what is precisely
meant by a notion, any of the interpretations is valid and could fulill a role
in theories on action understanding.
38
A consequence of this variability in interpretations is that the exact meaning
of any claim about motor resonance, action goals and action understanding that does not specify which of the interpretations of these notions is
used can vary to a great extent. A careful analysis of these claims allows for
better interpretation of theories about underlying neurocognitive matching mechanisms of action observation, and action execution, and can help
guide the design of future experiments. We will discuss an existing experiment from the literature, Umiltà et al’s (2001)mirror neuron paper as a
case study and illustrate how the experimental data and the interpretation
of them have diverged as a result of the abovementioned indeterminacy of
terminology. As an indication of the empirical applicability of the distinctions we propose, we will inish by presenting a concrete suggestion for how
this study could be adapted so that other interpretations of the concepts
presented in Table 1 can be tested.
resonance
In literature on the MNS the notion of resonance is used to describe the
activation of the motor system during action observation. The notion is adopted from physics and is used to describe the phenomenon that one (part
of a) system oscillates at the same frequency and in the same phase as another (part of the) system. In the neurocognitive domain it is not claimed
that the motor system is literally resonating in the sense that premotor neurons are iring in the same frequency and phase as neurons in other areas
(we will come to the question of what areas soon). These claims should thus
not be read as claims about neural synchrony (Damasio, 1989; Ward, 2003)
or neural oscillation (Fries, 2005). Instead, a more liberal sense of the notion is usually adopted. Rizzolatti et al. (2001) write: “we understand actions
when we map the visual representation of the observed action onto our
motor representation of the same action”. Elsewhere (Rizzolatti & Craighero, 2004), they explain: “The proposed mechanism is rather simple. Each
time an individual sees an action done by another individual, neurons that
represent that action are activated in the observer’s premotor cortex. […]
the motor ‘resonance’ translates the visual experience into an internal ‘personal knowledge’.” This process is often characterized as a form of simulation, in which the observer simulates the observed motor act, in order to
understand it (Decety & Grezes, 2006; Gallese & Goldman, 1998).
39
1 It is still debated whether the inal action representation—provided that such a representation exists—resides in motor areas (as embodied approaches to cognition argue) or whether
there are disembodied representations of actions. Here we choose not to take side in this
debate.
motor resonance
When examining the literature on mirror neurons and action understanding, two different meanings or interpretations of the notion can be
discovered, each having different elements participate in the resonance
process. We will call these two interpretations intrapersonal resonance and interpersonal resonance.
In the intrapersonal interpretation of resonance, it is claimed that the
motor system of the observer of an action resonates with her own perceptual system, so both brain areas taking part in the resonance process lie
within the same person. Examples of this kind of use can be found in, for
example, Rizzolatti et al. (2001; 2004), Buccino et al. (2004a), and Hommel
(Hommel, 2003; Hommel et al., 2001).
The idea is that the observation of an event leads to a representation
in the perceptual system of the observer. This perceptual representation is
thereupon propagated to the motor system. When the perceived event is an
action and a matching motor representation is available, the motor system
resonates, similar to a tuning fork that starts to resonate when a note of the
right pitch is played nearby (Jacob, 2009; Saxe, 2005b). Like the resonance
of the tuning fork can provide information about the pitch of the note
played, the resonance of the motor system provides information about the
action that is perceived. This is possible, according to the theory, because
the resonance is speciic for different actions. For example, at the observation of a certain grasping action, e.g., a precision grip, a motor representation corresponding with that speciic grasping action is activated in the
motor system. The observer “recognizes” the activity in her motor system as
being a representation of the speciic grasping action, and she thereby recognizes the observed precision grip action. As the coupling of a perceptual
representation to a motor representation happens unmediated by higher
cognitive processes, this theory is also known as the direct-matching hypothesis
(Iacoboni et al., 1999; Rizzolatti et al., 2001; Rizzolatti & Sinigaglia, 2010).
Figure 1 depicts the presumed causal chain from a motor plan in the
executor to an action representation in the observer, and the place where
intrapersonal resonance occurs1.
40
executor
observer
motor system
executor
intention
motor
representation
motor system
observer
action
perceptual
representation
motor
representation
action
representation
intrapersonal resonance
Figure 1. The presumed causal path from action plan in the executor to action representation
in the observer and the location of intrapersonal resonance.
The strongest evidence for this theory comes from single cell recordings
in macaque monkeys. Neurons in the inferior premotor areas were shown
to ire selectively for different actions, and action means like precision and
power grips, both performed and observed (Di Pellegrino et al., 1992; Gallese et al., 1996; Rizzolatti et al., 1996). This has led to the conclusion that
these areas are involved in the recognition (and understanding) of actions.
These monkey data were backed up by imaging data that showed that the
human motor system is activated differently upon observations of different
actions (Buccino et al., 2001; Buccino, Binkofski, & Riggio, 2004a; Fadiga et
al., 2005; Rizzolatti & Craighero, 2004).
This theory can elegantly account for the inding that mirror neurons
do not ire when the observed event is not an action (Gallese et al., 1996),
or when the action is carried out by a non-biological effector (e.g. a robot
arm) (Kilner, Friston, & Frith, 2007a; Tai, Scherler, Brooks, Sawamoto, &
Castiello, 2004). Resonance occurs when a matching motor representation
is available, so when the perceived event is not an action or an action that is
carried out by a non-biological effector, there is no matching motor representation and the motor system remains silent.2
In a second interpretation, the notion of resonance is used to denote
functional correspondence between the states in the motor system of the
observer and that of the executor of an action. This view is present in the
work of, for instance, Decety & Grezes (2006), Gallese (2001), Jacob (2008),
Fadiga (2005), de Vignemond & Haggard (2008) and Wilson & Knoblich
(2005). As the two systems taking part in the resonance process are situated
2 There are experiments, such as Fogassi et al. (2005) and Umiltà et al. (2008) that show
mirror neuron response to tool-based actions, but this was only after extensive training with
tools. A possible explanation is that, by training with tools, the monkey creates a motor representation of these actions.
41
executor
observer
motor system
executor
intention
motor
representation
motor system
observer
action
perceptual
representation
motor
representation
action
representation
interpersonal resonance
Figure 2. The presumed causal path from action plan in the executor to action representation
in the observer as presumed in motor theories of action understanding, and the two parts of
the system that take part in interpersonal resonance.
Resonance in the interpersonal meaning is a higher-level description of
the result of various processes from a motor representation in the executor to an activated motor system in the observer. It describes a resemblance
between the two motor systems and it can be established without making
claims about the underlying mechanism. This is evident from Figure 2: The
resonance process covers multiple causal steps that can be accomplished
by various underlying mechanisms. This interpretation of resonance is
not committed to speciic mechanisms bringing about these steps. Usually
a form of intrapersonal resonance is presumed to establish interpersonal
resonance, but this is not necessarily the only option: an inferential process
could also result in interpersonal resonance.
motor resonance
in two different persons, we will call this form of resonance interpersonal
resonance.
In the interpersonal interpretation of resonance, the notion is used in
an even more metaphorical sense. It is assumed that there is a semantic or
functional resemblance between the motor representation in the observer
of an action and the motor representation of the executor of the action
(e.g., both motor systems represent a grasping action at the same time). In
a sense, the observer and the executor of an action share a representation
(de Vignemont & Haggard, 2008). It is therefore stated that the observer’s
motor system resonates with that of the executor (Gallese, 2001; Gallese
& Goldman, 1998; Goldman, 2009; Jacob, 2008; M. Wilson & Knoblich,
2005), or shorter, that the observer resonates with the executor (Fadiga et
al., 2005). Figure 2 shows the presumed causal sequence from an action
plan in the executor to a representation of that action in the observer. The
two elements that take part in the interpersonal resonance are marked with
an arrow.
42
Setting goals
It is often claimed that motor resonance allows the recognition of not only
the action as such, but also of the goal that is served by the action (Iacoboni
et al., 2005; Rizzolatti et al., 2001; Rizzolatti & Sinigaglia, 2010). Yet, like the
notion of motor resonance, the notion of goal allows for various interpretations. A survey of the literature on mirror neurons yields three qualitatively
different interpretations of the goal of an action.
First the goal of an action is often interpreted as another, less speciic action that is abstracted from execution speciics. For example, Gallese et al.
(1996) classify mirror neurons as broadly congruent when the neurons appear to be activated by the goal of the observed action, regardless of how it
was achieved. An example of such a goal could be “grasping”, and grasping
with a precision grip, grasping with a full hand grip and grasping with the
mouth all serve the goal grasping. The goal-as-an-action interpretation is
also present in the work of Ferrari (2005), Fogassi (2005) and Iacoboni
(2005) and is dominant in the early papers on mirror neurons (Gallese et
al., 1996; Rizzolatti et al., 1996).
The fact that the goal of an action is itself another action is potentially problematic, as nearly every action itself can be said to serve a new, higher goal.
To illustrate: The action “grasping a cup”, can serve the goal “drinking”.
Thus conceived, drinking is an action goal. “Drinking”, however, can also be
considered an action, having “quenching thirst” or “engage in social activity” as a goal. Quenching thirst serves the goal “maintaining homeostasis”,
which serves the goal “survival” and so on. There thus exists a continuum
from concrete, readily observable events (the use of a precision grip) to
highly abstract events (survival) 3. Although individual preferences may be
possible, there seems to be no a priori level at which actions are located and
a level at which action goals are located.
Umiltà et al. (2008) provide a clear example of goals and actions lying
on the same continuum. Macaque monkeys were trained to use normal and
reverse pliers to grasp objects. The researchers found that the same motor neurons that under normal conditions ire when an object is grasped,
also ire when the object is grasped with reversed pliers, which means that
3 Besides actions and action goals, two more related notions can be found in the literature.
An “action means” is a particular way of performing an action. Action means also lie on the
same continuum as actions and goals, and can therefore, upon different interpretations, also
be actions themselves. The notion “movement” is often used to denote a movement that does
not serve a goal (see for instance Gallese & Goldman (1998) or Hommel (2003)). Action thus
conceived is a subclass of movements, i.e. those movements that serve a goal.
43
4 As said, in note 3, the difference between a movement and an action is often taken to be
that the latter serves a goal and the former doesn’t. This would entail that every action serves a
goal, making the term ‘goal-directed action’ a pleonasm for other interpretations of “goal”, as
non-goal-directed actions cannot exist, just non-goal-directed movements.
5 This quote illustrates how terminology can cause confusion. Apart from the personal/subpersonal violation, the claim that “mirror neurons infer” also departs from the initial claims
that mirror neurons engage in direct relection and no inferential processes are needed. See
Uithol et al. (2011a) for a more detailed discussion on direct relection versus inferential processing with respect to mirror neurons.
motor resonance
the hand needs to be opened to grasp the object. This suggests that these
motor neurons respond to the act of grasping (an action higher in the continuum) and not the motor act of closing the hand (an action lower in
the continuum). Although not discussed in the paper, it is not dificult to
see how the grasping with pliers on its turn serves actions of even higher
abstraction, such as eating. Fogassi and his colleagues for instance, found
different responses in mirror neurons, depending on whether the grasping
action was part of an eating action or a placing action (Fogassi & Luppino,
2005). In all, because interpretations on all levels is possible, a clear indication of the level the analysis takes place can be helpful in interpreting the
indings correctly.
A second interpretation of a goal of an action is a target object. It is this
interpretation that has given us the term ‘goal-directed action’, meaning a
transitive or object-directed action 4. Use of this interpretation can be found
in, for instance, Umiltà et al. (2001), who state that “mirror neurons have to
infer and represent the occluded speciic action in addition to the inferred
object, which is the goal of the action.”5 This interpretation of goals is also
often present in the early mirror neuron papers (Gallese et al., 1996; Rizzolatti et al., 1996), but also in later studies (Hamilton & Grafton, 2006).
Similar to this is the interpretation of a goal being a location, for instance a
cross on the desk (Wohlschlager & Bekkering, 2002) or the end location of
an action (Bekkering, Wohlschlager, & Gattis, 2000). At other places, a goal
as an object is contrasted with a goal as a location (Hamilton & Grafton,
2006; Woodward, 1998).
A third interpretation of goal is a desired state of the world. A possible
state could be “a full cup of coffee” and several actions—picking up the coffee pot, transferring it to the cup, tilting the coffee pot, etc.—are needed
in succession to reach that state. This interpretation can be found in, for
example, Csibra & Gergeley (2007), Grafton & Hamilton (2007) or Sebanz,
Bekkering & Knoblich (2006).
44
These interpretations do not necessarily exclude each other. For example “taking possession of an object” seems to have aspects of all three
interpretations. First, taking possession can be viewed as an action that can
be executed in different ways (grasping, ordering, buying). Second and obviously, this action is directed towards an object. Third, taking possession of
an object can be viewed as reaching a world state in which a certain object
is in my possession (in my hands, my mouth, my stomach). In general, the
difference between the interpretation of goal as another action and goal
as a desired world state seems to be a matter of emphasis. Sometimes one
of the interpretations is more natural or evident, sometimes the other. For
example, when one or two persons are carrying a table out of the room (Sebanz et al., 2006), it is generally not the action that one is interested in, it is
a state of the world in which the table is located outside the room. In other
cases, such as eating and drinking, it is not so much the world state that a
person is interested in, but the action itself: the person enjoys the action of
eating or drinking. Of course eating serves a purpose and is a mechanism
by which a species acquires necessary nutrients. So in a way one could say
that having the food in ones stomach is a desired world state albeit often an
unconscious one, but this seems a rather awkward way of phrasing a goal.
Notwithstanding the possible overlap, the differences can be crucial.
The meaning of the claim that mirror neurons respond selectively to goals
can differ to a great extent in the three different interpretations of ‘goal’.
For example, recognizing that an action is directed towards a cup and recognizing that this cup grasping contributes to getting a clean table are two
quite different capacities that require different experiments for testing the
nature of motor activation. As a consequence, experimental results that
support a certain neuroscientiic hypothesis (e.g. about neural mechanisms
underlying goal understanding) under one interpretation of goal understanding do not automatically support that same hypothesis under other
interpretations of goal understanding. Fogassi et al.’s (2005) study on parietal mirror neurons provides a good example of experimental setup where
precise terminology is crucial. The researchers found mirror neurons in the
monkey’s Inferior Parietal Lobule that responded selectively for different
intentions underlying the same actions. Monkeys were trained to grasp a
piece of food and either place it in a container on their shoulder, or eat it.
Some neurons responded differently for these two intentions. Importantly,
in some neurons this difference in iring was preserved when the monkeys
observed the experimenters perform the same actions. Because Fogassi and
45
Understanding action
What is meant by ‘action understanding’ differs from paper to paper. The
dificulty with the notion is that it consists of two elements, action and understanding, and the meaning of these elements is interdependent and open
to different interpretations. To start with actions: We have seen that action
means, actions and action goals can be placed on a continuum from speciic, readily observable events (e.g. the use of a precision grip) to highly abstract events (maintaining homeostasis), and there seems to be no a priori
way to make a clear-cut and objective contrast between action means, actions, and action goals.
Despite the lack of a priori considerations for contrasting actions with
goals in this interpretation of goals, it seems that the capacity to understand
motor resonance
his colleagues use the unambiguous notions ‘object’ and ‘intention’ to denote the different interpretations of goal (although the latter is sometimes
also referred to as goal), there is no confusion or conlation of the notion
goal here. However, if Fogassi and his colleagues would have used the notion goal in both the meaning of object and intention—as can be found
elsewhere in literature, as shown above—then a circular statement about
goal recognition causing goal recognition would be the result.
Hamilton & Grafton (2007) provide an illustration of all three uses of
this notion. In their introduction they discuss goals as being a desired world
state (e.g. getting refreshment), they refer to goal-dependent mirror neuron iring in the meaning of a more abstract action, while their experiments
are based on the object interpretation of goals. The authors themselves
seem to be aware of the differences in interpretation when they write that
“It is also important to note that the goals we have studied were deined by
the identity of the object taken by the actor, contrasting between a ‘take
wine bottle’ goal and a ‘take dumb bell goal’. It remains to be seen if the
same parietal regions encode other types of goal, for example manipulating
the same object in different ways.” Yet, the discussion of these other interpretations in the introduction, and the fact that they do not further specify
their interpretation of goal throughout the paper could easily entices other
researchers into applying the results to the other interpretations as well. In
the section entitled “Diverging concepts” we will discuss a case in which,
upon systematic conceptual analysis, the original experimental setup no
longer matches subsequent interpretations by other authors.
46
grip types differs to such an extent from the capacity to understanding homeostasis that differentiation is necessary. With the mirror neuron literature in mind, we will limit the use of the notion ‘action’ to movements that
exists in the here and now and that serve a goal, like grasps. We use the label
‘goals’ for more abstract actions than the observed one, in the sense that
they are either non-visible (like “maintaining homeostasis”, or “keeping to
one’s diet”) or involve future actions (“grasping in order to clean up the table”;
cleaning up the table might be a visible action, but it is not yet at the time
of picking up a cup).
The fact that actions can be found along a broad continuum of increasing abstraction has consequences for the interpretation of ‘understanding’.
Understanding can mean recognition (i.e. a form of classiication: “That’s
a precision grip”), but also ‘recognizing the goal that is served by an action
(“that’s grasping to eat”). However, as we have just seen, what is considered
to be an action and what the goal of an action, is liable to interpretation.
This makes the difference between recognizing an action and recognizing
the goal of an action also a matter of interpretation. To stick with the drinking example: When “grasping a cup” is interpreted as an action, the goal
of the action can be “to drink”. So the action can be recognized (“that’s
grasping”) or its goal can be recognized (“that’s drinking”). When, however,
we see drinking as an action, and quenching thirst as the goal of an action,
then “that’s drinking” is a matter of action recognition, and “that’s quenching thirst” is understanding the goal of the action.
Many authors seem to pitch their interpretation of action understanding somewhere along this continuum, but very few delimit or make their
interpretation explicit. This makes it dificult to assess the exact claims that
are made. For example, Rizzolatti and Craighero (2004) state that: “This
automatically induced, motor representation of the observed action corresponds to that which is spontaneously generated during active action and
whose outcome is known to the acting individual” [our italics]. Without speciication, this ‘outcome’ can mean anything from a precision grip to maintaining homeostasis. However, the claim that the MNS detects grip types
is quite different from (and more modest than) the claim that the MNS is
capable of detecting long-term goals or intentions. The two claims presume
different capacities of the system and demand different tests to verify them.
Beside recognizing the action and recognizing the goal an action serves,
a third interpretation is that understanding an action is “knowing how to
respond appropriately to an observed action” (Gallese et al., 1996; Rizzolatti
47
Diverging concepts
Umiltà and her colleagues (2001) had monkeys watch grasping actions with
the object to be grasped occluded from the monkey’s sight. By means of
single cell recordings, they showed that the monkey’s mirror neurons that
normally respond to the observation of a certain action also respond when
the inal, crucial part of that action was hidden. This shows that the buildup to the action (e.g. the opening of the hand and the reaching towards an
object) is enough to trigger the mirror neuron response, and that observation of the actual action (the grasping of an object) is not necessary. The
authors conclude that these indings support the idea that the goal of an
action can be recognized, even when the monkey is provided with an incomplete percept of an action, provided that the monkey knew that there was an
object behind the occluder. They subsequently conclude that their indings
motor resonance
et al., 2001). For example: Rizzolatti et al. (2001) write: “By action understanding, we mean the capacity to achieve the internal description of an action and to use it to organize appropriate future behavior” [our italics]. So in addition to “the capacity to achieve the internal description of an action” which
is in line with the irst interpretation, this deinition adds that it should be
used for generating an appropriate response.
Again, the different interpretations of action understanding refer to capacities that can differ to a large extent, so we will have to disentangle them.
We will use the term action recognition when we mean the classiication of an
action and the ability to differentiate it from other actions. By goal recognition
we mean classiication of the goal of an action. This goal can be an action
more abstract than the movement that takes place in the here and now,
as discussed above, or another interpretation of goal, as discussed in the
previous paragraph. Knowing how to respond appropriately to an action
we will call action response. Table 1 presents an overview of these different
interpretations.
To illustrate the empirical relevance of our conceptual discussion and
terminological distinctions, we will analyze a well-known mirror neuron
study by Umiltá and colleagues (2001) that produced fascinating results. We
will show that a univocal interpretation of the experimental data is troubled
by the use of indeinite terms. As a result their data is often interpreted as
supporting mirror neurons involvement in forms of goal understanding,
while, in our terminology, only action recognition is demonstrated.
48
“further corroborate the previously suggested hypothesis that the mirror
neurons’ matching mechanism could underpin action understanding”; a
conclusion that is subsequently adopted by others (Ferrari et al., 2005; Rizzolatti & Sinigaglia, 2010).
However, interpretation of these indings is not straightforward. We
have shown that three different interpretations of both the notions ‘action
understanding’ and ‘action goal’ circulate (let alone the range of abstraction on which actions and goals can be formulated). Umiltà and colleagues
showed that certain mirror neurons that ire upon observing a certain action also ire when the inal part of the action was occluded. As the neuron
exclusively ires upon viewing actions of this type, this is a form of what we
would call action recognition: the recognition and classiication of an action. Their interpretation of ‘goal’ is that of ‘object’, as becomes clear in
sentences like “the inferred object, which is the goal of the action” (p.161).
So, when we rephrase their indings in our systematic terminology (see
Table1), this experiment shows that the recognition of an action is dependent
on knowledge of the presence of a graspable object. This suggests that the
monkey understands that the observed movement is grasping only when it
knows that it is directed towards an object. This inding is in line with early
mirror neuron studies (Gallese et al., 1996; Rizzolatti et al., 1996), that also
found that mirror neurons did not respond to mimed actions (i.e. actions
not directed towards an object). These studies show that mirroring in order
to recognize actions involves more than mirroring the kinematic features, as
these features in mimed actions are identical to object directed actions, but
do not evoke mirror neuron response.
However, the indings of this experiment cannot be used to draw conclusions regarding goal understanding, i.e. inferring the goal that is served by
a certain action from observation of that action alone, as the data show that
the presence of a goal in the object sense is a prerequisite for the recognition of the action.
So the tenability of the claim that these indings “further corroborate the
previously suggested hypothesis that the mirror neurons’ matching mechanism could underpin action understanding” depends on what is meant by
both the “previously suggested hypothesis” and “action understanding”. Regarding the irst: support for the direct-matching hypothesis (Rizzolatti et
al., 2001) is problematic. This hypothesis states that the visual representation of the observed action (i.e. the kinematic features of the movement)
is mapped onto our motor representation of the same action and when a
49
motor resonance
matching motor representation exists, resonance occurs and the action is
recognized. According to this hypothesis, action recognition thereby enables goal inference, as the observer of the action knows, from his own experience, which goal is (usually) served by the recognized action.
When we try to explain Umiltà et al’s data within the framework of the
direct-matching hypothesis, we seem to run into some circularity: goal recognition is a prerequisite for action recognition, yet, according to the directmatching hypothesis, action recognition is a prerequisite for goal inference.
In their 2010 paper, Rizzolatti & Sinigaglia have reformulated the directmatching hypothesis. In this formulation, action mirroring is rendered as
a dual-route process, with one route directly matching movements and the
other mapping the goal of the observed motor act onto the observer’s own
motor repertoire. When these routes are genuinely parallel, action recognition no longer is a prerequisite for goal recognition, but these two processes
take place simultaneously and independently.
However, support for this revised direct-matching hypothesis is also
problematic, and now it becomes crucial what is meant by action understanding. When action understanding is taken to mean action recognition,
then these data can only provide support for half of the reformulated hypothesis. Umiltà and her colleagues found neurons that respond selectively
to different actions, which can only support the already well-established part
of the revised direct-matching hypothesis: the direct matching of actions.
No evidence is provided for the second route: the direct matching of goals.
When action understanding is taken to mean goal recognition, the indings cannot support the direct-matching hypothesis either, as only action
recognition is established, and according to the revised formulation of the
hypothesis action recognition does not underpin goal recognition, but goal
recognition takes place independently along a different route.
In all, these indings seem more in line with competing hypotheses, such
as Csibra’s (2007) or Jacob’s (2008), that state that action understanding is
modulated by non-mirroring processes, such as processing of the presence
of an object.
Based on careful distinctions of terms, as done in Table 1, we have been
able to reveal dificulties in the interpretation of data in the literature. We
have given an example of how our conceptual work can help analyzing existing data, allowing for a more precise match between empirical results and
conceptual interpretations. Next we will show that this conceptual analysis
can also help guiding the design of new experiments in such a way that
50
conceptual confusion can be prevented. As an illustration of one of such
possible experiment, we will discuss how Umiltà et al’s (2001) experiment
can be modiied in a way that it can test a different interpretation of the
concepts in Table1.
Let us interpret ‘action understanding’ as ‘goal recognition’ and let us
stick to the interpretation of goal as object. In that case, ‘goal recognition’
means ‘recognizing what object an action is directed at’. One way to test
mirror neurons’ contribution to goal recognition in this sense is to identify
mirror neurons that ire differently upon grasping actions towards different
objects. This could be done by placing two objects instead of one behind the
occluder, each demanding a different grip type (say an apple and a peanut,
demanding a full hand grip and a precision grip respectively). When the
monkey knows that only one of the objects is placed behind the occluder,
and this object is approached with the wrong grip type, mirror neurons that
ire for that grip type should remain silent, as this action cannot have the
object behind the occluder as its goal. For example, the monkey knows that
there is only an apple, but observes a grasping action with precision grip
towards the occluder, When mirror neurons that respond only to actions
performed with a precision grip remain silent (as they should when they ire
selectively for different objects and there is no appropriate object behind
the occluder), it could be considered further evidence that mirror neuron’s
iring characteristics are dependent on the object that an action is directed
at. Failure to demonstrate the ability of mirror neurons to “recognize” the
wrong grip for the object behind the occluder could be considered evidence against the idea that mirror neurons contribute to goal recognition
when goal is interpreted as the target object of an action.
In all, different interpretations of the concepts used in theories on action understanding demand different experimental setups. We have given
an example of how Umiltà et al’s (2001) experiment can be modiied in
such ways that other interpretation of the concept of action understanding
could be tested. Other interpretations of action understanding and goal will
each require a different setup tuned speciically to that conceptualization
and hypothesis that one intends to test.
Conclusion
The exact meaning of any statement involving action understanding, goal
recognition and motor resonance can vary to a great extent, depending
51
motor resonance
on the interpretation of the concepts used. In the cognitive neuroscience
literature, it is often not explicated which of the multitude of possible interpretations are used. As a result different sets of experimental data can be
taken in mutual support of neuroscientiic hypotheses, even though interpretations might diverge in ways that make the result in fact incompatible.
By means of a careful conceptual analysis we aimed to disentangle the different possible interpretations of ‘action understanding’, ‘action goals’ and
‘motor resonance’. The ine-grained distinctions we have proposed, exempliied in Table 1, allow for better interpretations of experimental data and
more adequate design of experiments. We have shown that our proposed,
systematic labeling scheme is empirically relevant in interpreting research
data, by showing how the use of the scheme leads to a reinterpretation
of existing experimental results in the cognitive neuroscience literature.
Moreover, we have illustrated how our scheme can guide the design of experimental setups aimed to test different interpretations of action understanding.
The systematic use of well-deined concepts is an important aspect of
the constructive and fruitful analysis of experimental data. In this paper we
performed a conceptual analysis to arrive at more precise and unequivocal
deinitions of the terms ‘action understanding’, ‘action goal’ and ‘motor
resonance’, terms that are central to the cognitive neuroscientiic study of
action and perception. We hope to have shown that the type of conceptual
analyses such as we performed in this paper are not mere theoretical exercises, but a constructive contribution to the empirical cognitive neuroscience.
abstract
In analyses of the motor system, two hierarchies are often posited: The irst—the action hierarchy—is a decomposition of an action into sub-actions and sub-sub-actions. The second—
the control hierarchy—is a postulated hierarchy in the neural control processes that are
supposed to bring about the action. A general assumption in cognitive neuroscience is that
these two hierarchies are internally consistent and provide complementary descriptions of
neuronal control processes. In this essay, we suggest that neither offers a complete explanation and that they cannot be reconciled in a logical or conceptual way. Furthermore, neither
pays proper attention to the dynamics and temporal aspects of neural control processes.
We will explore an alternative hierarchical organization in which causality is inherent in the
dynamics over time. Speciically, high levels of the hierarchy encode slower (goal-related)
representations, while lower levels represent faster (action and motor acts) kinematics. If
employed properly, a hierarchy based on this principle is not subject to the problems that
plague the traditional accounts.
This chapter was published, in a slightly modiied version, as: Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2012). Hierarchies in Action and Motor Control. Journal of
Cognitive Neuroscience, 24(5), 1077–1086.
four action hierarchies
motor control
action
intention
54
Introduction
In motor control it is common to think of actions as hierarchically structured: a goal is served by an action, which, in turn, is served by multiple
sub-actions. For example, when I want a glass of milk from the fridge, I have
to get up from my chair, walk to the kitchen, open the door of the fridge,
grasp the box of milk, and so on. I get up by means of placing my hands on
the armrests, bending forward, stretching my legs and push off. I place my
hands on the armrest by means of stretching my arms, grasping the rests,
etc. (similar to Newell and Simon’s (1972) means-end structure in problem
solving, see also Byrne & Russon (1998)). When the goal of getting a glass
of milk is placed on top, and the other aspects of the action are arranged
below it, a hierarchy appears. When going down the hierarchy, the tree gets
wider (more elements on one level) while the elements become less abstract, down to the level of individual muscle movements.
A general assumption of cognitive science is that such action hierarchies
are mirrored in the neural representation underlying them (Bechtel &
Richardson, 1993; Botvinick, 2008). In other words, there are two hierarchies: an action hierarchy describing the action; and a control hierarchy, describing the neural processes that are presumed to bring the action about1.
Cognitive scientist assume either implicitly (Hamilton & Grafton, 2007) or
explicitly (Botvinick, 2008) that these two hierarchies match. However, as
Badre notes: “the fact that a task can be represented hierarchically does not
require that the action system itself consist of structurally distinct processes”
(Badre, 2008, p. 193), so this assumption should be subject to testing. But,
whether these two hierarchies are identical is only partly an empirical matter. Before experiments to test this assumption can be designed, some important conceptual issues need to be addressed.
There are multiple ways to construct a hierarchy but two hierarchical
structures seem prevalent in the literature on action and motor control: one
is a hierarchy based on constitutional or part-whole relations between the
elements, the other is structured around a causal inluence between the elements. When describing the action hierarchy, typically a part-whole structure
is presumed, while the control hierarchy is usually explained using a causal
framework. However, we will show that these two structuring principles are
1 To prevent confusion: In the literature on motor control, the notion ‘action hierarchy’
is used for a hierarchical structure both in the action and in the neural control of the action.
Here we reserve the term for a hierarchical structure in the action or the behavior. Posited
structures in the neural control of an action we will call ‘control hierarchy’.
55
Actions and goals
The top-most level of a hierarchy in the motor domain is often labeled the
‘goal level’ (Hamilton & Grafton, 2006), ‘desire level’ (Grafton & Hamilton, 2007; Hamilton & Grafton, 2007), or ‘intention level’ (Pezzulo, Butz,
& Castelfranchi, 2008), but other labels, such as ‘superordinate action’, can
be found as well (Humphreys & Forde, 1998)). Below that, there is usually
at least one level for ‘actions’ (Hamilton & Grafton, 2006) or ‘sub-goals’
(Hamilton, 2009) and the bottom level is often labeled ‘movements’ or ‘kinematics’. The exact labels of these levels may, of course, vary, as long as
confusion is prevented. For reasons of clarity and consistency, we will call
the elements on the highest level ‘goals’, the action features on lower levels
‘actions’, and we will call the elements on the lowest level ‘motor acts’2, as
can be seen in Figure 1.
2 When the ideomotor terminology is adopted, an action is a movement that serves a goal
(Arbib & Rizzolatti, 1997). The elements in a hierarchy serve a goal by deinition (otherwise
they could not be accommodated in the hierarchy), so the elements on the lowest level cannot
be ‘mere’ movements (i.e. not serving a goal), as they are sometimes referred to. Hence we
choose the term ‘motor act’.
action hierarchies
in fact mutually exclusive, which suggests that the action hierarchy need not
be similar to the control hierarchy. We will discuss empirical evidence that
these two hierarchies are indeed dissimilar.
The remainder of this paper is organized as follows. We will start by briefly elucidating the relation between actions and goals. Next we will discuss
the two main structuring principles of hierarchies in the motor domain and
argue that they are incompatible and dissimilar. As an alternative account
we will discuss models that use different time scales for different control
processes. In these models structures can be found that can be seen as hierarchically structured, but in a different and much more implicit form.
This interpretation of a hierarchy is not subject to the problems that plague
the irst two options, and might therefore be an interesting alternative for
structuring elements in motor control. Understanding the nature of this
hierarchical structure can guide empirical research into action control.
56
getting
a glass
of milk
goals
walk
to the
kitchen
open
fridge
grasp
milk
box
actions
stand on
left leg
motor acts
flex
knee
swing
right
leg
move
leg
forward
Figure 1. A typical action hierarchy with one goal level and multiple levels for actions and motor acts. Note that this hierarchy is far from complete; for every action, only a few subactions
are shown.
The idea of a hierarchical structure in actions has been applied both to
action execution and action observation or action understanding. The rationale behind this dual application is that there is evidence that the same
brain structures are used for action generation and action observation (see
the extensive body of literature on mirror neurons (Rizzolatti & Craighero, 2004; Rizzolatti & Sinigaglia, 2010) and motor resonance (Fadiga et
al., 2005; Uithol, van Rooij, Bekkering, & Haselager, 2011b)). Our analysis,
however, is based mainly on claims about hierarchies in the execution of an
action, but might have consequences for action observation as well.
In order to get a better understanding of what is actually claimed when
it is proposed that action production is structured hierarchically, it is useful
to formulate an answer to two questions: (1) What makes one level higher
than another (what is the variable on the vertical axis); and (2) what is the
relation is between features on different levels (what do the lines between
the hierarchy elements in Figure 1 portray)? By answering these two questions, we will be able to compare the two different accounts of hierarchical
structures in the motor domain.
57
We can interpret a hierarchy as portraying part-whole relations between the
elements. Each level of the hierarchy comprises a set of subsystems, which
are themselves composed of smaller units. For example, the action ‘getting
milk’ consists of ‘walking to the fridge’, ‘opening the door’, ‘grasping the
box of milk’ and so on. ‘Opening the fridge’, in turn, consists of ‘grasping
the handle’, ‘pulling’, and so on. See Figure 2 for an example of a part-whole
hierarchy. In such a hierarchy, ‘getting milk’ does not exist apart from ‘walking to the fridge’, ‘opening the fridge’ etc.; it is composed of these action
features. In other words: When there is the right kind of reaching, opening
of the hand and closing of the hand (and a milk box, of course), there is
‘grasping the box of milk’. Likewise, when all the actions in the hierarchy
are present, the goal of ‘getting milk’ is present. In this case the vertical axis
denotes constitutive complexity. The higher up the axis, the more subparts
in total a certain action element has. The lines in Figure 2 portray a ‘part
of’ relation.
complexity
getting
a glass
of milk
walk
to the
kitchen
grasp
handle
reach
towards
handle
grasp
milk
box
open
fridge
pull
full
hand
grip
Figure 2. A hierarchy, again simpliied, structured according to the part–whole principle. The
structure is practically identical to the hierarchies found in the literature on action representation (see Figure 1).
Some important points need to be made in respect to a hierarchy based
on a part-whole relationship between the elements. First, this hierarchy
action hierarchies
Part-whole relations
58
can be postulated independently of an underlying cognitive mechanism.
It is a description of an action, and a way of carving an action into smaller
sub-actions and sub-sub-actions. The lower the hierarchical level, the more
detailed the description is; the higher the level, the more encompassing
the element is. ‘Grasp handle’ is just a label for the combination of ‘reach
towards handle’ and ‘full hand grip’. As such it provides a description of the
explanandum, not an explanation. Similarly, one can describe a human being
as consisting of a trunk, a head, two legs and two arms. The head consists
of eyes, ears, a nose, a mouth etc. This description does not directly offer a
mechanical explanation of the functioning of the human body; it describes
the elements that need to be explained. This nature becomes evident when
one tries to imagine how the postulated hierarchy could be refuted. It is
hard to imagine empirical evidence that could show that ‘reaching’ appears
not to be part of ‘grasping the milk box’. It seems that the kind of evidence
that could refute this hierarchy would rather be conceptual in nature.
Next, the part-whole hierarchy does not allow causal inluence between
the elements, as that would mean that an element would be the cause of its
own parts, and, in general, nothing can be the cause of its own parts (Craver
& Bechtel, 2007; Lewis, 2000)3. Likewise, the head is not the cause of the
eyes or nose. In terms of actions this means that the reaching action cannot
be the cause of the full-hand grip, but also that the goal of getting milk cannot be the cause of walking to the kitchen, which is at odd with most studies
into goal-directed action. This suggests that the part-whole principle might
not be the only principle at work in the general perception of a hierarchy
in the motor domain.
Lastly, we have shown previously (Uithol, van Rooij, Bekkering, & Haselager, 2011b) that goals can be formulated as an action of a more abstract
form (grasping a cup serves the goal drinking), as a desired world state
(grasping the cup in order to have a clean table) or as an object (the cup is the
3 Circular causality is a much-debated concept within dynamical system theory (Bakker,
2005; Juarrero, 1999; Lewis, 2005), and means that elements on a lower level collectively contribute to a higher-level variable, which in turn modulates the behavior of elements at the lower
level. As it is still highly contentious whether the downward causation (required for genuine
circular causality) actually amounts to a causative force over and beyond the collective interactions of lower level elements (Kim, 1993, 2000). We do not wish to pursue this issue here. More
importantly for our purposes, even if downward causation in this strong sense would exist, the
claim still is not that the collective variable would actually cause its own parts (i.e. their existence as
parts), but instead that it would causally constrain their behavior, and would therefore fall under
the second principle to structure a hierarchy (see below).
59
Causal relations
An alternative principle to structure a hierarchy in the motor domain, not
based on part-whole relations, is a causal hierarchy in which parts higher on
the hierarchy are the cause of, or causally inluence parts lower on the hierarchy4. The goal of getting a glass of milk activates a ‘get up’ action, which
activates a ‘stretch legs’ and ‘bend trunk’ action. In a causal hierarchy, higher-level elements can modulate the activity of lower-level mechanisms.
This structure differs from the part-whole structure in four important
ways. First, the action features are not subparts of features higher up the
hierarchy, but necessarily exist independent of action elements higher in
the hierarchy. It is important to realize that this renders the part-whole hierarchy and the causal hierarchy incompatible. In the part-whole hierarchy, the higher elements consist of the lower elements, and therefore, by
deinition, do not exist independently. In the causal hierarchy, the causal
4 Although this is generally true for action hierarchies, in the perceptual hierarchy the order
is reversed: features low in the hierarchy, such as lines and colors, are thought to be the cause
of higher-level features, such as objects (Felleman & Van Essen, 1991; Hubel & Wiesel, 1959).
action hierarchies
goal of my grasping action). It is possible to construct a part-whole hierarchy
only when goals, actions and motor acts are of a similar nature, in this case
a type of action. Only goals formulated as a type of action have subparts that
can be accommodated in a hierarchy. When goals are rendered as desired
world states or objects, no relevant subparts of an action goal can be formulated and placed in a hierarchy. Objects of course have subparts (e.g. a
cup has an handle, a saucer etc.), but object parts have no place in an action
hierarchy, as actions cannot be subparts of an object. The same goes for a
desired world state: it has many (dissimilar) elements, such as objects and
relations or properties, but they cannot be arranged in an action hierarchy.
A part-whole hierarchy could be construed for a desired world state, but it
would describe the world state, not the action needed to bring it about.
In all, a hierarchy based strictly on a part-whole principle describes the
action and its structure. No causal inluence can be assumed between the
different elements in the hierarchy. Consequently, a hierarchy based strictly
on a part-whole principle may provide a characterization of an action but it
does not provide an explanation of actions or motor control. Also, a hierarchy of this type allows only one interpretation of a goal, viz., a goal formulated as an action of a higher abstraction.
60
inluence between the elements necessitates independent existence of the
various elements.
Second, when goals exist independently of actions, it is no longer necessary that elements higher in the causal hierarchy are more complex than
elements lower in the hierarchy. A simple element can just as well be the
cause of a complex element. Indeed, goals and intentions are often posited
to be discrete, constitutionally simple and propositional states (Haggard,
2005; Pacherie, 2008; Uithol, Burnston, & Haselager, submitted).
Third, possible interpretations of the notion of goals are no longer restricted to abstract-action type of goals. The fact that parts need to be of a
similar (ontological) nature as the whole, entails that a part-whole hierarchy
only allowed goals deined in terms of an action. This restriction drops out
in a causal hierarchy, so that goals formulated as a desired world-state or an
object can also be the cause of an action. Additionally, elements such as ‘affordances’ (Gibson, 1979)—being a relation between an organism and an
object—can now be accommodated.
Fourth, unlike the part-whole relation, the causal structuring principle
does make claims about the underlying cognitive mechanisms. Effects and
causes are assigned to different elements, and for these elements to have a
physical reality they must be assumed to be related to physical causes and
effects, for instance, such as those that may hold in the brain.
To illustrate the nature of this hierarchy, let us assume that the goal of
‘getting milk’ is the cause of ‘walking to the kitchen’, ‘opening the fridge’
and ‘grasping the milk’. When we want to add further detail to this hierarchy, for example by further specifying ‘open fridge’ into ‘full hand grip’
and ‘pull’, we have to choose between simply replacing the element ‘open
fridge’ with this sequence of elements (Figure 3a), or adding an extra layer
below ‘open fridge’ (Figure 3b). The difference is not a mere difference in
visualization, but actually corresponds to two different claims about the control of the action. In the latter situation, we postulate an extra control layer,
which is ontologically independent of “full-hand grasp” and “pull handle”.
In this case it is claimed that “opening the fridge” exist as a separate entity
(a representation, or a command), independent of the lower level features.
In the causal hierarchy, the vertical axis denotes causal inluence. Higher levels have causal inluence on lower levels, but lower levels have no inluence on higher levels. However, motor control is generally not believed
to be instantiated by unidirectional downward causation. More realistic
models of motor control implement feedback by means of reciprocal con-
61
b)
getting
a glass
of milk
walk
to the
kitchen
full-hand
grasp
pull
handle
action hierarchies
causal influence
a)
getting
a glass
of milk
grasp
milk
box
walk
to the
kitchen
grasp
milk
box
open
fridge
full-hand
grasp
pull
handle
Figure 3. Two different hierarchies structured according to the causal principle. In hierarchy
(a) there is no extra control layer between “getting a glass of milk” and “full-hand grasp,”
whereas in (b) “open fridge” exists as an independent causal unit.
nections (Kilner, Friston, & Frith, 2007a), feedforward and error-prediction
(Friston, 2005; Haruno, Wolpert, & Kawato, 2001).
However, feedback between action elements on different levels is problematic to accommodate in a hierarchy structured around causal inluence,
as feedback is also a form of causal inluence. If motor acts can also inluence actions, and actions also inluence goals, we seem to have lost the
principled reason for placing goals on top and means at the bottom of the
hierarchy. In other words, there seems to be no principles for placing one
level below or above another level, which means a departure from one of
the main characteristics of the control hierarchy: its top-down organization.
To make things causally even more complex and interconnected, in addition to the aforementioned interlevel causal inluence, there is evidence
for intralevel causal inluence as well: elements on a given level seem to
inluence each other. As an example, Cohen and Rosenbaum found what
they call the ‘hysterese effect’ (2004). This effect shows that during a grasping task, a previous grip location inluences the location where an object
is grasped next, even when this means that the well known ‘end state comfort’ principle (Rosenbaum & Jorgensen, 1992)—a presumable top-down
process—has to be violated. As another example, Selen et al. (2009) found
that the ‘stiffness’ used in pushing an object was not only an effect of the
characteristics of the object that was being pushed, but also of the previous
object. In other words, it mattered what the subject did before for how the
task was executed.
62
There is also evidence that what you will do next, inluences on how you
perform the current action or motor act. In speech articulation this effect is
known as coarticulation (Rosenbaum, 2009). When, for example, pronouncing “tulip”, the lips already round before pronouncing the “t”, to correctly
pronounce the “u” ”, but consequently, the “t” is pronounced slightly different.
When there seems to be mutual inluence between elements on different levels as well as between elements on a single level, and we hold on
to causality as the only principle for structuring the hierarchy, the image
that emerges is more like a mesh with dynamically interconnected action
features than a neat tree structure with an inherent top-down ordering of
levels. In a tree with bidirectional causal inluence no unambiguous ordering of levels is implied by the causal relation alone.
To be clear, the conclusion of our analysis is not that the idea of an
action hierarchy is in itself wrong. We have argued that if such an action
hierarchy exists then it cannot be based on causal relations alone. Likewise,
we do not wish to deny the existence of causal relations between the action
elements, but framing the hierarchy entirely in terms of causal inluence
just does not seem to capture the complexity of inluences present in the
neural control of an action.
Still, we, as well as many other species, are capable of organizing our
behavior in such ways that a predetermined goal is achieved. When I want
a glass of milk, I usually have this goal prior to initiating action. Also, I usually succeed, regardless of a few obstacles on my path, and when necessary,
I can adapt my behavior to unforeseen environmental demands and still
succeed. This must mean that the goal of getting a glass of milk in Figure 3
has a dominance of some sort over the other action features. A clue on how
this dominance could be achieved can be found in recent modeling work.
We will discuss this in the section entitled ‘temporal extension’. First we
will formulate the consequences of the incompatibility explained above for
cognitive research into motor control.
Different hierarchies for different parts of the
explanation
Both the part-whole structure and the causal structure can be found in the
literature on action representation and motor control. For example, Grafton and Hamilton (Grafton & Hamilton, 2007) provide much evidence for
63
action hierarchy
control hierarchy
structuring princinple
part-whole
causality
location
in the action
in the neural control
nature
decomposition of the explanandum
mechanism
Table 1. The two types of hierarchies and their properties.
We have argued that the two structuring principles are not compatible. So
when the action hierarchy is supposed to be mirrored in the control hierarchy, a structuring principle that is applicable to both the hierarchies
is needed. Unfortunately neither the part-whole structure, nor the causal
structure seems to thrive outside its niche.
The causal structure makes little sense in the action hierarchy. We might
be able to explain that my walking to the fridge is caused by the goal of getting milk, but it does not make sense to state that my leg swinging is caused
by my walking, as that would entail that that my walking could exist independent of leg swinging.
Applying a part-whole structure to a control hierarchy is equally problematic. First, as explained above, a part-whole hierarchy would not relate
to a causal mechanism, but to a (complex) representation of an action at
best. Second, when one is looking for a part-whole hierarchy in neural structures, one assumes that the structure in the content of the representation is
mirrored in the structure of the vehicle of the representation, which means
action hierarchies
a form of distributed representation of an action in which different action
elements are represented in different brain regions. They claim that this
distributed nature of action representation is evidence for a hierarchy in
motor control. They note that “control hierarchies should be relected by
differences in those areas that are recruited for preparation and execution”
(p.599), suggesting a causal inluence between the various elements. Later,
however, (p. 605), they postulate an action hierarchy based on levels of complexity, suggesting a part-whole structure.
In general, each of the hierarchies seems to have found its own niche
within explanations of an action. When describing the action hierarchy, a hierarchy is often constructed on basis of the part-whole structure. The action is carved into sub-actions, and sub-sub-actions, as explained above. On
the other hand, when the control hierarchy is described, a causal structure is
presumed. An overview of our conclusions thus far is presented in Table 1.
64
that one is looking for an action representation with a constituent structure
(Fodor, 1975) or a microfeature structure (van Gelder, 1999). In this form
of representation the vehicle (i.e. the neural state that carries the information) has identiiable subparts, and content can be attributed to these
subparts. Moreover the content of the overall representation is dependent
on the content of the subparts. So in case of action representation, the goal
representation should consist of sub-representations that can be identiied
as actions. These sub-representation again have sub-parts with identiiable
content. For example, the representation of grasping the handle should
consist of two identiiable representations: reaching towards the handle and
a full-hand grip. This strong restriction renders much of the available neural
data insuficient to support a part-whole hierarchy, as not only do we have to
ind different representations for different subparts of an action, but these
representation together also need to be correlated with the presence of a
goal. So, for example, goal-sensitive mirror neurons in the macaque’s premotor cortex (Gallese et al., 1996; Rizzolatti et al., 1996; 1987) cannot be
accommodated in a control hierarchy based on a part-whole relation. The
vehicle of this goal representation is simple in the sense that no functional
subparts are known to date 5 (Uithol, van Rooij, Bekkering, & Haselager,
2011a). In contrast, a goal representation that has a constituent structure
should be dividable into several sub-vehicles, representing sub-goals or actions.
In all, the two structures are not compatible, and neither structure is
transferable to the other side of the explanation. A direct consequence is
that the control hierarchy and the action hierarchy need not match. Both
the structure and the set of elements of the two hierarchies can differ. Apparently our intuition to divide an action into ever-smaller parts—our ‘folk
motor control’, so to speak—might not be the best strategy for inding the
neural correlates of action control6. Indeed, Dennett warns us against the
uncritical acceptance of a seemingly (intuitively) reasonable task descrip5 Features such as spiking frequency or phase could play a functional role in the representational capacities of a neuron. To our knowledge no study investigated these properties of
mirror neurons.
6 The fact that the action hierarchy might not map (perfectly) onto the control hierarchy
also has interesting consequences for theories on action understanding by means of motor
resonance or mirror neurons. In these theories it is assumed that, when observing action,
the same neural structures are recruited as when executing an action (Uithol et al., 2011b).
But when the features of action control do not match the action features we distinguish in an
observed action, the nature of the ‘shared representations’ (de Vignemont & Haggard, 2008)
needs to be subjected to further research.
65
Neural evidence for two different hierarchies
There are two ways in which the action hierarchy and the control hierarchy
can be dissimilar: the control hierarchy can contain elements that are absent in the action hierarchy, and—vice versa—the action hierarchy can contain elements that are absent in the control hierarchy. There seems to be
empirical evidence for both types of mismatches. To give an example of the
irst: Graziano and Alalo (2007) stimulated the premotor areas of macaque
monkeys for a relatively long duration (500-1000ms). They were thereby
able to evoke complex movement sequences to a certain end-location, for
instance a sequence consisting of grasping, bringing to the mouth, turning the head towards the hand and opening the mouth. Importantly, these
movements were complex, but ‘dumb’: when something blocked the trajectory of the bringing-to-the-mouth movement, the arm got stuck and did no
move (Graziano, 2010, p. 461). These data seem to suggest that the behavioral repertoire of the monkey is represented by means of basic chunks, and
modiications to these chunks, such as target localization and adaptation to
the trajectory when an object is blocking the pathway. However, a straightforward decomposition of the action into an action hierarchy would not automatically lead to these basic action chunks, and therefore would not posit
the additional modifying elements. This demonstrates that the control hierarch contains elements that are absent in a straightforward action hierarchy.
Similarly, the most straightforward or intuitive decomposition of a grasping action is into the movements of individual ingers and the thumb. However, there is evidence that, at the neural side, the control of the grip is
action hierarchies
tion: “Marr’s more telling strategic point is that if you have a seriously mistaken view about what the computational-level description of your system is
[…], your attempts to theorize at lower levels will be confounded by spurious artifactual puzzles. What Marr underestimates, however, is the extent to
which computational level (or intentional stance) descriptions can also mislead the theorist who forgets just how idealized they are” (Dennett, 1989,
p. 108). Instead, a constant interplay between neural data gathering and
adapting the action hierarchy might be a more fruitful strategy.
Thus far we have based our conclusion that the action hierarchy need
not match the control hierarchy solely on conceptual grounds. In the next
paragraph we will discuss empirical evidence that there are in fact dissimilarities between these two hierarchies.
66
not decomposed into the movements of individual ingers, but to a base
posture with addition of reinements in inger and thumb position (Mason,
Gomez, & Ebner, 2001). So a straightforward decomposition of a precision
grip grasping action would lead to an index inger and thumb movements
as basic chunks, while the neural control hierarchy has a full hand grasp
and suppression of three ingers as basic chunks. Again, our ‘folk’ decomposition of an action seems not to correspond to the control hierarchy: the
neural representation can contain elements that, at irst sight, do not seem
to be part of the action.
There seems to be neurological evidence for the opposite possibility as
well: the control hierarchy can lack elements that do seem to be part of the
action hierarchy. The literature on embodied7, embedded cognition provides many examples of elements that can be considered part of an action,
but lack a neural correlate (see for instance Chiel & Beer, 1997). Clear examples can be found in the human gait. Our gait is a complex orchestra of
movements in many joints. The muscle activation responsible for a successful gait is hypothesized to be controlled by central pattern generators (Duysens & Van de Crommert, 1998). However, these neural patterns are not
suficient to generate a luent and eficient gate. Passive components, such
as muscle and tendon elasticity, and inertia of the upper and lower leg are of
crucial importance (Whittington et al., 2008). In other words, some particular stages or parts of an action are not controlled by the neural patterns that
activate muscles, but these stages are accomplished by “exploiting” regularities of the body, such as muscle and tendon elasticity, and the context, such
as inertia and gravity, and are in that sense not centrally controlled but via
self-organization. These important features of a normal gate are not part
of the action representation, but they are, nevertheless, part of the action.
The problems outlined above suggest that, in their purest form, the two
traditional principles for structuring a hierarchy might neither separately,
nor combined be the best candidates for a general theory on action representation. An interesting alternative for (or modiication to) structuring the
control hierarchy can be found in the temporal ordering of hierarchical elements or processes (Kelso, 1995; Kiebel, Daunizeau, & Friston, 2008; Koech-
7 The notion of ‘embodiment’ is used for various forms of dependency on a body. In cognitive science it can refer to something as modest as activation of the motor cortex (de Vignemont & Haggard, 2008), while usually in philosophy a more radical mutual dependency
between body and brain in the generation of behavior is meant (Clark, 1997; Haselager, van
Dijk, & van Rooij, 2008; van Dijk, Kerkhofs, van Rooij, & Haselager, 2008) (see Ziemke (2003)
for an overview of the various interpretations). Here we use the more radical interpretation.
67
Temporal extension
Yamashita & Tani (2008) modeled a motor system of a robot without using
what they call “local representations”: neural nodes dedicated to the representation of single action primitives in an explicitly represented hierarchical structure. Every one of the 180 units was connected to every other unit,
including itself. The network was trained using ‘back propagation through
time’, that required reciprocal message passing. They realized self-organization of a functional hierarchy through the use of two distinct types of
neurons, each with different temporal properties. The irst type of neuron is
fast, in the sense that its activity can change quickly. The second type of neuron is slow. They found that after training, continuous sensorimotor lows
are segmented into reusable motor primitives during repetitive execution
of behavioral tasks. Moreover, these primitives could be lexibly integrated
into new behavior sequences. The model accomplishes this without setting
up an explicit sub-goal or function. In other words, without explicit instructions, representations of independent action elements emerge.
It is important for our analysis that the two types of neurons each developed a distinct activation proile. During the execution of a repetitive
motor task, repetitions of similar patterns were observed in activities of the
fast context units. The activity in the slow units, in contrast, remained constant throughout the repetitious task. These results can be interpreted such
that the fast units encoded reusable motor primitives, that, due to their fast
dynamics, were unable to preserve goal information over long trajectories.
The slow context units, in contrast, encoded the switching between these
primitives, and on account of their slow dynamics could contribute to more
stable goal representations. It is important to realize that the behavior of the
robot was the result of the interplay of the different units, and not of slow
units controlling the faster ones.
This interpretation could provide us another and less problematic structuring principle for a hierarchy: temporal extension. Elements higher on
the hierarchy are represented longer or more stable than lower ones. As
action hierarchies
lin et al., 2003). The fundamentals of such a hierarchy are best introduced
by discussing a recent model in robotics (Yamashita & Tani, 2008). After
this brief excursion we will return to neuroscience and discuss Koechlin’s
‘cascade model’ of neural control (Koechlin et al., 2003), that seems to be
structured around the same principle.
68
such, they are able to inluence an action for a longer time interval, thereby accounting for our capacity to structure behavior around a goal. In a
way, this reverses the general reasoning: Elements are not more inluential
because they are higher in the hierarchy, but elements are higher in the
hierarchy because they have more inluence (on account of being more
persistent).
Although it is related to causal inluence, temporal extension is a different criterion for building a hierarchy. It is not assumed that the causal
inluence works in only one direction, from goal to action—remember that
every unit in the network Yamashita & Tani used was connected to every
other unit. Nor is it assumed that the causal inluence in one direction is
bigger than in the reversed directed. The difference between the types of
inluence is a difference in temporal extension: goals simply exert their inluence longer than the actions or motor acts.
The control hierarchy structured on basis of temporal extension is not
committed to the direct causal inluences as found in the causal hierarchy.
This means that although the overall structure—goals high in the hierarchy
and action means low in the hierarchy—can be preserved, the hierarchy is
much more implicit, as the orderly tree structure is lost. There is a simultaneous inluence of a great many action features, at an unbound number
of levels.
The model built by Yamashita and Tani (2008) developed a functional
hierarchy of only two layers: slow and fast, and the functional elements they
found would still be located at the very bottom of the common action hierarchies. They suggest that “[t]he idea of functional hierarchy that selforganizes through multiple time-scales may as such contribute to providing an explanation for puzzling observations of functional hierarchy in the
absence of an anatomical hierarchical structure” (p. 13). Indeed, human
action control seems to be hierarchically structured—as argued above—
without a clear anatomical hierarchical structure (Miller & Cohen, 2001),
so this model could help interpreting an inluential neurocognitive model
on action control.
Koechlin, Basso, Pietrini, Panzer, & Grafman (1999) proposed a model
in which different types of action control are located along a rostro-caudal
axis in the lateral prefrontal cortex (PFC). In their hierarchical model four
types of control are discerned (Koechlin et al., 2003; Koechlin & Summerield, 2007). Sensory control, located at the caudal end of the axis, is involved
in selecting motor actions. A bit more anterior, Contextual control is involved
69
action hierarchies
in selecting premotor representations or stimulus response associations.
Next, episodic control is involved in selecting task sets or sets of consistent
stimulus-response associations in the same context. Lastly, branching control,
implemented in the rostral end of the axis, the anterior and frontopolar regions of the PFC, involves controlling the activation of sub-episodes nested
in ongoing behavioral episodes.
The signiicance of the proposed model does not lie in the fact that
exactly four different control layers are posited (it is, we believe, unlikely
that human action control consist of ixed and integer number of control
layers), but in the suggestion that different control processes operate on different time scales. When going from sensory control to branching control,
the temporal extension of the types of control grows. Sensory control deals
with selecting immediate movements—analogous to Yamashita and Tani’s
fast neurons—and monitoring stimulus changes. The input to contextual
control is already more robust and less dynamic. Episodic control deals with
entire sets of association within one context, while branching control is involved in managing changes between different contexts. This means that
these control processes can be structured in a hierarchy structured around
stability, or temporal extension.
Once this hierarchy is established, it is compelling to interpret Koechlin
et al’s inding in terms of a more traditional, causal motor hierarchy, with an
action goal originating in the higher control processes, that is subsequently
propagated to the lower types of control to evoke the appropriate action.
By referring to their model as the ‘cascade model’, and by emphasizing the
downward modulation, Koechlin and colleagues are—perhaps unintentionally—feeding this compelling intuition, which is subsequently adopted by
other researchers (Badre & D’Esposito, 2007; Hamilton, 2009).
However, the data do not suggest such an interpretation. Koechlin and
colleagues (1999) show that when more temporally extended forms of control are needed, anterior and frontopolar areas are activated in addition, not
alternatively, suggesting that these control processes are not responsible for
the control task by themselves, but through interaction with the lower types
of control, just like all the units in Yamashita and Tani’s 2008 model contributed to the resulting behavior. This collective contribution is incompatible with the idea that goal directed behavior is the result of higher layers
propagating goal representations to lower layers. Goal directed behavior
emerges from the interaction between the different types of processes, not
from straightforward top-down modulation.
70
If we would accept Koechlin’s alternative hierarchy based on temporal
extension, but continue to interpret this hierarchy as a straightforward causal hierarchy we do not do justice to the complexity seemingly inherent in action control. Additionally, interpreting the proposed hierarchy in terms of
causal effects entails positing discrete states that, through interaction, bring
the action about. It is, however, highly unlikely that such discrete states with
these causal effects can be found in the prefrontal cortex (Uithol et al.,
submitted). Positing discrete states and causal interactions between them
seems to ignore the complex and intertwined nature of the dynamic control
processes emphasized by Koechlin and colleagues.
This insight could guide future research into action control. Instead of
positing an anteriorly represented action goal and try to locate the processes by which this representation is transformed to a motor program, the
analysis above suggests that research into action control is better served by
focusing on how goal directed behavior emerges from the interaction between the different control layers. Which sensory input is used on which
layer of control? How do lower control processes shape higher ones, and
vice versa? Koechlin and colleagues made an important step in shifting this
focus. This shift is hampered, however, if we allow the traditional views back
in to shape our analysis.
An important theoretical advantage of an implicit hierarchy based on
temporal extension is that it rids us from the rather artiicial constraint that
an action is associated with just one goal, present in the causal and partwhole hierarchies. At every moment one can be attributed many, maybe
even an ininite number of goals: to breathe, to read, to maintain homeostasis, to be a good scientist, to remain an upright posture, etc. Our behavior
is the result of the interplay of this multitude of goals (McFarland, 1989;
Uithol et al., 2012; submitted). These goals need not be represented on the
higher layers in Koechlin’s model, but can also be an emergent result of the
interaction of different control processes. To give a simple example: When
swimming using a front crawl, a typical pattern of strokes and breathing is
adopted. This pattern only makes sense when one realizes that two goals,
to swim as fast as possible and to breath, are pursued at the same time. Of
course we know about a swimmer’s goal to breathe, and this goal is unlikely
to be represented in one of Koechlin’s control layers. In straightforward
cognitive descriptions, we are inclined to leave it “out of the equation”, and
treat it as a boundary condition. But making a distinction between variables
and boundary conditions in such an intuitive and implicit manner might
71
Conclusion
In theories on motor control two hierarchies, the action hierarchy and the
control hierarchy are thought to match. We have presented both conceptual and empirical evidence suggesting that this assumption is unlikely to
be true. We have shown that, implicitly, two structuring principles are used
to construct a hierarchy, but that neither structure (nor the (impossible)
combination of the two structures) can provide an adequate framework for
explaining actions and motor control. The action hierarchy—constructed
using a part-whole hierarchy—is a description of the action that is to be
explained, but can be misleading in searching for a neural implementation
of the action. The control hierarchy—constructed using causal relations—
does not capture the complexity inherent in motor control. Our conclusion
is not that motor control is not structured hierarchically at all, but that the
traditional accounts of an action hierarchy do not capture the complex and
dynamic nature of motor control. Alternatively, dynamic accounts of motor
control can be interpreted as hierarchical as well. In these models which
elements that are represented longer and more stable are higher in the
hierarchy. Although these alternative models are hierarchical in a much
more implicit way, and cannot straightforwardly be interpreted along the
action hierarchies
not be the best approach to a general theory on motor control. Although
cognitive scientists generally have good reasons not to put an ininite number of goals in a model on action representation, to assume that the number
of goals is always limited to only one might in some cases be overly restrictive.
This inluence of multiple simultaneous goals cannot be easily accommodated in an explicit control hierarchy. An element can be caused or
modulated by multiple goals at the same time. It might not always be clear
what goals inluence a lower element, or to what extent. The result would
be that the orderly tree shaped hierarchy gets replaced by a dense mesh
of interconnected action elements, which would seriously undermines the
value of a hierarchy in explaining the realization of actions.
On the other hand, as the more implicit hierarchy, structured around is
not committed to the postulation of a single, explicit goal, nor a direct and
univocal relation between the higher and lower elements. Therefore the
inluence of multiple simultaneous goals does not undermine the hierarchical structure.
72
same lines of the more traditional accounts, they do not suffer from the
conceptual and empirical issues discussed. Much work—both conceptual
and empirical—is still needed to develop an implicit hierarchical structured
around temporal extension to an insightful and coherent alternative to the
current theories on action representation. But only if we approach the alternative hierarchy as a genuinely alternative structure, and avoid straightforward causal interpretations based on the traditional accounts, we can
expect to ind its true value.
73
action hierarchies
abstract
Intentions are commonly conceived of as discrete mental states that are the direct cause of
actions. In the last several decades, neuroscientists have taken up the project of localizing
intentions in the brain, and a number of areas have been posited as implementing representations of intentions. We argue, however, that it is doubtful that the folk notion of ‘intention’
applies to any particular physical process by which the brain initiates actions. We will show
that the idea of a discrete state that causes an action is deeply incompatible with the dynamic organization of the prefrontal cortex, the agreed upon neural locus of the causation
and control of actions. Discrete representations can at best, we will claim, play a subsidiary,
stabilizing role in action planning. This role, however, is still incompatible with the folk notion of intention. We conclude by arguing that the prevalence of the folk notion, including
its intuitive appeal in neuroscientiic explanations, stems from the central role intentions
play in constructing intuitive explanations of our own and others’ behavior.
A modiied version of this chapter is resubmitted for publication as: Uithol, S., Burnston,
D., & Haselager, W. F. G. (under review). Will intentions be found in the brain? Cognition.
five intentions in action
action
intention
prefrontal cortex
motor control
76
Introduction
Actions are generally thought to be the result of a preceding intention to
act. Tim intends to grasp the cup in front of him and subsequently, and
consequently, he grasps the cup. Intentions, in this ‘folk’ interpretation, are
conceived as discrete mental states that are the direct cause of actions. The
notion of intention plays an important role in a variety of contexts, ranging
from psychology (Meltzoff, 1995), to philosophical theories of action (Bratman, 1987; Davidson, 1963), to legal theory (Moore, 2011).
The folk conception of intention is often straightforwardly used in neuroscientiic studies into willed action. Haggard summarizes the role the
notion plays in computational neuroscience as follows: “In computational
motor control, for example, actions begin with a relatively simple description of a goal (e.g. ‘I will stand up’). The brain must expand this task-level
representation into an extremely detailed movement pattern specify the
precise kinematics of all participating muscles and joints. Generating this
information is computationally demanding. The brain’s solution to the
problem may lie in the hierarchical organization of the motor system. Details of movement are decided at the lowest level of the motor system possible” (Haggard, 2005, p. 292). The picture of intentions that emerges here
is of discrete and simple states, free of context-speciic details, that are the
originating causes of subsequent action planning and motor movement.
Ever since Libet’s pioneering investigations into the neuroscience of
willed action (1985), continuous attempts have been made to ind the neural correlates of intentions. Largely based on functional MRI studies, these
attempts have resulted in a variety of proposed localizations for intentions.
For example, Lau, Rogers, Haggard & Passingham (2004) asked subjects to
attend to their own intentions while performing an act, and measured the
areas that showed modulated activity during an attended condition compared to an unattended condition. Based on its increased activation, they
argue that intentions to act are localized in the pre-SMA region of the medial prefrontal cortex. By stating, after James (1890), that “any intention
[…] has the tendency to cause the relevant movements” (Lau et al., 2004, p.
1208), these authors explicitly state their adherence to the causal aspect of
the folk notion of intentions.
Other projects attempt similar localizations, but end up with different
results. For instance, Haynes and colleagues (2007) report inding neural
activation speciic to subjects’ intentions in the medial prefrontal cortex
(more anterior than Lau and colleagues reported), as well as lateral pre-
77
intentions in action
frontal cortex. In this study participants had to either subtract or add two
numbers that were presented after a short interval. Simply looking at the
differential brain activity in the two cases allowed for statistically signiicant
predictions of which action the subjects intended to perform, leading to
the hypothesis that the intentions for these actions were encoded in this
differential activity.
As a inal example, Hamilton and Grafton have conducted a variety of
studies attempting to delineate the brain areas responsible for understanding others’ intentions, and have argued that such processes make use of the
observer’s own goal representing system (Hamilton & Grafton, 2008). Citing
fMRI evidence that activation in the inferior parietal lobule is not speciic
to motor effectors, they argue that “human IPL [(inferior parietal lobule)]
and IFG ([inferior frontal gyrus]) contain populations of neurons that encode the outcome of an observed action”, and moreover that “these results
are concordant with previous data implicating IPL and IFG in goals and
intentions” (Hamilton & Grafton, 2008, p. 1164). Thus, they claim that
speciic groups of neurons contain an explicit representation of a desired
outcome, and is involved in preparing the appropriate action (Grafton &
Hamilton, 2007), thereby implementing an intention to bring about the
outcome. This interpretation, in addition to being clearly inluenced by the
folk view, shapes Hamilton and Grafton’s view of the overall structure of the
motor system. They propose a hierarchical organization in which motor
plans are at lower levels and abstract goal and intention representations
are at the top (a “motor hierarchy” similar to the one described by Haggard above (Grafton & Hamilton, 2007; Hamilton & Grafton, 2007; see also
Uithol, van Rooij, Bekkering, & Haselager, 2012).
It is interesting that each of these projects posits a different locations for
intentions. One might think that this means that the researchers are postulating alternative, competing hypotheses, and that further research will
help determine which of the hypotheses is correct. In this paper we will
argue, alternately, that the folk view of intentions as discrete mental states
embraced by these theorists is not applicable to the neural processes that
actually generate actions, and that consequently none of these projects will
reveal a brain structure that instantiates states with the properties of intentions attributed by the folk interpretation.
In the next section we will discuss the properties of the folk notion of
intention—that intentions are functionally discrete, context-independent,
and cause actions—via a brief analysis of the role the notion plays in philo-
78
sophical accounts of intentional action. We will speciically discuss Pacherie’s theory of intentions (Pacherie, 2006; 2008; Pacherie & Haggard, 2011),
as we take her account to be the best attempt to make the folk notion empirically tractable, and because, importantly, a variety of investigators in the
neuroscience of action explicitly cite Pacherie’s account as being at least
compatible with their overall view of action generation (see, for instance,
Moore, Wegner & Haggard (2009); Pezzulo and Dindo (2011) and Hamilton and Grafton (2007)). However, we will then show that this account
is incompatible with a variety of results and models of action control in
the lateral prefrontal cortex (lPFC), the area that is most likely to implement causation and control of actions in the brain. This incompatibility, we
will show, stems directly from the adherence to the folk notion. As a consequence, any neuroscientiic account informed by the folk notion will face
similar compatibility problems. Therefore, we will argue, neither the folk
notion nor its philosophical descendents will provide a fruitful conceptual
framework for guiding neuroscientiic investigation. Subsequently, we will
discuss an alternate possible role for discrete representations, as helping to
stabilize dynamical processing during action planning. This contribution,
we will argue, is still incompatible with the folk notion of intention. Finally,
in a more speculative section, we will discuss the possible origins of the folk
notion, and try to explain why it has been so overwhelmingly pervasive.
The folk notion and its philosophical
descendants
The notion of ‘intention’ plays an important part in our folk psychology,
our everyday framework of explaining the behaviors of ourselves and others (Davies & Stone, 1995; Haselager, 1997; Stich, 1983). In this framework,
intentions are generally conceived as mental states similar to beliefs and
desires (Anscombe, 1957). Like a desire, an intention is characterized as
representing a potential outcome, but the two mental states differ in that an
intention consists of both a goal and an action plan to achieve it, whereas
a desire does not (Bratman, 1981). For instance, one can desire that the
sun will rise, but one cannot intent to make it rise, as there is no action that
could bring this about.
On philosophical interpretations of the folk framework, three important
and related features are generally attributed to intentions. First, the content
of intentions is independent of the context in which they occur. For instance
79
1 Our use of discreteness is about the role these mental states play, and is not to be confused
with the discreteness of the representations themselves (see Maley (2010) for a discussion on
the latter interpretation).
intentions in action
Pacherie (2008), Searle (1983) and Bratman (Bratman, 1987), stress that
the content of a particular intention is independent of the perceptual, affective, and cognitive context in which it is implemented, and therefore that
each particular intention needs to be subsequently embedded into a context in order to cause an appropriate action. Intentions share this contextindependence with other mental states. Just as the belief that “France is a
country” is independent of the color of the walls of the room in which one
entertains this belief, the intention to grasp an apple is presumed to be the
same regardless of the color or shape of the apples one intends to grasp,
or the reason for grasping them. This context-independency allows one to
form intentions about future actions in a different context, for instance to
pick up groceries after work (Pacherie, 2008, p. 183).
Consequently, particular intentions retain their characteristics across
different instances. An intention to grasp an apple today is the same as the
intention to grasp a different apple one had last week, or last year. Note that
this context-independency of the content of intentions does not mean that
the occurrence of intentions is independent of context. Seeing fruit and vegetables at the supermarket can very well help to bring about my nevertheless
context-independent intention to eat an apple.
Second, intentions are thought to be functionally discrete and simple states.
Discreteness, for purposes of this discussion, means that intentions are believed to be cognitive units that play a clearly isolatable role (Haselager,
1997)1. Given the discrete causal interactions posited for intentions, and
their context independence, intentions themselves are also relatively simple. Context-speciic details are not part of the intention, which allows for
a high degree of abstraction, and results in a relatively simple mental state.
Complexity in the action generation system arises only either from interactions between simple, discrete states, or from the translation of these states
into non-discrete format further on (or “down,” in a hierarchical view) in
the action generation sequence.
Philosophers often invoke these properties by talking about the “propositional” nature of intentions (see for instance Pacherie (2008), or Fodor
(1985)). Propositional states consist of explicit, structured semantic representations similar to those found in language. Thus, when one intends to
grasp an apple, the cause of the associated action is an attitude stemming
80
from a mental representation analogous in structure to the phrase “I intend
to get an apple.” Compared to motor plans or detailed motor representations, propositions are relatively simple, consisting of abstract representations of the intended object and the act. While neuroscientists may not subscribe to the explicitly linguistic, propositional characterization of discrete
states, quotes such as those from Haggard in the introduction clearly show
that many neuroscientists sign on for the discrete and simple rendering of
the nature of intentions.
Third, intentions are thought to cause actions, as opposed to simply covarying with them (Searle, 1983). On this view, an intention must be formed
prior to the planning and generation of an action, and the content of the
intention determines what actions are appropriate to generate. An episode
of action planning can only be successful if it achieves the outcome represented in the intention, e.g., to grasp an apple. Thus, in accordance with the
hierarchical models posited by, for instance, Hamilton and Grafton (Grafton & Hamilton, 2007; Hamilton & Grafton, 2007), on the folk view the
discrete intention is the primary causal and organizing factor in an episode
of action generation.
To account for thoughtless or unconscious actions, Searle (1983) posits
two types of intentions: prior intentions and intentions in action2. Prior intentions are intended to capture the folk notion of intentions, and are supposed to account for the temporally extended and deliberating aspects of
this notion. Intentions in action are of a different, unconscious and nonpropositional format, which share more features with motor plans than with
propositional thought.
As philosophy has further explored the types of effects associated with
intentions, as many as seven functions have emerged. Intentions are posited
to: 1) terminate deliberation, 2) prompt practical reasoning, 3) coordinate
action, 4) initiate action, 5) sustain action, 6) guide action, and 7) monitor
action (see Pacherie (2000) for a more detailed discussion of the functions
of intentions and theories proposed to capture them). However, providing a
speciic causal theory of intention that both holds on to the core folk notion
and accounts for all seven functions has proven dificult.
Pacherie has done admirable work to create a framework that is based
on these philosophical theories of action and also suitable for empirical
investigation. She distinguishes three types of intentions: D-intentions, P2 Similarly, Bratman (1987) contrasts future-directed and present-directed intentions, Pacherie & Haggard (2011) contrast immediate and prospective intentions.
81
intentions in action
intentions and M-intentions (Pacherie, 2006; 2008). D-intentions (distal intentions) are located at the top of the action hierarchy. Since D-intentions
are the outcomes of propositional reasoning processes, Pacherie explicitly
states that they are propositional and discrete (p. 192), next to contextinsensitive (2008, p. 183). D-intentions therefore seem to be highly similar
to the folk conception of intention.
The context of the action becomes pertinent when D-intentions are
translated to P-intentions (proximal intentions). P-intentions contain a plan
for the action within the current context, for example, getting up from my
chair, walking to the fruit bowl and grasping the apple. While the details of
P-intentions are not clear on Pacherie’s account, they are supposed to play
an intermediate role, aiding in the transition from a discrete, simple state
to detailed motor plans.
Lastly, these intentions are translated to an M-intention (motor intention) that speciies the exact motor representations needed to perform an
action, and contain detailed programs for, e.g., how to get up from a chair,
how to balance one’s body, etc. M-intentions are no longer propositional in
nature, but instead consist of a set of motor representations that together
cause and control a complex series of movements. So what starts as a discrete, context-free state with propositional content gets translated to a complex and highly speciic motor command that is entirely adapted to the
current context.
Pacherie’s account has an additional layer of complexity from what we
have discussed here, in that, while the initial causation of the action proceeds in the way we have described, she also allows for causal inluence to
propagate back up the action hierarchy (see Figure 1). Speciically, types of
intentions at lower levels can, after being caused by intentions at higher levels, in turn modify these higher-level intentions. There is, however, a tension
in the idea of discrete D-intentions starting the causal cascade (p. 188), and
subsequently being modiied by more dynamic processing at lower levels.
While functionally discrete states may be in principle possible in a dynamic
system, positing this sort of architecture does raise issues about the nature of
the feedback to higher levels. It is unclear how a functionally discrete state
is modulated by continuous and dynamic feedback, since the structure and
form of these two types of processes are radically different. In what follows,
we will expand on this point, arguing that this tension arises from adhering
to the folk notion of intentions, while also attempting to do justice to the
dynamic nature of action generation. It is this adherence, we will claim, that
82
renders Pacherie’s framework incompatible with the brain processes that
control our actions.
Figure 1. An overview of Pacherie’s “causal cascade” theory of intentional action, taken from
Pacherie (2008).
Action control in the prefrontal cortex
Based on a variety of evidence from imaging studies, neurophysiology, and
neuropsychology, the prefrontal cortex (PFC) is generally recognized to
be the locus of action generation and control in the brain (see Miller and
Cohen (2001) for an early review of this evidence). Various models of prefrontal action control have been proposed (see Badre (2008) and Ramnani
and Owen (Ramnani & Owen, 2004), for reviews), and while the details of
different models of control differ (see more on this below), the notion of a
posterior-anterior axis on the lateral PFC (lPFC) for implementing different
kinds of control is now widely recognized (Badre, 2008; Fuster, 2004).
The processes in the lPFC seem to exhibit the same temporally extended
and deliberative aspects as Pacherie’s D-intentions. For example, Koechlin
and colleagues (Koechlin et al., 2003; Koechlin & Summerield, 2007) have
suggested and tested a model in which different types of action coordination are subserved by different areas of the lPFC. Based on differential activation found in imaging studies, they posit that posterior areas of PFC are
in charge of selecting between different “sensorimotor associations,” where
these consist of a learned connection between a stimulus property and a
speciic motor act. When different sets of rules must be applied depending
83
intentions in action
on the nature of the stimulus sets presented, dorsolateral PFC (dlPFC) is
activated, performing a process that they call episodic control or contextual
control. When a task must be paused, and a new set of rules implemented,
due to the presentation of an interrupting stimulus, followed by a return
to the irst task, the lateral frontopolar cortex (the most rostral part of the
PFC) is activated, a process they call branching control.
Like D-intentions, the processes in branching control are temporally extended in the sense that they must keep track of information longer than
other types of control—they involve maintaining a variety of task rules in
order to act appropriately in response to the presented stimulus. They also
seem to be genuinely deliberative, since they must keep track of a variety of
different considerations, and discern the appropriate relationships between
them, in order to complete tasks effectively.
These studies and models suggest that the functions attributed to the
regions along the rostro-caudal axis seem to be similar to the functions that
D-, P-, and M-intentions are supposed to perform. The posterior prefrontal cortex is thought to be involved in concrete action responses (Badre,
2008), just like M-intentions. More anterior and rostral regions are involved
in more temporally extended and deliberative action planning (Badre,
2008; Christoff & Gabrieli, 2000; Koechlin et al., 2003), just like P- and
D-intentions. So, in behavior requiring branching control, anterior areas
would store the required rules, and produce processes to apply the new
rules in the appropriate setting. Several of the proponents of the models
here discussed interpret their results in a similar way. Koechlin and colleagues suggest that a broad “goal representation” is maintained in aPFC
during the performance of sub-tasks, and activates the appropriate actions
at the appropriate times (2003). Burgess, Veitch, Costello & Shallice (2000)
interpret the deicits in patients with lesions in aPFC as an inability to form
and carry out intentions. Strikingly, Ramnani and Owen (2004) describe the
relevant sort of intention along the lines of such plans as to “Meet John at
5”: an obvious parallel to Pacherie’s D-intentions.
In all, if D- P- and M-intentions are to be found in the brain, the most
likely location would be along the anterior-posterior axis on the lPFC. However, despite the similarities between functions attributed to the different
types of intentions and the processes along the anterior-posterior axis, we
will argue in the next section that the context-independent, discrete, and
causal rendering of intentions is deeply incompatible with the type of information processing that occurs in lPFC.
84
The complex control in the lPFC
We have shown that in Pacherie’s model, D-intentions are simple, discrete
and independent of the context in which the actions they cause are embedded. In this section we will show that there is convincing empirical evidence
that neural activity in the anterior regions of the lPFC is informationally as
well as dynamically complex. These types of complexity render prefrontal control processes incompatible with the properties and causal role of intentions
attributed by the folk view.
Let us start with informational complexity. A system’s operations are
informationally complex, in our view, if, during its normal operation, the
system has access to and makes use of a variety of different sources of information. The anterior regions of the lPFC receive input from the medial
temporal lobe, the thalamus, as well as multiple sensory areas, such as ventral visual areas, somatosensory areas, auditory cortex, and the rostral superior temporal sulcus, which is itself known to be a multimodal area (Miller
& Cohen, 2001; Ongür & Price, 2000). The existence of these pathways
suggests that activity in the anterior parts of the lPFC can, in principle, be
modulated by a variety of types of information.
That this sort of modulation occurs during action control is supported
by data from single cell studies in monkeys, showing that neurons of the
anterior regions of the PFC are sensitive to a variety of perceptual information about the stimuli that cause actions. Some early studies by Fuster,
Bauer & Jerveye (Fuster, Bauer, & Jervey, 1982) show evidence that anterior
dorsolateral and orbital regions of the PFC had speciic responses to spatial
elements, as well as perceptual features of the action context, and to the
relations between them. The researchers conclude that these data “provide
further evidence for the hypothesis that the prefrontal cortex is essential
for the temporal integration of sensory data and motor acts in sequential
behavioral structures” (p. 690). Following on these results, Fuster, Bodner
& Kroger (Fuster, Bodner, & Kroger, 2000) trained monkeys to form an association between a tone and a color—speciically, the monkeys had to press
a certain color on a touch screen following the presentation of a certain
frequency sound. Single cell recordings were conducted in over 300 neurons spanning areas in the anterior parts of the dorsolateral and anterior
prefrontal cortex. In accordance with the previous indings, they discovered
that the vast majority of the cells increased activation in the interim between
the presentation of the tone and the colors, suggesting that these encode
the presence of a relation between tone and color.
85
intentions in action
Fuster and colleagues contend that the function of these PFC cells is to
facilitate the integration of perceptual information and, moreover, to do so
over an extended period (Fuster et al., 1982, p. 690). Thus, the cross-temporal
aspects of these neurons suggests an action control process that seems to be
in line with the cross-temporal role that intentions are supposed to play, yet
the process seems highly sensitive to speciic perceptual information. It is
the presence of the tone in the perceptual environment that mediates processing in the lPFC, and this processing is directly relevant to performing
the proper action (via the learned perceptual association).
One could maintain that the cells that associate a tone and a color are
encoding a discrete intention, such as “I will press the appropriate color.”
This would be to invoke the folk notion that perceptual information is
turned into a context free state (e.g., “the tone is present”), which in turn
forms a discrete intention. An initial reason to doubt this sort of claim is
that these neurons are interspersed tightly with the ones with more speciic
responses—i.e responding to just a tone or a color—suggesting a highly integrated network. Also recall that the hierarchical views of actions which are
based on discrete intentions, including Pacherie’s theory, posit that discrete
intentions occur at a higher level—both causally and anatomically (Grafton
& Hamilton, 2007)—than non-discrete ones. The interspersed physiology
revealed by Fuster et al’s studies, then, is incompatible with this objection.
Even more challenging to the suggestion of discrete states in the anterior PFC is the dynamical complexity of these areas. We deine dynamical complexity as the continuous modiication of the state of a system in accordance
with processing and task demands. Amongst the neurons that Fuster et al.
(2000) studied which showed feature-speciic associative behavior, different
behavior at different stages of the action sequence was found. Some neurons maintained correlated iring only through certain epochs of stimulus
presentation—for instance, during auditory presentation and in the delay
between stimuli—while others ired consistently in different stages. Moreover, even the neurons that seem to represent the association between the
stimulus features—as opposed to neurons that ired for just one of the features—did not demonstrate a constant response, but instead their activity
varied in their temporal relations to the onset stimulus. So the activation
of the anterior lPFC does not remain consistent during the course of planning and executing an action, but instead changes dynamically during the
progression.
86
Notice here the complementary natures of informational and dynamical
complexity: the dynamic activity shifts in the lPFC are due to the temporal
relation of the current situation to the perceived stimulus—i.e., whether it
is currently being perceived, or the length of time between the perception
and the action. The presence of different types context-speciic information
in the lPFC thus has continuous effects on its activity.
While it is hard to establish the presence of dynamical complexity in
imaging studies, due to the coarse temporal resolution of MRI data, the
single cell data we have discussed seem to be in line with the indings of
imaging studies regarding the systems-level properties of both the posterior
and anterior regions of human lPFC. For instance, more posterior areas
of the lPFC have been implicated in contexts where either associations between visual stimuli and particular actions (Passingham, Toni, & Rushworth,
2000) need to be recalled (see also Koechlin et al. (2003), Kouneiher et al.
(2009) and Rushworth (2008)). This suggests that activity in these areas is
modiied depending on whether there is perceptual information needed
to complete a task. Importantly, these effects are not isolated in posterior
regions of the lPFC, as would be expected if representations became less
context-sensitive as one moved anteriorly along the anterior-posterior axis.
Prabhakaran, Narayanan, Zhao & Gabrieli (2000) compared two conditions involving tasks cued by verbal and spatial information, one in which
the two types of information were presented separately, and one in which
the verbal and spatial cues coincided. In the “non-integrated” (irst) condition, activation was higher in more posterior areas of the lPFC, whereas in
the “integrated” (second) condition, activation was considerably higher in
more anterior regions, particularly in the right hemisphere. The difference
between posterior and anterior regions, then, is not in whether information
about perceptual context is processed in anterior areas, but what kind of
context information is processed—i.e., information about speciic environmental features more posteriorly, and information about relations between
environmental features more anteriorly (see also Christoff, Ream, Geddes,
& Gabrieli, 2003). Similarly Ranganath, Johnson & D’Esposito (Ranganath,
Johnson, & D’Esposito, 2000) found that left anterior lPFC shows increased
activation during tasks where more speciic perceptual information had
to be recalled, suggesting that informational complexity increases as one
moves more anteriorly along the lPFC axis. Both the imaging and the single
cell results we have discussed support the informational complexity of the
87
intentions in action
lPFC, and the studies by Fuster et al. strongly support its dynamical complexity.
These types of complexity are irreconcilable with each of the three properties of the folk notion of intention, discussed in section 2. First, contextindependency: The informational complexity of the PFC shows that even
the most anterior parts of the lPFC are not context-insensitive in the way
demanded by the core folk notion of intentions. If the role that Fuster et
al. (2001; 1982; 2000) attribute to lPFC neurons of integrating perceptual
information over a period of time is correct, then this context-sensitivity is
vital for the functioning of these areas. The view suggested by these results
is that branching and episodic control (Koechlin & Summerield, 2007)
implement their functions by tracking information relevant to a particular
task or task set in the actor’s perceived environment over time. Branching
control, for instance, attempts to mediate shifting task demands with changes in context, and does so by keeping track of relations between a variety
of different informational sources. On the view suggested by informational
and dynamic complexity, this sort of procedure can produce temporally extended behaviors without relying on previously represented, discrete intentions, simply through dynamic interaction with a perceptual context.
Second, the two kinds of complexity also undermine the notion of intentions as functionally discrete and comparatively simple states. Dynamical complexity alone would be problematic for an attempt to isolate a discrete state
from ongoing processes, since it shows that activation in the PFC is continuously modiied from the earliest stages of the task. This speaks against the
element of Pacherie’s view in which a discrete (i.e., not in a continuous
functional relationship with other elements of the system) state begins the
action coordination process, and is subsequently modiied by feedback from
non-discrete lower levels. The additional fact that these processes are embedded in context up to the highest levels means that it is very unlikely that
one will be able to isolate a state whose processing or content is dissociable
from the context in which it occurs, and therefore which is consistent across
changes in context. Of course, some elements of the context may remain
similar across instances of apple-grasping (e.g., apples tend to have roughly
the same shape and size across instances), and therefore some elements of
the apple-grasping process may be similar across most instances. However,
this does not mean that there is a state somewhere in this process that is
discrete and context insensitive. As Fuster and colleagues suggest, these can
88
simply be due to the inluence of contextual elements that are similar across
those instances, suggesting that apparent discreteness here is accidental.
Moreover, the informational and dynamical complexity of lPFC control
processes undermine the idea of diminishing complexity when one moves
up to more deliberative and temporally extended aspects of actions, which
is presumed by both Pacherie and by neuroscientists like Haggard. Instead,
the data discussed above suggests that there is a different kind of complexity
present at higher levels. At low-level motor control, there is complexity with
respect to detailed information regarding motor states—e.g., proprioceptive and sensorimotor feedback from the muscles involved— and detailed
plans for how to perform suites of motor effector movement. At higher,
deliberative, levels, there is a complexity of types of information and the relations between them. So unlike low-level control, in which there is complex
and detailed information within a type (effector-speciic), higher levels of
control are complex in that they can process relations between a variety of
stimulus types, rules, outcomes, and temporal contexts. This is far removed
from the folk notion of a simple and discrete state.
Third and inally, causation: According to Pacherie, the types of intentions
involved in causation and control of actions are distinct, with D-intentions
starting the process and not being intimately involved in lower-level, situational and motor control, which depends on the other types of intentions.
We take the PFC data discussed above to show that even the most anterior
parts of the lPFC play a role in control as well as causation of actions. Koechlin et al.’s (2003) data, for instance, show that anterior lPFC must be actively
engaged in monitoring more posterior areas, since it is involved in recognizing the interrupting stimuli and implementing the new set of action rules
at the appropriate time 3. Indeed, Koechlin and colleagues suggest that the
areas that are involved in “selecting” actions (in their terms, appropriate
“stimulus-response associations”) at lower levels are also involved in controlling them. This clearly suggests the conjoined functioning of causation and
3 Of course, it is possible that anterior regions have no access to processes occurring at more
posterior ones. It might be that anterior lPFC simply cuts off whatever processing is occurring in more posterior regions when episodic or branching control is required. However, we
consider this unlikely. First of all, the large number of connections from posterior to anterior
regions suggests the availability of information regarding posterior region activity to anterior
regions. Moreover, presumably the most eficient way to produce the correct modiication of
more posterior processes will involve modifying its particular current processing in accordance
with the needed form of control. In keeping with Koechlin et al.’s view of selection between
different action associations, it is likely that anterior lPFC and orbital PFC areas are inluenced
by processes occurring at lower levels.
89
‘Kicking it upstairs’
One could object to our claim that the deliberative, future-directed aspects
of action planning are implemented by the lPFC by arguing as follows:
Pacherie’s D-intentions span a big range, from speciic, say the intention
to eat an apple, to very broad and future directed, such as the intention
to spend your next holiday in Spain. We have exclusively focused on those
D-intentions that could be accounted for by lPFC processes, but more distantly future-directed intentions may fall outside of the capacities of these
processes. So, a different system might need to be invoked to account for
intentions that are directed at activity in the far future. Since a part of the
set of D-intentions can already be accounted for by lPFC-processes, we have
to split the set. Let us call the long-term D-intentions “S-intentions,” (for
“Spain intentions”). The suggestion then is that, in addition to the lPFC
control processes, a separate system is needed to account for these S-intentions. This way, the propositional, discrete character of intentions is saved
intentions in action
control in the PFC. Similarly, Ridderinkhof, Wery, Wildenberg, Segalowitz &
Carter (2004) state that the PFC has the capacity for both “evaluating” contexts and “regulating” activities, which is bound up in its ability to represent
both associations between contextual features and rules for selecting the
appropriate associations.
In all, we have argued that the functions posited by for D-intentions are
exhibited by the lPFC, but that the processes within these brain regions do
not exhibit the properties to which the folk notion and Pacherie’s view of
D-intentions are committed. This suggests that although Pacherie is right
in emphasizing the dynamical nature of action control, her adoption of the
folk interpretation of intention, and her emphasis on top-down causation
(D-intentions starting the causal cascade) make her model irreconcilable
with empirical data on prefrontal action control.
A possible objection to our perspective on these results would be to claim
that the empirical data we have discussed so far correspond to relatively lowlevel intentions—i.e., intentions to perform speciic motor acts in relation
to concrete stimuli—but that there are more future-directed aspects of action planning and generation that are not accounted for by this picture.
Perhaps there are further sorts of intentions that are discrete, and perhaps
these originate outside of the lPFC. We will discuss this objection in detail
in the next section.
90
by placing it another step up the stairs in a hierarchical model of action
generation. Discrete S-intentions, on this view, would be capable of causing
a series of D-intentions; D-intentions would account for extended temporal
associations between context-speciic information; P-intentions would plan
actions within that current context; and inally M-intentions would recruit
the corresponding motor representations.
Although we cannot rule out something like S-intentions across the
board, they bring with them a serious threat of explanatory vacuity. Going
to Spain does not consist of a single action. This means that an S-intention
must cause and coordinate a series of more speciic D-intentions (to go to
the travel agency, to buy a Spanish dictionary, to renew your passport, etc.),
which, due to their more direct link with an action, would more plausibly
employ the PFC processes we have discussed. The problem is that intentions are supposed to fulill the causal aspect of the folk notion, but these
S-intention do not cause a single speciic action. If we allow intentions that do
not directly and immediately cause actions into our scheme, it becomes extremely unclear what intentions are causing what actions. To illustrate, going to Spain is part of John’s intention to live a happy life. This H-intention
(‘Happy-intention’) does not directly select a single intention on a lower
level, for example the S-intention (after all, a holiday in Spain is usually not
suficient for a happy life), but causes multiple intentions, just like the S-intention causes multiple D-intentions. Maybe going to Spain is part of John’s
intention to impress his neighbors, which makes him happy. In that case
the impressing-neighbors intention sits between the H-intention and the
S-intention, but perhaps only partly, as John might have gone to Spain irrespective of his neighbors’ opinion. Allowing intentions that do not directly
cause actions, but cause other intentions in turn, opens the door to a virtually unlimited number of intentions for any speciic action. Since these sorts
of intentions, on the folk notion, are supposed to be the primary explanatory construct for each action, saving discrete intentions by placing them at
further and further distances—both conceptually and anatomically—from
actual actions undermines the explanatory leverage.
Furthermore, the idea of propositional intentions causing the processes
of action control we have discussed is faces empirical problems. In order
to explain extended action generation with discrete intentions, not only
must a location for these intentions be posited, but it must be made clear
how these discrete intentions cause speciic action-control processes—i.e.,
by what neural pathways and in what temporal order. One possible locus for
91
Figure 2. Action control (based on Koechlin and Summerield (2007)) and informational complexity in the prefrontal cortex. There is interconnectivity between the different regions of the
PFC, as well as input on each level from sensory areas (Miller & Cohen, 2001). Moreover there
is motivational input on different levels from the dorsal Anterior Cingulate Cortex (ACC) and
the pre-SMA (Kouneiher et al., 2009). Note that this is just a graphical overview of the control
processes we have discussed, and the combined functional connectivity that underlies them,
not an alternative theory of action generation or control.
intentions in action
attempting to ind discrete inputs to the lPFC would be medial prefrontal
cortex, due to its relative lack of sensory input and considerable projection to lPFC (Ongür & Price, 2000) and due to its recognized function of
underlying the motivational elements of action generation (Egner, 2009).
However, mPFC does not provide a univocal, input to lPFC, but instead
inluences lFPC processing in several places (Kouneiher et al., 2009). Kouneiher and colleagues (2009) argue that motivational inluence affects both
contextual and episodic levels of control—i.e., the posterior PFC as well as
midPFC. This means that the motivational factors that prime actions cannot
be parceled as discrete and unitary input states. Instead, they appear to be
multi-faceted processes that modulate activity at different anatomical locations and on different levels of neural control, implemented by multiple
heterogeneous and dynamically interacting structures. This kind of dificulty, we contend, makes it unlikely that a pathway can be found via which a
discrete intention (of the S-or H-type), formed independent of the control
processes, could be propagated to the lateral PFC. See Figure 2 for an overview of the connectivity we have discussed.
92
Relatedly, functional connectivity studies into aPFC suggest that there is massive inluence from more posteriorly located (“bottom-up”) PFC-processes,
as well as from mPFC and from sensory and subcortical areas (Miller &
Cohen, 2001). So, the information processing leading to action generation
in the aPFC is not likely to be caused primarily by another system producing
discrete intentions elsewhere in the brain, but is most likely due, at least
in large part, to other processes (such as the motivational ones discussed
above) occurring within the PFC. This is, of course, necessary when one
realizes that deliberation does not take place “in a vacuum”, but is to a large
extent inluenced by the context an agent is in, the motor capabilities of the
agent, the background state of the agent, etc. Figure 2 clearly shows that the
folk notion with its discrete, context-independent and causal character does
not match the neural processes that cause and control our actions.
Cognitive outsourcing
Thus far we have argued that the type of intentions posited by the folk notion cannot be the primary causes of our behavior. However, this does not
exclude the possibility that discrete representations—for instance certain
memory traces or linguistic representations—might play some other, subsidiary role. Here, we speculatively suggest such an account, following Clark
and others (Clark, 1997; 2006; Elman, 2004), on which discrete representations are a means for cognitive outsourcing, or scaffolding. On this type of
view, discrete representations can serve as stabilizing factors for the more
dynamic processes that actually control actions (Clark, 2006, p. 372). Speciically, the idea is that, while actions are initiated and controlled via a dynamic process centered in the PFC, additional representations can occasionally play a role as stable elements around which the dynamic processes
can organize themselves.
So, while it may be highly problematic to hold that discrete representations directly cause actions, due to the conceptual and empirical problems
we have discussed, it may be that the presence of stable representations
provides a “cognitive resource” which allows for the PFC to update its actions in accordance with the representation. This would be akin to writing
“go to Spain” on a blackboard or in a calendar, and looking at it occasionally
when one gets distracted. Such an outsourcing process could enable some
processes that would not be possible otherwise. For instance, it might be
93
intentions in action
that for actions that demand a long deliberation periods and are highly
complex, such as a trip to Spain or planning a party, the PFC by itself would
not have—and this is a speculative suggestion—the storage capacity to coordinate all of the appropriate actions on its own. Clark gives a compelling argument that some distinctively human processes depend on such outsourcing procedures, and complex and long-range planning might well be one
of them (Clark, 2006).
The process of cognitive outsourcing is emphatically a different type of
process than the ones attributed to the folk interpretation of intention.
The stable discrete or linguistic representations serve only as an occasional
guidepost by which a structure like the PFC can orient itself in the course
of an extended process. As Clark notes, there is no outside governing system that uses the representations to direct the process in question. Instead,
the outsourcing view posits that stable representations are tools for the selfmanipulation of cognitive systems (Clark, 2006, p. 373) not a gateway for the
inluence of outside systems. Thus, when applied to action, the outsourcing
view posits that the causation and control of particular actions is still centered
squarely within the PFC, and this still undermines the causal properties associated with discrete intentions by the folk notion.
Of course, this proposal is speculative, and the details of the relationship between the dynamic and continuous control processes and processes
operating on larger time-scales or more stable representations demand a
thorough investigation. However, we believe that research into intentional
action is better served by focusing on the complex interaction of different
control processes, and perhaps their link with additional stable representations, than it is by attempts to localize intentions of the folk variety, or base
elaborate hierarchical models of action and motor systems on such. In our
view, this does not do justice to the complexity involved in action generation
and control. The important questions in understanding a complex system
involve investigating how structure emerges from the continuous interaction of the components involved (Bechtel, 2008). In this particular case,
we contend that understanding action control will involve understanding
how (i) multiple types of information are integrated, both across modalities
and across time, (ii) how informationally complex processes at the anterior
regions of the PFC are translated into motorically complex information at
more posterior areas, and (iii) how these processes are shaped by the context of the action. The folk framework has no explanatory resources for
94
addressing these questions, which we claim are vital for understanding how
cognitive beings interact with the world.
In summary, we have argued that the discrete-intention view to which
the folk notion is committed is not a suitable framework in which to accommodate the complexity and details of action generation in the brain. How,
then, can we account for the compelling intuition underlying the ubiquity
of the folk notion? In the next section we will sketch the outlines of the
answer to that question, on which the characteristic of the folk interpretation of intentions stem from the use of the concept of intention in social
explanation and communication.
The social origin of intentions
To make a irst step towards understanding our strong intuitions about intentions being the cause of our behavior, we have to look at how we go about
explaining behavior. Suppose we see Mary in the hallway in her department,
and we want to explain why she is walking to the coffee machine. In proposing this question, a crucial step has already been taken implicitly: we
have framed Mary’s behavior in terms of a single action—i.e., walking to the
coffee machine. We have already zoomed in on one aspect of Mary’s behavior, and ignored other aspects. McFarland (1989) calls this the “teleological
hypothesis,” the idea that a piece of behavior can be considered in isolation from the rest of the behavioral repertoire (p.39). In reality, he claims,
actions are never singular. In a normal episode of walking to the coffee
machine, Mary in fact performs a variety of actions: she retains an upright
posture; she tries to avoid making too much noise; she hums, or perhaps
mumbles; she greets her colleague; she stretches her legs, etc. Describing
behavior in terms of a single action is in many ways an artiicial strategy (McFarland, 1989; Uithol et al., 2012).
Moreover, walking to the coffee machine is but one level at which we
can describe the action (Uithol, van Rooij, Bekkering, & Haselager, 2011b).
What Mary does at one moment can be described at many levels: lexing her
knee, putting one leg in front of the other, walking, getting coffee, relieve
herself from drowsiness, attempting to work on her paper more effectively,
pursuing a fruitful scientiic career, etc. In most cases an explanation such
as “Mary is getting coffee” is intuitively the most reasonable, but not in all
cases. Which level we pick to describe her behavior is dependent on what
we want to explain (Vallacher & Wegner, 1987) but also, and importantly, to
95
intentions in action
whom. For instance, if we are explaining Mary’s perambulations to her boss,
we might focus more on her desire to be alert and productive, than on the
action of getting coffee per se.
So, in general everyday explanation of actions, we tend to pick only a
sub-section of the behavior (we focus on the obtaining of coffee, and momentarily forget about the humming, the posture, the leg stretching, etc.),
and focus on one level of action description (‘walking’ is the action, and not
‘placing one leg in front of the other’, or ‘working eficiently on a paper’).
But which action and which level we pick, is dependent on the context and
the nature of the action as well as the audience to which we explain the action.
While this strategy is artiicial in several ways, it serves vitally important
roles in social explanation. Since we do not have access to the complex
interaction of the various control processes that shape Mary’s behavior, we
posit a single intention that explains just that part of Mary’s behavior that
we are interested in. So not only do we adopt what Dennett calls an “intentional stance” towards Mary (Dennett, 1987), it is a highly contextualized
stance, focused on only one aspect of behavior, at one level of description.
This means that one possible way of accounting for the origin of the folk
notion of intention is by stressing its role in explanations of behavior. Since
these explanations are largely a social tool (i.e., we explain Mary’s actions
to someone), the positing of discrete intentions serves an important communicative function.
Interestingly, we seem to be in a similar situation in regards to explaining our own actions as we are in explaining Mary’s actions. Just as we have no
direct access to the ongoings in Mary’s prefrontal cortex, our conscious access to our own control processes is limited as well (Dennett, 1991; Sellars,
1963; Wegner, 2003). So when asked what we are doing or why we are doing
it, the best we can do is give a shorthand version, or a rough approximation
of the expected result of the various control processes. This also holds for
contexts where we explain our actions to ourselves (see also De Ruiter et al.,
2007). Suppose John is trying to igure out why he went to Spain. It is well
beyond his introspective abilities to recall all of the complex factors involved
in producing the necessary actions, which, as we have argued, would be
highly mediated by activity in the PFC. What he can do, perhaps, is describe
some primary motivations that contributed to his action (for example, the
desire to enjoy warm weather after a disappointing summer, or that he wanted to practice Spanish conversation, etc.). This linguistic phrasing helps
96
us (re)construct the motivations behind our own actions, thus making our
actions comprehensible (Nisbett & DeCamp Wilson, 1977), but it would be
a mistake to take this approximation to be the key causal factor at work in
producing the actions themselves.
With this in mind, it is easy to understand why the folk notion attributes
to intentions the properties we have discussed. We have shown that, explicitly for philosophers, and often implicitly for neuroscientists, intentions in
the folk interpretation are thought of as propositional states, usually framed
as a small sentence. This is, of course, exactly the format one would expect
when a complex of neuronal, and bodily activity is summarized as a linguistic representation, used for communication purposes.
Moreover, language is a representational format that (unlike, for example, certain pictorial formats) facilitates, or even necessitates, abstraction
and generalization. Short linguistic descriptions of objects, events, states,
etc., cannot accommodate a high degree of detail.
Thus, assuming that representations of this sort are both implemented
in the brain and generate actions leads to the commitments of contextinsensitivity and simplicity that are inherent in the folk view. Since the processing of the PFC in action generation does not match these properties, we
conclude that the nature of intentions is social, not biological, and therefore that frameworks based on this notion are unlikely to signiicantly illuminate the processes by which the brain generates actions.
Conclusion
We have argued that the prefrontal cortex is not likely to implement states
that match the folk notion of intention. We have focused on the PFC due
to its known contribution to action control, and the similarity between its
established functions and those that Pacherie posits for the different types
of intentions. There is still much to discover about action control, and of
course the brain is bigger than the PFC. Thus, one could maintain that,
despite the conceptual and empirical problems laid out above, and despite
what we know about neural processes of action control, discrete intentions
do exist, and that through further exploration some structure or process
will be found that does not exhibit informational and dynamical complexity, and therefore can be seen as implementing discrete intentions. We think
that at this point the burden of proof lies with those who would entertain
such a view. This burden not only encompasses inding the brain structure
97
intentions in action
that could be responsible for the generation and processing of intentions,
but also what types of behavior these states can account for. Unless such
states are found, we believe that it is a more fruitful strategy to concentrate
on the nature and interactions of the dynamic processes that shape our action, instead of holding on to a folk interpretation of intention for which
there seems to be little empirical motivation.
abstract
Intention reading and action understanding have been reported in ever-younger infants.
However, the notions of intention attribution and action understanding, as well as their
relation to each other, are surrounded by much confusion, making it dificult to assess the
meaning and value of such indings. In this paper we set out to clarify the notions of ‘action
understanding’ and ‘intention attribution’, and discuss their relation. We will show that what
is commonly referred to as ‘action understanding’ in fact encompasses various heterogeneous association and prediction mechanisms. In general, these forms of action understanding do not result in the attribution of an intention to an observed actor. By disentangling
intention attribution from action understanding, and by exposing the latter as an umbrella
notion, we provide a novel theoretical framework that prevents conceptual confusion and
allows for better comparison of indings from different experimental paradigms, and a much
more fruitful approach to comparative questions.
This chapter is submitted for publication as: Uithol, S., & Paulus, M. (submitted). What do
infants understand of others’ action? A theoretical account of early social cognition.
six action understanding
in infants
action understanding
intention
infancy
social cognition
100
Introduction
Research in recent decades has provided ample evidence that young infants process information about other people’s actions in a particular way;
and that they use this information to understand and predict others’ behavior, as well as react adequately to it (e.g. Barresi & Moore, 1996; Bigelow &
Birch, 1999; Carpendale & Lewis, 2004; Elsner, 2007; Elsner & Aschersleben, 2003; Falck-Ytter, Gredebäck, & Hofsten, 2006; Hauf, 2006; Kochukhova & Gredebäck, 2010; C. Moore, 2006; Nyström, Ljunghammar, Rosander,
& Hofsten, 2011; Paulus, 2011; Phillips, Wellman, & Spelke, 2002; V. Reddy,
2010; Reid et al., 2009; Ruffman, Taumoepeau, & Perkins, 2011; Sodian,
2011; Southgate, Johnson, Karoui, & Csibra, 2010; Tomasello, 1999; Woodward, Sommerville, & Guajardo, 2001). Initially, these indings were mainly
based on the application of traditional preferential looking paradigms (e.g.
Woodward, 1998), but more recent research methods such as eye-tracking
and EEG recordings have partly conirmed and extended these indings
(e.g. Falck-Ytter et al., 2006; Paulus, Hunnius, & Bekkering, 2011a; Reid et
al., 2009; Reid, Csibra, Belsky, & Johnson, 2007). For example, by measuring
oscillations of the gamma-band over the frontal cortex Reid and colleagues
(2007) provided evidence that 8-month-old infants differentiate between
complete and incomplete actions. Employing eye-tracking technology,
Falck-Ytter and colleagues (2006) examined infants’ eye-movements during
the observation of a grasping and transport action, and provided evidence
that 12-month-old infants were able to visually anticipate the target of the
ongoing actions. Moreover, it has been shown that infants at the end of
their irst year of life differentiate between different kinds of intentional actions: they react more impatiently when an adult is unwilling to hand them
a toy compared to when he is unable to do so (Behne, Carpenter, Call, &
Tomasello, 2005). Taken together, these studies provide rich evidence for
the claim that from early in life human infants posses some ability to understand other people’s actions.
A number of inluential theoretical accounts have interpreted these
data as evidence for infants’ ability to read others intentions (Baron-Cohen,
1995; Luo & Baillargeon, 2010; Meltzoff & Brooks, 2001; Tomasello, 1999;
Tomasello, Carpenter, Call, Behne, & Moll, 2005; Woodward, 2009). More
precisely, it has been argued that action understanding depends on the ability to read intentions. For example, Luo and Baillargeon (2010) start their
review with stating that “[o]ur ability to make sense of others’ intentional
actions rests primarily on our ability to understand the mental states that
101
action understanding in infants
underlie these actions” (p. 301). Baldwin and Baird (2001) claim that “[o]
ur everyday, common-sense ability to interpret and predict others’ behavior
hinges crucially on judgments about the intentionality of others’ actions”
(p.171), and Woodward (2009) agrees that “infants understand intentions
as existing independently of particular concrete actions and as residing
within the individual” (p. 55).
Interestingly, others—while still adhering to the notion of intention understanding—have reversed this relation and claim that intention reading
is the product (i.e. the result) of a more low-level form of action processing.
For instance Gallese and colleagues (2009) stress the impact of infants’ mirroring processes (motor simulation of observed actions), and posit that this
functional mechanism forms the basis of their capacity to understand intentions. Similarly, Gallese and Goldman (1998) state that “humans’ mindreading abilities rely on the capacity to adopt a simulation routine. This capacity might have evolved from an action execution/observation matching
system” (p. 493). These simulation routines allow that others’ “intentions
can be directly grasped” (Gallese et al., 2009, p. 105).
It is not only the role of intention attribution that is subject to debate; the
notion itself is as well. For example, in a cautious note, Baldwin and Baird
(2001) acknowledge that “our everyday notions of intention and intentionality may turn out to be invalid as characterizations of the genuine content
or processes of the mind/brain” (p. 172). That is, even though in daily life
we generally speak about others’ intentions, it is questionable whether the
concept of ‘intentions’ is a valid scientiic construct that is useful for guiding
neuroscientiic research (see also Uithol et al., submitted).
Taken together, notwithstanding the volley of publications on the early development of and role of intention reading in action understanding,
there doesn’t seem to be a general agreement on the notions of intention
attribution and action understanding, the relation between the two processes, and their role in cognitive functioning (see also Perner & Doherty,
2005). We will argue that much of the discussion is the result of imprecise
use of terminology. By showing that several distinct processes are subsumed
under the label of ‘action understanding’, and detaching these processes
from intention attribution, we aim at helping this discussion forward. We
will argue that in everyday situations these processes only rarely result in the
attribution of an intention. Consequently, framing these various forms of
infants’ action understanding in terms of ‘intention attribution’ can be confusing and easily lead to over-estimation of infant’s capacities. Importantly,
102
our claim is more encompassing than a suggestion for terminology; we will
show that our more precise terminology allows for reinterpretation of current indings, and can aid in designing new experiments.
What do we understand, when we
understand an action?
There seems to be a great variety in the interpretation of the notion ‘action understanding’ in the social cognition literature (see Uithol, van Rooij,
Bekkering, & Haselager, 2011b for a detailed overview). First, it can mean
classifying an action, e.g. recognizing a grasp as a grasping action (examples
will be discussed below). Next, it can mean recognizing the goal behind
an action, i.e. recognizing a grasping action as being aimed at a particular
target. This goal, in turn, also allows for various interpretations. In its most
general form, a goal is framed as a desired world state, but in cognitive and
neuroscientiic experiments, goal recognition is generally operationalized
as target prediction or super-ordinate action prediction. In target prediction, the
notion of ‘goal’ is interpreted as the object or end location towards which
an action is directed: We see a grasping action, and we understand that the
grasp is directed at this speciic object. Next, super-ordinate action prediction
means that the goal of an action is interpreted as another action, but one of
a higher abstraction. A grasping action towards a cup can be interpreted as
a cue that the actor wants to drink. Drinking is also an action, but of a higher abstraction (i.e. grasping the cup is a means for the goal of drinking).
Fourth and inally, action understanding can mean generating an appropriate response to an observed action. We will discuss the different forms
of understanding in more detail, and relate them to research into infant
action understanding.
Action classiication
When we classify an action, we recognize that it belongs to a certain category of actions. Through this categorization, the observed action is recognized to have the properties that belong to the category to which the
action is allocated. That is, we do not perceive another person’s behavior
as a unrelated movements, but rather as a particular action that follows a
particular pattern. Such knowledge allows us to make sense of the observed
movements, as we are able to embed them into structures of actions and to
make, and enables us to make predictions about how the action will con-
103
Target prediction
Anticipating the target of an action upon observing a not yet completed
action is—like with action recognition—having an expectation about what
will happen when the action is continued or how the action will inish. Target prediction has been a major topic of interest in infancy research over
1 Mirror neuron studies, for example, suggest such an effect. Umiltá and colleagues (2001)
showed that monkey’s were able to recognize a partly occluded grasping action only when the
monkey knew that there was a graspable object behind the occluder.
action understanding in infants
tinue or end. For example, by recognizing an action as a grasping action,
the observer is able to predict that the hand will follow a particular movement pattern and continue its trajectory towards an object and take hold of
it. This recognizing and predicting should emphatically not be interpreted
as a conscious processes, but rather as a process of pattern completion (Buffart, Leeuwenberg, & Restle, 1981). This is something our mind does all the
time: just like we recognize a chair as something we can sit on, and we do
not relect on its use.
Only few studies have examined infants’ developing ability of action
classiication (Baldwin, Andersson, Saffran, & Meyer, 2008; Friend & Pace,
2011; Loucks & Baldwin, 2006; Saylor, Baldwin, Baird, & LaBounty, 2007).
For example, Baldwin and colleagues (2001) presented 10- to 11-month-old
infants with videos of series of ongoing actions. They showed that infants
were more surprised when the video was paused in the middle of an action than when this happened in between two actions, which suggests that
infants recognized the beginning of an action as part of a particular type of
action, and that they thus classiied the action. Another example is provided
by Behne and colleagues (2005), who showed that 9- to 18-month-old (but
not 6-month-old) infants correctly classiied an action either as a failed attempt to hand to them a desired toy or as deliberately not handing the toy
(i.e. teasing).
The mechanisms underlying action classiication in infants are still unclear, but work with adults suggests that plain familiarization and the detection of statistical regularities may account for much of this capacity (Baldwin et al., 2008). Infants observe thousands and thousands of grasps, for
instance, so generalization across these instances is likely to occur. Yet, it
remains an open question for future research to examine whether there
are other mechanisms involved, and whether there are also top-down inluences on action classiication1.
104
the past decade, so there is now a considerable body of research on infants’
target anticipations (Cannon & Woodward, 2012; Falck-Ytter et al., 2006;
Paulus, Hunnius, & Bekkering, 2011a; Woodward, 1998). Two variations can
be discerned: the target is an object (Woodward, 1998) or the target is a certain location (Falck-Ytter et al., 2006; Kochukhova & Gredebäck, 2010)2. In
simple experimental setups, with for example only one object around the
target location, this is highly similar to action recognition.
In both action classiication and target prediction, action understanding
is related to a prediction of how the action will end. Recognizing an action
as a grasping action involves recognizing that there is a target of the grasping action. Predicting the target is more dificult, however, when there are
multiple action targets available. In these cases plain associations between a
graspable object and an action no longer provide enough information for
forming an expectation about the action. Additional information needs to
be incorporated. This information could be based on the recognition of
statistical regularities (Ruffman et al., 2011) or the integration of context
information (Perner & Ruffman, 2005; see also Uithol, van Rooij, Bekkering, & Haselager, 2011a). Paulus et al. (2011b), for example, showed that
9-month-olds visually anticipate which path (out of two) an agent is going to
take based on previous visual experience with this agent and his behavior.
In addition to statistical regularities of the observed actor, infants can
employ cues such as grip types to predict the correct target. 1-Year-old infants expect an actor to grasp a particular object depending on an actor’s
grip size (Daum, Vuori, Prinz, & Aschersleben, 2009) and 20-month-old
infants visually anticipate the correct action target of a tool-use action depending on how the actor grasps the tool (Paulus, Hunnius, & Bekkering,
2011a). Furthermore, it has been shown that infants rely on their own action experiences to predict others’ action goals (Sommerville, Woodward,
& Needham, 2005) and also take affordances and functional properties of
objects, included in the task, into account (Gredebäck & Melinder, 2010;
Gredebäck, Stasiewicz, Falck-Ytter, Rosander, & Hofsten, 2009).
Super-ordinate Action Recognition
In super-ordinate action recognition it is understood to which higher-level
action the observed action contributes. For instance, it is understood that a
2 Note that in Woodward’s (1998) inluential paradigm both interpretations of target are
present, although only target objects are referred to as ‘goals’, while target locations are
dubbed ‘locations’.
105
Response selection
A inal form of action understanding we want to discuss is being able to
perform an appropriate response in social interactions (Bernieri, Reznick,
& Rosenthal, 1988; Eckerman & Peterman, 2001). The recognition of response selection as a form of action understanding is theoretically rooted
in embodied cognition (Clark, 1997) and ecological approaches to social
interaction (Knoblich & Sebanz, 2008 scenario 1; K. L. Marsh, Richardson,
Baron, & Schmidt, 2006). For example, infants in their irst year of life learn
to react adequately to their caregivers’ attempts to feed them by opening
their mouth in advance (van Dijk, Hunnius, & van Geert, 2009; Young &
Drewett, 2000). Were a third person to observe these interactions, and subsequently be asked why the infant reacted the way it did, she would likely say
that the infant recognized the intent of the other (i.e. to feed) and reacted
action understanding in infants
cup is grasped for drinking. This type of goal recognition is usually tested by
means of prediction of the subsequent action (e.g. moving the cup towards
the mouth). Super-ordinate actions can be more dificult to recognize, as
often there is not a one-to-one mapping between observed actions and the
super-ordinate actions. Not every cup is grasped for drinking, for example;
the cup can also be grasped in order to be placed in the dishwasher (see
Iacoboni et al., 2005). This means that additional cues are needed to form
the right expectation about the next action. For example, a grasp towards a
full cup is more likely to be followed by drinking, an empty cup by placing in
the dishwasher. So next to straightforward action-object and person-object
associations (Paulus, 2011; Perner & Ruffman, 2005), context information
and additional information about the object is needed to successfully predict these actions.
Super-ordinate action recognition has, to our knowledge, not been a
topic of wide interest in infant studies. The only study that examined this
issue comes from Woodward and Sommerville (2000), who showed that
12-month-old infants understood a single action as embedded in a sequence
of other actions (i.e. as predictive for a subsequent action), when the actions were causally connected. One reason for this lack of research interest
might be that infants’ super-ordinate action recognition is methodologically
more dificult to assess than target anticipation, as the latter deals with concrete objects that stay at the same place. Further research is necessary to
understand the ontogenesis of super-ordinate action recognition, and the
mechanisms that could underlie this capacity.
106
with the appropriate action. That is, one is inclined to think that in order
to perform an appropriate response, one needs to irst understand an action, making response selection a consequence of action understanding,
not itself a form of understanding. This need not be the case. It might be
enough that infants recognize a particular context (e.g., a particular daily
routine, a room) and that they learn to act on the perception of a particular signal (e.g., the approaching spoon) with the opening of their mouth.
Even though such behaviors do not necessarily entail a prediction or a full
classiication, but rather the initiation of a “correct” response (without being aware that it is a correct response), such a behavior shows some form
of action understanding (De Jaegher, Di Paolo, & Gallagher, 2010; Mead,
1934). This form of action understanding might rely on habitual learning
and the acquisition of perception-action associations (Bargh & Chartrand,
1999; Heyes, in press; Van Schie, van Waterschoot, & Bekkering, 2008).
Summary
What the exposition above shows is that not every form of action understanding has received an equal amount of attention within developmental
psychology. Target prediction seems by far the most studied, which enables
discussing candidate mechanisms that provide for this capacity. Other forms
of action understanding, such as super-ordinate action recognition and action response, are largely overlooked, so that suggestions for the underlying
mechanisms are much more speculative. The developmental pathways of
these abilities remain subject for future research.
Note that the order in which we presented the various forms of action
understanding does not suggest an increase of complexity. Recognizing
a scooping action as belonging to ‘feeding’ is likely to occur before complex actions such as uncorking a bottle can be recognized. Neither does
the order suggest that the latter forms of understanding dependent on the
former. Recognizing a target object may very well be a prerequisite for classifying an action as a grasping action (see footnote 1). Future empirical
research is needed to investigate the heterogenic roots of infants’ action
understanding, and the interaction of the various capacities.
We discussed ways of understanding actions, and how infants might accomplish this. The review of the literature shows that action understanding
is a quite heterogeneous and multifaceted concept that subsumes many different forms, which are only partly related to each other. At this point it is
important to note that the forms of action understanding discussed can all
107
What is in an intention?
Rooted irmly in folk-psychology (Haselager, 1997; Stich, 1983; Stich & Ravenscroft, 1994), the notion of intention plays a key role in psychological
explanations. Action intentions to act are generally thought to consist of a
belief about the physical environment, a desire to change the environment,
and an action plan to realize that change (Bratman, 1987; Malle & Knobe,
1997; Moses, 2001).
Since intentions are intimately related to beliefs, both mental states have
similar properties (cf. Uithol et al. submitted). First, both are thought of as
functionally discrete mental states, meaning that they play a functional role
that can be clearly contrasted from the role other beliefs and intentions
play. For example, the intention to buy a pear can be contrasted with the
intention to buy an apple, as each of the intentions would result in a different action. Next, both beliefs and intentions have context-independent
content, allowing one to form beliefs or intentions about situations and
contexts other than the current one, for example the intention to pick up
groceries on your way home after work, even though you are still in the ofice. Finally both mental states are often (although not necessarily) assumed
to have a propositional structure, meaning that the structure or syntax resembles the structure of linguistic representations.
In literature on intention understanding, it is generally and implicitly
assumed that an intention in the actor is responsible for causing and guiding his or her action (Baldwin & Baird, 2001; Luo & Baillargeon, 2010; Tomasello et al., 2005). Intention attribution is correct, it is assumed, when
the attributed intention matches the intention that was responsible for the
observed behavior. For example, Baldwin and Baird (2001) equate “what
others intend” with “judgments about the speciic content of the intentions
guiding others’ actions” and claim that this is crucial for predicting behavior (p.171).
action understanding in infants
be interpreted as grasping something from the other’s intentional action,
but none of them involve attributing an intention (see also Perner, 2010;
Povinelli, 2001). This is not to argue that intention attribution does not play
a role at all in both infant and adult action understanding, but that this role
might be smaller than is generally assumed. In the next section we will discuss how intentions are traditionally construed and, what they are posited
to do or explain.
108
However, there is increasing evidence that this is a highly problematic
framework for studying action understanding. At least three problems make
the idea of intentions being the cause of our actions an unfortunate framework for studying action understanding. First, actions are never performed
in isolation, but always parallel and in interaction with many other actions
(McFarland, 1989; Uithol et al., 2012; submitted). One cannot perform a
grasping action, without also breathing, making saccades, balancing one’s
body, etc. At the same time, one might also hum, visually target the object
or make eye contact with another person etc. A different intention can be
attributed to each of these simultaneous actions. As a consequence, at each
moment we can attribute an indeinite number of intentions to others. The
objection that only one of these actions was intended, and the others are either unconscious or necessary side effects will be addressed below, when we
discuss how our behavior is caused and controlled.
Next, every action can be described at a virtually unlimited number of
levels, with a different intention for every new action description (Uithol
et al., 2012; submitted). A grasping action can be described as “stretching
one’s elbow”, “grasping an object”, “drinking”, “maintaining homeostasis”,
“survival” etc. Of course, it is unlikely that we infer “maintaining homeostasis” from an observed action, but there seems to be no principled reason
to choose one level over the other as the level at which an action should be
described and an intention should be attributed. Empirical research has
shown that intention attribution depends on a number of factors such as
whether the observed person is liked or disliked, or whether the performed
actions are positively or negatively valued (Kozak, Marsh, & Wegner, 2006).
Also, explicit instruction to take the other person’s mental perspective does
not affect the probability of intention attribution, but it merely inluences
the level at which the actions are described. The fact that there is not a
single level at which actions are described makes any claim about what particular intention is inferred in a particular situation underdetermined, unless one has speciied which level one is interested in.
Third, it is highly questionable that intentions conceived as mental
states are the actual key causal factor in generating behavior. Much of our
behavior (and most of the actions used in experimental setups) is generated and controlled via (interaction of) dynamic control processes (Fuster,
2001; Koechlin et al., 2003; Kouneiher et al., 2009; Petrides, 2005; Smith,
Thelen, Titzer, & McLin, 1999; see also E. Thelen & Smith, 1994; E. Thelen,
Schöner, Scheier, & Smith, 2001). For example, Koechlin, Ody and Kounei-
109
action understanding in infants
her (2003) posit a model of action control in which behavior is the result
of four interacting control layers, each functioning on its own time scale.
Goal-directed behavior is not the result of an action goal that is propagated
from one (higher) control layer to lower ones, but emerges from the interaction of these layers (see also Uithol, van Rooij, Bekkering and Haselager
(2012)). This complex and dynamic interaction is deeply incompatible with
the prevalent understanding of the notion of intention, discussed above,
as a functionally discrete state with context-independent content (Uithol,
Burnston & Haselager (submitted)). Of course, it often seems like we irst
form an intention, after which we start our action. We decide to get coffee,
and subsequently we get up from our desk, walk down the hallway and head
for the coffee machine. But it is important not to confuse this personal level
and phenomenological description with the actual causes of our behavior.
Our postulated intention—even if we bother to explicitly formulate one—is
more likely to be a rough summary of the outcome of the complex interaction of the control processes, than an actual mental state that is responsible
for the subsequent actions.
When our behavior is indeed the result of dynamic and online processes,
and there is no discrete intention that causes our actions, intention attribution cannot be a matter of matching the attributed with a behavior-causing
intention. Alternatively, we would argue that when you attribute an intention, you isolate from an ongoing and continuous stream of behavior a selection that you deem worth explaining. For example, when a colleague
offers you a cup of coffee, you focus on the hand that passes you the coffee
cup, and momentarily ignore her breathing, her balancing her own cup,
her saccades, etc. As you do not have access to the complex processes that
control the observed behavior, you postulate an intention that explains just
that part we are interested in, usually the most salient perceptual effect (Elsner, 2007; Paulus, in press), i.e. the passing of the coffee.
Like the other forms of action understanding discussed above, this attributed intention is an expectation of how the action will continue or end,
or what action is likely to follow the current one, but there are important
differences. Attributed intentions have an explicit propositional format
while the forms low-level action predictions discussed above don’t (see also
Moses, 2001). This explicit propositional format allows for the attribution
of highly complex intentions, which can make complex actions that are directed at a far future intelligible, which would not be possible relying solely
on low-level action predictions. For example, not only can one attribute the
110
intention of grasping a cup to an observed action, but also grasping a cup
in order to clean the table, or cleaning the table because the actor expects
visitors, and so on. The level of abstraction of the attributed intentions is
virtually unlimited. However, there are additional constraints. Intentions,
for example, need to correspond to beliefs an actor has. An actor cannot
intend to get coffee unless she also has the belief that there is a coffee machine down the hallway. This makes belief attribution also necessary for attributing intentions (Moses, 2001), thereby making intention attribution a
cognitively demanding and often conscious practice (see also van Rooij et
al., 2011), and one we do not often engage in, as we will explain in the next
section.
We rarely attribute intentions
In everyday life we only seldom seem to engage in intention attribution. We
thoughtlessly grasp a cup of coffee when someone offers one, without attributing the intention to offer coffee to the other (just like we can thoughtlessly sip our coffee without ever explicitly forming the intention to take a sip).
Work by social and cognitive psychologists has provided a compelling body
of evidence that large parts of our everyday social interaction are not guided
by explicit relections and conscious considerations (Bargh, 2006; Bargh &
Chartrand, 1999; Bargh & Ferguson, 2000; Bargh, Chen, & Burrows, 1996;
Bernieri & Rosenthal, 1991; Cesario, Plaks, Hagiwara, Navarrete, & Higgins,
2010; Langer, 1978), and that the intentional control of action is often quite
limited (for developmental indings see also Kenward, Folke, Holmberg,
Johansson, & Gredebäck, 2009; Klossek & Dickinson, 2012; Klossek, Russell,
& Dickinson, 2008), consisting rather of automatized action routines (Aarts
& Dijksterhuis, 2000; Dijksterhuis & Nordgren, 2006).
At this point, we can think of two objections to our claim that action
understanding and intention attribution are distinct processes and that the
latter is a relatively rare phenomenon. First, one might claim that we are using an overly rich and high-level interpretation of the notion of ‘intention’.
Perhaps our analyses hold for high-level intentions, such as the intention to
pick-up groceries after work, but not for simpler ones, such as the intention
to pick up an object. Searle (1983), for example, contrasts prior intentions
from intentions in action (see also Bratman (1987) and Pacherie (2000; 2008)
for similar distinctions). Prior intentions represent the goal of an action,
and the means to achieve it, prior to the action, while intentions in action
111
action understanding in infants
are the immediate causes of the movements needed to perform the action.
With this contrast in mind, one could object that our analysis might hold for
prior intentions, but not for intentions in action, and claim that the low-level forms of action understanding that we have discussed are instances of attributing intentions in action. This is, we believe, a problematic suggestion,
as intentions in action cannot play a role in action understanding. What this
suggestion entails is that for each of the observed actions (or sub-action) an
intention in action is posited, and that action understanding is the process
of inferring these intentions in action. However, the only access one has to
these hidden states is through observation of the actions. As every intention
in action corresponds to one action or sub-action, inferring the intention
in action from an observed action provides no additional information at
all (e.g. one does not learn anything new from inferring the intention to
grasp from a grasping action). The fact a second processing step of inferring unobservable and unveriiable mental entities from observed actions is
needed, and that this extra step does not provide a deeper form of understanding, render the positing of intentions in action explanatory idle with
respect to action understanding.
A second objection could be that we do attribute intentions for every action we observe and understand, in all their propositional glory, but we do
this automatically and unconsciously. We see no way to test this claim empirically, but given the fact that preverbal infants engage in action prediction,
but do not yet grasp the typical propositional properties of intentions (what
Moses (2001) calls the ‘epistemic aspect of intention’: the understanding
that an intention of an actor is tied to his or her beliefs and desires) and
given the various possible mechanisms underlying action understanding we
have discussed above, we deem this highly unlikely.
In all, when intentions are understood as propositional mental states, intention attribution is a cognitively demanding and relatively rare form of action understanding. When intentions are understood as non-propositional
states that are uniquely related to an action, intention attribution does not
provide any additional information over the forms of action understanding
we have discussed above. Claiming that action understanding in preverbal
infants is best characterized as ‘attributing intentions’ seem to assume that
there is a third option, in between the two possibilities laid-out above: one
that is non-propositional but still provides extra information. We see no way
to turn such a third option into an intelligible concept.
112
Understanding actions and attributing
intentions
We have explained that intention attribution is a special and relatively rare
form of understanding observed actions. Of course no one could be prevented from using the notion of ‘intention attribution’ for every form of action
understanding as well, but we believe this to be highly problematic. It entails
that the notion of ‘intention attribution’ is stretched to such an extent that it
encompasses various heterogeneous capacities, from low-level, unconscious
forms of understanding to highly cognitive, conscious and deliberative acts
of attribution. This stretch evokes a serious risk of losing explanatory leverage. First, under such an account various automated prediction processes
are instances of genuine intention attribution. For example, a word processor’s auto-correct function can be said to attribute the intention to write
“the” when confronted with the input “teh”. Similarly, a soccer ball kicked
towards the goal can be attributed the intention of entering the goal (see
Luo, 2011). In the end the notion of intention attribution becomes so broad
and comprehensive that without further speciication, it is not clear how to
interpret a claim regarding intention attribution3. The capacity to associate
a grasping movement and a target object is qualitatively different from the
capacity to infer that an actor wants to clean up the table, as he wants to
make a good impression on his visitors. Consequently, data collected on one
interpretation of intention attribution cannot straightforwardly be related to
data that was created using a different interpretation. So when action understanding and intention attribution are not carefully contrasted, and it is not
speciied what interpretation of intention attribution is used exactly, confusion and under- or overestimation of infants’ capacities is easily created.
Alternatively, when low-level action understanding and intention attribution are properly contrasted, there are three ways in which the two notions
can be related. First, one could claim that action understanding and intention attribution are entirely distinct and independent processes. While this
is a theoretical possibility, we are not aware of accounts that posit such independence.
Second, one could hold that intention attribution is required for successful
action understanding. Although this has been suggested (Baldwin & Baird,
2001; Luo & Baillargeon, 2010), the cognitive complexity of intention attri3 Haselager, De Groot & Van Rappard (2003) describe a highly similar problem with the notion of representation.
113
action understanding in infants
bution, combined with the capacity of predicting targets in very young infants, as well as indings of powerful statistical learning capacities in infants
(Kirkham, Slemmer, & Johnson, 2002; Paulus, 2011; Ruffman et al., 2011;
Saffran, Aslin, & Newport, 1996), suggest that this option is unlikely (see
also de Bruin, Strijbos, & Slors, 2011). We have discussed several mechanisms that could account for the various forms of action understanding,
and none of these mechanisms relied on attributing an intention to an actor. Moreover, Luo (2011) showed that infants as young as 3-months old
also form predictions about non-human agents. Although one could maintain that this shows that infants already attribute intentions at the age of 3
months, even to non-human agents, a far more plausible and parsimonious
explanation would be that infants do not need to attribute intentions in
order to predict these movements.
Third and inally, low-level action-object associations and action expectations could be interpreted as contributing to intention attribution. In this
case the observer uses the various forms of low-level expectations to come to
a judgment about the intention behind an observed action. As an example,
Gallese and colleagues (2009) recently suggested that before or below explicit propositional intention attribution lies a prerelexive form of action
understanding that relies on the observer’s own motor system. Through
motor resonance, the actor’s intentions “can be directly grasped without the
need of representing them in propositional form” (p.105), and this crucial
role is played “by the motor system in providing the building blocks upon
which more sophisticated social cognitive abilities can be built” (p. 108).
We agree that the attribution of an explicit intention can be informed
by the other forms of action understanding, and we believe that Gallese
and colleagues (2009) made an important step in stripping away the propositional format that is generally attributed to action understanding, but
we have two important objections to their theory of action understanding.
First, we believe that their emphasis on motor cognition outstrips the types
of action understanding that can be based exclusively on this mechanism.
For example, when observing an action that involves multiple target objects, or actions that can serve multiple super-ordinate actions, plain motor resonance often cannot make one goal more likely than the other (see
also Jacob, 2008; Jacob & Jeannerod, 2005; Uithol, van Rooij, Bekkering, &
Haselager, 2011a).
Second, while in some complex cases it might be correct to frame action
predicting mechanisms as contributing to intention attribution, or inten-
114
tion understanding, it does not capture what we believe to be regular action
understanding, because, as we have shown above, much of everyday action
understanding does not involve attributing intentions to observed agents,
but relies on low-level prediction mechanisms. Below we will sketch the contours of a framework that we believe to accommodate action understanding
and intention attribution and their relationbest.
Infants’ action understanding: towards a new
theoretical framework
As outlined above, an action intention contains a belief about a particular
state of the world. Consequently, an attribution of an intention to another
person is closely tied to an understanding of the other’s belief. For a long
time it was undisputed that infants do not attribute beliefs to other agents
until about four years old (Perner, 1991; Wimmer & Perner, 1983). Recently, some have questioned this ‘common sense’, and claimed to have shown
earlier false-belief attribution (Kovács, Téglás, & Endress, 2010; Onishi &
Baillargeon, 2005). However, several theoretical models posited that these
earlier forms are most likely not subserved by belief attribution, but other
mechanisms (e.g. Apperly & Butterill, 2009; De Bruin & Newen, 2012; Perner & Ruffman, 2005; Rakoczy, 2011). Apperly and Butterill (2009) argued
that there is a difference between the explicit attribution of beliefs, and
other forms of understanding. The irst is heavily dependent on language
use and executive function, and seem to be absent until the age of four
(Wellman, Cross, & Watson, 2001). The latter can only be tested implicitly (e.g. by means of a looking-time paradigm), but seems to be present at
a younger age. The irst empirical results that support these models have
been reported (Thoermer, Sodian, Vuori, Perst, & Kristen, 2011).
Based on the analysis above, and the dependence of explicit intentions
on beliefs, we can now formulate a corresponding account for action understanding and intention attribution. In this new framework, infants can
acquire several forms of action understanding at a very young, preverbal
age, but explicit intention attribution does not occur until much later,
when cognitive capacities such as language use are adequately developed.
But whereas Apperly and Butterill argue for a two-system model for belief
attribution (although they do not exclude the possibility of more systems,
see footnote 1, p. 953), we have argued that action understanding relies on
many interacting mechanisms.
115
action understanding in infants
There seems to be empirical evidence for a connection between explicit
intention attribution and language understanding (Ruffman, Slade, Rowlandson, Rumsey, & Garnham, 2003). Children’s social understanding is reported to be correlated with maternal talk about the child’s emotions and
desires (Ruffman, Slade, & Crowe, 2002; Taumoepeau & Ruffman, 2008),
suggesting that intention attribution might at irst instance be more like
a language game (Wittgenstein, 1953) that children learn to play, than a
necessity for the understanding of everyday’s actions (cf. Astington, 2006;
Montgomery, 1997; Nelson, 2007; Olson, 1988).
The exposure of action understanding as an umbrella notion, and the
insight that action understanding does not rely on, nor necessarily contribute to, intention attribution, have important consequences for future research. Instead of trying to establish intention attribution in ever younger
infants (e.g. Luo, 2011), we believe that research into the ontogenesis of
social cognition is better served by considering the various processes and
mechanisms that underlie the different forms of action understanding, and
study how they interact to produce successful behavior that enables the observer to act successfully in the social word.
In older children, the interaction between low-level action prediction
and explicit intention attribution can be studied. Inluence might very well
work in both directions. Like Gallese and colleagues (2009) argue, explicit
mentalizing might use information acquired by means of low-level processes. But having an idea about the actor’s intention might inluence how we
observe or predict an action as well. There is evidence that activity in the
mirror neuron system is modulated by additional parameters, such context
(Iacoboni et al., 2005), and task speciics (Lingnau & Petris, submitted), but
to our knowledge the inluence of presumed intentions on action processing has not been tested directly.
The relation could be even more complicated when learning is taken
into account. Actions that irst need to be understood by means of explicit
intention attribution, could, after familiarization, no longer require explicit
inference, but rely on associations. For example, within the theory-of-mind
research tradition it has been shown that for younger children false belief
understanding involves more frontal (i.e. executive control) processes,
whereas adults do not rely to the same extent on executive functions in evaluating others’ beliefs (Meinhardt, Sodian, Thoermer, Döhnel, & Sommer,
2011), suggesting a more automatized processing of others’ false beliefs.
This points to developmental changes in the neurocognitive mechanisms
116
underlying the same kind of social understanding, which become more automatized once they are acquired.
By acknowledging that there are different forms of action understanding
we can shed new light on various seemingly contrasting hypotheses. For example, Luo (2011) presents two views on early goal attribution: humans irst
theories, that claim that infants psychological reasoning is at irst restricted
to humans, and all-agents theories, that claim that infants also attribute goals
to non-human agents. However, this contrast only emerges when action understanding or goal attribution is framed as an ‘all-or-nothing’ capacity, that
either starts with humans, or is applied to all types of agents. By subdividing
action understanding and detaching it from intention attribution, as done
above, the contrast seems to disappear. Several forms of associations and
predictions are at work, some of which might be stronger for humans, perhaps due to higher exposure, while others are more universal mechanisms.
Woodward (1998) found longer looking times in 6-month olds for a novel
target object than for a novel target location, suggesting that these infants
more strongly associate a human grasping action to an object than to a location, and that this capacity to associate is not present for poking actions with
a tool. This does not seem to be at odd with Luo and Baillargeon’s (2005)
indings, which show that infants are also able to associate a self-propelled
box with a certain target object. Also Luo and Baillargeon found that target
objects allow for stronger associations than target locations, and that this
association is also formed with non-human agents. However, There is no
ground for assuming that a ‘psychological reasoning system’ (Luo, 2011, p.
454) is at work here. These indings can be explained in terms of low-level
associations, without relying on any psychological or mental states. Only
when operating on a conlated understanding of goal or intention attribution and action understanding is one enticed to interpret these data in
terms of ‘psychological reasoning’, and the contrast between human-irst
and all-agents emerges.
Conclusion
We have argued that it is not insightful to frame indings of action processing in infants solely in terms of ‘action understanding’. What is commonly
referred to as action understanding in fact consists of various heterogeneous
processes of action prediction and anticipation. As a consequence, action
understanding is not an ‘all-or-nothing’ capacity that infants of a certain age
117
action understanding in infants
do or do not have, or that is applied to either humans only or to all agents.
Instead, it appears to be a multi-facetted notion, based on various mechanisms that develop gradually. Importantly, these processes of action prediction and action anticipation do not involve the attribution of an intention.
As intention attribution seems to be dependent on language capacities, it
does not emerge until language is suficiently developed. Intention attribution is cognitively effortful but it can account for understanding far more
complex or future-directed actions. Most of our everyday actions, however,
are understood without intention attribution.
sevendiscussion
action hierarchies
simulation
intention
120
The previous chapters discuss the notions of ‘representation’, ‘action’, and
‘intention’ relatively independently. In this discussion I will highlight the
connections between the notions, and show that when our interpretation of
one of these notions changes, the others will change too. At the same time,
I will illustrate how the analyses of the previous chapters can impact current
and future research into the cognitive and neural mechanisms of action
generation and action understanding. I will do so by irst explaining the
mutual dependency of the notions of ‘action hierarchy’ and ‘intention’ and
discussing how the insights of Chapter 4 and 5 impact current interpretation of experimental data on imitation. Next I will use the conclusions about
action understanding (Chapter 3 and 6), and the framework of embodied
cognition (introduction) to assess the claim that mirror neurons ‘simulate’.
After having discussed claims about current research, I will discuss what
the conclusions mean for the notion of intention itself, and how future research into intentional action can be build on the reinterpretation of the
notion I have proposed.
Action hierarchies and imitation
I have deined intentions loosely as a desired world state, or goal, combined
with an action plan for how to reach it. The action hierarchies I have discussed and criticized in Chapter 4 consist of exactly these elements: an action goal on top, and the actions needed to accomplish this below it. In this
chapter I have argued that a hierarchy with a goal on top that causes or initiates lower action features is conceptually problematic. Chapter 5 continues
along these lines by arguing that intentions as discrete representations do
not correspond to neural states. Like in Chapter 4, the framework is replaced by a multitude of interacting control processes. When this framing
of action control indeed corresponds better to the actual neural basis of
intentional action than the traditional frameworks, there are far-reaching
consequences for studying intentional action. To illustrate, I will discuss an
imitation experiment by Bekkering and colleagues (Bekkering et al., 2000;
Bekkering & Wohlschlager, 2002), and show how a more ine-grained notion of action understanding and the abandoning intentions as primary
cause of actions changes the interpretation of the data.
Bekkering and colleagues (2002) discuss several experiments that are
generally interpreted as supporting the hypothesis that goals are represented hierarchically. In one (Bekkering et al., 2000), children had to imitate
121
discussion
an adult model touching his or her own ear. The most common error was
the so-called contra-ipsi error. In that case the adult model used the contralateral hand in the demonstration, while children imitated this action using
the ipsilateral hand, nevertheless touching the correct ear. Having the adult
model touch only one ear (always right, or always left, using the ipsilateral
or contralateral hand randomly) signiicantly reduced imitation errors. The
received explanation is that eliminating the necessity of keeping track of
the main goal (either left or right ear) enabled children to reproduce goals
or actions lower in the hierarchy in the to be imitated action, such as using
the correct hand or correct movement path. They conclude that children
tend to imitate the goal of a movement rather than speciic means, which is
in line with the general idea of ‘rational imitation’ (Csibra & Gergely, 2007;
Gergely, Bekkering, & Király, 2002).
These interesting indings are often interpreted as evidence for a hierarchy in action observation and imitation (Bekkering & Wohlschlager, 2002;
Hamlin, Hallinan, & Woodward, 2008; Sebanz & Knoblich, 2009), as well as
a hierarchy in action control (Grafton & Hamilton, 2007; Hamilton, 2009).
By applying the insights from the previous chapters, I will show that the irst
interpretation is far from straightforward and the second highly problematic. To start with the second: What is studied in Bekkering et al’s experiment is imitation, which relies on action observation, so for this to apply to the
control of an action as well, one needs to assume that the hierarchy in the
observed action matches the hierarchy in action control, but it is exactly this
assumption that is shown to be problematic in Chapter 4.
Then the evidence for a hierarchy in imitation or observation: The multitude of goals that can be attributed to a single action (Chapter 3, 4, 5 and
6) troubles a straightforward translation to an action hierarchy. Which goal
will be attributed depends on the level of description, the embedding in a
super-ordinate action, the experimental task, etc. The objection that only
one goal is ‘the right one’, namely the one that has initiated the observed
action in the actor, is dismissed in Chapter 5. So even in such a highly controlled experimental setup, there are many action interpretations and levels
of description possible. Was it, for instance, the intention of the actor to
“make a movement with his or her right arm”, “touch a random ear”, “touch
his or her left ear”, “make a clear stimulus”, “do as the experimenter told to
do”, “make a few euros”, etc.?
Next, the fact that both the movement and the target location can be
interpreted as a “goal” (Chapter 3, 4, and 6) means that we no longer have
122
reasons to assume that the choice of target is higher in the hierarchy than
the choice to use the left or the right arm. The suggestion that the data
shows that targets locations are higher in the hierarchy than effectors begs
the question, as this data was supposed to show that higher goals are better of more often imitated than lower goals. To see this: Suppose a (highly
lexible) actor would touch his ear with either his hand or his foot. In that
case it is to be expected that the effector would be much more salient, and
the target location less so. If target locations and effectors are indeed part
of an action hierarchy, the hierarchy could easily lip, depending on this experimental paradigm used1. The data Bekkering and colleagues presented
might therefore reveal a hierarchy in saliency in the stimulus, meaning that
in the current setup, the target location is a more salient action feature than
the speciic means to reach the target. Differences in saliency can provide
insight in to what aspects of an action children pay attention to, or in other
words, which characteristics of an action are used in action prediction, action understanding and imitation (Chapter 6). See also Elsner (2007) for a
review of saliency in imitation tasks.
A more general underlying problem, not so much with the original study
and presentation of the indings, but rather with subsequent interpretations
of the data, is that the indings are often interpreted and cited at a more
general and abstract level (Grafton & Hamilton, 2007; Hamilton, 2009). In
exact wording, Bekkering and colleagues found that for an ear touching action, children pay more attention to the target (ear) than whether the left
or right hand is used. This is an interesting and valuable inding in itself.
However, Chapter 6 argues that action understanding consists of various
types of associations and mechanisms. A consequence of this heterogeneous
nature is that the current indings might not generalize to other types of
actions (e.g. the hand versus foot action, mentioned above), other types
of goals (e.g. a superordinate action, Chapter 6), or other forms of action
understanding.
In all, a straightforward interpretation of this imitation task in terms
of an action hierarchy appears to be problematic. The acknowledgement
that goals and intentions are not ‘out there in the stimulus’, and ready to
be picked up, but arise in action interpreting, combined with a more inegrained terminology with respect to action understanding show that other
1 Multiple realizability (i.e. there are different means to one end) cannot fix the orientation of
the hierarchy either, as we may indeed use different arms to reach a single target, but the reverse
is also true: we can use one and the same arm in reaching various targets.
123
Motor resonance, simulation, and action
understanding
Mirror neurons activity or motor resonance during action observation is often interpreted as a form of simulation of the observed action (Calvo-Merino
et al., 2005; Calvo-Merino, Grezes, Glaser, Passingham, & Haggard, 2006;
Decety & Grezes, 2006; Gallese & Goldman, 1998; Gallese & Sinigaglia,
2011; Goldman, 2009; Grezes & Decety, 2001), thereby claiming support
for the simulation theory of mind reading (Goldman, 2006; 1989; Gordon,
1986, see also Chapter 3). This support assumes that the observer of an action uses his or her own motor system to form a prediction of the mental
states responsible for the observed action (Gallese, 2007; 2009; Gallese &
Goldman, 1998; Glenberg, 2006; Iacoboni, 2008; Keysers & Gazzola, 2007;
Rizzolatti & Craighero, 2004; Rizzolatti & Sinigaglia, 2010). By drawing on
the previous chapters, Barsalou’s theory of ‘Perceptual Symbol Systems’,
and Hommel’s ‘Theory of ‘Event Coding’ I will argue that this support is
not straightforward, and that the persuasiveness of this support is largely
based on an overly rich interpretation of the notion of ‘simulation’ and a
strict contrast between perceptual and motor representations (a contrast
that is not in line with empirical evidence, amongst which the very inding
of mirror neurons itself).
By arguing that conceptual knowledge is based on perceptual systems,
Barsalou (Barsalou, 1999; 2008; Barsalou, Simmons, Barbey, & Wilson,
2003), presented a perceptual theory of knowledge. In contrast to classical
theories, that assume that knowledge resides in a modular and semantic
system that it separate from episodic memory (Barsalou et al., 2003, p. 84),
this theory posits that knowledge is represented in the same systems that
produced the knowledge through perception. For example, perceiving a
car activates my perceptual areas. After many perceptions of many cars, a
concept of car is formed that is no longer tied to one speciic perception,
discussion
interpretations are not only possible, but sometimes even more likely (e.g.
imitating the most salient aspect of an action). Follow-up studies are needed
to gain a deeper understanding of the exact mechanisms that are exploited
in acts of imitation and action understanding. The discrimination of various ways of understanding an action presented in Chapter 3 and 6, and
the mechanisms I hypothesized to be underlying these different forms of
understanding in Chapter 6 could assist in designing these studies.
124
but that is still represented in the perceptual areas (and therefore still perceptual in nature). A perceptual symbol is thus created2.
These perceptual symbols can be activated through perception (e.g.
perceiving a car), or through imagery (thinking about cars). Activations of
these perceptual symbols are called ‘simulations’. It is important to realize
that what is simulated here is a previous encounter with the perceived entity.
This type of simulation is therefore a form of recognition (when a stimulus is
present) or memory (in absence of a stimulus, i.e. in case of imagery).
Even though Barsalou speaks of ‘sensory-motor areas’, his theory is
mainly build around sensory or perceptual representations. If we were to
sketch an account for motor representations analogous to Barsalou’s perceptual symbols, we would posit that representations of actions are represented modally, and therefore in the motor and perhaps premotor areas.
Consequently, these representations are active during action execution (the
motor analogue to Barsalou’s perception) but also during motor imagery.
In the latter case, it can be said that the observed action is ‘simulated’, but
it is important to realize that this is in the sense of memorizing the action.
For the sake of the argument I have posited a strict contrast between
perceptual and motor processes. Such a strict contrast is more and more
questioned, however. Hommel’s Theory of Event Coding (TEC: Hommel,
2003; 2004; Hommel et al., 2001), for example, argues that the representations underlying perception and action are coded together in a common
representational medium. As a consequence, the activity of a neuron or a
group of neurons in an area involved in this common coding can be related
to both perceptual and motor events. So for our ‘motor symbols’ this means
that next to action execution and action imagery, action observation also activates these multimodal action representations.
This is, of course, exactly in line with activation characteristics of mirror
neurons. Consequently, assuming this common coding scheme, mirror neuron iring corresponds to the activation of a representation that underlies
both perceptual and motor features of an action, or analogous to Barsalou
and colleagues (2003): the re-enactment of a modal action representation.
This means that mirror neurons can be said to simulate, but this is simulation in Barsalou’s re-enactment sense.
2 Some (e.g. Aydede, 1999; Adams & Campbell, 1999, Zwaan Stanield & Madden, 1999)
have argued that Barsalou might be right in claiming that the representations are accommodated in perceptual areas, but that does not make them necessarily modal. This, however, does
not make a difference for my argumentation here.
125
3 Following this reasoning, one could just as well lip the received interpretation, and interpret mirror neuron iring as perceptual representations, and claim that the remarkable inding
of mirror neuron activity found during action execution demands extra explanation, and attribute it to ‘simulating the perceptual effect of the action’. Although this suggestion bears similarity to ideomotor theories (James, 1890), few researchers interpret mirror neurons this way.
4 When the dual response proile is a result of the embodied nature of cognition, one would
expect it to be a rather widespread phenomenon, present in multiple cortical areas, and all
across the animal kingdom. Indeed, Mukamel (2010) established neurons with mirror properties in various (human) cortical regions, and mirror neurons have been reported in humans
discussion
For motor resonance to support the simulation theory of mind reading,
we have to assume a different or additional form of simulation: simulation
of the observed actor’s mental processes (Gallese & Goldman, 1998, see
also Chapter 3). The contrast between the two types of simulations I posited here maps on the distinction between intrapersonal and interpersonal
resonance, presented in Chapter 3. Simulation as re-enactment can be considered a form of intrapersonal resonance, while simulation in the mindreading sense is a form of interpersonal resonance. However, interpreting
mirroring processes as a form of simulation in the mind reading sense relies
on some remarkable additional assumptions.
First, only if we interpret mirror neuron iring as ‘motor representations
that also become active during action observation’ we are inclined to interpret this activity in terms of simulation in the mindreading sense. This
means that we have to assume a strict contrast between perceptual and motor representations, and between perceptual and motor areas that accommodate these representations, and we have to assume that mirror neurons
are motor representations. After all, when we would consider them to be
perceptual representations, there would be no reason to call them simulations in the mindreading sense3.
The assumption of a strict contrast between perception and motor areas,
and the classiication of the inferior frontal gyrus (IFG)—the area in which
mirror neurons were irst found—as belonging to the second category is understandable from a historical perspective. The IFG has been designated to
be a premotor area due to its important role in the planning and execution
of actions (Rizzolatti et al., 1988). It is remarkable, however, that the iring
of mirror neurons—the incarnation of multimodal representations—is still
implicitly assumed to be related to motor representations, and hence simulation of an actor’s motor decisions.
Following TEC’s common coding, areas or neurons with a dual response
proile are not only to be expected, they are a necessary consequence of the
representational organization4. When the dual response proile of mirror
126
neurons would indeed be a consequence of the representational organization of the brain, then interpreting the activation of mirror neurons upon
action observation as a form of covert simulation in the mindreading sense
is taking a teleological stance towards a phenomenon that might be the
result of plain neural organization. The discussion about the function of mirror neurons should therefore be preceded by a discussion about the origin
of these neurons.
Yet, even when mirror neurons are a direct consequence of the modal,
embodied representational structure of the brain, it could still be the case
that the brain makes use of this dual response proile, that mirror neurons
can be attributed a function after all, and that this function has to do with
action understanding. Yet, the notion of ‘mindreading’ is still out of place.
Chapter 5 argues that the idea of discrete intentions causing our actions
is highly problematic. Consequently, the idea of mindreading as ‘inferring
the mental states that were the cause of the observed actions’, is equally
problematic. Chapter 6 argues that action understanding is a multi-facetted
process, and that it is plausible that mirroring processes play a role in the
mechanisms underlying some of the discussed forms of action understanding. These processes, however, are better described as forms action prediction than of ‘mindreading’.
The future of intentions
An overall theme that connects the previous chapters is the straightforward
applications of psychological and folk psychological notions into neuroscientiic research, and the problems this creates. Of course I am not the irst
to object to such application; eliminative materialism (Churchland, 1981; Lycan & Pappas, 1972) has been around for a while, and argues that a mature
neuroscience will in the end replace folk-psychological notions, and that,
consequently, these folk notions will be eliminated. However, being primordially a philosophical enterprise, eliminativism is usually formulated on a
rather abstract level, discussing ‘mental states’ and ‘brain states’. By contrast, I have tried to descend from this abstract level by studying particular
types of mental states and concepts. In chapter 5 the possible neural imple-
(Chong, Cunnington, Williams, Kanwisher, & Mattingley, 2008; Kilner, Neal, Weiskopf, Friston,
& Frith, 2009; Mukamel et al., 2010), monkeys (Gallese et al., 1996; Rizzolatti et al., 1996), and
songbirds (Prather, Peters, Nowicki, & Mooney, 2008). It is too soon, however, to determine
whether this is as widespread as one would expect based on common coding principles.
127
discussion
mentations of intentions are analyzed in detail, and it is concluded that the
folk interpretation of ‘intention’ does not apply to any particular physical
process by which the brain initiates actions.
The conclusion of Chapter 5 can be considered to be eliminative to some
extent, as I have argued that the notion of intentions does not correspond
to a particular brain state. However, I do not expect the notion to disappear
with a developing neuroscience. I have suggested that the notion plays an
important role in social and communicative acts, which makes it unlikely
that it will ever be replaced. Like Dennett (Dennett, 1978; 1987) explains,
the concepts of ‘beliefs’ and ‘desires’ provide an easy and accurate explanation of ongoing behavior, even of systems that can be argued to have none
of such mental states, like thermostats and chess computers. Similarly, intentions will remain providing an explanation or prediction of an ongoing
action that is accurate enough in a daily context. What I have been arguing,
though, is that this notion does not describe a neural or psychological state,
and that it therefore does not provide a fruitful starting point for empirical
research.
I have speculated that for more complex and temporally extended behavior, the neural processes that guide our actions seem to interact with
various inputs that function as stabilizing elements, and I have speciically
emphasized linguistic structures (Clark, 2006). A linguistically structured
representation can function as a reminder for future actions, similar to the
way that reciting the memorized grocery list in the supermarket helps one
to ind the items one needs.
The type of processes that the brain can rely on need not be restricted to
linguistic structures, however. The brain can also rely on external stabilizing
elements, such as artifacts and other agents. For example, a stop sign is a
control factor in our ‘safe driving behavior’, and therefore accounts for part
of the function attributed to the intention to drive safely. After all, when intentions are not brain states, but a social construction, as Chapter 5 argues,
we do not have principled reasons to only include neural processes in the
processes that contribute to behavior.
This means that the function we commonly attribute to an intention is
in fact performed by a number of processes and elements, of which some
are internal to the brain (neural control processes), some have external
origins but are internalized (linguistic structures), and some are external
(stop sign). This is emphatically not a list of types of intentions—a stop sign
128
cannot be considered an intention on its own—but a list of elements that
collectively contribute to the functions generally attributed to intentions 5
If one wants to maintain the causal role of intentions, and yet acknowledge the various processes that shape and control our behavior, one is
forced to embrace a radical version of the ‘extended mind thesis’ (Adams
& Aizawa, 2001; Chemero, Silberstein, & Sloutsky, 2008; Clark, 2003; 2008;
Sterelny, 2010), which states that cognition is not bound by brain processes,
but involves external tokens and processes as well. Within this framework,
one can hold that the representation of an intention has a distributed vehicle, consisting of heterogeneous elements. However, this interpretation
is rather remote from traditional interpretations of Causal Action Theory
(Bratman, 1987; Davidson, 1963; Pacherie, 2000), and it is unclear what explanatory leverage is gained by maintaining intentions as primary cause of
our actions. See Figure 1 for a graphic overview of the differences between
the folk psychological framework and the alternative sketched here.
a
b
intention
other agents
objects
actor
actor
context
deliberation
objects
intention
action
linguistic
processes
context
control
processes
action
other
agents
Figure 1. Diagram a) shows the traditional ‘folk’ framework of action causation, b) shows the
proposed alternative.
I have argued above and in Chapter 6 that this reinterpretation of the
notion of intention has important consequences for studying action understanding. It involves shifting the focus from how an intention can be
inferred from an observed movement, which is argued to be highly problematic (Jacob, 2008), and potentially computationally intractable (Chapter
2, van Rooij, 2008), to how the interaction between context, objects and motor resonance contributes to action understanding. A irst step in mapping
these various processes has been taken in Chapter 6.
The proposed framework has also consequences for studying joint action.
Generally, joint action is conceived as being the result of a shared intention
5 See Frank, van Rooij and Haselager (2009) for an interesting example of how a model can
acquire systematicity by relying on the systematicity in the external world. A similar process
could account for the discreteness of intentions.
129
Final remarks
The main conclusions of this thesis can be summarized as follows: 1) actions
are not generated by straightforward top-down causation of intentions, and
2) action understanding is a heterogeneous concept that encompasses various forms of recognition and prediction. Of course the framework sketched
in this thesis is by no means a full-blown theory or a fully-ledged alternative
to existing theories on action generation. How, for instance, can a hierarchy based on temporal extension account for dificulties such as the exact
timing of actions, the order of sub-actions to be performed, etc.? How do
dynamic action control processes interact with the stabilizing factors I have
mentioned? A theory that is to replace the models and theories I have criticized will have to formulate an answer to such questions. However, I hope to
have made clear that the proposed framework offers a promising alternative
in which the complexity and the dynamical nature of the neural processes
discussion
(Bratman, 1993; Butterill & Sebanz, 2011; Tomasello et al., 2005), which
is considered to be a special, and often problematic case of the common
individual intention (Pacherie, 2011), in which the role of context is not
easily accounted for. By contrast, in the proposed framework, action control
processes are already distributed over the brain, context, and other agents.
This means that in a similar context, the “intentions” of two or more actors
already overlap to a certain extent. Also, as action initiation and control are
deeply intertwined at the neural level, the process of movement synchronization (Chartrand & Bargh, 1999; Lakin, Jefferis, Cheng, & Chartrand,
2003) further shapes the joint action. Objects can also play a role: for example the movement of the table that I am carrying with my colleague can
become a stabilizing constituent, shaping my action control (Sebanz et al.,
2006). This means that the focus in studying joint action should not only
lie on how people form shared intentions, but also on how the interplay
between the described elements and control processes results in stable goal
representations, and thereby successful joint action.
Additionally, reinterpreting the notion of intention might have implications well outside of the discussed ields, including neurotechnology (e.g.
Brain Computer Interfaces “reading” the intentions of patients, see Haselager (submitted)), and legal theory, in which the notion plays a pivotal role
as well.
that control our actions and the mechanisms that underlie action understanding can be accounted for.
references
132
a
Aarts, H., & Dijksterhuis, A. (2000). The auto- Baker, C., Tenenbaum, J. B., Saxe, R., &
matic activation of goal-directed behaviour:
Trafton, J. (2007). Goal inference as inverse
The case of travel habit. Journal of Environplanning. Proceedings of the 29th Annual Cognimental Psychology, 20(1), 75–82.
tive Science Society. (pp. 779–784).
Abdelbar, A., & Hedetniemi, S. (1998). Approximating MAPs for belief networks is
NP-hard and other theorems. Artiicial Intelligence, 102(1), 21–38.
Adams, F., & Campbell, K. (1999). Modality
and abstract concepts. Behavioral and Brain
Sciences, 22(04), 610.
Bakker, B. (2005). The concept of circular
causality should be discarded. Behavioral and
Brain Sciences, 28, 195–196.
Baldwin, D. A., & Baird, J. A. (2001). Discerning intentions in dynamic human action.
Trends in Cognitive Sciences, 5(4), 171–178.
doi:10.1016/S1364-6613(00)01615-6
Adams, F., & Aizawa, K. (2001). The bounds
of cognition. Philosophical Psychology, 14(1),
43–64.
Baldwin, D. A., Andersson, A., Saffran, J. R.,
& Meyer, M. (2008). Segmenting dynamic
human action via statistical structure. Cognition, 106(3), 1382–1407. doi:10.1016/j.
Anscombe, G. E. M. (1957). Intention. Oxford: cognition.2007.07.005
Basil Blackwell.
Baldwin, D. A., Baird, J. A., Saylor, M. M., &
Apperly, I. A., & Butterill, S. A. (2009). Do
Clark, M. A. (2001). Infants parse dynamic
humans have two systems to track beliefs
action. Child development, 72(3), 708–717.
and belief-like states? Psychological Review,
116(4), 953–970. doi:10.1037/a0016923
Bargh, J. A. (2006). What have we been priming all these years? On the development,
Arbib, M. A., & Rizzolatti, G. (1997). Neural
mechanisms, and ecology of nonconscious
expectations: a possible evolutionary path
social behavior. European journal of social
from manual skills to language. Communicapsychology, 36(2), 147–168. doi:10.1002/
tion and Cognition, 29, 393–424.
ejsp.336
Astington, J. W. (2006). The developmental
Bargh, J. A., & Chartrand, T. L. (1999). The
interdependence of theory of mind and
unbearable automaticity of being. American
language. In S. C. Levinson & N. J. Enield
psychologist, 54(7), 462.
(Eds.), The roots of human sociality: Culture, cognition, and human interaction (pp.
Bargh, J. A., & Ferguson, M. J. (2000).
179–206). Oxford, UK: Berg.
Beyond behaviorism: On the automaticity
of higher mental processes. Psychological BulAydede, M. (1999). What makes perceptual
letin, 126(6), 925.
symbols perceptual? Behavioral and Brain
Sciences, 22(04), 610–611.
Bargh, J. A., Chen, M., & Burrows, L. (1996).
Automaticity of social behavior: Direct efBadre, D. (2008). Cognitive control, hierfects of trait construct and stereotype action
archy, and the rostro–caudal organization
on construct accessibility. Journal of personalof the frontal lobes. Trends in Cognitive
ity and social psychology, 50, 869–878.
Sciences, 12(5), 193–200. doi:10.1016/j.
tics.2008.02.004
Baron-Cohen, S. (1995). Mindblindness: An essay on autism and theory of mind. Cambridge,
Badre, D., & D’Esposito, M. (2007). FunctionMA: MIT Press.
al magnetic resonance imaging evidence for
a hierarchical organization of the prefronBarresi, J., & Moore, C. (1996). Intentional
tal cortex. Journal of Cognitive Neuroscience,
relations and social understanding. Behav19(12), 2082–2099.
ioral and Brain Sciences, 19, 107–154.
b
Baker, C., Goodman, N., Tenenbaum, J. B.,
Love, B., McRae, K., & Sloutsky, V. (2008).
Theory-based Social Goal Inference (pp.
1447–1452). Presented at the Proceedings
of the 30th Annual Conference of the Cognitive Science Society.
Barsalou, L. W. (1999). Perceptual Symbol
Systems. Behavioral and Brain Sciences, 22,
577–660.
Barsalou, L. W. (2008). Grounded cognition.
Annual Review of Psychology, 59, 617–645.
133
Bechtel, W. (1998). Representations and
Cognitive Explanations: Assessing the Dynamicist’s Challenge in Cognitive Science.
Cognitive Science, 22(3), 295–318.
Bechtel, W., & Richardson, R. C. (1993).
Discovering Complexity. Decomposition and
Localization as Strategies in Scientiic
Research (p. 286). Princeton: Princeton
University Press.
Infant Behavior and Development, 22(3),
367–382.
Botvinick, M. (2008). Hierarchical models of
behavior and prefrontal function. Trends in
Cognitive Sciences, 12(5), 201–208.
Brass, M., & Haggard, P. (2008). The What,
When, Whether Model of Intentional
Action. The Neuroscientist, 14(4), 319–325.
doi:10.1177/1073858408317417
Brass, M., & Heyes, C. M. (2005). Imitation:
Is cognitive neuroscience solving the correspondence problem? Trends in Cognitive
Sciences, 9(10), 489–495.
Beer, R. (2000). Dynamical approaches to
cognitive science. Trends in Cognitive Sciences, Bratman, M. E. (1981). Intention and meansend reasoning. The Philosophical Review,
4(3), 91–99.
90(2), 252–265.
Behne, T., Carpenter, M., Call, J., & TomaBratman, M. E. (1987). Intention, plans, and
sello, M. (2005). Unwilling Versus Unable:
practical reason. Cambridge, MA: Harvard
Infants’ Understanding of Intentional
University Press.
Action. Developmental Psychology, 41(2),
328–337. doi:10.1037/0012-1649.41.2.328
Bratman, M. E. (1993). Shared intention. EthBekkering, H., & Wohlschlager, A. (2002).
Action perception and imitation: A tutorial.
In W. Prinz & B. Hommel (Eds.), Attention
& Performance XIX: Common mechanisms in
perception and action (pp. 294–314). Oxford,
UK: Oxford University Press.
ics, 104(1), 97*113.
Brooks, Rodney. (1986). A Robust Layered
Control System for a Mobile Robot. IEEE
Journal of Robotics and Automation, 2(1),
14–23.
Brooks, Rodney. (1991a). Intelligence withBekkering, H., Wohlschlager, A., & Gattis, M.
out representation. Artiicial Intelligence, 47,
(2000). Imitation of Gestures in Children is
139–159.
Goal-directed. The Quarterly Journal of Experimental Psychology Section A: Human Experimen- Brooks, Rodney. (1991b). New approaches to
robotics. Science, 253(5025 ), 1227–1233.
tal Psychology, 53(1), 153–164.
Bennett, M., & Hacker, P. (2003). Philosophical Buccino, G., Binkofski, F., & Riggio, L.
(2004a). The mirror neuron system and acfoundations of neuroscience (pp. XVII, 461).
tion recognition. Brain and Language, 89(2),
Malden, MA: Blackwell Publishing.
370–376.
Bernieri, F. J., & Rosenthal, R. (1991). InterBuccino, G., Binkofski, F., Fink, G., Fadiga,
personal coordination: Behavior matchL., Fogassi, L., Gallese, V., Seitz, R., et al.
ing and interactional synchrony. In R. S.
(2001). Action observation activates premoFeldman & B. Rime (Eds.), Fundamentals of
tor and parietal areas in a somatotopic
nonverbal behavior (pp. 401–432). New York:
manner: an fMRI study. European Journal of
Cambridge University Press.
Neuroscience, 13(2), 400–404.
Bernieri, F. J., Reznick, J. S., & Rosenthal,
Buccino, G., Vogt, S., Ritzl, A., Fink, G., Zilles,
R. (1988). Synchrony, pseudosynchrony,
K., Freund, H.-J., & Rizzolatti, G. (2004b).
and dissynchrony: Measuring the entrainNeural circuits underlying imitation learnment process in mother-infant interactions.
ing of hand actions: An event-related fMRI
Journal of personality and social psychology,
study. Neuron, 42(2), 323–334.
54(2), 243.
Bigelow, A. E., & Birch, S. A. J. (1999). The ef- Buffart, H., Leeuwenberg, E., & Restle, F.
(1981). Coding theory of visual pattern
fects of contingency in previous interactions
completion. Journal of Experimental Psycholon infants’ preference for social partners.
references
Barsalou, L. W., Simmons, K., Barbey, A., &
Wilson, C. (2003). Grounding conceptual
knowledge in modality-speciic systems.
Trends in Cognitive Sciences, 7(2), 84–91.
134
Chartrand, T. L., & Bargh, J. A. (1999). The
Chameleon Effect: The Perception-Behavior
Link and Social Interaction. Journal of personBurgess, P. W. W., Veitch, E., de Lacy Costello,
ality and social psychology, 76(6), 893–910.
A., & Shallice, T. (2000). The cognitive and
neuroanatomical correlates of multitasking. Chemero, T. (2000). Anti-RepresentationalNeuropsychologia, 38(6), 848–863.
ism and the Dynamical Stance. Philosophy of
Science, 67(4), 625–647.
Butterill, S. A., & Sebanz, N. (2011). Editorial: Joint Action: What Is Shared? Review of Chemero, T., Silberstein, M., & Sloutsky, V.
Philosophy and Psychology, 2, 137–146.
(2008). Defending Extended Cognition.
In B. Love, K. McRae, & V. Sloutsky (Eds.),
Bylander, T., Allemang, D., Tanner, M., &
Proceedings of the 30th Annual Conference of
Josephson, J. (1991). The Computationalthe Cognitive Science Society (pp. 129–134).
Complexity of Abduction. Artiicial IntelProceedings of the 30th Annual Meeting of
ligence, 49(1-3), 25–60.
the Cognitive Science Society.
Byrne, R. W., & Russon, A. (1998). Learning
Chiel, H., & Beer, R. (1997). The brain has
by imitation: A hierarchical approach. Bea body: Adaptive behavior emerges from
havioral and Brain Sciences, 21(05), 667–684.
interactions of nervous system, body and
doi:doi:null
environment. Trends in Neurosciences, 20(12),
ogy - Human Perception and Performance, 7(2),
241.
c
553–556.
Calvo-Merino, B., Glaser, D., Grezes, J., Passingham, R. E., & Haggard, P. (2005). Action Chong, T., Cunnington, R., Williams, M.,
Observation and Acquired Motor Skills: An
Kanwisher, N., & Mattingley, J. (2008).
fMRI Study with Expert Dancers. Cerebral
fMRI Adaptation Reveals Mirror Neurons
Cortex, 15(8), 1243–1249. doi:10.1093/cerin Human Inferior Parietal Cortex, 18(20),
cor/bhi007
1576–1580.
Calvo-Merino, B., Grezes, J., Glaser, D., PassChristoff, K., & Gabrieli, J. D. E. (2000). The
ingham, R. E., & Haggard, P. (2006). Seeing
frontopolar cortex and human cognition:
or Doing? Inluence of Visual and Motor
Evidence for a rostrocaudal hierarchical
Familiarity in Action Observation. Current
organization within the human prefrontal
Biology, 16, 1905–1910.
cortex. Psychobiology, 28(2), 168–186.
Cannon, E. N., & Woodward, A. L. (2012).
Infants generate goal-based action predictions. Developmental science, 15(2), 292–298.
doi:10.1111/j.1467-7687.2011.01127.x
Christoff, K., Ream, J. M., Geddes, L. P. T.,
& Gabrieli, J. D. E. (2003). Evaluating SelfGenerated Information: Anterior Prefrontal Contributions to Human Cognition.
Behavioral Neuroscience, 117(6), 1161–1168.
doi:10.1037/0735-7044.117.6.1161
Carpendale, J. I. M., & Lewis, C. (2004).
Constructing an understanding of mind:
the development of children’s social under- Churchland, P. M. (1981). Eliminative Matestanding within social interaction. Behavioral
rialism and the Propositional Attitudes. The
and Brain Sciences, 27(1), 79–151.
Journal of Philosophy, 78(2), 67–90.
Cesario, J., Plaks, J. E., Hagiwara, N., NaClark, A. (1997). Being there: Putting body,
varrete, C. D., & Higgins, E. T. (2010). The
brain, and world together again. Cambridge,
ecology of automaticity. How situational
MA: MIT Press.
contingencies shape action semantics and
social behavior. Psychological Science, 21(9),
Clark, A. (2003). Natural-born cyborgs: minds,
1311–1317. doi:10.1177/0956797610378685
technologies, and the future of human intelligence
(p. 229 blz.).
Chalmers, D. (1995). On implementing a
computation. Minds and Machines, 4(4),
Clark, A. (2006). Language, embodiment,
391–402.
and the cognitive niche. Trends in Cognitive
Sciences, 10(8), 370–374.
Charniak, E., & Goldman, A. (1993). A Bayesian Model of Plan Recognition. Artiicial
Clark, A. (2008). Supersizing the Mind: EmbodiIntelligence, 64(1), 53–79.
ment, Action, and Cognitive Extension (pp.
135
Davies, M., & Stone, T. (1995). Folk Psychology:
The Theory of Mind Debate (p. 350). Cambridge, MA: Wiley-Blackwell.
Cohen, R. G., & Rosenbaum, D. A. (2004).
Where grasps are made reveals how grasps
De Bruin, L. C., & Newen, A. (2012). An asare planned: generation and recall of motor
sociation account of false belief understandplans. Experimental Brain Research, 157(4),
ing. Cognition.
486–495. doi:10.1007/s00221-004-1862-9
de Bruin, L., Strijbos, D., & Slors, M. (2011).
Cooper, R., & Shallice, T. (2006). HierarchiEarly Social Cognition: Alternatives to
cal Schemas and Goals in the Control of
Implicit Mindreading. Review of Philosophy
Sequential Behavior. Psychological Review,
and Psychology, 2(3), 499–517. doi:10.1007/
113(4), 887–916.
s13164-011-0072-1
Craver, C. F., & Bechtel, W. (2007). Top-down De Jaegher, H., Di Paolo, E., & Gallagher, S.
causation without top-down causes. Biology
(2010). Can social interaction constitute
and Philosophy, 22(4), 547–563.
social cognition? Trends in Cognitive Sciences,
14(10), 441–447.
Csibra, G. (2007). Action mirroring and action interpretation: An alternative account. De Ruiter, J., Noordzij, M. L., Newman-NorIn P. Haggard, Y. Rossetti, & M. Kawato
lund, S., Hagoort, P., & Toni, I. (2007). On
(Eds.), Sensorimotor Foundations of Higher
the origin of intentions. In P. Haggard & Y.
Cognition. Attention and Performance XXII (pp. Rossetti (Eds.), Attention & Performance 22.
427–451). Oxford: Oxford University Press.
Sensorimotor Foundations of Higher Cognition
Attention and Performance (pp. 593–609).
Csibra, G., & Gergely, G. (2007). “Obsessed
Oxford: Oxford University Press.
with goals”: functions and mechanisms of
teleological interpretation of actions in
de Vignemont, F., & Haggard, P. (2008).
humans. Acta Psychologica, 124(1), 60–78.
Action observation and execution: What is
doi:10.1016/j.actpsy.2006.09.007
shared? Social Neuroscience, 3(3), 421–433.
Cuijpers, R., Van Schie, H. T., Koppen, M.,
Erlhagen, W., & Bekkering, H. (2006).
Goals and means in action observation: A
computational approach. Neural Networks.
Cummins, R. (1989). Meaning and mental
representation (pp. VIII, 180).
d
Damasio, A. (1989). Time-locked multiregional retroactivation: A systems-level
proposal for the neural substrates of recall
and recognition. Cognition, 33, 25–62.
Damasio, A. R. (1985). Understanding the
mind’s will. Behavioral and Brain Sciences, 8,
589. doi:10.1017/S0140525X00045180
Daum, M. M., Vuori, M. T., Prinz, W., &
Aschersleben, G. (2009). Inferring the size
of a goal object from an actor’s grasping
movement in 6- and 9-month-old infants.
Developmental science, 12(6), 854–862.
doi:10.1111/j.1467-7687.2009.00831.x
Davidson, D. (1963). Actions, Reasons,
and Causes. Journal of Philosophy, 60(23),
685–700.
Davidson, D. (1980). Essays on actions and
events. New York: Oxford University Press.
de Vignemont, F., & Singer, T. (2006). The
empathic brain: how, when and why? Trends
in Cognitive Sciences, 10(10), 435–441.
Decety, J., & Grezes, J. (2006). The power
of simulation: Imagining one“s own and
other”s behavior. Brain Research, 1079, 4–14.
Dennett, D. C. (1978). Brainstorms, Philosophical Essays on the Mind and Psychology. Montgomery, VT: Bradford Books.
Dennett, D. C. (1987). The Intentional Stance
(p. 388). Cambridge, MA: MIT Press.
Dennett, D. C. (1989). Cognitive Ethology:
Hunting for bargains or a wild goose chase?
In A. Monteiore & C. Noble (Eds.), Goals,
no-goals, and own goals (pp. 101–116). London: Unwin Hyman.
Dennett, D. C. (1991). Consciousness Explained.
Boston: Little, Brown and Co.
Di Pellegrino, G., Fadiga, L., Fogassi, L., Gallese, V., & Rizzolatti, G. (1992). Understanding motor events: a neurophysiological
study. Experimental Brain Research, 91(1),
176–180.
references
XXIX, 286). New York: Oxford University
Press.
136
Dietrich, E., & Markman, A. (2001). Dynamical description versus dynamical modeling.
Trends in Cognitive Sciences, 5(8), 332.
Dietrich, E., & Markman, A. (2003). Discrete
Thoughts: Why Cognition Must Use Discrete Representations. Mind & Language,
18(1), 95–119.
Dijksterhuis, A., & Nordgren, L. F. (2006). A
theory of unconscious thought. Perspectives
on Psychological Science, 1(2), 95–109.
Dinstein, I., Thomas, C., Behrmann, M., &
Heeger, D. (2008). A mirror up to nature.
Current Biology, 18(3), R13–R18.
Elsner, B. (2007). Infants’ imitation of goaldirected actions: The role of movements
and action effects. Acta Psychologica, 124(1),
44–59.
Elsner, B., & Aschersleben, G. (2003). Do I
get what you get? Learning about the effects
of self-performed and observed actions in
infancy. Consciousness and Cognition, 12(4),
732–751.
Erlhagen, W., Mukovskiy, A., & Bicho, E.
(2006). A dynamic model for action understanding and goal-directed imitation. Brain
Research, 1083, 174–188.
Dretske, F. (1988). Explaining behavior: reasons
in a world of causes (pp. XI, 165). Cambridge,
MA: MIT Press.
f
Duysens, J., & Van de Crommert, H. (1998).
Neural control of locomotion; Part 1. The
central pattern generator from cats to humans. Gait & Posture, 7(2), 131–141.
Falck-Ytter, T., Gredebäck, G., & Hofsten, von,
C. (2006). Infants predict other people’s
action goals. Nature Neuroscience, 9(7),
878–879. doi:10.1038/nn1729
Eckerman, C. O., & Peterman, K. (2001).
Peers and infant social/communicative
development. In G. Bremner & A. Fogel
(Eds.), Blackwell Handbook of Infant Development (pp. 326–350). Malden, MA: Blackwell
Publishers.
Felleman, D., & Van Essen, D. (1991). Distributed hierarchical processing in the primate
cerebral cortex. Cerebral Cortex, 1(1), 1–47.
e
Edelman, G., Tononi, G., & Haier, R. (2003).
A Universe of Consciousness: How Matter
Becomes Imagination. Contemporary psychology, 48(1), 92–93.
Egner, T. (2009). Prefrontal cortex and cognitive control: motivating functional hierarchies. Nature Neuroscience, 12(7), 821–822.
doi:10.1038/nn0709-821
Fadiga, L., Craighero, L., & Olivier, E. (2005).
Human motor cortex excitability during the
perception of others’ action. Current Opinion
in Neurobiology, 15(2), 213–218.
Ferrari, P. F., Gallese, V., Rizzolatti, G., & Fogassi, L. (2003). Mirror neurons responding
to the observation of ingestive and communicative mouth actions in the monkey ventral premotor cortex. The European journal of
neuroscience, 17(8), 1703–1714.
Ferrari, P., Rozzi, S., & Fogassi, L. (2005).
Mirror Neurons Responding to Observation of Actions Made with Tools in Monkey
Ventral Premotor Cortex. Journal of Cognitive
Neuroscience, 17(2), 212–226.
Eiter, T., & Gottlob, G. (1995). The Complex- Fodor, J. (1975). The language of thought (pp.
ity of Logic-Based Abduction. Journal of the
x, 214). New York: Crowell.
Association for Computing Machinery, 42(1),
3–42.
Fodor, J. A. (1983). The modularity of mind. an
essay on faculty psychology (p. 145). The
Eliasmith, C. (2005). A new perspective on
MIT Press.
representational problems. Journal of Cognitive Science, 6, 97–123.
Fogassi, L., & Luppino, G. (2005). Motor
functions of the parietal lobe. Current OpinEliasmith, C. (2010). How we ought to deion in Neurobiology.
scribe computation in the brain. Studies In
History and Philosophy of Science Part A, 41(3), Frank, S., Haselager, W. F. G., & van Rooij, I.
313–320.
(2009). Connectionist semantic systematicity. Cognition, 110(3), 358–379.
Elman, J. L. (2004). An alternative view of
the mental lexicon. Trends in Cognitive
Friend, M., & Pace, A. (2011). Beyond event
Sciences, 8(7), 301–306. doi:10.1016/j.
segmentation: Spatial-and social-cognitive
tics.2004.05.003
processes in verb-to-action mapping. Develop-
137
mental Psychology, 47(3), 867–876.
Gallese, V., & Sinigaglia, C. (2011). What
is so special about embodied simulation?
Trends in Cognitive Sciences, 15(11), 512–519.
doi:10.1016/j.tics.2011.09.003
Friston, K. (2005). A theory of cortical responses. Philosophical Transactions of the Royal
Society B-Biological Sciences, 360(1456), 815.
Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). Action recognition in the
Fritsch, G., & Hitzig, E. (1870). Ueber die elepremotor cortex. Brain, 119(2), 593–610.
ktrische Erregbarkeit des Grosshirns. Archiv
fuer Anatomie, Physiologie und wissenschaftliche Gallese, V., Keysers, C., & Rizzolatti, G.
Medicin, 300–332.
(2004). A unifying view of the basis of social
cognition. Trends in Cognitive Sciences, 8(9),
Frixione, M. (2001). Tractable competence.
396–403.
Minds and Machines, 11(379-397).
Gallese, V., Rochat, M., Cossu, G., & SinigaFuster, J. M. (2001). The Prefrontal Cortex—
glia, C. (2009). Motor Cognition and Its
An Update: Time Is of the Essence. Neuron,
Role in the Phylogeny and Ontogeny of
30, 319–333.
Action Understanding. Developmental Psychology, 45(1), 103–113.
Fuster, J. M. (2004). Upper processing stages
of the perception-action cycle. Trends in
Garey, M., & Johnson, D. (1979). Computers
Cognitive Sciences, 8(4), 143–145.
and intractability: a guide to the theory of NPcompleteness (p. 338 blz.).
Fuster, J. M., Bauer, R. H., & Jervey, J. P.
(1982). Cellular discharge in the dorsoGentilucci, M., Bertolani, L., Benuzzi, F.,
lateral prefrontal cortex of the monkey in
Negrotti, A., Pavesi, G., & Gangitano, M.
cognitive tasks. Experimental Neurology, 77(3),
(2000). Impaired control of an action after
679–694. doi:10.1016/0014-4886(82)90238supplementary motor area lesion: a case
2
study. Neuropsychologia, 38(10), 1398–1404.
Fuster, J. M., Bodner, M., & Kroger, J. K.
Gergely, G., Bekkering, H., & Király, I. (2002).
K. (2000). Cross-modal and cross-temRational imitation in preverbal infants.
poral association in neurons of frontal
Nature, 415(6873 ), 755.
cortex. Nature, 405(6784), 347–351.
doi:10.1038/35012613
Gibson, J. (1979). The ecological approach to
visual perception (pp. XVI, 332).
Gallese, V. (2001). The “Shared Manifold”
Hypothesis: From Mirror Neurons To EmGlenberg, A. (1997). What memory is for.
pathy. Journal of Consciousness Studies, 8(5-7),
Behavioral and Brain Sciences, 20(1), 1–18.
33–50.
Glenberg, A. (2006). Naturalizing cognition:
Gallese, V. (2007). Before and below “theory
The integration of cognitive science and
of mind”: embodied simulation and the
biology. Current Biology, 16(18), R802–R804.
neural correlates of social cognition.
Goldberg, G. (1985). Supplementary Motor
Philosophical Transactions of the Royal Society
Area Structure and Function - Review and
B-Biological Sciences, 362(1480), 659–669.
Hypotheses. Behavioral and Brain Sciences,
Gallese, V. (2009). Mirror Neurons, Embod8(4), 567–588.
ied Simulation, and the Neural Basis of
Social Identiication. Psychoanalytic Dialogues, Goldman, A. (2006). Simulating minds: the
philosophy, psychology, and neuroscience of
19(5), 519–536.
mindreading. Philosophy of mind series (pp. IX,
Gallese, V., & Goldman, A. (1998). Mirror
364).
neurons and the simulation theory of mindGoldman, A. (2009). Mirroring, Simulating
reading. Trends in Cognitive Sciences, 2(12),
and Mindreading. Mind & Language, 24(2),
493–501.
235–252.
g
references
Fries, P. (2005). A mechanism for cognitive dynamics: neuronal communication
through neuronal coherence. Trends in
Cognitive Sciences, 9(10), 474–480.
Gallese, V., & Lakoff, G. (2005). The Brain’s
concepts. Cognitive Neuropsychology, 22(3),
455–479.
138
Goldman, A. I. (1989). Interpretation
Psychologized. Mind & Language, 4(3),
161–185.
Gordon, R. M. (1986). Folk Psychology
as Simulation. Mind & Language, 1(2),
158–171.
Grafton, S. T., & Hamilton, A. (2007). Evidence for a distributed hierarchy of action
representation in the brain. Human Movement Science, 26(4), 590–616.
Graziano, M. S. A. (2010). Ethologically relevant movements mapped onto the motor
cortex. In A. Ghazanfar & M. Platt (Eds.),
Primate Neuroethology (pp. 454–470.). New
York: Oxford University Press.
Haken, H., Kelso, J. A. S., & Bunz, H. (1985).
A theoretical model of phase transitions
in human hand movements. Biological
Cybernetics, 51(5), 347–356. doi:10.1007/
BF00336922
Hamilton, A. (2009). Research review: Goals,
intentions and mental states: challenges for
theories of autism. Journal of Child Psychology
and Psychiatry, 50(8), 881–892.
Hamilton, A., & Grafton, S. T. (2006). Goal
representation in human anterior intraparietal sulcus. Journal of Neuroscience, 26(4),
1133–1137.
Hamilton, A., & Grafton, S. T. (2007). The
motor hierarchy: from kinematics to goals
and intentions. In P. Haggard, Y. Rossetti, &
Graziano, M. S. A., & Alalo, T. (2007). MapM. Kawato (Eds.), Attention & Performance
ping Behavioral Repertoire onto the Cortex. 22. Sensorimotor Foundations of Higher CogniNeuron, 56(2), 239–251.
tion Attention and Performance (pp. 381–408).
Oxford: Oxford University Press.
Graziano, M. S. A., Taylor, C., & Moore, T.
(2002). Complex Movements Evoked by Mi- Hamilton, A., & Grafton, S. T. (2008). Action
crostimulation of Precentral Cortex. Neuron,
outcomes are represented in human infe34(5), 841–851.
rior frontoparietal cortex. Cerebral Cortex,
Gredebäck, G., & Melinder, A. (2010).
Infants’ understanding of everyday social
interactions: A dual process account. Cognition, 114, 197–206. doi:10.1016/j.cognition.2009.09.004
18(5), 1160–1168.
Hamlin, J. K., Hallinan, E. V., & Woodward, A.
L. (2008). Do as I do: 7‐month‐old infants
selectively reproduce others’ goals. Developmental science, 11(4), 487–494.
Gredebäck, G., Stasiewicz, D. D., Falck-Ytter,
Haruno, M., Wolpert, D. M., & Kawato,
T., Rosander, K., & Hofsten, von, C. (2009).
M. (2001). MOSAIC Model for SensoAction type and goal type modulate goalrimotor Learning and Control. Neural
directed gaze shifts in 14-month-old infants.
Computation, 13(10), 2201–2220. doi:d
Developmental Psychology, 45(4), 1190–1194.
oi:10.1162/089976601750541778
doi:10.1037/a0015667
Harvey, I. (1996). Untimed and misrepreGrezes, J., & Decety, J. (2001). Functional
sented: connectionism and the computer
anatomy of execution, mental simulation,
metaphor. AISB Quarterly, 96, 20–27.
observation, and verb generation of actions:
A meta-analysis. Human Brain Mapping,
Haselager, W. F. G. (1997). Cognitive science
12(1), 1–19.
and folk psychology: the right frame of mind (pp.
viii, 165). London, etc.: Sage.
Grezes, J., Armony, J., Rowe, J., & Passingham,
R. E. (2003). Activations related to “mirror” Haselager (submitted) Did I do that? Brainand “canonical” neurones in the human
Computer Interfacing and the sense of
brain: an fMRI study. NeuroImage, 18(4),
agency.
928–937.
Haselager, W. F. G., De Groot, A., & Van RapGrush, R. (1997). The architecture of reppard, H. (2003). Representationalism vs.
resentation. Philosophical Psychology, 10(1),
anti-representationalism: a debate for the
5–24.
sake of appearance. Philosophical Psychology,
16(1), 5–23.
Haggard, P. (2005). Conscious intention and
motor cognition. Trends in Cognitive Sciences, Haselager, W. F. G., van Dijk, J., & van Rooij, I.
9(6), 290–295.
(2008). A Lazy Brain? Embodied Embedded
h
Haugeland, J., & Rumelhart, D. (1991).
Iacoboni, M., Woods, R., Brass, M., BekkerRepresentational Genera. In W. Ramsey, S.
ing, H., Mazziotta, J. C., & Rizzolatti, G.
Stich, & D. Rumelhart (Eds.), Philosophy and
(1999). Cortical mechanisms of human
Connectionist Theory. Hillsdale, N.J.: Erlbaum. imitation. Science, 286(5449), 2526–2528.
Haynes, J.-D., & Rees, G. (2006). Decoding
mental states from brain activity in humans.
Nature Reviews: Neuroscience, 7, 523–534.
Haynes, J.-D., Sakai, K., Rees, G., Gilbert, S.,
Frith, C. D., & Passingham, R. E. (2007).
Reading Hidden Intentions in the Human
Brain. Current Biology, 17(4), 323–328.
doi:10.1016/j.cub.2006.11.072
Heyes, C. M. (in press). What can imitation
do for cooperation? In B. Calcott, R. Joyce,
& K. Sterelny (Eds.), Signalling, Commitment
and Emotion. Cambridge, MA: MIT Press.
j
Jacob, P. (2008). What Do Mirror Neurons
Contribute to Human Social Cognition?
Mind & Language, 23(2), 190–223.
Jacob, P. (2009). A Philosopher’s Relections
on the Discovery of Mirror Neurons. Topics
in Cognitive Science, 1(3), 570–595.
Jacob, P., & Jeannerod, M. (2005). The motor
theory of social cognition: a critique. Trends
in Cognitive Sciences, 9(1), 21–25.
James, W. (1890). The principles of psychology
(p. 2 v.). New York: H. Holt and company.
Jeannerod, M. (1994). The representing
Hickok, G. (2009). Eight Problems for the
brain: Neural correlates of motor intention
Mirror Neuron Theory of Action Understanding in Monkeys and Humans. Journal of and imagery. Behavioral and Brain Sciences,
17(2), 187–201.
Cognitive Neuroscience, 21(7), 1229–1243.
Hommel, B. (2003). Planning and Represent- Juarrero, A. (2002). Dynamics in Action. Intentional Behavior As a Complex System (p.
ing Intentional Action. The Scientiic World,
300). Cambridge, MA: The MIT Press.
3, 593–608.
Hommel, B. (2004). Event iles: feature binding in and across perception and action.
Trends in Cognitive Sciences, 8(11), 494–500.
k
Kakei, S., Hoffman, D. S., & Strick, P. L.
(1999). Muscle and movement representations in the primary motor cortex. Science,
285(5436), 2136–2139.
Hommel, B., Müsseler, J., Aschersleben, G.,
Keijzer, F. (2001). Representation and behavior
& Prinz, W. (2001). The Theory of Event
(pp. viii, 276).
Coding (TEC): A framework for perception
and action planning. Behavioral and Brain
Kelso, J. A. S. (1995). Dynamic patterns: the selfSciences, 24 (5), 849–877.
organization of brain and behavior (pp. XVII,
334). Cambridge, MA: MIT Press.
Hubel, D., & Wiesel, T. (1959). Receptive
ields of single neurones in the cat’s striate
cortex. Journal of Physiology, 148, 574–591.
Kenward, B., Folke, S., Holmberg, J., Johansson, A., & Gredebäck, G. (2009). Goal directedness and decision making in infants.
Hume, D. (1739). A Treatise on Human Nature.
Developmental Psychology, 45(3), 809–819.
(P. H. Nidditch, Ed.) (2nd ed.). Oxford:
doi:10.1037/a0014076
Clarendon Press.
Keysers, C., & Gazzola, V. (2006). Towards a
Humphreys, G., & Forde, E. (1998). Disordered action schema and action disorganisa- unifying neural theory of social cognition.
Progress in brain research, 156, 379–402.
tion syndrome. Cognitive Neuropsychology, 15,
771–812.
139
references
i
Cognition and Neuroscience. In P. Calvo
Iacoboni, M. (2008). Mirroring People. The new
& T. Gomila (Eds.), Handbook of Cognitive
science of how we connect with others. New York:
Science. An Embodied Approach (pp. 273–290).
Farrar, Straus and Giroux.
Oxford: Elsevier.
Iacoboni, M., Molnar-Szakacs, I., Gallese, V.,
Hauf, P. (2006). Infants’ perception and
Buccino, G., Mazziotta, J. C., & Rizzolatti, G.
production of intentional actions. Progress
(2005). Grasping the intentions of others
in brain research, 164, 285–301. doi:10.1016/
with one’s owns mirror neuron system. PLoS
S0079-6123(07)64016-3
Biology, 3(3), e79.
140
Keysers, C., & Gazzola, V. (2007). Integrating
simulation and theory of mind. Trends in
Cognitive Sciences, 11(5), 194–196.
Kiebel, S. J., Daunizeau, J., & Friston, K.
J. (2008). A hierarchy of time-scales and
the brain. Plos Computational Biology,
4(11), e1000209. doi:10.1371/journal.
pcbi.1000209
Kilner, J., Friston, K., & Frith, C. D. (2007a).
Predictive coding: an account of the mirror neuron system. Cognitive Processing, 8,
159–166.
Kilner, J., Friston, K., & Frith, C. D. (2007b).
The mirror-neuron system: a Bayesian perspective. NeuroReport, 18(6), 619–623.
of manual feeding and lying spoons. Child
development, 81(6), 1729–1738. doi:10.1111/
j.1467-8624.2010.01506.x
Koechlin, E., & Summerield, C. (2007).
An information theoretical approach to
prefrontal executive function. Trends in
Cognitive Sciences, 11(6), 229–235.
Koechlin, E., Basso, G., Pietrini, P., Panzer,
S., & Grafman, J. (1999). The role of
the anterior prefrontal cortex in human
cognition. Nature, 399(6732), 148–151.
doi:10.1038/20178
Koechlin, E., Ody, C., & Kouneiher, F. (2003).
The Architecture of Cognitive Control
in the Human Prefrontal Cortex. Science,
302(5648), 1181–1185. doi:10.1126/science.1088545
Kilner, J., Neal, A., Weiskopf, N., Friston, K.,
& Frith, C. D. (2009). Evidence of Mirror
Neurons in Human Inferior Frontal Gyrus. Kouneiher, F., Charron, S., & Koechlin, E.
Journal of Neuroscience, 29(32), 10153–10159.
(2009). Motivation and cognitive control
doi:10.1523/jneurosci.2668-09.2009
in the human prefrontal cortex. Nature
Neuroscience, 12(7), 939–945. doi:10.1038/
Kim, J. (1993). Supervenience and Mind: Selected
nn.2321
Philosophical Essays (p. 400). Cambridge:
Cambridge University Press.
Kovács, A. M., Téglás, E., & Endress, A. D.
(2010). The social sense: susceptibility to
Kim, J. (2000). Mind in a Physical World. Camothers’ beliefs in human infants and adults.
bridge, Ma: MIT Press.
Science, 330(6012), 1830–1834. doi:10.1126/
science.1190792
Kirkham, N. Z., Slemmer, J. A., & Johnson,
S. P. (2002). Visual statistical learning in in- Kozak, M. N., Marsh, A. A., & Wegner, D.
fancy: evidence for a domain general learnM. (2006). What do I think you’re doing?
ing mechanism. Cognition, 83(2), B35–B42.
Action identiication and mind attribution.
doi:10.1016/S0010-0277(02)00004-5
Journal of personality and social psychol-
Klossek, U. M. H., & Dickinson, A. (2012). Ra- ogy, 90(4), 543–555. doi:10.1037/00223514.90.4.543
tional action selection in 1½- to 3-year-olds
following an extended training experience. Lakin, J. L., Jefferis, V. E., Cheng, C. M., &
Journal of experimental child psychology, 111(2),
Chartrand, T. L. (2003). The Chameleon
197–211. doi:10.1016/j.jecp.2011.08.008
Effect as Social Glue: Evidence for the
Evolutionary Signiicance of Nonconscious
Klossek, U. M. H., Russell, J., & Dickinson, A.
Mimicry. Journal of Nonverbal Behavior, 27(3),
(2008). The control of instrumental action
145–162. doi:10.1023/A:1025389814290
following outcome devaluation in young
children aged between 1 and 4 years. Journal Langer, E. (1978). Rethinking the role of
of experimental psychology. General, 137(1),
thought in social interaction. In I. H.
39–51. doi:10.1037/0096-3445.137.1.39
Harvey, W. I. Ickes, & R. F. Kidd (Eds.), New
l
Knoblich, G., & Sebanz, N. (2008). Evolving intentions for social interaction: from
entrainment to joint action. Philosophical
Transactions of the Royal Society B-Biological
Sciences, 363(1499), 2021–2031.
directions in attribution research (pp. 36–58).
Hillsdale, NJ: Lawrence Erlbaum Associates,
Publishers.
Lau, H. C., Rogers, R. D. D., Haggard, P.,
& Passingham, R. E. (2004). Attention to
Intention. Science, 303(5661), 1208–1210.
Kochukhova, O., & Gredebäck, G. (2010).
doi:10.1126/science.1090973
Preverbal infants anticipate that food will be
brought to the mouth: an eye tracking study Lewis, D. (2000). Causation as inluence. The
141
Journal of Philosophy, 97(4), 182–197.
Liepelt, R., Cramon, Von, D., & Brass, M.
(2008). What is matched in direct matching? Intention attribution modulates motor
priming. Journal of Experimental Psychology
- Human Perception and Performance, 34(3),
578–591. doi:10.1037/0096-1523.34.3.578
Malle, B. F., & Knobe, J. (1997). The folk concept of intentionality. Journal of Experimental
Social Psychology, 33, 101–121.
Marsh, K. L., Richardson, M. J., Baron, R.
M., & Schmidt, R. C. (2006). Contrasting
Approaches to Perceiving and Acting With
Others. Ecological Psychology, 18(1), 1–38.
doi:10.1207/s15326969eco1801_1
Lingnau, A., & Petris, S. (2012). Action
understanding inside and outside the motor Mason, C., Gomez, J., & Ebner, T. (2001).
system: the role of task dificulty. Cerebral
Hand synergies during reach-to-grasp. JourCortex
nal of Neurophysiology, 86(6), 2896.
Lingnau, A., Gesierich, B., & Caramazza, A.
McFarland, D. (1989). Goals, no-goals and
(2009). Asymmetric fMRI adaptation reveals
own goals. In A. Monteiore & D. Noble
no evidence for mirror neurons in humans.
(Eds.), Goals, no-goals and own goals: a debate
Proceedings of the National Academy of Scion goal-directed and international behaviour
ences of the United States of America, 106(24),
(pp. 39–57). London: Unwin Hyman.
9925–9930.
Mead, G. H. (1934). Mind, Self, and Society. (C.
Loucks, J., & Baldwin, D. A. (2006). When is
W. Morris, Ed.) (p. 401). Chicago, London:
a grasp a grasp? Characterizing some basic
University Of Chicago Press.
components of human action processing.
Meinhardt, J., Sodian, B., Thoermer, C., DöhIn K. Hirsh-Pasek & R. Golinkoff (Eds.),
nel, K., & Sommer, M. (2011). True- and
Action meets words: How children learn verbs
false-belief reasoning in children and adults:
(pp. 228–261). New York: Oxford University
An event-related potential study of theory of
Press.
mind. Accident Analysis and Prevention, 1(1),
Luo, Y. (2011). Three-month-old infants attri67–76. doi:10.1016/j.dcn.2010.08.001
bute goals to a non-human agent. DevelopMeltzoff, A. N. (1995). Understanding the
mental science, 14(2), 453–460.
intentions of others: Re-enactment of
Luo, Y., & Baillargeon, R. (2005). Can a selfintended acts by 18-month-old children.
propelled box have a goal? Psychological
Developmental Psychology, 31(5), 838–850.
reasoning in 5-month-old infants. PsychologiMeltzoff, A. N., & Brooks, R. (2001). “Like
cal Science, 16(8), 601–608. doi:10.1111/
Me” as a Building Block for Understanding
j.1467-9280.2005.01582.x
Other Minds: Bodily Acts, Attention, and
Luo, Y., & Baillargeon, R. (2010). Toward a
Intention. In B. F. Malle, L. J. Moses, & D. A.
mentalistic account of early psychological
Baldwin (Eds.), Intentions and intentionality:
reasoning. Current Directions in Psychological
foundations of social cognition (pp. 171–191).
Science, 19(5), 301–307.
Cambridge, MA: MIT Press.
Lycan, W. G., & Pappas, G. S. (1972). What
is eliminative materialism? Australasian
Journal of Philosophy, 50(2), 149–159.
doi:10.1080/00048407212341181
m
Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function.
Annual Review of Neuroscience, 24, 167–202.
Millikan, R. (1984). Language, thought, and
Mahon, B., & Caramazza, A. (2008). A critical
other biological categories: new foundations for
look at the embodied cognition hypothesis
realism (p. 355). Cambridge, MA: MIT Press.
and a new proposal for grounding conMillikan,
R. G. (1995). Pushmi-Pullyu
ceptual content. Journal of Physiology-Paris,
representations.
Philosophical Perspectives, 9,
102(1-3), 59–70.
185–200.
references
Libet, B. (1985). Unconscious cerebral
initiative and the role of conscious will in
voluntary action. Behavioral and Brain Sciences, 8, 529–566.
Maley, C. J. (2010). Analog and digital, continuous and discrete. Philosophical Studies,
155(1), 117–131. doi:10.1007/s11098-0109562-8
142
Miyashita, Y. (2005). Understanding Intentions: Through the Looking Glass. Science,
308(5722), 644–645.
dynamics. Philosophical Psychology, 23(6),
759–773.
Nilsson, N. (1984). Shakey The Robot, Technical
Montgomery, D. (1997). Wittgenstein“s
Note 323.
Private Language Argument and Children”s
Nisbett, R. E., & DeCamp Wilson, T. (1977).
Understanding of the Mind, . DevelopmenTelling more than we can know: Verbal
tal Review, 17(3), 291–320. doi:10.1006/
reports on mental processes. Psychological
drev.1997.0436
Review, 84(3), 231–259.
Moore, C. (2006). The development of comNoli, S., Ikegami, T., & Tani, J. (2008).
monsense psychology (p. 241). Mahah, NJ:
Editorial: Behavior and Mind as a Complex
Lawrence Erlbaum.
Adaptive System. Adaptive Behavior, 16(2-3),
Moore, J. W., Wegner, D. M., & Haggard, P.
101–103. doi:10.1177/1059712308090150
(2009). Modulating the sense of agency with
Nordh, G., & Zanuttini, B. (2005). Propoexternal cues. Consciousness and Cognition,
sitional abduction is almost always hard.
18(4), 1056–1064. doi:10.1016/j.conProceedings of the 19th International Joint Concog.2009.05.004
ference on Artiicial Intelligence (IJCAI-2005),
Moore, M. S. (2011). Philosophical foundaEdinburgh, Scotland, UK, 534–539.
tions of criminal law. In R. A. Duff (Ed.),
Nordh, G., & Zanuttini, B. (2008). What
Philosophical foundations of criminal law (pp.
makes propositional abduction tractable.
179–205). New York: Oxford University
Artiicial Intelligence, 172(10), 1245–1284.
Press.
Moses, L. J. (2001). Some Thoughts on
Ascribing Complex Intentional Concepts to
Young Children. In B. F. Malle, L. J. Moses,
& D. A. Baldwin (Eds.), Intentions and
Intentionality (pp. 69–83). Cambridge, MA:
MIT Press.
Mukamel, R., Ekstrom, A., Kaplan, J., Iacoboni, M., & Fried, I. (2010). Single-Neuron
Responses in Humans during Execution
and Observation of Actions. Current Biology,
20(8), 750–756.
Murata, A., Fadiga, L., Fogassi, L., Gallese,
V., Raos, V., & Rizzolatti, G. (1997). Object
Representation in the Ventral Premotor
Cortex (Area F5) of the Monkey. Journal of
Neurophysiology, 78(4), 2226–2230.
n
Nelson, K. (2007). Young minds in social worlds.
experience, meaning, and memory (p. 315).
Cambridge, MA: Harvard University Press.
Nyström, P., Ljunghammar, T., Rosander,
K., & Hofsten, von, C. (2011). Using mu
rhythm desynchronization to measure mirror neuron activity in infants. Developmental
science, 14(2), 327–335.
o
Olson, D. (1988). On the origin of beliefs
and other intentional states in childeren.
In J. W. Astington, P. Harris, & D. R. Olson
(Eds.), Developing theories of mind (pp.
414–426). Cambridge, UK: Cambridge
University Press.
Ongür, D., & Price, J. L. (2000). The organization of networks within the orbital and
medial prefrontal cortex of rats, monkeys
and humans. Cerebral Cortex, 10(3), 206–219.
Onishi, K. H., & Baillargeon, R. (2005).
Do 15-Month-Old Infants Understand
False Beliefs? Science, 308(5719), 255–258.
doi:10.1126/science.1107621
Newell, A., & Simon, H. A. (1972). Human
Problem Solving (p. 784). Prentice Hall.
Ouden, den, H., Frith, U., Frith, C. D., &
Blakemore, S.-J. (2005). Thinking about
intentions. NeuroImage, 28(4), 787–796.
Newman-Norlund, R. D., Van Schie, H. T.,
Van Zuijlen, A., & Bekkering, H. (2007).
The mirror neuron system is more active
during complementary compared with
imitative action. Nature Neuroscience, 10(7),
817–818.
Oztop, E., Kawato, M., & Arbib, M. A. (2006).
Mirror neurons and imitation: A computationally guided review. Neural Networks,
19(3), 254–271.
Nielsen, K. S. (2010). Representation and
Oztop, E., Wolpert, D. M., & Kawato, M.
(2005). Mental state inference using visual
control parameters. Cognitive Brain Research,
143
22(2), 129–151.
Pacherie, E. (2006). Towards a dynamic
theory of intentions. In S. Pocket, W. P.
Perner, J., & Doherty, M. (2005). Do infants
Banks, & S. Gallagher (Eds.), Does Consciousunderstand that external goals are internalness Cause Behavior? An investigation of the
ly represented? Behavioral and Brain Sciences,
Nature of Volition (pp. 145–167). Cambridge,
28(5), 710–711.
MA: MIT Press.
Perner, J., & Ruffman, T. (2005). Psychology.
Pacherie, E. (2008). The phenomenology of
Infants’ insight into the mind: how deep?
action: A conceptual framework. Cognition,
Science, 308(5719), 214–216. doi:10.1126/
107(1), 179–217.
science.1111656
Pacherie, E., & Haggard, P. (2011). What are Petrides, M. (2005). The Rotral-Caudal Axis
Intentions? In W. Sinnott-Armstrong & L.
of Cognitive Control within the Lateral
Nadel (Eds.), Conscious Will and ResponsibilFrontal Cortex. In S. Dehaene, J.-R. Duity: A Tribute to Benjamin Libet (pp. 70–84).
hamel, M. D. Hauser, & G. Rizzolatti (Eds.),
New York: Oxford University Press.
From Monkey Brain to Human Brain. A Fyssen
Passingham, R. E., Toni, I., & Rushworth, M.
F. S. (2000). Specialisation within the prefrontal cortex: the ventral prefrontal cortex
and associative learning. Experimental Brain
Research, 133(1), 103–113.
Paulus, M. (2011). How infants relate looker
and object: evidence for a perceptual learning account of gaze following in infancy.
Developmental science, 14(6), 1301–1310.
Paulus, M. (in press). Action mirroring and
action understanding: an ideomotor and
attentional account. Psychological Research.
doi:10.1007/s00426-011-0385-9
Paulus, M., Hunnius, S., & Bekkering, H.
(2011a). Can 14- to 20-month-old children
learn that a tool serves multiple purposes?
A developmental study on children’s action goal prediction. Vision Research, 51(8),
955–960. doi:10.1016/j.visres.2010.12.012
Paulus, M., Hunnius, S., van Wijngaarden,
C., Vrins, S., van Rooij, I., & Bekkering, H.
(2011b). The Role of Frequency Information and Teleological Reasoning in Infants.
Developmental Psychology, 47(4), 976–983.
Foundation Symposium (pp. 293–314). Cambridge, MA: MIT Press.
Pezzulo, G., & Dindo, H. (2011). What should
I do next? Using shared representations
to solve interaction problems. Experimental Brain Research, 211(3-4), 613–630.
doi:10.1007/s00221-011-2712-1
Pezzulo, G., Butz, M. V., & Castelfranchi, C.
(2008). The Anticipatory Approach: Deinitions and Taxonomies. In G. Pezzulo, M. V.
Butz, C. Castelfranchi, & R. Falcone (Eds.),
The Challenge of Anticipation. A unifying
framework for the Analysis and Design of Artiicial Cognitive Systems (pp. 23–43). Berlin,
Heidelberg: Springer-Verlag.
Pfeifer, R., & Scheier, C. (1999). Understanding
intelligence (pp. xx, 697 p.).
Phillips, A. T., Wellman, H. M., & Spelke, E. S.
(2002). Infants’ ability to connect gaze and
emotional expression to intentional action.
Cognition, 85(1), 53–78.
Piccinini, G. (2008). Computation without
representation. Philosophical Studies, 137(2),
205–241.
Penield, W., & Rasmussen, T. (1950). The cere- Povinelli, D. J. (2001). On the Possibility of
bral cortex of man; a clinical study of localization
Detecting Intentions Prior to Understandof function. New York, Macmillan.
ing Them. In B. F. Malle, L. O. J. Moses, &
D. A. Baldinw (Eds.), Intentions and IntentionPerner, J. (1991). Understanding the representaality (pp. 225–248). Cambridge: MIT Press.
tional mind (p. 348). Cambridge, MA: The
MIT Press.
Prather, J. F., Peters, S., Nowicki, S., &
Perner, J. (2010). Who took the cog out of
Cognitive Science? Mentalism in an era
Mooney, R. (2008). Precise auditory–vocal
mirroring in neurons for learned vocal com-
references
p
Pacherie, E. (2000). The Content of Intentions. Mind & Language, 15(4), 400–432.
of anti-cognitivism. In P. A. Frensch & R.
Schwarzer (Eds.), Cognition and Neuropsychology: International Perspectives on Psychological
Science (Vol. 1, pp. 241–261). Hove: Psychology Press.
144
munication. Nature, 451(7176), 305–310.
doi:10.1038/nature06492
Prinz, W. (1997). Perception and Action
Planning. The European journal of cognitive
psychology, 9(2), 129–154.
Pylyshyn, Z. W. (1987). The Robot’s Dilemma:
The Frame Problem in Artiicial Intelligence.
Norwood, NJ: Ablex Publishing.
tem and Imitation. In S. Hurley & N. Chater
(Eds.), Perspectives on Imitation (pp. 55–76).
Cambridge, Ma: MIT Press.
Rizzolatti, G., & Craighero, L. (2004). The
Mirror-Neuron System. Annual Review of
Neuroscience, 27, 169–192.
Rizzolatti, G., & Sinigaglia, C. (2010). The
functional role of the parieto-frontal mirror
circuit: interpretations and misinterpretaQuian Quiroga, R., Reddy, L., Kreiman, G.,
tions. Nature Reviews: Neuroscience, 11(4),
Koch, C., & Fried, I. (2005). Invariant visual
264–274.
representation by single neurons in the human brain. Nature, 435, 1102–1107.
Rizzolatti, G., Camarda, R., Fogassi, L., Gentilucci, M., Luppino, G., & Matelli, M. (1988).
Rakoczy, H. (in press). Do infants have a
Functional organization of inferior area 6
theory of mind? British Journal of Developmenin the macaque monkey. II. Area F5 and the
tal Psychology.
control of distal movements. Experimental
Brain Research, 71(3), 475–490.
Ramnani, N., & Owen, A. (2004). Anterior
prefrontal cortex: insights into function
Rizzolatti, G., Fadiga, L., Gallese, V., &
from anatomy and neuroimaging. Nature
Fogassi, L. (1996). Premotor cortex and the
Reviews: Neuroscience, 5(3), 184–194.
recognition of motor actions. Cognitive Brain
q
r
Ranganath, C., Johnson, M. K., & D’Esposito,
M. (2000). Left anterior prefrontal activation increases with demands to recall
speciic perceptual information. Journal of
Neuroscience, 20(22), RC108.
Research, 3(2), 131–142.
Rizzolatti, G., Fogassi, L., & Gallese, V. (2001).
Neurophysiological mechanisms underlying
the understanding and imitation of action.
Nature Reviews: Neuroscience, 2, 661–670.
Reddy, V. (2010). Engaging Minds in the irst Rizzolatti, G., Gentilucci, M., Fogassi, L.,
year: The developing awareness of attention
Luppino, G., Matelli, M., & Ponzoni-Maggi,
and intention. In G. Bremner & T. Wachs
S. (1987). Neurons related to goal-directed
(Eds.), Handbook of Infant Development. 2nd
motor acts in inferior area 6 of the macaque
Edition. Chichester: Wiley-Blackwell.
monkey. Experimental Brain Research, 67(1),
220–224.
Reid, V. M., Csibra, G., Belsky, J., & Johnson, M. H. (2007). Neural correlates of
Rosenbaum, D. A. (2009). Human motor conthe perception of goal-directed action in
trol. Human perception and performance
infants. Acta Psychologica, 124(1), 129–138.
(2nd ed. p. 505). San Diego: Academic
doi:10.1016/j.actpsy.2006.09.010
Press.
Reid, V. M., Hoehl, S., Grigutsch, M., Groendahl, A., Parise, E., & Striano, T. (2009).
The neural correlates of infant and adult
goal prediction: evidence for semantic
processing systems. Developmental Psychology,
45(3), 620–629. doi:10.1037/a0015209
Ridderinkhof, K. R., van den Wildenberg,
W. P. M., Segalowitz, S. J., & Carter, C. S.
(2004). Neurocognitive mechanisms of
cognitive control: The role of prefrontal
cortex in action selection, response inhibition, performance monitoring, and rewardbased learning. Brain and Cognition, 56(2),
129–140. doi:10.1016/j.bandc.2004.09.016
Rizzolatti, G. (2005). The Mirror Neuron Sys-
Rosenbaum, D. A., & Jorgensen, M. J. (1992).
Planning macroscopic aspects of manual
control☆. Human Movement Science, 11(1-2),
61–69. doi:10.1016/0167-9457(92)90050-L
Ruffman, T., Slade, L., & Crowe, E. (2002).
The relation between children“s and mothers” mental state language and theory-ofmind understanding. Child development,
73(3), 734–751.
Ruffman, T., Slade, L., Rowlandson, K.,
Rumsey, C., & Garnham, A. (2003). How
language relates to belief, desire, and emotion understanding. Cognitive Development,
18(2), 139–158.
145
Rushworth, M. F. S. (2008). Intention,
Choice, and the Medial Frontal Cortex.
Annals of the New York Academy of Sciences, 1124(1), 181–207. doi:10.1196/annals.1440.014
s
Saffran, J. R., Aslin, R. N., & Newport, E. L.
(1996). Statistical Learning by 8-MonthOld Infants. Science, 274(5294), 1926–1928.
doi:10.1126/science.274.5294.1926
Saltzman, E. (1979). Levels of sensorimotor representation. Journal of Mathematical
Psychology, 20(2), 91–163.
Saxe, R. (2005a). Against simulation: the
argument from error. Trends in Cognitive Sciences, 9(4), 174–179.
Saxe, R. (2005b). Tuning forks in the mind:
Reply to Goldman and Sebanz. Trends in
Cognitive Sciences, 9(7), 321.
Saxe, R. (2009). The neural evidence for
simulation is weaker than I think you think
it is. Philosophical Studies, 144(3), 447–456.
Saylor, M. M., Baldwin, D. A., Baird, J. A.,
& LaBounty, J. (2007). Infants’ On-line
Segmentation of Dynamic Human Action.
Journal of Cognition and Development, 8(1),
113–128. doi:10.1080/15248370709336996
Searle, J. (1983). Intentionality, An Essay in the
Philosophy of Mind. Cambridge: Cambridge
University Press.
Connectionist Systems. Mind & Language,
22(3), 246–269.
Skinner, B. (1953). Science and human behavior.
New York, Macmillan.
Smith, L. B., Thelen, E., Titzer, R., & McLin,
D. (1999). Knowing in the context of acting:
The task dynamics of the A-not-B error.
Psychological Review, 106(2), 235.
Sodian, B. (2011). Theory of mind in infancy.
Child Development Perspectives, 5(1), 39–43.
Sommerville, J. A., Woodward, A. L., & Needham, A. (2005). Action experience alters
3-month-old infants“ perception of others”
actions. Cognition, 96(1), B1–B11.
Southgate, V., Johnson, M. H., Karoui, I. E., &
Csibra, G. (2010). Motor system activation
reveals infants’ on-line prediction of others’
goals. Psychological Science, 21(3), 355.
Sterelny, K. (2010). Minds: extended or
scaffolded? Phenomenology and the Cognitive
Sciences, 9(4), 465–481. doi:10.1007/s11097010-9174-y
Stich, S. P. (1983). From folk psychology to cognitive science: The case against belief. (p. 266).
Cambridge, MA: The MIT Press.
Stich, S. P., & Ravenscroft, I. (1994). What is
folk psychology? Cognition, 50(1-3), 447–468.
doi:10.1016/0010-0277(94)90040-X
t
Tai, Y., Scherler, C., Brooks, D., Sawamoto,
N., & Castiello, U. (2004). The Human Premotor Cortex Is “Mirror” Only for Biological Actions. Current Biology, 14(2), 117–120.
Sebanz, N., & Knoblich, G. (2009). Prediction Taumoepeau, M., & Ruffman, T. (2008). Stepping Stones to Others’ Minds: Maternal Talk
in joint action: What, when, and where. TopRelates to Child Mental State Language and
ics in Cognitive Science, 1(2), 353–367.
Emotion Understanding at 15, 24, and 33
Sebanz, N., Bekkering, H., & Knoblich, G.
Months. Child development, 79(2), 284–302.
(2006). Joint action: bodies and minds
doi:10.1111/j.1467-8624.2007.01126.x
moving together. Trends in Cognitive Sciences,
Thagard, P., & Verbeurgt, A. (1998). Coher10(2), 70–76.
ence as constraint satisfaction. Cognitive
Selen, L. P. J., Franklin, D. W., & Wolpert,
Science, 22(1), 1–24.
D. M. (2009). Impedance control reduces
Thelen, E., & Smith, L. (1994). A dynamic
instability that arises from motor noise.
systems approach to the development of cognition
Journal of Neuroscience, 29(40), 12606–12616.
and action (pp. XXIII, 376).
doi:10.1523/JNEUROSCI.2826-09.2009
Sellars, W. (1963). Science, Perception and Reality. New York: Humanities Press.
Shea, N. (2007). Content and Its Vehicles in
Thelen, E., Schöner, G., Scheier, C., & Smith,
L. (2001). The dynamics of embodiment: A
ield theory of infant perseverative reaching.
Behavioral and Brain Sciences, 24(1), 1–86.
references
Ruffman, T., Taumoepeau, M., & Perkins,
C. (2011). Statistical learning as a basis for
social understanding in children. British
Journal of Developmental Psychology.
146
Thoermer, C., Sodian, B., Vuori, M., Perst,
Neurophysiological Study. Neuron, 31(1),
H., & Kristen, S. (2011). Continuity from an
155–165.
implicit to an explicit understanding of false
belief from infancy to preschool age. British Vallacher, R. R., & Wegner, D. M. (1987).
What do people think they’re doing? Action
Journal of Developmental Psychology.
identiication and human behavior. PsychoTomasello, M. (1999). The cultural origins of
logical Review.
human cognition. Cambridge, MA; London:
van Dijk, J., Kerkhofs, R., van Rooij, I., &
Harvard University Press.
Haselager, W. F. G. (2008). Can There Be
Tomasello, M., Carpenter, M., Call, J., Behne,
Such a Thing as Embodied Embedded CogT., & Moll, H. (2005). Understanding and
nitive Neuroscience? Theory & Psychology,
sharing intentions: The origins of cultural
18(3), 297.
cognition. Behavioral and Brain Sciences, 28,
van Dijk, M., Hunnius, S., & van Geert,
675–735.
P. (2009). Variability in eating behavior
Toni, I., Lange, F. P., Noordzij, M. L., & Hathroughout the weaning period. Apgoort, P. (2008). Language beyond action.
petite, 52(3), 766–770. doi:10.1016/j.apJournal of Physiology-Paris, 102(1-3), 71–79.
pet.2009.02.001
v
Tsotsos, J. (1990). Analyzing vision at the
complexity level. Behavioral and Brain Sciences, 13, 423–469.
u
Van Elk, M. (2010). Action semantics. functional and neural dynamics (p. 235). Radboud
University Nijmegen.
Uithol, S., Burnston, D., & Haselager, W. F. G. Van Elk, M., Van Schie, H. T., & Bekker(submitted). Will intentions be found in the
ing, H. (2008). Conceptual knowledge for
brain? Cognition.
understanding other’s actions is organized
primarily around action goals. Experimental
Uithol, S., Haselager, W. F. G., & Bekkering,
Brain Research, 189(1), 99–108.
H. (2008). When Do We Stop Calling Them
Mirror Neurons? (B. Love, K. McRae, & V.
van Gelder, T. (1992). Deining `distributed
Sloutsky, Eds.)Proceedings of the 30th Annual
representation’. Connection Science, 4(3/4),
Conference of the Cognitive Science Society (pp.
175–191.
1783–1788).
van Gelder, T. (1995). What Might Cognition
Uithol, S., van Rooij, I., Bekkering, H., &
Be, If Not Computation? Journal of PhilosoHaselager, W. F. G. (2011a). What do mirror
phy, 91(7), 345–381.
neurons mirror? Philosophical Psychology,
van Gelder, T. (1998). The Dynamical Hy24(5), 607–623.
pothesis in Cognitive Science. Behavioral and
Uithol, S., van Rooij, I., Bekkering, H., &
Brain Sciences, 21, 615–665.
Haselager, W. F. G. (2011b). Understanding
van Gelder, T. (1999). Distributed versus
motor resonance. Social Neuroscience, 6(4),
local representation. In (pp. 236–238).
388–397.
Cambridge, MA: The MIT Encyclopedia of
Uithol, S., van Rooij, I., Bekkering, H., &
Cognitive Sciences.
Haselager, W. F. G. (2012). Hierarchies in
Action and Motor Control. Journal of Cogni- van Rooij, I. (2008). The tractable cognition
thesis. Cognitive Science, 32(6), 939–984.
tive Neuroscience, 24(5), 1077–1086.
Umiltà, M. A., Escola, L., Intskirveli, I., Gram- van Rooij, I., Haselager, W. F. G., & Bekmont, F., Rochat, M., Caruana, F., Jezzini, A., kering, H. (2008). Goals are not implied
by actions, but inferred from actions and
et al. (2008). When pliers become ingers
contexts. Behavioral and Brain Sciences, 31,
in the monkey motor system. Proceedings
38–39.
of the National Academy of Sciences, 105(6),
2209–2213. doi:10.1073/pnas.0705985105
van Rooij, I., Kwisthout, J., Blokpoel, M., Szymanik, J., Wareham, T., & Toni, I. (2011).
Umiltà, M. A., Kohler, E., Gallese, V., Fogassi,
Intentional communication: ComputationL., Fadiga, L., Keysers, C., & Rizzolatti, G.
ally easy or dificult? Frontiers in Human Neu(2001). I Know What You Are Doing - A
roscience, 5. doi:10.3389/fnhum.2011.00052
147
w
y
z
Wimmer, H., & Perner, J. (1983). Beliefs
about beliefs: Representation and constraining function of wrong beliefs in young
children’s understanding of deception. Cognition, 13(1), 103–128. doi:10.1016/00100277(83)90004-5
Wittgenstein, L. (1953). Philosophical Investigations. Oxford: Blackwell.
Wohlschlager, A., & Bekkering, H. (2002). Is
human imitation based on a mirror-neuron
system? Some behavioural evidence. Experimental Brain Research, 143(3), 335–341.
Woodward, A. L. (1998). Infants selectively
encode the goal object of an actor’s reach.
Cognition, 69(1), 1–34.
Woodward, A. L. (2009). Infants“ Grasp of
Others” Intentions. Current Directions in Psychological Science, 18(1), 53–57. doi:10.1111/
references
Van Schie, H. T., van Waterschoot, B. M.,
j.1467-8721.2009.01605.x
& Bekkering, H. (2008). Understanding
Woodward,
A. L., & Sommerville, J. A. (2000).
action beyond imitation: reversed compatTwelve-month-old
infants interpret action
ibility effects of action observation in imitain context. Psychological Science, 11(1), 73–77.
tion and joint action. Journal of Experimental
Psychology - Human Perception and Performance, Woodward, A. L., Sommerville, J. A., & Gua34(6), 1493–1500. doi:10.1037/a0011750
jardo, J. J. (2001). How Infants Make Sense
of Intentional Action. In B. F. Malle, L. O.
Ward, L. (2003). Synchronous neural oscilJ. Moses, & D. A. Baldwin (Eds.), Intentions
lations and cognitive processes. Trends in
and Intentionality. Cambridge, MA: MIT
Cognitive Sciences, 7(12), 553–559.
Press.
Wegner, D. M. (2003). The Illusion of Conscious
Yamashita, Y., & Tani, J. (2008). Emergence
Will (p. 419). Cambridge, MA: The MIT
of Functional Hierarchy in a Multiple TimPress.
escale Neural Network Model: A Humanoid
Wellman, H. M., Cross, D., & Watson, J.
Robot Experiment. Plos Computational Biol(2001). Meta-Analysis of Theory-of-Mind
ogy, 4(11), 1–18.
Development: The Truth about False Belief.
Young, B., & Drewett, R. (2000). Eating beChild development, 72(3), 655–684.
haviour and its variability in 1-year-old chilWhittington, B., Silder, A., Heiderscheit, B.,
dren. Appetite, 35(2), 171–177. doi:10.1006/
& Thelen, D. (2008). The contribution of
appe.2000.0346
passive-elastic mechanisms to lower extremity joint kinetics during human walking. Gait Ziemke, T. (2003). What’s that thing called
embodiment? In R. Alterman & D. Kirsh
& Posture, 27(4), 628–634.
(Eds.), Proceedings of the 25th Annual ConferWicker, B., Keysers, C., Plailly, J., Royet, J.-P.,
ence of the Cognitive Science Society (pp. 1134–
Gallese, V., & Rizzolatti, G. (2003). Both of
1139). Mahwah, NJ: Lawrence Erlbaum.
Us Disgusted in My Insula - The Common
Neural Basis of Seeing and Feeling Disgust. Zwaan, R. A., Stanield, R. A., & Madden, C.
J. (1999). Perceptual symbols in language
Neuron, 40(3), 655–664.
comprehension: Can an empirical case be
Wilson, M., & Knoblich, G. (2005). The
made? Behavioral and Brain Sciences, 22(04),
case for motor involvement in perceiving
636–637.
conspeciics. Psychological Bulletin, 131(3),
460–473.
148
149
Summary
The notion of ‘intention’ plays an important role in our everyday thinking of action. Yet, little is known about how intentions are represented in
the brain, and how they initiate actions. This thesis investigates how actions
arise, and what role intentions play. After introducing the notions of ‘representation’, ‘action’ and ‘intention’ in Chapter 1, in Chapter 2 I investigate
the process of content attribution to the iring of single mirror neurons.
Single cell recordings in monkeys provide strong evidence for an important role of the motor system in action understanding, but although the
data acquired from single cell recordings are generally considered to be
robust, several debates have shown that the interpretation of these data is
far from straightforward. Chapter 2 argues that, without principled restrictions, research based on single-cell recordings allows for unlimited content
attribution to single mirror neurons. A theoretical analysis of the type of
processing attributed to the mirror neuron system can help formulating restrictions on what mirroring is and what cognitive functions could, in principle, be explained by a mirror mechanism. It is argued that the processing
at higher levels of abstraction needs assistance of non-mirroring processes
to such an extent that subsuming the processes needed to infer goals from
actions under the label ‘mirroring’ is not warranted.
In humans single cell recordings are problematic. Therefore, activation
of the motor areas upon action observation using fMRI or EEG is studied.
The inding of this so called ‘motor resonance’ is generally regarded to be
supportive for motor theories of action understanding. These theories take
motor resonance to be essential in the understanding of observed actions
and the inference of action goals. However, Chapter 3 shows that the notions
of ‘resonance’, ‘action understanding’ and ‘action goal’ appear to be used
ambiguously in the literature. A survey of the literature on mirror neurons
and motor resonance yields two different interpretations of the term ‘resonance’, three different interpretations of ‘action understanding’, and again
three different interpretations of what the ‘goal’ of an action is. This entails
that, unless it is speciied what interpretation is used, the meaning of any
statement about the relation between these concepts can differ to a great
extent. By discussing Umilta et al.’s (2001) well-known experiment on mirror neurons I show that more precise deinitions and use of the concepts
will allow for better assessments of motor theories of action understanding
and hence a more fruitful scientiic debate. Lastly, I provide an example of
150
how the discussed experimental setup could be adapted, based on the preceding analysis, to test other interpretations of the concepts.
Actions are commonly thought of as structured hierarchically. Chapter 4
analyses such hierarchies. In the literature two hierarchies are often posited:
The irst—the action hierarchy—is a decomposition of an action into sub-actions and sub-sub-actions. The second—the control hierarchy—is a postulated
hierarchy in the neural control processes that are supposed to bring about
the action. A general assumption in cognitive neuroscience is that these two
hierarchies are internally consistent and provide complementary descriptions of neuronal control processes. In this chapter I show that that neither
hierarchy offers a complete explanation and that they cannot be reconciled
in a logical or conceptual way. Furthermore, neither pays proper attention
to the dynamics and temporal aspects of neural control processes. I explore
an alternative hierarchical organization in which causality is inherent in the
dynamics over time. Speciically, high levels of the hierarchy encode slower
(goal-related) representations, while lower levels represent faster (action
and motor acts) kinematics. If employed properly, a hierarchy based on this
principle is not subject to the problems that plague the traditional accounts.
Chapter 5 analyzes the neural applicability of the notion of ‘intention’.
Intentions are commonly conceived of as discrete mental states that are
the direct cause of actions. In the last several decades, neuroscientists have
taken up the project of localizing intentions in the brain, and a number of
areas have been posited as implementing representations of intentions. I
argue, however, that it is doubtful that the folk notion of ‘intention’ applies
to any particular physical process by which the brain initiates actions. Drawing on the analysis of Chapter 4, Pacherie’s account of intentions (Pacherie,
2006, 2008), and Koechlin’s model on action control (Koechlin et al, 1999,
2003) I show that the idea of a discrete state that causes an action is deeply
incompatible with the dynamic organization of the prefrontal cortex, the
presumed neural locus of the causation and control of actions. Discrete
representations can at best, I will claim, play a subsidiary, stabilizing role in
action planning, but this role is still incompatible with the folk notion of
intention. This chapter concludes by arguing that the prevalence of the folk
notion, including its intuitive appeal in neuroscientiic explanations, stems
from the central role intentions play in constructing intuitive explanations
of our own and others’ behavior. Some future directions based on the presented analysis are sketched.
151
summary
Finally, in Chapter 6 the ideas, results, and analyses of the previous chapters are applied to the ield of developmental psychology. Intention reading
and action understanding have been reported in ever-younger infants, but
these indings are highly debated. In this chapter I set out to clarify the
notions of ‘action understanding’ and ‘intention attribution’ and discuss
their relation. I use the various forms of ‘action understanding’ from Chapter 3 and speculate on the mechanisms that could underlie these capacities.
Based on Chapter 5 I argue that these forms of action understanding do not
generally result in the attribution of an intention to an observed actor. By
disentangling intention attribution from action understanding, and by exposing the latter as an umbrella notion, I provide a framework that allows
for better comparing indings from different experimental paradigms.
Finally, in Chapter 7 I discuss the implications of previous chapters for
the notions of ‘action understanding’ and ‘intention’, and for our conception of action hierarchies. The most important conclusions are: 1) that the
evidence for the presence of an action hierarchy is partly circulair, 2) that
motor resonance in itself does not provide support for mindreading, and 3)
that a reinterpretation of the concept of ‘intention’ can aid in our understanding of how we understand each others’ actions, and how joint action
is possible.
152
153
Samenvatting (Dutch summary)
Het begrip ‘intentie’ speelt een belangrijke rol in ons alledaags denken over
acties. Toch is er weinig bekend over hoe intenties in het brein gerepresenteerd zijn en onze acties teweegbrengen. In dit proefschrift onderzoek ik
hoe acties tot stand komen, en welke rol intenties daarin spelen. Na een
korte introductie van de begrippen ‘representatie’, ‘actie’ en ‘intentie’ in
Hoofdstuk 1, beschrijf ik in Hoofdstuk 2 de manier waarop representationele inhoud aan individuele spiegelneuronen wordt toegekend. Metingen
aan individuele neuronen in het apenbrein lijken erop te duiden dat het
motorsysteem een belangrijke rol speelt bij het begrijpen van acties. De
vraag hoe deze bevindingen geïnterpreteerd moeten worden, zorgt echter
voor veel controverse. In dit hoofdstuk laat ik zien dat de representationele
inhoud die aan individuele spiegelneuronen kan worden toegeschreven onbegrensd is, voor wie bereid is een almaar hoger abstractieniveau te kiezen.
Door middel van een theoretische analyse van het type processen dat in
het spiegelneuronensysteem verondersteld wordt plaats te vinden kan de
inhoudattributie begrensd worden. Ik concludeer dat bij cognitieve processen van een hogere abstractie, zoals het achterhalen van intenties, dusdanig
veel hulp van andere processen nodig is dat het predicaat ‘spiegelen’ misplaatst is.
Bij mensen wordt de hersenactiviteit die zich voordoet bij actieobservatie—de zogenoemde ‘motorresonantie’—niet in individuele neuronen
gemeten, maar met behulp van scantechnieken als fMRI en EEG. Deze
activiteit wordt doorgaans als bewijs voor motortheorieën van actiebegrip
gezien. Volgens deze theorieën speelt motorresonantie een essentiële rol
bij het begrijpen van acties van anderen. In Hoofdstuk 3 laat ik echter zien
dat de begrippen ‘motorresonantie’, ‘actiebegrip’ en ‘actiedoel’ binnen de
cognitiewetenschappen op uiteenlopende manieren geïnterpreteerd worden. Door deze veelheid aan interpretaties kan een bewering over de relatie
tussen deze begrippen sterk uiteenlopende betekenissen hebben, tenzij
expliciet duidelijk gemaakt wordt welke interpretatie is gehanteerd. Door
een preciezere deinitie van de gehanteerde begrippen en een zorgvuldiger
gebruik van de concepten kan de toetsbaarheid van de motortheorieën—
en daarmee het wetenschappelijke debat— sterk verbeterd worden, zoals ik
laat zien aan de hand van een analyse van een experiment van Umiltà et al.
(2001). Ten slotte laat ik zien hoe het door Umiltà gebruikte paradigma op
basis van mijn analyse kan worden aangepast om andere interpretaties van
de concepten te toetsen.
154
Doorgaans wordt aan acties een hiërarchische structuur toegeschreven.
In Hoofdstuk 4 analyseer ik deze structuur. In de literatuur worden veelvuldig twee hiërarchieën gepostuleerd: de eerste—de action hierarchy—is
een verdeling van een actie in subacties, en sub-subacties. De tweede—de
control hierarchy—is een hiërarchie die verondersteld wordt aanwezig te
zijn in de neurale processen die acties coördineren. Binnen de cognitieve
neurowetenschappen wordt over het algemeen aangenomen dat deze twee
hiërarchieën coherent zijn en complementaire beschrijvingen van de neurale processen bieden. In dit hoofdstuk betoog ik dat geen van beide hiërarchieën een complete verklaring kan bieden, en dat ze niet op een logische
of conceptuele wijze te integreren zijn. Bovendien schenkt geen van beide
voldoende aandacht aan het dynamische karakter van de betreffende neurale processen. Ik bespreek een alternatieve hiërarchische structuur waarbij causaliteit inherent is aan de temporele dynamica. Langzamere niveaus
sturen doelgerelateerde processen, en snellere niveaus zijn gerelateerd aan
vluchtigere aspecten van acties, zoals bewegingen. Wanneer deze alternatieve hiërarchische structuur zorgvuldig wordt toegepast, zijn de problemen
die zich bij de andere hiërarchieën voordoen te vermijden.
In Hoofdstuk 5 onderzoek ik de toepasbaarheid van het concept ‘intentie’ in de neurowetenschappen. Intenties worden doorgaans gezien als
mentale toestanden die de directe oorzaak van onze acties zijn. De laatste decennia hebben neurowetenschappers talrijke pogingen gedaan om
deze mentale toestanden in het brein te vinden, en een aantal gebieden
aangewezen waar intenties gerepresenteerd zouden zijn. Ik betoog echter,
dat het twijfelachtig is dat het begrip ‘intentie’ overeenkomt met één enkel
neuraal proces dat acties genereert. Door de resultaten van Hoofdstuk 4
te combineren met Pacherie’s model van intenties (Pacherie, 2006, 2008),
en Koechlin’s model van actieplanning (Koechlin et al., 1999, 2003) laat
ik zien dat het idee van een enkele en onderscheidbare mentale toestand
onverenigbaar is met de complexe en dynamische karakter van de processen in de prefrontaalschors, het hersengebied waarvan wordt aangenomen
dat het acties initieert en stuurt. Afzonderlijke representaties kunnen hoogstens een stabiliserende rol spelen bij het plannen van acties. Vervolgens betoog ik in dit hoofdstuk dat zowel de alomtegenwoordigheid van het begrip
intentie, als de eigenschappen die aan het begrip worden toegeschreven
zijn te verklaren zijn door te kijken naar de rol die het begrip speelt bij het
beschrijven van onze eigen en andermans acties. Ten slotte schets ik hoe
de aangehaalde studies en de verworven inzichen dicteren met welke pro-
155
dutch summary
cessen en eigenschappen een alternatieve interpretatie van actieplanning
rekening moet houden.
In hoofdstuk 6 worden de ideeën, analyses en resultaten van de voorgaande hoofdstukken toegepast op het gebied van de ontwikkelingspsychologie. Het begrijpen van intenties en acties wordt aan steeds jongere
kinderen toegeschreven. Tegelijkertijd is er een discussie gaande over de
mate waar in jonge kinderen deze capaciteiten bezitten. In dit hoofdstuk
verduidelijk ik de concepten “acties begrijpen” en “intenties toeschrijven”,
en bespreek ik hun relatie. Daarbij maak ik gebruik van de verschillende
vormen van actiebegrip uit Hoofdstuk 3, en speculeer ik over welke mechanismen voor de verschillende facetten van het begrijpen van acties verantwoordelijk zouden kunnen zijn. Op grond van de conclusies in hoofdstuk
5 betoog ik dat deze vormen van actiebegrip over het algemeen niet tot
de attributie van een intentie leiden. Ik laat zien dat het loskoppelen van
intentie-attributie en actiebegrip, en het inzicht dat dit laatste begrip een
containerbegrip is de mogelijkheid scheppen een nieuw kader voor het
bestuderen van sociale cognitie bij jonge kinderen te schetsen.
Ten slotte bespreek ik in hoofdstuk 7 wat in meer algemene zin de consequenties zijn van de voorafgaande hoofdstukken voor experimenteel onderzoek naar actiebegrip, voor onze interpretatie van motorhiërarchieën, motor simulatie en het concept ‘intentie’. De belangrijkste conclusies zijn 1)
dat het bewijs voor het bestaan van een hiërarchie in acties deels circulair
is, 2) dat motor resonantie niet zonder meer verondersteld kan worden bij
te dragen aan ‘mindreading’ en 3) dat een herinterpretatie van het begrip
“intentie” kan bijdragen aan een beter begrip van hoe we andermans acties
begrijpen, en hoe samenwerking mogelijk is.
156
157
Thank you
Thank you Pim, for being my supervisor. 5 years ago I googled for possible collaborations,
and I’m still glad you came up irst. Thank you for never lowering your expectations for my
writings, my posters, my presentations, and even my attitude, which could be frustrating
at times, but deinitely improved our project (and perhaps even my attitude slightly). Your
capacity of getting to the core of a problem in a matter of seconds has never ceased to
amaze me.
Thank you Harold, for being my promoter. Your experience and your knowledge of the ield,
your intuition of how to pitch an argument and what roads to pursue have been crucial for
the succes of our project. Thank you for your wonderful way of managing a lab, a centre,
and an institute.
Thank you Iris, for being my copromoter, and for your contribution to the irst chapters
of this thesis. Thank you for our endless discussions: our dificulties to ind agreement (on
nearly everything) greatly sharpened the argumentation and formulation in our papers.
Thank you Bill, for hosting me in San Diego. Your department and your courses have been
of great inspiration.
Thank you Dan M.F. Burnston, for our adventure in intentions and the prefrontal cortex, and
for the good times we had in San Diego, Mexico, and Nijmegen.
Thank you Markus for our endeavour to unravel infant action understanding. The speed at
which we worked suggests that this thesis could have easily contained 20 chapters.
Thank you my colleagues at the Donders Centre for Cognition, for making the DCC a great
place to be a PhD student, and a great place to be.
Thank you Maaike, for being my paranymf, for the countless salads we ate and cappuccinos
we drank together. Thank you for the fun we had sailing (bailing water), running, playing
squash, and everything else.
Thank you Pascal, for being my paranymf, for the great fun we had exercising, running,
drinking, and everything else.
Finally, thank you Jacinthe and Daphne, for making the world outside of my PhD-project
the best possible.
158
159
Publications
Uithol, S., & Paulus, M. (under review). What do infants understand of others’ action? A
theoretical account of early social cognition.
Uithol, S., Burnston, D., & Haselager, W. F. G. (resubmitted). Will intentions be found in
the brain?.
Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2012). Hierarchies in Action
and Motor Control. Journal of Cognitive Neuroscience, 24(5), 1077–1086.
Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2011). Understanding motor
resonance. Social Neuroscience, 6(4), 388–397.
Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2011). What do mirror neurons
mirror? Philosophical Psychology, 24(5), 607–623.
Uithol, S., Haselager, W. F. G., & Bekkering, H. (2008). When Do We Stop Calling Them
Mirror Neurons? Proceedings of the 30th Annual Conference of the Cognitive Science
Society (pp. 1783–1788).
160
161
Curriculum vitae
Sebo was born February 16th, 1977, in Langerak. After inishing high school in Heerenveen,
he went to study mechanical engineering at the University of Twente, Enschede. Being
unsatisied there, he switched to philosophy of science, technology and society. During
the inal part of this master program he became interested in the mind-body problem, and
wrote his master thesis about the possibility to ind mental representations using imaging
techniques. After graduating he contacted Pim Haselager from Radboud University Nijmegen, and together they submitted a research proposal on the notion of representation, that
received an internal NICI graduation grant. Simultaneously, Sebo got a small assignment
as a teacher in the psychology program at the university of Twente. After inishing his PhD
thesis, Sebo was offered an extension of his contract at the Donders Institute, to develop a
number of ideas.
162
163
Donders graduate school for cognitive
neuroscience series
van Aalderen-Smeets, S.I. (2007). Neural dyPoser, B.A. (2009). Techniques for BOLD and
namics of visual selection. Maastricht University, blood volume weighted fMRI. Radboud UniverMaastricht, the Netherlands.
sity Nijmegen, Nijmegen, the Netherlands.
1
13
Schoffelen, J.M. (2007). Neuronal communication through coherence in the human motor system.
Radboud University Nijmegen, Nijmegen, the
Netherlands.
2
Baggio, G. (2009). Semantics and the electrophysiology of meaning. Tense, aspect, event structure.
Radboud University Nijmegen, Nijmegen, the
Netherlands.
de Lange, F.P. (2008). Neural mechanisms of
motor imagery. Radboud University Nijmegen,
Nijmegen, the Netherlands.
van Wingen, G.A. (2009). Biological determinants of amygdala functioning. Radboud University Nijmegen Medical Centre, Nijmegen,
the Netherlands.
3
Grol, M.J. (2008). Parieto-frontal circuitry in
visuomotor control. Utrecht University, Utrecht,
the Netherlands.
4
14
15
Bakker, M. (2009). Supraspinal control of walking: lessons from motor imagery. Radboud University Nijmegen Medical Centre, Nijmegen,
the Netherlands.
16
Bauer, M. (2008). Functional roles of rhythmic
neuronal activity in the human visual and somatosensory system. Radboud University Nijmegen, Aarts, E. (2009). Resisting temptation: the role of
Nijmegen, the Netherlands.
the anterior cingulate cortex in adjusting cognitive
control. Radboud University Nijmegen, NijmeMazaheri, A. (2008). The Inluence of Ongoing
gen, the Netherlands.
Oscillatory Brain Activity on Evoked Responses
and Behaviour. Radboud University Nijmegen, Prinz, S. (2009). Waterbath stunning of chickens
Nijmegen, the Netherlands.
– Effects of electrical parameters on the electroencephalogram and physical relexes of broilers.
Hooijmans, C.R. (2008). Impact of nutritional
Radboud University Nijmegen, Nijmegen, the
lipids and vascular factors in Alzheimer’s Disease. Netherlands.
Radboud University Nijmegen, Nijmegen, the
Netherlands.
Knippenberg, J.M.J. (2009). The N150 of the
Auditory Evoked Potential from the rat amygdala:
Gaszner, B. (2008). Plastic responses to stress by
In search for its functional signiicance. Radboud
the rodent urocortinergic Edinger-Westphal nucleus. University Nijmegen, Nijmegen, the NetherRadboud University Nijmegen, Nijmegen, the lands.
Netherlands.
Dumont, G.J.H. (2009). Cognitive and physiWillems, R.M. (2009). Neural relections of
ological effects of 3,4-methylenedioxymethamphetmeaning in gesture, language and action. Radamine (MDMA or ’ecstasy’) in combination with
boud University Nijmegen, Nijmegen, the
alcohol or cannabis in humans Radboud UniverNetherlands.
sity Nijmegen, Nijmegen, the Netherlands.
van Pelt, S. (2009). Dynamic neural representaPijnacker, J. (2010). Defeasible inference in autions of human visuomotor space. Radboud Uni- tism: a behavioral and electrophysiogical approach.
versity Nijmegen, Nijmegen, the Netherlands. Radboud University Nijmegen, Nijmegen, the
5
17
6
18
7
8
19
9
20
10
21
Lommertzen, J. (2009). Visuomotor coupling at Netherlands.
different levels of complexity. Radboud University de Vrijer, M. (2010). Multisensory integration in
Nijmegen, Nijmegen, the Netherlands.
spatial orientation. Radboud University Nijmegen, Nijmegen, the Netherlands.
Poljac, E. (2009). Dynamics of cognitive control
in task switching: Looking beyond the switch cost.
Vergeer, M. (2010). Perceptual visibility and apRadboud University Nijmegen, Nijmegen, the pearance: Effects of color and form. Radboud UniNetherlands.
versity Nijmegen, Nijmegen, the Netherlands.
11
12
22
23
164
Levy, J. (2010). In Cerebro Unveiling Unconscious Snijders, T.M. (2010). More than words – neural
Mechanisms during Reading. Radboud Univer- and genetic dynamics of syntactic uniication.
sity Nijmegen, Nijmegen, the Netherlands.
Radboud University Nijmegen, Nijmegen, the
Netherlands.
Treder, M. S. (2010). Symmetry in (inter)action.
Radboud University Nijmegen, Nijmegen, the Grootens, K.P. (2010). Cognitive dysfunction
Netherlands.
and effects of antipsychotics in schizophrenia and
borderline personality disorder. Radboud UniverHorlings C.G.C. (2010). A Weak balance;
sity Nijmegen Medical Centre, Nijmegen, the
balance and falls in patients with neuromuscular
Netherlands.
disorders. Radboud University Nijmegen,
Nijmegen, the Netherlands.
Nieuwenhuis, I.L.C. (2010). Memory consolidation: A process of integration – Converging evidence
Snaphaan, L.J.A.E. (2010). Epidemiology of
from MEG, fMRI and behavior. Radboud Unipost-stroke behavioural consequences. Radboud
versity Nijmegen Medical Centre, Nijmegen,
University Nijmegen Medical Centre, Nijme- the Netherlands.
gen, the Netherlands.
Menenti, L.M.E. (2010). The right language:
Dado – Van Beek, H.E.A. (2010). The reguladifferential hemispheric contributions to language
tion of cerebral perfusion in patients with Alproduction and comprehension in context. Radzheimer’s disease. Radboud University Nijmegen boud University Nijmegen, Nijmegen, the
Medical Centre, Nijmegen, the Netherlands. Netherlands.
Derks, N.M. (2010). The role of the non-pregan- van Dijk, H.P. (2010). The state of the brain, how
glionic Edinger-Westphal nucleus in sex-dependent alpha oscillations shape behaviour and event restress adaptation in rodents. Radboud University lated responses. Radboud University Nijmegen,
Nijmegen, Nijmegen, the Netherlands.
Nijmegen, the Netherlands.
24
37
25
38
26
27
39
28
40
29
41
Wyczesany, M. (2010). Covariation of mood and
brain activity. Integration of subjective self-report
data with quantitative EEG measures. Radboud
University Nijmegen, Nijmegen, the Netherlands.
30
Meulenbroek, O.V. (2010). Neural correlates of
episodic memory in healthy aging and Alzheimer’s
disease. Radboud University Nijmegen, Nijmegen, the Netherlands.
42
Oude Nijhuis, L.B. (2010). Modulation of
Beurze S.M. (2010). Cortical mechanisms for
human balance reactions. Radboud University
reach planning. Radboud University Nijmegen, Nijmegen, Nijmegen, the Netherlands.
Nijmegen, the Netherlands.
Qin, S. (2010). Adaptive memory: imaging
van Dijk, J.P. (2010). On the Number of Motor
medial temporal and prefrontal memory systems.
Units. Radboud University Nijmegen, Nijme- Radboud University Nijmegen, Nijmegen, the
gen, the Netherlands.
Netherlands.
Lapatki, B.G. (2010). The Facial MusculaTimmer, N.M. (2011). The interaction of
ture – Characterization at a Motor Unit Level.
heparan sulfate proteoglycans with the amyloid
Radboud University Nijmegen, Nijmegen, the Beta-protein. Radboud University Nijmegen,
Netherlands.
Nijmegen, the Netherlands.
31
43
32
44
33
45
Kok, P. (2010). Word Order and Verb Inlection in Crajé, C. (2011). (A)typical motor planning and
Agrammatic Sentence Production. Radboud Uni- motor imagery. Radboud University Nijmegen,
versity Nijmegen, Nijmegen, the Netherlands. Nijmegen, the Netherlands.
van Elk, M. (2010). Action semantics: Functional van Grootel, T.J. (2011). On the role of eye and
and neural dynamics. Radboud University
head position in spatial localisation behaviour.
Nijmegen, Nijmegen, the Netherlands.
Radboud University Nijmegen, Nijmegen, the
34
46
35
47
Majdandzic, J. (2010). Cerebral mechanisms
of processing action goals in self and others.
Radboud University Nijmegen, Nijmegen, the
Netherlands.
36
Netherlands.
Lamers, M.J.M. (2011). Levels of selective attention in action planning. Radboud University
Nijmegen, Nijmegen, the Netherlands.
48
165
49
van Leeuwen, T.M. (2011). ‘How one can see
what is not there’: Neural mechanisms of graphemecolour synaesthesia. Radboud University Nijmegen, Nijmegen, the Netherlands.
50
Scheeringa, R. (2011). On the relation between
oscillatory EEG activity and the BOLD signal.
Radboud University Nijmegen, Nijmegen, the
Netherlands.
van Tilborg, I.A.D.A. (2011). Procedural learning in cognitively impaired patients and its application in clinical practice. Radboud University
Nijmegen, Nijmegen, the Netherlands.
Bögels, S. (2011). The role of prosody in language
comprehension: when prosodic breaks and pitch
accents come into play. Radboud University
Nijmegen, Nijmegen, the Netherlands.
Bruinsma, I.B. (2011). Amyloidogenic proteins
in Alzheimer’s disease and Parkinson’s disease:
interaction with chaperones and inlammation.
Radboud University Nijmegen, Nijmegen, the
Netherlands.
51
Ossewaarde, L. (2011). The mood cycle: hormonal inluences on the female brain. Radboud
University Nijmegen, Nijmegen, the Netherlands.
52
61
62
63
Voermans, N. (2011). Neuromuscular features
of Ehlers-Danlos syndrome and Marfan syndrome;
expanding the phenotype of inherited connective
tissue disorders and investigating the role of the
Kuribara, M. (2011). Environment-induced acti- extracellular matrix in muscle. Radboud Univervation and growth of pituitary melanotrope cells of sity Nijmegen Medical Centre, Nijmegen, the
Xenopus laevis. Radboud University Nijmegen, Netherlands.
Nijmegen, the Netherlands.
Reelick, M. (2011). One step at a time. DisentanHelmich, R.C.G. (2011). Cerebral reorganizagling the complexity of preventing falls in frail older
tion in Parkinson’s disease. Radboud University persons. Radboud University Nijmegen MediNijmegen, Nijmegen, the Netherlands.
cal Centre, Nijmegen, the Netherlands.
Boelen, D. (2011). Order out of chaos? AsBuur, P.F. (2011). Imaging in motion. Applicasessment and treatment of executive disorders in
tions of multi-echo fMRI. Radboud University
brain-injured patients. Radboud University
Nijmegen, Nijmegen, the Netherlands.
Nijmegen, Nijmegen, the Netherlands.
Schaefer, R.S. (2011). Measuring the mind’s
Koopmans, P.J. (2011). fMRI of cortical layers.
ear: EEG of music imagery. Radboud University
Radboud University Nijmegen, Nijmegen, the Nijmegen, Nijmegen, the Netherlands.
Netherlands.
Xu, L. (2011). The non-preganglionic Edingervan der Linden, M.H. (2011). Experience-based Westphal nucleus: an integration center for energy
cortical plasticity in object category representation. balance and stress adaptation. Radboud UniverRadboud University Nijmegen, Nijmegen, the sity Nijmegen, Nijmegen, the Netherlands.
Netherlands.
Schellekens, A.F.A. (2011). Gene-environment
Kleine, B.U. (2011). Motor unit discharges
interaction and intermediate phenotypes in alcohol
- Physiological and diagnostic studies in ALS. Rad- dependence. Radboud University Nijmegen,
boud University Nijmegen Medical Centre,
Nijmegen, the Netherlands.
Nijmegen, the Netherlands.
van Marle, H.J.F. (2011). The amygdala on
Paulus, M. (2011). Development of action
alert: A neuroimaging investigation into amygdala
perception: Neurocognitive mechanisms underlying function during acute stress and its aftermath.
children’s processing of others’ actions. Radboud
Radboud University Nijmegen, Nijmegen, the
University Nijmegen, Nijmegen, the NetherNetherlands.
lands.
De Laat, K.F. (2011). Motor performance in indiTieleman, A.A. (2011). Myotonic dystrophy type viduals with cerebral small vessel disease: an MRI
2. A newly diagnosed disease in the Netherlands.
study. Radboud University Nijmegen Medical
Radboud University Nijmegen Medical CenCentre, Nijmegen, the Netherlands.
tre, Nijmegen, the Netherlands.
64
53
54
65
55
66
56
67
57
68
58
69
59
70
60
71
donders series
Van der Werf, J. (2011). Cortical oscillatory
activity in human visuomotor integration. Radboud University Nijmegen, Nijmegen, the
Netherlands.
166
Mädebach, A. (2011). Lexical access in speaking: Studies on lexical selection and cascading
activation. Radboud University Nijmegen,
Nijmegen, the Netherlands.
72
Vrins, S. (2012). Shaping object boundaries:
contextual effects in infants and adults. Radboud University Nijmegen, Nijmegen, the
Netherlands.
Poelmans, G.J.V. (2011). Genes and protein
networks for neurodevelopmental disorders. Radboud University Nijmegen, Nijmegen, the
Netherlands.
Weber, K.M. (2012). The language learning brain: Evidence from second language and
bilingual studies of syntactic processing. Radboud
University Nijmegen, Nijmegen, the Netherlands.
73
van Norden, A.G.W. (2011). Cognitive function
in elderly individuals with cerebral small vessel
disease. An MRI study. Radboud University
Nijmegen Medical Centre, Nijmegen, the
Netherlands.
74
Jansen, E.J.R. (2011). New insights into
V-ATPase functioning: the role of its accessory subunit Ac45 and a novel brain-speciic Ac45 paralog.
Radboud University Nijmegen, Nijmegen, the
Netherlands.
75
Haaxma, C.A. (2011). New perspectives on
preclinical and early stage Parkinson’s disease.
Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands.
76
Haegens, S. (2012). On the functional role of
oscillatory neuronal activity in the somatosensory
system. Radboud University Nijmegen, Nijmegen, the Netherlands.
77
van Barneveld, D.C.P.B.M. (2012). Integration
of exteroceptive and interoceptive cues in spatial
localization. Radboud University Nijmegen,
Nijmegen, the Netherlands.
78
Spies, P.E. (2012). The relection of Alzheimer
disease in CSF. Radboud University Nijmegen
Medical Centre, Nijmegen, the Netherlands.
79
Helle, M. (2012). Artery-speciic perfusion measurements in the cerebral vasculature by magnetic
resonance imaging. Radboud University Nijmegen, Nijmegen, the Netherlands.
80
84
85
Verhagen, L. (2012). How to grasp a ripe
tomato. Utrecht University, Utrecht, the
Netherlands.
86
Nonkes, L.J.P. (2012). Serotonin transporter
gene variance causes individual differences in rat
behaviour: for better and for worse. Radboud University Nijmegen Medical Centre, Nijmegen,
the Netherlands.
87
Joosten-Weyn Banningh, L.W.A. (2012).
Learning to live with Mild Cognitive Impairment:
development and evaluation of a psychological
intervention for patients with Mild Cognitive
Impairment and their signiicant others. Radboud
University Nijmegen Medical Centre, Nijmegen, the Netherlands.
88
89
Xiang, HD. (2012). The language networks of the
brain. Radboud University Nijmegen, Nijmegen, the Netherlands
Snijders, A.H. (2012). Tackling freezing of gait
in Parkinson’s disease. Radboud University
Nijmegen Medical Centre, Nijmegen, the
Netherlands.
90
Rouwette, T.P.H. (2012). Neuropathic Pain and
the Brain - Differential involvement of corticotropin-releasing factor and urocortin 1 in acute and
chronic pain processing. Radboud University
Nijmegen Medical Centre, Nijmegen, the
Netherlands.
91
Van de Meerendonk, N. (2012). States of
Egetemeir, J. (2012). Neural correlates of real-life indecision in the brain: Electrophysiological and
joint action. Radboud University Nijmegen,
hemodynamic relections of monitoring in visual
Nijmegen, the Netherlands.
language perception. Radboud University Nijmegen, Nijmegen, the Netherlands.
Janssen, L. (2012). Planning and execution
of (bi)manual grasping. Radboud University
Sterrenburg, A. (2012). The stress response of
Nijmegen, Nijmegen, the Netherlands.
forebrain and midbrain regions: neuropeptides, sex-
81
82
Vermeer, S. (2012). Clinical and genetic
characterisation of Autosomal Recessive Cerebellar
Ataxias. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands.
83
92
93
speciicity and epigenetics. Radboud University
Nijmegen, Nijmegen, The Netherlands
Uithol, S. (2012). Representing Action and Intention. Radboud University Nijmegen, Nijmegen, The Netherlands
94
167