[go: up one dir, main page]

Academia.eduAcademia.edu
Representing Action and Intention SEBO UITHOL DON DERS series SEBO UITHOL 94 representing action and intention promotor: prof. dr. Harold Bekkering copromotoren: dr. Pim Haselager dr. Iris van Rooij manuscriptcommissie: prof. dr. Günther Knoblich prof. dr. Vittorio Gallese (Università degli studi di Parma) prof. dr. Bernard Hommel (Universiteit Leiden) The studies described in this thesis were caried out at the Donders Institute for Brain, Cognition, and Behaviour at the Radboud University Nijmegen, the Netherlands. ISBN/EAN: 978-94-91027-38-3 Printed by Ipskamp Drukkers, Enschede, the Netherlands © Sebo Uithol, 2012 one introduction two mirror neurons as representations 9 22 three motor resonance 38 four action hierarchies 56 ive intentions in action 78 six action understanding in infants 102 seven discussion 123 references 135 summary 153 dutch summary 157 thank you 161 publications 163 curriculum vitae 165 donders series 167 one representation action intention introduction 6 Representing action and intention The concept of ‘intentions’ plays a crucial role in our everyday explanations of our own and others’ behavior. Intentions are generally taken to be a goal state combined with an action plan for how to reach that state, which makes the concepts of actions and intentions deeply intertwined: Actions are something we do, intentions make us do it. Movements that are unintentional, such as blinking, sneezing and spilling coffee, are not considered actions, and—vice versa—desires that do not involve actions are not considered intentions. Moreover, the relation between intentions and actions seems to be hierarchical: A goal is achieved by means of one or more actions, that in turn can consist of multiple sub-actions. For example, I can form the intention to get coffee, and subsequently plan the actions needed to reach my goal: getting up, walking to the hallway, locking the ofice door, etc. Locking the ofice door, in turn, consists of taking the ofice key from my pocket, insert it in the lock etc. How do we do that? What neural mechanisms could underlie this ability to structure our behavior in such a way that a goal can be achieved? In this thesis I will argue that the way we understand an action is not necessarily identical to the way we execute them. Therefore, the hierarchical structure seemingly inherent in actions need not match a similar hierarchy in the control of these actions. I will irst analyze what we understand when we understand an action. Next, I will argue that the hierarchy found in action control is not a straightforward causal hierarchy, in which elements higher in the hierarchy cause or initiate elements lower in the hierarchy. Instead, the hierarchical structure seems to be emerging from multiple processes that run at different time scales. As our behavior is the result of such an emergent hierarchical structure in which each of the elements jointly contribute, the notion of ‘intention’, thought to occupy the highest regions of the hierarchy, and be the primary cause of our actions, will change dramatically. This, in turn, impacts our conception of ‘action understanding’. The result I present here is not so much a detailed theory of how our actions are controlled in the brain, but rather an alternative framework that allows for designing new experiments and formulating new theories on action execution and action understanding. First this introduction will give a brief overview of the main concepts of this thesis: representation in cognitive neuroscience, intention, and action representation in the brain. In discussing these notions, I will further specify the problems that will be the topic of the subsequent chapters. 7 A representation consists of four elements: vehicle, content, user and object1. The object, event, or neural state that carries the information is called vehicle, the information that is carried by the vehicle is called content. Each representation necessarily contains a vehicle and content. Representations are commonly identiied by the content, (Uithol, Haselager, & Bekkering, 2008; Uithol, van Rooij, Bekkering, & Haselager, 2011a) which means that the two representations are different when the content is different2. The content of the two representations is different when each representation plays a different role in the cognitive system. This is often called the functional discreteness of representations (Haselager, 1997; Stich, 1983, see also Chapter 5). The entity that is represented, which could be an object, or an event, is called object. The content stands in some relation to the object that is represented, but content and object are not identical. The content contains only certain aspects of the object. The fact that content and object are not the same makes misrepresentation possible (often emphasized in theoretical accounts, see for instance Cummins (1989) and Dretske (1988)), but also appears to be useful to assess representational claims about mirror neurons (see Chapter 2). The fourth and inal element of a representation is user. The user is the system or process that uses the representation to guide its behavior, entailing that something is representation only to some system or process. The importance of a user is often emphasized in theoretical accounts (Bechtel, 1998; Eliasmith, 2005; Haugeland & Rumelhart, 1991; Millikan, 1984), but remains rather implicit in discussions in cognitive psychology, and cognitive neuroscience. This is potentially problematic, as the content of one and the same vehicle can vary, dependent on the user that reads the representation (see Eliasmith, 2005). Throughout the history of cognitive science, the notion of representation has played a fundamental role as an explanatory construct. Cognition and behavior were thought to be the result of computational processes that 1 This set of core elements is different from—for instance—Harvey’s (1996) set. Harvey places more emphasis on the communicational aspects of representation and deems user, producer, representation and object the essential aspects. This emphasis on a representation producer dismisses what Dretske (1988) calls ‘natural signs’ from the realm of representation, because here there is no producer, only signiicant features in an environment. To encompass non-communicative features as well, I will stick to the four elements mentioned in the text. 2 Consequently, two different representations can use the very same representational resources, or vehicle. This phenomenon is referred to as “superimposition” (van Gelder, 1999). introduction Representation in cognitive neuroscience and philosophy 8 use representations. The notion has been deined, operationalized and interpreted many times (Bechtel, 1998; Chemero, 2000; Clark, 1997; Dretske, 1988; Haugeland & Rumelhart, 1991; Millikan, 1984), but, as it turns out, never entirely satisfactory (see Cummins, 1989; Haselager, De Groot, & Van Rappard, 2003). A particularly problematic issue seems characterizing the relation between the object and the content. To illustrate: according to Clark (1997) and Haugeland (1991) a system is “representation using just in case: 1)It must coordinate its behaviors with environmental features that are not always ‘reliably present to the system’, 2)It copes with such cases by having something else (in place of a signal directly received from the environment) “stand in” and guide behavior in its stead, and 3)That “something else” is part of a more general representational scheme that allows the standing in to occur systematically and allows for a variety of related representational states” (Clark, 1997). However, this “standing in”, that is supposed to capture the relation between content and object, is still open for interpretation. Some philosophers (1988)) have tried to ground “standing in for” in a reliable covariance of the representation and the feature it represents, but others have pointed out that covariance is neither suficient (Clark, 1997), nor necessary (Millikan, 1984) for standing in. Covariance is not a necessary condition because, for example, an alarm bell can represent a radiation leak even when a leak has never occurred. Covariance is not suficient either, because an environmental state can be continuously available to the system, so there is no need for representation. For example, young sunlowers track the sun with their heads, resulting in a covariance of the sun position and the orientation of the lower heads. Yet, it seems awkward to describe the behavior of the sunlowers in terms of internal representations of the position of the sun3. Notwithstanding these conceptual dificulties, covariation seems to provide at least a irst estimation or a rough approximation of the representational content, and remains to be the main basis for establishing representations in cognitive neuroscience. In a typical setup of a neurophysiological experiment, an animal is presented with stimuli while the activity of single neurons is measured. When a reliable covariation between the presence of a stimulus and the activity in a cell is established, the cells is said to repre3 Haugeland (1991) and Grush (1997) have argued that X is a representation of Y only when Y is not always reliably present to the system. If Y is reliably present, Grush prefers using the notion presentation instead of representation. Clark, on the other hand, prefers the notion representation as soon as this representational approach yields explanatory power regardless of whether the environmental feature is reliably present to the system (Clark, 1997). 9 Embodied representations When cognition is claimed to be embodied, the dependency of cognition on the body is emphasized. This dependency can take various forms (Ziemke, 2003), but in cognitive neuroscience, embodiment is usually taken to mean that sensory and motor cortices are involved in tasks that do not directly involve perception or action (Barsalou, 2008; de Vignemont & Haggard, 2008; Decety & Grezes, 2006; Glenberg, 1997; Mahon & Caramazza, 2008). The framework of embodied cognition has important consequences for our interpretation of both representational vehicle and content. For the case of perceptual representations it entails that certain concepts, say a bird, are not represented in an amodal, conceptual format, but in the sensorimotor areas and in a format that is similar to actual perceptions of a bird. This representation is hypothesized to be a gradual build-up of all the neural activity caused by encounters with a bird (Barsalou, 1999; 2008). Consequently, when thinking about birds, these perceptual areas are reactivated. For action representations this means that actions are not stored as amodal concepts, but as motor activations. Similarly, thinking about a certain action activates those areas that would also be active upon performing that action. In the action domain ‘embodiment’ is sometimes interpreted more radically, and used to denote the fact that certain features of the action are not represented, but delegated, so to speak, to the body (Chiel & Beer, 1997; Van Dijk, Kerkhofs, van Rooij, & Haselager, 2008). As an example: Our gait is the result of a complex orchestra of movements in many joints. The muscle activation responsible for a successful gait is hypothesized to be controlled by central pattern generator (Duysens & Van de Crommert, 1998). However, these neural patterns are not suficient to generate a luent and eficient gate. Bodily components, such as muscle and tendon elasticity are of crucial importance (Whittington, Silder, Heiderscheit, & Thelen, 2008). In other words, some particular stages or parts of an action are not introduction sent the stimulus. To illustrate, using this paradigm mirror neurons in the premotor and parietal cortex (Di Pellegrino, Fadiga, Fogassi, Gallese, & Rizzolatti, 1992; Fogassi & Luppino, 2005; Gallese, Fadiga, Fogassi, & Rizzolatti, 1996; Rizzolatti, Fadiga, Gallese, & Fogassi, 1996), ‘edge detecting neurons’ in the primary visual areas (Hubel & Wiesel, 1959) and even ‘Jennifer Aniston neurons’ in the medial temporal cortex (Quian Quiroga, Reddy, Kreiman, Koch, & Fried, 2005) have been established. 10 controlled by the neural patterns that activate muscles, but these stages are accomplished by “neural silence”, and exploiting regularities of the body. These features of an action are not part of the neural representation of the action, but they are, nevertheless, part of the action4. Due to the above-mentioned outsourcing of representational content to bodily (and also environmental, in case of ‘embedded cognition’) elements, researchers in various ields started questioning the explanatory value of the notion of representation altogether, and emphasized the constant dependency of cognition on a body and the world. This anti-representationalism can be found in developmental psychology (Thelen & Smith, 1994), cognitive neuroscience (Beer, 2000; Edelman, Tononi, & Haier, 2003; Kelso, 1995), philosophy (Keijzer, 2001; van Gelder, 1995; 1998), and artiicial intelligence (Rodney Brooks, 1991a). But the conceptual vagueness of the notion of representation proved problematic. Proposals that were considered to be non-representational by the anti-representationalists were considered representational by the representationalists (Haselager et al., 2003). The debate around the ‘Watt governor’ is illustrative of the profundity of this conceptual confusion. This relatively simple mechanisms became subject of a debate about whether it is a representational system or not (Bechtel, 1998; Dietrich & Markman, 2001; 2003; Nielsen, 2010; van Gelder, 1995; 1998). For a part, this thesis is able to sidestep this debate between representationalism and anti-representationalism. The irst chapters—chapter 2 and 3—analyze claims that are made about the representational content attributed to mirror neurons and motor resonance. As such, these chapters are written under the assumption of representationalism, meaning that these chapters analyze scientiic and conceptual claims that are made within this framework, and do not the question the framework itself. This neutral stance is more dificult to maintain in chapter 4 and 5. In chapter 4 the explicit action hierarchy, with functionally discrete elements that engage in causal interaction, is argued to be untenable. In chapter 5 not only the straightforward causal relations between action representations are questioned, but so are the elements themselves. Speciically, here it is argued that the notion of ‘intention’, conceived as a functionally discrete 4 Other embodied interpretations question the strict contrast between perceptual and motor representations. For example, Millikan’s (1995) “pushmi-pullyu representations”, and Clark’s (1997) “action-oriented representations” code for a certain environmental feature (say a predator) and at the same time the appropriate response to it (leeing). More recent, Hommel’s ‘Theory of Event Coding’ (Hommel et al., 2001) posits a common representational medium for perceptual and action representations. 11 Intentions The notion of ‘intention’ has had a long, turbulent and rambling past. In classical psychology and philosophy, intentions are conceived of as mental states (Hume, 1739; James, 1890). In James’ (1890) ideomotor theory, for instance, an intention encompasses a belief about the perceptual consequences of an action and at the same time cause that action. Halfway the twentieth century, inluenced by the linguistic turn in philosophy (Wittgenstein, 1953), and behaviorism (Skinner, 1953), intentions were interpreted as restatements of an action, and exiled from the realm of mental states. In her inluential work, Anscombe (1957) argued that when we explain intentional action, we give reasons for the action, not causes, thereby suggesting that intentions are not so much causes of actions as descriptions of primary reasons of the agent. Although Davidson initially agreed with Anscombe (Davidson, 1963), later (1980) he argued that this account can not account for the deliberation and planning aspect of intention. To acknowledge the prospective (future-directed) and behavior-structuring aspects of intentions (e.g. the intentions to play squash after work), and in line with more cognitivist approaches to behavior, the notion of intentions was brought back to being a mental states. In this refurbished interpretation of intention, the notion was adopted in neuroscience. In his seminal experiment, Libet (1985) asked participants to pay attention to the exact moment at which they had the intention (“felt the urge”, p. 530) to press a button, and was, through EEG recording, able to show that this moment was about 250 ms later than the onset of a ‘readi- introduction representation, is deeply incompatible with action control processes. This means that the function normally ascribed to an intention is performed by various heterogeneous elements, both inside and outside (see discussion section) of the brain. As representational content is tied to the function—as explained above—the representational view needs to postulate a representation of an intention with a vehicle that consists of various processes and elements. Some of these processes are dynamically coupled to each other, or to external features. This dynamic character and distributed make-up of the representational vehicle might in the end be reconcilable with the notion of representation, but demand such a stretch of the representational framework, that it is questionable that the traditional notion still offers explanatory leverage. 12 ness potential’—a signal corresponding with the preparation of an action. This inding created a volley of follow-up studies (Brass & Haggard, 2008; Haynes & Rees, 2006; Lau, Rogers, Haggard, & Passingham, 2004), theoretical interpretations (Wegner, 2003), and criticism (Dennett, 1991). After Libet’s remarkable indings not only the timing of intentions has been studied, the emergence of fMRI as a research tool for cognitive neuroscience made attempts to localize intentions possible as well (Burgess, Veitch, de Lacy Costello, & Shallice, 2000; Hamilton & Grafton, 2006; Haynes et al., 2007; Lau et al., 2004; Ouden, Frith, Frith, & Blakemore, 2005). Although contemporary interpretations of ‘intention’ show slight variability, an intention is usually interpreted as a desired goal, combined with an action plan to reach that goal (Bratman, 1987; Malle & Knobe, 1997; Moses, 2001; Pacherie, 2000). These intentions are generally thought to be functionally discrete (see above), propositional states. Yet, as will be explained in Chapter 4, an explicit and top-down control structure has proven to be highly problematic. This means that the nature of intention in the action hierarchy is unclear. After analyzing both the notion of intention, with all the properties inherent to it, and the neural processes that cause and control our actions, Chapter 5 concludes that the two frameworks are incompatible. Intentions, it will be argued, have their origins in action explanations, not action control, and therefore fulill an explanatory role, not a causal one. This is not to deny the capacity of forming future-directed plans that guide our behavior, but to show that such future-directed action control is not best described in terms of intentions as the primary cause of the observable behavior. Action representation Movements are controlled by the primary motor cortex (see Figure 1a) (Kakei, Hoffman, & Strick, 1999). There seems to be an almost direct correlation between neuronal activity in these neurons, and simple movements. Electrical stimulation to neurons in the primary motor cortex will lead to muscle twitches (Fritsch & Hitzig, 1870). It was thus found that different parts of the primary motor cortex control different effectors (Penield & Rasmussen, 1950), resulting in the well-known ‘homunculus’ (Figure 1b). 13 introduction Figure 1a, the primary motor cortex, and b) the anatomical segregation of the projections of the primary motor cortex (adapted from Penield & Rasmussen, 1950). What is represented in the primary motor cortex is still debated. For example, there is evidence that not only individual muscle forces, but also basic movements (the ‘motor vocabulary’, Rizzolatti et al., 1988) are represented here. Graziano (2007; 2002) stimulated individual neurons in the primary motor cortex of a macaque monkey for a relatively long period (500 ms., as opposed to the usual maximum of 50 ms). He found that this stimulation evoked full movements, such as grasping, bringing to the mouth, eating, instead of the previously reported muscle twitches. The premotor cortex and the supplementary motor area lie anterior of the primary motor cortex. It is generally assumed that here more complex actions in the form of series of basic action chunks are planned and prepared (in interaction with the basal ganglia), and that these complex representations are subsequently propagated to the primary motor cortex (Gentilucci et al., 2000; Goldberg, 1985; Grafton & Hamilton, 2007). The idea is that actions start with action goals or intentions, often presumed to be represented in the lateral and medial prefrontal cortex (Hamilton & Grafton, 2008), the anterior cingulated cortex (Haynes et al., 2007; Lau et al., 2004), and limbic structures (Damasio, 1985). These intentions are posited to be rather unspeciied, and context-independent (Haggard, 2005). They are subsequently propagated to the premotor areas where they are embedded in the current context and result into a concrete action plan (Fuster, 2004; Hamilton & Grafton, 2007). Finally, these action representations are translated into a detailed set of movements in the primary motor area. Although this image appeals to an intuitive logic, it seems to be an oversimpliication of neural control of actions. First it is problematic to talk about the origin of an intention, as this seems to suggest a brain area ‘where 14 it all comes together’—or a ‘Cartesian Theater’ in the words of Dennett (1991)—and where decisions are made. Apart from the conceptual problems discussed by Dennett (the Homunculus problem, for example, see also Bennett & Hacker (2003)), there seems to be empirical evidence that more posterior areas are closely involved in action planning as well (Koechlin, Ody, & Kouneiher, 2003). For example, the existence of canonical neurons (neurons that associate certain objects to the appropriate action for that object (Grezes, Armony, Rowe, & Passingham, 2003; Murata et al., 1997)) in the ventral premotor cortex, and mirror neurons (neurons that associate between own and observed actions), in the left premoter and parietal cortex suggests that action planning does not rely solely on anterior or medial parts of the prefrontal regions, or on intentions that are created in absence of context information. Action hierarchies The functional and anatomical segregation of different aspects of an action has widely been interpreted as supporting the idea of an action hierarchy (Byrne & Russon, 1998; Cooper & Shallice, 2006; Grafton & Hamilton, 2007; Hamilton, 2009; Hamilton & Grafton, 2007; Liepelt, Cramon, & Brass, 2008; Saltzman, 1979; Van Elk, 2010; Van Elk, Van Schie, & Bekkering, 2008). This hierarchical view on actions is so commonly accepted that it seems part of cognitive science’s ‘common sense’, and is as such hardly ever explicated or questioned. Pfeifer and Scheier note that: “One of the main reasons that the hierarchical view of behavior control has been (and still is) so popular is that it is straightforward and easy to understand. Moreover, it has a strong basis in folk psychology: It seems compatible with what we do in our everyday activities” (Pfeifer & Scheier, 1999, p. 344). The general idea of structuring an action into a hierarchy is highly similar to control structures found in classical AI systems (Good Old Fashioned AI, or GOFAI) and robotics. In these domains action planning is viewed as a kind of problem solving. When the overall problem is too complicated to solve at once, the system sets out to solve sub-problems irst. This hierarchical control structure in robotics dates back to 1960s. The robot Shakey is controlled by a hierarchical planning structure that consists of ive levels (Nilsson, 1984). The bottom level may be thought of as deining the elementary physical capabilities of the system (i.e. the basis motor vocabulary). The second level consists of what is called Low-Level Actions, or LLAs. Examples 15 introduction of these LLAs are the robot’s physical capabilities, such as ‘roll’ and ‘tilt’. The third level consists of a library of Intermediate-Level Actions, or ILAs. These ILAs are preprogrammed packages of LLAs. According to Nilsen, these ILAs are best thought of as: “instinctive abilities of the robot, analogous to such built-in complex animal abilities as ‘walk’ or ‘eat’”(Nilsson, 1984, p. 6). Above the level of ILAs there is fourth level, which is concerned with planning the solutions to problems. The ifth and top level consists of the program that actually invokes and monitors executions of the ILAs. Shakey’s basic planning mechanism, STRIPS, plans irst in an abstract space and then reines at successively more detailed levels. This makes its action control structure highly similar to current conceptions of action hierarchies (Grafton & Hamilton, 2007; Hamilton & Grafton, 2007, Chapter 4), as well as Pacherie’s model of intentional action (Pacherie, 2008, Chapter 5; 2008). However, these classical control structures have been shown to have severe limitations in dealing with real world environments. They suffer from various versions of the frame problem (i.e. it is impossible to assess what the relevant consequences of an action are, and what the irrelevant, see Fodor (1983), Haselager (1997), and Pylyshyn (1987)), which make their problem solving routines potentially computationally intractable (as the computations needed for planning grow exponentially with every added element in the world (van Rooij, 2008)). This makes these classical structures problematic to such an extent that Noli, Ikegami, and Tani deem classical approaches based on explicit design ‘hopeless’ (2008, p. 101). Strikingly, the very same hierarchical structure that is recognized as causing the problems in robotics, and therefore deemed hopeless, is posited to be the solution to the problem of action planning in humans. Haggard posits that “The brain must expand this task-level representation into an extremely detailed movement pattern specify the precise kinematics of all participating muscles and joints. Generating this information is computationally demanding. The brain’s solution to the problem may lie in the hierarchical organization of the motor system” (Haggard, 2005, p. 292). Chapter 4 will argue, however, that the explicit representational and hierarchical structure is problematic—both conceptually and empirically—in neural action control. As an alternative to the GOFAI control structures in robotics, Brooks (Rodney Brooks, 1986; 1991b; 1991a) introduced the subsumption architecture. This architecture consists of various independent and parallel layers. Lower layers implement simple forms of behavior, such as ‘walking’, or 16 ‘avoiding objects’ and function as autonomous control units with own input from the sensors and output to the effectors. Higher layers do not start or stop activity in the lower layers, but merely modulate the output of the lower layers. In such architectures, the resulting behavior is not the result of top-down control, but emergent from the interaction between all layers, and the environment. The types of control at lower layers (e.g. ‘avoiding an object’) operate on a smaller timescale than the higher layers (e.g. ‘exploring the world’). Yamashita and Tani (2008) were able to have a similar control structure emerge from a network in which the units operate on different time scales. This result provides an interesting suggestion for how the brain controls behavior in a hierarchical manner. Additionally, this suggestion seems compatible with recent models of action control (Koechlin et al., 2003; Kouneiher, Charron, & Koechlin, 2009). In Chapter 4 I will discuss action hierarchies, their problems and a possible solution. Outline of this thesis This thesis is organized as follows. In chapter 2 I will analyze claims that are being made about the representational content of mirror neurons. I will show that there is no limit to the level of abstraction of the content that can be attributed to single mirror neurons. I will argue, however, that the higher levels are less appropriate if one wants to maintain to the idea of action mirroring as a form of direct matching. As mirror neurons in macaques were found using invasive techniques, it is dificult to establish mirror neurons in humans (Chong, Cunnington, Williams, Kanwisher, & Mattingley, 2008; Kilner, Neal, Weiskopf, Friston, & Frith, 2009; Lingnau, Gesierich, & Caramazza, 2009). Therefore, one usually does not speak of mirror neuron activity in humans, but more cautiously of ‘motor resonance’—the phenomenon that the motor areas are activated upon action observation. In chapter 3 I will discuss motor resonance, and its putative relation to action understanding. I will show that there is great variability in the interpretations of these notions: two interpretations for motor resonance, three interpretations for action understanding, and three interpretations for action goal. Consequently, a (ictive, but reasonable) claim that “motor resonance contributes to understanding the goal of an action” can have—apart from the vague “contributes to”—eighteen different meanings. An example of the misunderstanding stemming from this use of termi- 17 introduction nology is discussed, and experiments on basis of the exposition of different interpretations are suggested. In chapter 4 it is argued that our intuitive conception of an action hierarchy is insuficient to capture the complexity of the neural control of our behavior. We have an intuitive way of carving an action into sub-actions and sub-sub-actions and implicitly assume that these components have a neural counterpart. This need not be the case. Drawing upon recent AI (Yamashita & Tani, 2008) and neurocognitive models (Koechlin et al., 2003; Kouneiher et al., 2009), I will argue that there is indeed a hierarchical organization present in neural circuits, but that it is based on differences in timescales at which processes operate. Consequently, the straightforward causal connections between the elements in a hierarchy are untenable. While in chapter 4 it is argued that the links between the elements in an action hierarchy were untenable, chapter 5 continues the dissolving of the discrete and explicit character of the action hierarchy, by arguing that even some of the elements are not present in action control either. In this chapter the notion of ‘intention’ is analyzed and compared with neural control of action. It is shown that the discrete character of intentions is incompatible with the dynamic nature of the control processes, and that therefore intentions cannot play the role in guiding our actions that is commonly assumed. Chapter 6 serves as an example of the direct relevance of the previous chapters to cognitive science. In this chapter the insights in action understanding (chapter 3), action control (chapter 4) and the nature of intention (chapter 5) are applied to infant action understanding and intention attribution. It will be argued that infant studies tend to leave the notion of ‘action understanding’ underspeciied, thereby paving the way for conlating action understanding and intention attribution, resulting in overly rich interpretation of infants’ cognitive capacities. Finally, in chapter 7 a general discussion of the preceding chapters is presented. It is concluded that our notions of action hierarchies and intentions need to be modiied substantially in order to provide an empirically fruitful concept. Next, two examples of how the analyses of the preceding chapters could lead to reinterpretation of current data will be given. Finally, I will discuss what the conclusions in this thesis mean for the concept of ‘intention’ itself, and this could impact future research. abstact Single cell recordings in monkeys provide strong evidence for an important role of the motor system in action understanding. This evidence is backed up by data from studies of the (human) mirror neuron system using neuroimaging or TMS techniques, and behavioral experiments. Although the data acquired from single cell recordings are generally considered to be robust, several debates have shown that the interpretation of these data is far from straightforward. We will show that research based on single-cell recordings allows for unlimited content attribution to mirror neurons. We will argue that a theoretical analysis of the mirroring process, combined with behavioral and brain studies, can provide the necessary limitations. A complexity analysis of the type of processing attributed to the mirror neuron system can help formulating restrictions on what mirroring is and what cognitive functions could, in principle, be explained by a mirror mechanism. We argue that the processing at higher levels of abstraction needs assistance of non-mirroring processes to such an extent that subsuming the processes needed to infer goals from actions under the label ‘mirroring’ is not warranted. This chapter was published, in a slightly modiied form, as: Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2011). What do mirror neurons mirror? Philosophical Psychology, 24(5), 607–623. two mirror neurons as representations goal inference mirror neurons representation theory of mind computation 20 Mirroring in cognitive neuroscience After their discovery in the early 1990s (Di Pellegrino et al., 1992; Gallese et al., 1996; Rizzolatti et al., 1996) mirror neurons caused great excitement in cognitive neuroscience, as these neurons seem to suggest a common coding of action perception and action execution (Hommel, Müsseler, Aschersleben, & Prinz, 2001; Prinz, 1997), or a shared representation between the observer and the executor of an action (de Vignemont & Haggard, 2008; Grezes & Decety, 2001). They were labeled ‘mirror neurons’, as the observed action “seems to be ‘relected’, like in a mirror, in the motor representation for the same action of the observer” (Buccino, Binkofski, & Riggio, 2004a). It is exactly this ‘rock bottom’ connotation of mirroring (i.e. direct relection, or direct matching of action features) that has made it an attractive notion for explanations of cognitive functions, including action understanding (Fogassi & Luppino, 2005; Iacoboni et al., 2005; Rizzolatti, Fogassi, & Gallese, 2001) emotion understanding (de Vignemont & Singer, 2006; Keysers & Gazzola, 2006; Wicker et al., 2003), imitation (Brass & Heyes, 2005; Buccino, Vogt, Ritzl, Fink, Zilles, Freund, & Rizzolatti, 2004b; Iacoboni et al., 1999; Rizzolatti, 2005; Wohlschlager & Bekkering, 2002), complementary action (Newman-Norlund, Van Schie, Van Zuijlen, & Bekkering, 2007), and communication (Ferrari, Gallese, Rizzolatti, & Fogassi, 2003; Gallese & Lakoff, 2005). Evidence for the existence of mirroring processes is derived from broadly three types of experimental research: (i) single-cell recordings in monkeys, (ii) analyses of the entire (human) mirror neuron system (MNS) using imaging or TMS techniques, and (iii) behavioral experiments, using interference effects and reaction times to probe properties of the MNS. The received view is that observed actions are mapped onto the motor cortex of the observer. When there is a matching motor representation available, the action is recognized. This hypothesis is known as the direct-matching hypothesis (Rizzolatti et al., 2001). In single cell research, the activity of mirror neurons is often conceptualized as a form of representation, coding for (categories of) actions or action goals (Fogassi & Luppino, 2005; Gallese et al., 1996; Iacoboni et al., 1999; Rizzolatti et al., 1996). In this type of research, the activity of a single neuron is measured and related to the occurrence of an external event. When there is a reliable covariance between the neuronal activity and an external event it is concluded that the neuronal activity represents the external event. 21 1 Technically, a consequence of this inability to measure individual neurons in humans is that mirror neurons have not yet been unequivocally established in humans. There is indirect evidence of mirror neurons in humans based on repetition suppression (Chong et al., 2008; Kilner et al., 2009), but this result is not unequivocal (Lingnau et al., 2009). mirror neurons as representations In research based on imaging techniques or TMS and behavioral studies, mirroring is generally viewed as a form of processing, mapping perceptual representations of the observed action to motor representations of the observer’s own action repertoire (Buccino, Binkofski, & Riggio, 2004a; Iacoboni et al., 2005; Miyashita, 2005; Rizzolatti, 2005; Rizzolatti et al., 2001; Rizzolatti & Craighero, 2004). As this type of research is dependent on imaging or TMS techniques, reaction times and error rates, which can only show or inluence activity in large groups of neurons, one can show the involvement of a brain region as a whole in a certain task, but not the response or contribution of a single neuron.1 The data acquired from single cell recordings is generally regarded to be robust and solid. Therefore, it is often used to guide research of the other two types, or to interpret the acquired data. For instance, Newman-Norlund et al. (2007) use the distribution of strictly and broadly congruent mirror neurons, as found by Gallese and colleagues (1996) in monkeys to predict the BOLD signal in the MNS in two different conditions (imitative vs. complementary action). However, the existence of several debates about the function of mirror neurons (Csibra, 2007; Dinstein, Thomas, Behrmann, & Heeger, 2008; Jacob, 2008; Jacob & Jeannerod, 2005; Saxe, 2005a) indicate that although the data these single-cell experiments generate might be hard, the interpretation of these indings is far from straightforward. In recent years several researchers have formulated criticisms on the received interpretation of the function of mirror neurons (the direct-matching hypothesis) pointing to the fact that this hypothesis cannot account for many important indings, and have formulated alternative theories (e.g. Csibra, 2007; de Vignemont & Haggard, 2008; Jacob, 2008). For example, both Csibra and Jacob argue that mirror neuron activity is not constitutive of action understanding, but only indicative of it (Csibra, 2007; Jacob, 2008; Jacob & Jeannerod, 2005). They argue that action understanding is an interpretative process that takes place outside the motor system and that mirror neurons are involved in the subsequent action prediction and planning. Although our analysis is different, the outcome can be interpreted as (partially) supporting their views. In this paper we want to analyze the paradigm that has lead to many of the indings, i.e. the single cell recordings. By means of an analysis of the representational elements of mirror neurons, we 22 will show that this type of research allows for virtually unbounded content attribution to individual neurons (see also Uithol et al., 2008). As a consequence, mirror neurons can be said to represent ever more abstract events, from grip types to long-term intentions. However, by means of a complexity analysis of goal inference, a task generally attributed to a mirroring process, we can formulate a possible limitation on what representational output such a process can produce. This analysis, combined with behavioral or brain studies, provides a possible means of limiting the representational content and can thereby help in interpreting the data acquired with single-cell measurements. We will argue that the recognition and understanding of goals and intentions need assistance of non-mirroring processes to such an extent that subsuming these processes under the label ‘mirroring’ is no longer warranted. Our analysis is speciically aimed at a mirror mechanism and its alleged support for action understanding, Although there might be consequences for its support for other cognitive functions (see above), these fall outside the scope of this paper. Mirror neurons representing actions and goals The iring of mirror neurons is often characterized as a form of representation. The neurons are said to represent action means, action ends or goals, and intentions. Examples are abundant: Gallese et al. (1996) propose that a “possible function of mirror neuron movement representation is that this representation is involved in the ‘understanding’ of motor events”; Rizzolatti et al. (1996) propose that “[mirror neuron’s] activity ‘represents’ the observed action”; and Iacoboni and colleagues (1999) suggest that “F5 neurons code the general goal of a movement”. What counts as an action means or an action goal is relative and a matter of interpretation. To give an example: A precision grip can be a means to the grasping of a cup. This cup grasping, however, can also be considered a means to the end drinking. Drinking, in its place, can be regarded a means to maintaining homeostasis or engage in social activity. There thus exists a continuum from concrete, readily observable movements (e.g. the use of a precision grip) to highly abstract goals and intentions (such as engaging in social activity), and there is no a priori way to make a clear-cut and objective contrast between action means and action ends or goals. Several action hierarchies and labels have been proposed to divide this continuum (Bek- 23 content user vehicle object representation proper Figure 1. The basic elements of a representation. 2 This contrast is also denoted with the terms intention in action and prior intentions (Searle, 1983). See also Chapter 5. mirror neurons as representations kering & Wohlschlager, 2002; Grafton & Hamilton, 2007; Jeannerod, 1994). To clarify our terminology: we will speak of actions in relation to the level of grips or simple actions (e.g. grasping with a precision grip), and of goals when the behavior is interpreted more broadly, ranging from motor goals (e.g. the goal of grasping a cup) to long-term intentions (cleaning the table or spending your next holiday in Brazil)2. As mirror neuron iring is commonly viewed as a form of representation, we will analyze it using the basic elements of a representation: vehicle, content, object and user (Cummins, 1989; Dretske, 1988) (See Figure 1. See also Bechtel (1998) and Shea (2007) for similar presentations of these elements.) The representation proper consists of a vehicle and a content. The vehicle of a representation is the physical carrier (e.g. neural state) that represents. The information that is carried by the vehicle is called its content. Content is not the same as the object that is represented. An object or event in the outside world can be misrepresented and most of the time the content is of a more general or more abstract nature than the object represented (e.g. a sparrow can get represented as “bird”). It is important to note that representational objects need not be physical objects. A representational object can as well be a situation or an event, such as an action. The fourth and inal element of a representation is a user. The user is the system or process that uses the representation to guide its behavior. In case of mirror neurons, the user is likely to be another brain system. For a full understanding of the functionality of the mirror neuron system, one has to specify the user of the information that mirror neurons are supposed to carry. Yet, in most models on the working of mirror neurons the user remains unspeciied. We will therefore analyze mirror neuron representations using just the vehicle, content and object aspect. 24 Single-cell recording experiments are based on what we have elsewhere called a vehicle-irst approach (Uithol et al., 2008): One starts with a vehicle, in this case a neuron, and then tries to identify the type of stimuli the vehicle covaries with (viz. the neuron responds to), thereby establishing a characterization of its content. Not all mirror neurons are equally selective in their responses. This has led Gallese et al. (1996) to discriminate three categories of mirror neurons: strictly congruent, broadly congruent and non-congruent mirror neurons. Strictly congruent mirror neurons respond to observed and executed movements that correspond both in terms of general action (e.g. grasping) and in the way that action was executed (e.g. precision grip). During action observation, the object of the representation is the movement of the experimenter or of another monkey. During action execution, the object is the movement of the monkey. The content of the neuron’s iring is assumed to be the shared feature of the two events that the neuron responds to, in this case the particular action with a particular grip (e.g. a grasp with a precision grip). When the motor and perceptual object share a common feature that gets relected in the activity of the vehicle, the neuron is said to ‘mirror.’ Raising levels of abstraction Each type of broadly congruent mirror neurons responds to a variety of grips or actions (Gallese et al., 1996), and consequently no commonality in the response proile can be found at the level of grips. For instance, broadly congruent neurons of group 1 are highly speciic to motor activity in terms of action and speciic type of grip (e.g. a precision grip), but respond to the observation of various types of grips (e.g. both a precision grip and a full hand grip). See Table 1 for the various types of broadly congruent mirror neurons, their response proiles, and the lowest common property in the motor and visual response proile. Although it is not possible to specify a shared property on the level of grips, congruence can be found one level up, at the level of actions. Here the response proile is equally speciic on the motor and perception side. The key property that mirror neurons owe their name to—the fact that the common property of a motor and a perceptual event gets relected in the activity of one vehicle—can be preserved, but only by moving the description of the shared property from the level of 25 type of mirror neuron response proile (m = motor, v = visual lowest common property in motor and visual proile non-congruent m: various actions v: various actions object-related actions broadly congruent group 3 m: speciic action v: various actions speciic goals (grasping to eat) broadly congruent group 2 m: speciic hand action v: various hand actions speciic category of actions (e.g. hand actions) broadly congruent group 1 m: speciic grip v: various grips speciic action (e.g. grasping with a hand) stricly congruent m: speciic grip v: speciic grip speciic grip (e.g. grasping with precision grip) Table 1. The various mirror neurons, their response proile and their lowest common property in motor and visual response (i.e., their attributed content). In sum, neurons can be made to mirror—in the sense of representing a common property of a motor and perceptual event—by invoking levels of description of an increasing abstraction. As there exists an almost unlimited number of levels of abstraction, representational content can be attributed to any neuron that responds to both executed and perceived actions. It must be emphasized that this is not a problem only to broadly congruent or noncongruent mirror neurons. The same interpretational principles can ren- mirror neurons as representations grips up to the level of actions. The representational content attributed to this neuron can then be formulated as ‘grasping with the hand’. In a similar vein, broadly congruent mirror neurons of group 2 can be taken to be congruent (and thereby representational) on the level of categories of actions, for instance hand actions versus non-hand actions. Neurons of group 3, in turn, can be considered congruent on the level of action goals, as these neurons appear to respond to the goal of an action and to be indifferent to the means by which this goal is achieved. Non-congruent mirror neurons seem to show no clear-cut congruency between the observed action and the movement of the monkey. Hence, at irst sight no common property seems available in their response proile. However, when the level of abstraction is raised to the level of object-related versus non-object-related actions, this neuron can be considered congruent and representational again, as mirror neurons only respond to object-related actions, and not to, for instance, mimed actions. The representational content of this type of neurons can thus be characterized as object-related actions. So even non-congruent mirror neurons can be made congruent by choosing the appropriate level of description. 26 der a neuron congruent or incongruent anywhere along the continuum. This problem of unbounded content attribution undermines the explanatory value of the notion of mirroring, By raising the level of abstractness, one strays from the rock bottom connotation of mirroring, making it increasingly dificult to see how highly abstract properties can be ‘relected directly’. This problem can be overcome by imposing some principled restrictions on the level of abstraction at which mirroring can rightfully be said to occur. An analysis of the processes that are attributed to the MNS can offer such principled restrictions. Action understanding in the mirror neuron system The human MNS is assumed to consist of the rostral part of the inferior parietal lobule, the lower part of the precentral gyrus and the posterior part of the inferior frontal gyrus (Rizzolatti & Craighero, 2004). This mirroring system is supposed to facilitate action understanding, goal understanding and imitation by means of a mirroring process (Iacoboni et al., 1999; 2005; Rizzolatti et al., 2001). Although the nature of the mirroring process is still largely unknown, some claims about features of this process can be found in the literature. The general idea is that perceptual representations of the observed action are mapped to motor representations of the observer’s own action repertoire.3 Importantly, the process is assumed to be direct (i.e., the mirror neuron representation is brought about without involvement of higher, inferential processes, but by means of direct coupling, direct activation, direct association), or otherwise computationally simple. For example, Rizzolatti and Craighero (2004) write: “The proposed mechanism is rather simple. Each time an individual sees an action done by another individual, neurons that represent that action are activated in the observer’s premotor cortex. […] Thus, the mirror system transforms visual information into knowledge.” Similarly, Iacoboni (2008) writes: “[w]e do not have to draw complex inferences or run complicated algorithms. Instead, we use mirror neurons.”4 3 The activity of the mirror neuron system is often described as a form of resonance. This resonance is claimed to be either interpersonal, i.e. between parts of the premotor system of the observer and of the executor or intrapersonal, i.e. between a visual and a motor representation in the observer. See Uithol et al. 2011b for an elaboration on this distinction. 4 Strictly speaking, this statement relects a category mistake in the sense that “we” can use mirror neurons. One may wonder what the “we” consists of if neurons are not part of it. How- 27 Goal inference is context dependent The recognition of an action alone is not suficient for a reliable goal inference, as multiple goals can be achieved by a given action (e.g., picking up a cup for drinking, pouring, cleaning up) (de Vignemont & Haggard, 2008; Jacob & Jeannerod, 2005). Also, multiple actions can be performed ever, we do not want to elaborate in this paper on this along the lines of Bennett & Hacker (2003). Rather, we see this statement (and many similar others) as a ‘rough and ready’ type of description that could be formulated more appropriately (e.g. by speaking of mirror neurons that implement our “capacity to…”) when the occasion requires it. mirror neurons as representations Despite the general agreement on the simplicity of the mirroring process, there is diversity in the ield when it comes to the capacity for abstraction of the mirroring process. It is claimed that the mirroring process produces representations of actions and action means (Buccino, Binkofski, & Riggio, 2004a; de Vignemont & Haggard, 2008; Fadiga, Craighero, & Olivier, 2005; Rizzolatti & Craighero, 2004), but at other places the scope of possible MNS output has been expanded to incorporate representations of the intentions behind actions (Gallese, Keysers, & Rizzolatti, 2004; Iacoboni, 2008; Iacoboni et al., 2005). For example, Rizzolatti & Sinigaglia (2010) claim that “through matching the goal of the observed motor act with a motor act that has the same goal, the observer is able to understand what the agent is doing”. The range in abstraction attributed to the output of the mirror neuron system exceeds the one depicted in Table 1 for individual mirror neurons (from action means to immediate action goals). We will argue that, on theoretical grounds, it is implausible that a direct or otherwise simple process has the capacity for reliably producing representational content at or above the level of action goals. In order to do so we will have to make minimal assumptions on what could be possibly meant by ‘processing’ in the context of mirroring. For our purposes it will sufice to assume that by processing one means a form of ‘computation’ in the broad sense of the word (Chalmers, 1995; Eliasmith, 2010; Piccinini, 2008). This would include non-traditional and non-symbolic forms of computation—such as the various forms of neural network computations—but it would exclude hypothetical mechanisms with presumed computational processing powers that have no possible physical implementation (see also Frixione (2001), Tsotsos (1990), and van Rooij (2008)). 28 to achieve a certain goal (the goal to drink with grasping a cup, ordering a beer, opening the tap). In other words, there is not a one-to-one mapping between actions and goals, but a many-to-many mapping. Therefore, goalaction associations alone cannot produce a unique goal when observing a given action. Which goal can be reached by an action is dependent on the context in which the action is performed. For instance, hand waving can be a means to shooing away mosquitos as well as making a taxi stop. It is the context of the action—the presence of taxis or mosquitoes, an urban environment or a campground—that leads to a different interpretation of the observed action. So in order to reliably infer goals from observed actions, both the action and the context must be processed (De Ruiter, Noordzij, NewmanNorlund, Hagoort, & Toni, 2007; Jacob & Jeannerod, 2005; Kilner, Friston, & Frith, 2007a; Toni, Lange, Noordzij, & Hagoort, 2008; van Rooij, Haselager, & Bekkering, 2008). This context-dependency of goals is also present in Iacoboni and colleagues’s tea-cup experiment (Iacoboni et al., 2005), where grasping a cup from a neatly set table was used to suggest the goal of ‘drinking’, and grasping it from a messy table to suggest ‘cleaning up’. Typical mirror areas (the posterior part of the inferior frontal gyrus and the adjacent sector of the ventral premotor cortex) were shown to be more active when an intention could be inferred from one of these contexts than in cases of no context, which the authors took as evidence that a mirroring process is responsible for performing the context-dependent inferring of goals (Iacoboni, 2008; Iacoboni et al., 2005). We argue, however, that although it might be the case that parts of the inferior frontal cortex are involved in context-dependent goal inference, it does not appear likely that this is done by means of a mirroring mechanism. The degree of context understanding required for reliable goal inferences seems, given the computational complexity of this task, to exceed by far the abilities of a direct or otherwise simple mechanism. Goal inference is ‘non-direct’ and ‘complex’ Goal inference above the level of immediate goals cannot be based on a direct or a simple mechanism. Although it remains somewhat implicit what most researchers mean by direct, we will argue that it cannot mean that sensory representations are mapped on the motor system without signiicant aid of other, more inferential processes. Such a direct mapping mechanism 29 5 Evidently, real world situations can vary in many more than just 35 features, but to make our case we do not need to assume anymore such features. mirror neurons as representations would need access to a mapping structure in which each possible actioncontext combination maps directly to a unique goal. But given the number of possible actions and the number of possible relevant context aspects, this strategy would, due to what is called a combinatorial explosion, very soon result in an unmanageable number of mappings. To illustrate, consider there are, say, just 35 possible context features 5 (e.g., that the person in a white coat is a surgeon (or a psychopath), the scalpel is sterile (or poisoned), the setting is a hospital (or a movie set), the person being cut is a patient (or an actor), the other people in the room are nurses (or medical students), etc.) and there are several different goals (e.g., grasping the scalpel to cure, to hurt, to cut wire, to clean, to put away, to give to a nurse, etc.). If we make the simplifying assumption that any context feature is either ‘present’ or ‘absent’ then there already exist more than 100,000,000,000 distinct possible contexts (that is one hundred billion, roughly the number of neurons in the entire human brain). Allowing values in between ‘present’ and ‘absent’ only serves to increase the number more. This unmanageable number of combinations makes it impossible to solve goal inference using a direct mapping solution. In everyday’s conditions, people can easily infer goals from observed actions in a certain context. This suggests that people do not instantiate all possible action context combinations. Instead, it suggests that action to goal mappings must be ‘non-direct’ or ‘inferential,’ in the sense that they use some form of knowledge about the interaction between actions and contexts to infer plausible goals. Besides statements about mirroring being a direct process, claims that context-sensitive goal inferences could be achieved by a simple mechanism can also be found in the literature (Iacoboni, 2008; Rizzolatti & Craighero, 2004). There are, however, reasons to believe that no simple mechanism can support a general capacity for making context-sensitive goal inferences, as these inferences belong to a class of inferences that are known to be notoriously dificult. Inferring goals from actions is a form of abduction, also called inference to the best explanation (Baker et al., 2008; Charniak & Goldman, 1993; Haselager, 1997). In abduction, causes are hypothesized to explain observed effects. In the case of goal inference, the cause is a ‘goal’ and the observed effect is an ‘action’. Existing models of abduction of a reasonable generality all belong to the class of so-called computationally intractable (or NP-hard) functions (Abdelbar & Hedetniemi, 1998; Bylander, 30 Allemang, Tanner, & Josephson, 1991; Eiter & Gottlob, 1995; Garey & Johnson, 1979; Nordh & Zanuttini, 2005; 2008; Thagard & Verbeurgt, 1998; van Rooij, 2008). These are functions strongly conjectured by mathematicians to defy eficient computation by any physically implementable mechanism (see Garey & Johnson (1979), and van Rooij (2008) for details). This thus suggests that it is unlikely that a reasonably general capacity for making goal inferences is based solely on a mechanism that qualiies as ‘simple’. Taking complexity into account Due to the tractability issues with abduction in its general form, it seems unlikely that humans can perform completely domain general goal inferences. Instead, when trying to igure out a goal behind an action, humans may be performing this task against the background of a restricted domain of situations. This restricted domain must still be quite general if it is to account for the variety of situations in which humans can infer goals. At the same time it needs to be suficiently constrained in nature to allow for tractable goal inference. At present it is unclear how tractable models of abduction can be formulated without rendering their domain of application too simple for modeling real world domains (see e.g. Nordh & Zanuttini, 2005; 2008). It is noteworthy that computational models of goal inference in cognitive science that seem to work (i.e., that make plausible goal inferences without running into tractability issues) severely restrict the possible contexts and the number of possible actions and goals, keeping their application domain far removed from realistically complex situations (see Baker, Tenenbaum, Saxe, & Trafton, 2007; Cuijpers, Van Schie, Koppen, Erlhagen, & Bekkering, 2006; Erlhagen, Mukovskiy, & Bicho, 2006; Oztop, Wolpert, & Kawato, 2005). For example, Baker et al. (2007) modeled goal inferences made by an observer viewing a point moving in the lat plane to one of 3 possible goal states, and Oztop et al. (2005) modeled goal inferences made by an observer viewing a reaching movement in the lat plane to one of eight possible goal states, and a grasping movement with three possible goals. The availability of such successfully predictive, albeit highly restricted, models seems to lead to an underestimation of the computational complexity inherent in more general domains. A possible way of dealing with the limitations of a direct-matching mechanism and the fact that it cannot account for goal inference above the level of immediate goals could be to “upgrade” the notion of mirroring, by sup- 31 6 These limitations, however, are as such not enough to provide a solid and undisputable bound for content attribution to the iring of individual mirror neurons. That would require a deep understanding of the functioning of the MNS at the neuronal level, as well as taking an ontological position on the nature of representation, which would be beyond the scope of this article. mirror neurons as representations plying it with more inferential capabilities. This upgraded process might thereby be capable of explaining a substantial part of goal inference after all. A problem with such an approach, however, is that calling this complex, unknown form of processing a ‘mirroring process’ does not provide additional explanatory value over calling it just processing. Moreover, it further increases the risk of underestimating the complexity of the computational problem underlying goal inference. Once we recognize the dificulties inherent in making context-sensitive goal inferences, we can better address the question of how ‘mirroring’ can contribute to an explanation of goal inference. To draw a parallel: line detection is likely an important sub-process of visual perception, yet we do not think of object recognition as a case of mere line detection. Likewise, mirroring may be an important sub-process in goal inference, but we would not do well to assume that goal inference is a case of mere mirroring. In this light it is informative that computational models that set out to connect to neuroatonomy place the major burden of processing context in goal inferences outside the MNS (Kilner, Friston, & Frith, 2007b; 2007a; Oztop, Kawato, & Arbib, 2006). This separation between context-processing and mirror neuron processing makes sense given that context is such a multimodal and multifaceted construct; it is unlikely that a mirroring process is by itself capable of processing context in all its complexity. These considerations can help to interpret the iring characteristics of mirror neurons in the sense that when the attributed content becomes increasingly abstract, it becomes less plausible that the mirror neuron is the actual functional unit that is playing the key role in the cognitive function one is interested in6. Instead, at higher levels of abstraction, such as higherlevel action goals or intentions, it becomes increasingly plausible that less direct (more inferential) cognitive processes inluence the response proile of mirror neurons. An intuitive response to our analysis is that we do not take the role of expectations into account. In our daily lives we are familiar with many different contexts and appropriate actions in that context. Such associations could result in an expectation or prediction of what the goal of an observed action is, which can help goal inference on many occasions. Indeed, most of 32 the time a reaching action has grasping as its goal, and most grasped cups are grasped for drinking. But it is important to realize that such default associations can only predict one goal given a certain action or context-action pair. People can easily ignore such default interpretations when necessary (say, when the waiter is grasping your empty cup). It is precisely this capacity to sometimes use a default association and sometimes use another context sensitive process that makes direct matching an insuficient mechanism to account for our general capacity to infer goals from actions. Another objection to our arguments could be that there is data available that does show mirror neurons representing higher intentions. For instance Fogassi et al. (2005) found mirror neurons in the monkey’s Inferior Parietal Lobule that responded selectively for different intentions underlying the same actions. Monkeys were trained to grasp a piece of food and either place it in a container on their shoulder, or eat it. Some neurons responded differently for these two intentions. Importantly, in some neurons this difference in iring was preserved when the monkeys observed the experimenters perform the same actions. Fogassi et al. take this as evidence that mirror neurons are in fact capable of mirroring intentions. At a irst sight this data seem in contradiction with our analysis, as we argued that a mirror mechanism cannot yield an action description at the level of intentions. However, as it is pointed out by (among others) Csibra (2007), from the fact that this activity is indicative of intention understanding, one cannot conclude that it is constitutive of intention understanding. A mechanism in which low-level mirror processes are modulated by other, context sensitive processes would produce similar neuronal in these neurons. The inding of neurons with mirror properties in medial frontal and temporal cortices outside the classical mirror neuron area (Mukamel, Ekstrom, Kaplan, Iacoboni, & Fried, 2010) point in the direction that other regions are also involved in action recognition. Also the activation of non-mirroring areas upon interpreting actions found in imaging studies, such as occipital, posterior parietal and frontal areas (Calvo-Merino, Glaser, Grezes, Passingham, & Haggard, 2005; Grezes & Decety, 2001; Iacoboni et al., 2005) hints to the modulating role of these brain systems. When Rizzolatti and Sinigaglia (2010) speculate on mirror neurons’ capacity to generalize between various effectors, and actions, some of which fall outside the observer’s own motor repertoire, they too recognize the necessity of aid of non-mirroring processes in action understanding. We think that, in addition to characterizing the nature of the mirrored 33 Conclusion Research based on single-cell recordings is hindered by a conceptual indetermination that allows for content attribution of an unbounded abstraction, troubling the interpretation of the data of these experiments. However, research into the MNS as a whole, together with an analysis of the computational complexity of the attributed task can help to restrict the class events that can be said to be mirrored. We have argued that, on theoretical grounds, a direct or otherwise simple (mirror-like) process cannot be used to infer action goals, as the context-dependency of goals deies a simple or direct solution to the task of goal inference. We therefore propose to restrict the use of the term mirroring to describe a simple relective mechanism that is involved in relatively low-level action observation and recognition, such as grips of basic actions. This restriction can help interpreting the data acquired by means of single-cell recordings. It is of course obvious that we, humans, are very well capable of attributing intentions from observed actions, in spite of the presumed complexity of this task. Explaining how we solve or evade the apparent computational intractability inherent in context-dependent goal inference is a major question for cognitive neuroscience. However, we think that answering this question is not helped by heaping together potentially complex processes under the label of ‘mirroring’. Instead, much work in various areas of cognitive science needs to be done before this question can be answered. How good, exactly, are we at goal inference? In what circumstances do we make mistakes? In what way is context relevant for goal attribution? What restrictions apply to the action domain that make goal inference computationally tractable enough for humans in everyday life? Answers to these questions can guide future theory development on goal inference. It is only when the complexity of the task is appreciated to the full extent that we can expect to get insight into how goal inference is achieved by brain mechanisms and how mirroring contributes to these mechanisms. mirror neurons as representations motor representations (de Vignemont & Haggard, 2008), unraveling this interaction between different brain parts is a necessary step in addressing the problem of how people are capable of understanding observed actions or inferring the goals those actions serve, than subsuming complex processes under the label of “mirroring”. abstract The discovery of mirror neurons in monkeys, and the inding of motor activity during action observation in humans are generally regarded to be supportive for motor theories of action understanding. These theories take motor resonance to be essential in the understanding of observed actions and the inference of action goals. However, the notions of ‘resonance’, ‘action understanding’ and ‘action goal’ appear to be used ambiguously in the literature. A survey of the literature on mirror neurons and motor resonance yields two different interpretations of the term resonance, three different interpretations of action understanding, and again three different interpretations of what the goal of an action is. This entails that, unless it is speciied what interpretation is used, the meaning of any statement about the relation between these concepts can differ to a great extent. By discussing an experiment we will show that more precise deinitions and use of the concepts will allow for better assessments of motor theories of action understanding and hence a more fruitful scientiic debate. Lastly, we will provide an example of how the discussed experimental setup could be adapted to test other interpretations of the concepts. This chapter has been published, in a slightly modiied form, as: Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2011). Understanding motor resonance. Social Neuroscience, 6(4), 388–397. three motor resonance mirror neurons action understanding theory of mind goals 36 Introduction The discovery of mirror neurons in macaque monkeys (Di Pellegrino et al., 1992; Gallese et al., 1996; Rizzolatti et al., 1996) has generally been greeted as support for the idea that motor areas play an essential role in understanding observed actions and the inference of the pursued goals of these actions, as these neurons ire upon both observing and executing actions, leading to the idea that the observer simulates the observed action (Gallese & Goldman, 1998). This suggestion was further backed up by the inding that the human motor system becomes activated during action observation (Buccino, Binkofski, & Riggio, 2004a; Buccino et al., 2001; Fadiga et al., 2005; Rizzolatti & Craighero, 2004). Due to the supposedly direct and noninferential character of this process, this phenomenon is often referred to as “motor resonance”. Ever since the discovery of mirror neurons many fascinating indings have been reported. However, the explanatory power of mirror neurons regarding action understanding has fallen out of step with the continuous stream of experiments and accompanying indings. Theories on the mirror neuron system (MNS) and motor resonance have recently received criticism (Dinstein et al., 2008; Hickok, 2009; Jacob & Jeannerod, 2005). The general purport of this criticism is that mirror neurons cannot account for certain experimental indings (Hickok, 2009; Saxe, 2005a; 2009), or that the generalization from monkey data to the human mirror neuron system is not warranted (Dinstein et al., 2008; Lingnau et al., 2009). Also, theoretical concerns about the limitation of action understanding by means of directmatching have been raised (Csibra, 2007; Jacob & Jeannerod, 2005; Uithol, van Rooij, Bekkering, & Haselager, 2011a). It is not the purport of this paper to review the extensive body of research on mirror neurons and to argue for a speciic framework in which the experimental indings are best explained. To a large extent we will remain neutral on these matters. Instead, we will show that the ongoing discussion makes often use of imprecise terminology. Due to the use of ambiguous concepts on both sides, the discussion between proponents and critics of motor resonance-based theories of action understanding advances only with great dificulty. By means of a careful analysis of the concepts of ‘motor resonance’, ‘action understanding’ and ‘action goals’, we aim to clarify the troubled debate around motor theories of action understanding and the role mirror neurons play. 37 notion interpretation explanation example resonance intrapersonal Resonance between visual and motor areas Visual representation of grip type is propagated to motor areas interpersonal Resonance between observer and executor of action Both observer and executer have representation grasp action in motor areas action Action of higher abstraction than observed action Drinking object Object at which the action is directed A cup world state Desired world state that can be achieved by action A full cup of coffee action recognition Recognition of observed action Recognize action as grasping goal recognition Recognition of goal of an action Recognize grasping action as serving drinking action anticipation Generation of response to observed action Prepare grasping action when offered a cup. action goal action understanding Table 1. The possible interpretations of resonance, action goal and action understanding, as found in the literature. motor resonance The notion ‘motor resonance’ appears to be used ambiguously in the literature on the MNS. At least two fundamentally different interpretations of the notion of resonance are used in neurocognitive explanations of the MNS, which we will call intrapersonal and interpersonal resonance. Each interpretation has different elements taking part in the resonance process. Next we will show that three qualitatively different interpretations can be found of what the goal of an action is: The goal as a more abstract action, the goal as a graspable object and the goal as a desired world state. We will discuss these three interpretations. Finally, we will show that the notion of action understanding can describe three different cognitive functions, which we will label action recognition, goal recognition and action anticipation. An overview of the different interpretations and our terminology is shown in Table 1. The interpretations will be discussed in detail below. It is important to note that none of these interpretations is in itself right or wrong, or better than another one. As long as it is speciied what is precisely meant by a notion, any of the interpretations is valid and could fulill a role in theories on action understanding. 38 A consequence of this variability in interpretations is that the exact meaning of any claim about motor resonance, action goals and action understanding that does not specify which of the interpretations of these notions is used can vary to a great extent. A careful analysis of these claims allows for better interpretation of theories about underlying neurocognitive matching mechanisms of action observation, and action execution, and can help guide the design of future experiments. We will discuss an existing experiment from the literature, Umiltà et al’s (2001)mirror neuron paper as a case study and illustrate how the experimental data and the interpretation of them have diverged as a result of the abovementioned indeterminacy of terminology. As an indication of the empirical applicability of the distinctions we propose, we will inish by presenting a concrete suggestion for how this study could be adapted so that other interpretations of the concepts presented in Table 1 can be tested. resonance In literature on the MNS the notion of resonance is used to describe the activation of the motor system during action observation. The notion is adopted from physics and is used to describe the phenomenon that one (part of a) system oscillates at the same frequency and in the same phase as another (part of the) system. In the neurocognitive domain it is not claimed that the motor system is literally resonating in the sense that premotor neurons are iring in the same frequency and phase as neurons in other areas (we will come to the question of what areas soon). These claims should thus not be read as claims about neural synchrony (Damasio, 1989; Ward, 2003) or neural oscillation (Fries, 2005). Instead, a more liberal sense of the notion is usually adopted. Rizzolatti et al. (2001) write: “we understand actions when we map the visual representation of the observed action onto our motor representation of the same action”. Elsewhere (Rizzolatti & Craighero, 2004), they explain: “The proposed mechanism is rather simple. Each time an individual sees an action done by another individual, neurons that represent that action are activated in the observer’s premotor cortex. […] the motor ‘resonance’ translates the visual experience into an internal ‘personal knowledge’.” This process is often characterized as a form of simulation, in which the observer simulates the observed motor act, in order to understand it (Decety & Grezes, 2006; Gallese & Goldman, 1998). 39 1 It is still debated whether the inal action representation—provided that such a representation exists—resides in motor areas (as embodied approaches to cognition argue) or whether there are disembodied representations of actions. Here we choose not to take side in this debate. motor resonance When examining the literature on mirror neurons and action understanding, two different meanings or interpretations of the notion can be discovered, each having different elements participate in the resonance process. We will call these two interpretations intrapersonal resonance and interpersonal resonance. In the intrapersonal interpretation of resonance, it is claimed that the motor system of the observer of an action resonates with her own perceptual system, so both brain areas taking part in the resonance process lie within the same person. Examples of this kind of use can be found in, for example, Rizzolatti et al. (2001; 2004), Buccino et al. (2004a), and Hommel (Hommel, 2003; Hommel et al., 2001). The idea is that the observation of an event leads to a representation in the perceptual system of the observer. This perceptual representation is thereupon propagated to the motor system. When the perceived event is an action and a matching motor representation is available, the motor system resonates, similar to a tuning fork that starts to resonate when a note of the right pitch is played nearby (Jacob, 2009; Saxe, 2005b). Like the resonance of the tuning fork can provide information about the pitch of the note played, the resonance of the motor system provides information about the action that is perceived. This is possible, according to the theory, because the resonance is speciic for different actions. For example, at the observation of a certain grasping action, e.g., a precision grip, a motor representation corresponding with that speciic grasping action is activated in the motor system. The observer “recognizes” the activity in her motor system as being a representation of the speciic grasping action, and she thereby recognizes the observed precision grip action. As the coupling of a perceptual representation to a motor representation happens unmediated by higher cognitive processes, this theory is also known as the direct-matching hypothesis (Iacoboni et al., 1999; Rizzolatti et al., 2001; Rizzolatti & Sinigaglia, 2010). Figure 1 depicts the presumed causal chain from a motor plan in the executor to an action representation in the observer, and the place where intrapersonal resonance occurs1. 40 executor observer motor system executor intention motor representation motor system observer action perceptual representation motor representation action representation intrapersonal resonance Figure 1. The presumed causal path from action plan in the executor to action representation in the observer and the location of intrapersonal resonance. The strongest evidence for this theory comes from single cell recordings in macaque monkeys. Neurons in the inferior premotor areas were shown to ire selectively for different actions, and action means like precision and power grips, both performed and observed (Di Pellegrino et al., 1992; Gallese et al., 1996; Rizzolatti et al., 1996). This has led to the conclusion that these areas are involved in the recognition (and understanding) of actions. These monkey data were backed up by imaging data that showed that the human motor system is activated differently upon observations of different actions (Buccino et al., 2001; Buccino, Binkofski, & Riggio, 2004a; Fadiga et al., 2005; Rizzolatti & Craighero, 2004). This theory can elegantly account for the inding that mirror neurons do not ire when the observed event is not an action (Gallese et al., 1996), or when the action is carried out by a non-biological effector (e.g. a robot arm) (Kilner, Friston, & Frith, 2007a; Tai, Scherler, Brooks, Sawamoto, & Castiello, 2004). Resonance occurs when a matching motor representation is available, so when the perceived event is not an action or an action that is carried out by a non-biological effector, there is no matching motor representation and the motor system remains silent.2 In a second interpretation, the notion of resonance is used to denote functional correspondence between the states in the motor system of the observer and that of the executor of an action. This view is present in the work of, for instance, Decety & Grezes (2006), Gallese (2001), Jacob (2008), Fadiga (2005), de Vignemond & Haggard (2008) and Wilson & Knoblich (2005). As the two systems taking part in the resonance process are situated 2 There are experiments, such as Fogassi et al. (2005) and Umiltà et al. (2008) that show mirror neuron response to tool-based actions, but this was only after extensive training with tools. A possible explanation is that, by training with tools, the monkey creates a motor representation of these actions. 41 executor observer motor system executor intention motor representation motor system observer action perceptual representation motor representation action representation interpersonal resonance Figure 2. The presumed causal path from action plan in the executor to action representation in the observer as presumed in motor theories of action understanding, and the two parts of the system that take part in interpersonal resonance. Resonance in the interpersonal meaning is a higher-level description of the result of various processes from a motor representation in the executor to an activated motor system in the observer. It describes a resemblance between the two motor systems and it can be established without making claims about the underlying mechanism. This is evident from Figure 2: The resonance process covers multiple causal steps that can be accomplished by various underlying mechanisms. This interpretation of resonance is not committed to speciic mechanisms bringing about these steps. Usually a form of intrapersonal resonance is presumed to establish interpersonal resonance, but this is not necessarily the only option: an inferential process could also result in interpersonal resonance. motor resonance in two different persons, we will call this form of resonance interpersonal resonance. In the interpersonal interpretation of resonance, the notion is used in an even more metaphorical sense. It is assumed that there is a semantic or functional resemblance between the motor representation in the observer of an action and the motor representation of the executor of the action (e.g., both motor systems represent a grasping action at the same time). In a sense, the observer and the executor of an action share a representation (de Vignemont & Haggard, 2008). It is therefore stated that the observer’s motor system resonates with that of the executor (Gallese, 2001; Gallese & Goldman, 1998; Goldman, 2009; Jacob, 2008; M. Wilson & Knoblich, 2005), or shorter, that the observer resonates with the executor (Fadiga et al., 2005). Figure 2 shows the presumed causal sequence from an action plan in the executor to a representation of that action in the observer. The two elements that take part in the interpersonal resonance are marked with an arrow. 42 Setting goals It is often claimed that motor resonance allows the recognition of not only the action as such, but also of the goal that is served by the action (Iacoboni et al., 2005; Rizzolatti et al., 2001; Rizzolatti & Sinigaglia, 2010). Yet, like the notion of motor resonance, the notion of goal allows for various interpretations. A survey of the literature on mirror neurons yields three qualitatively different interpretations of the goal of an action. First the goal of an action is often interpreted as another, less speciic action that is abstracted from execution speciics. For example, Gallese et al. (1996) classify mirror neurons as broadly congruent when the neurons appear to be activated by the goal of the observed action, regardless of how it was achieved. An example of such a goal could be “grasping”, and grasping with a precision grip, grasping with a full hand grip and grasping with the mouth all serve the goal grasping. The goal-as-an-action interpretation is also present in the work of Ferrari (2005), Fogassi (2005) and Iacoboni (2005) and is dominant in the early papers on mirror neurons (Gallese et al., 1996; Rizzolatti et al., 1996). The fact that the goal of an action is itself another action is potentially problematic, as nearly every action itself can be said to serve a new, higher goal. To illustrate: The action “grasping a cup”, can serve the goal “drinking”. Thus conceived, drinking is an action goal. “Drinking”, however, can also be considered an action, having “quenching thirst” or “engage in social activity” as a goal. Quenching thirst serves the goal “maintaining homeostasis”, which serves the goal “survival” and so on. There thus exists a continuum from concrete, readily observable events (the use of a precision grip) to highly abstract events (survival) 3. Although individual preferences may be possible, there seems to be no a priori level at which actions are located and a level at which action goals are located. Umiltà et al. (2008) provide a clear example of goals and actions lying on the same continuum. Macaque monkeys were trained to use normal and reverse pliers to grasp objects. The researchers found that the same motor neurons that under normal conditions ire when an object is grasped, also ire when the object is grasped with reversed pliers, which means that 3 Besides actions and action goals, two more related notions can be found in the literature. An “action means” is a particular way of performing an action. Action means also lie on the same continuum as actions and goals, and can therefore, upon different interpretations, also be actions themselves. The notion “movement” is often used to denote a movement that does not serve a goal (see for instance Gallese & Goldman (1998) or Hommel (2003)). Action thus conceived is a subclass of movements, i.e. those movements that serve a goal. 43 4 As said, in note 3, the difference between a movement and an action is often taken to be that the latter serves a goal and the former doesn’t. This would entail that every action serves a goal, making the term ‘goal-directed action’ a pleonasm for other interpretations of “goal”, as non-goal-directed actions cannot exist, just non-goal-directed movements. 5 This quote illustrates how terminology can cause confusion. Apart from the personal/subpersonal violation, the claim that “mirror neurons infer” also departs from the initial claims that mirror neurons engage in direct relection and no inferential processes are needed. See Uithol et al. (2011a) for a more detailed discussion on direct relection versus inferential processing with respect to mirror neurons. motor resonance the hand needs to be opened to grasp the object. This suggests that these motor neurons respond to the act of grasping (an action higher in the continuum) and not the motor act of closing the hand (an action lower in the continuum). Although not discussed in the paper, it is not dificult to see how the grasping with pliers on its turn serves actions of even higher abstraction, such as eating. Fogassi and his colleagues for instance, found different responses in mirror neurons, depending on whether the grasping action was part of an eating action or a placing action (Fogassi & Luppino, 2005). In all, because interpretations on all levels is possible, a clear indication of the level the analysis takes place can be helpful in interpreting the indings correctly. A second interpretation of a goal of an action is a target object. It is this interpretation that has given us the term ‘goal-directed action’, meaning a transitive or object-directed action 4. Use of this interpretation can be found in, for instance, Umiltà et al. (2001), who state that “mirror neurons have to infer and represent the occluded speciic action in addition to the inferred object, which is the goal of the action.”5 This interpretation of goals is also often present in the early mirror neuron papers (Gallese et al., 1996; Rizzolatti et al., 1996), but also in later studies (Hamilton & Grafton, 2006). Similar to this is the interpretation of a goal being a location, for instance a cross on the desk (Wohlschlager & Bekkering, 2002) or the end location of an action (Bekkering, Wohlschlager, & Gattis, 2000). At other places, a goal as an object is contrasted with a goal as a location (Hamilton & Grafton, 2006; Woodward, 1998). A third interpretation of goal is a desired state of the world. A possible state could be “a full cup of coffee” and several actions—picking up the coffee pot, transferring it to the cup, tilting the coffee pot, etc.—are needed in succession to reach that state. This interpretation can be found in, for example, Csibra & Gergeley (2007), Grafton & Hamilton (2007) or Sebanz, Bekkering & Knoblich (2006). 44 These interpretations do not necessarily exclude each other. For example “taking possession of an object” seems to have aspects of all three interpretations. First, taking possession can be viewed as an action that can be executed in different ways (grasping, ordering, buying). Second and obviously, this action is directed towards an object. Third, taking possession of an object can be viewed as reaching a world state in which a certain object is in my possession (in my hands, my mouth, my stomach). In general, the difference between the interpretation of goal as another action and goal as a desired world state seems to be a matter of emphasis. Sometimes one of the interpretations is more natural or evident, sometimes the other. For example, when one or two persons are carrying a table out of the room (Sebanz et al., 2006), it is generally not the action that one is interested in, it is a state of the world in which the table is located outside the room. In other cases, such as eating and drinking, it is not so much the world state that a person is interested in, but the action itself: the person enjoys the action of eating or drinking. Of course eating serves a purpose and is a mechanism by which a species acquires necessary nutrients. So in a way one could say that having the food in ones stomach is a desired world state albeit often an unconscious one, but this seems a rather awkward way of phrasing a goal. Notwithstanding the possible overlap, the differences can be crucial. The meaning of the claim that mirror neurons respond selectively to goals can differ to a great extent in the three different interpretations of ‘goal’. For example, recognizing that an action is directed towards a cup and recognizing that this cup grasping contributes to getting a clean table are two quite different capacities that require different experiments for testing the nature of motor activation. As a consequence, experimental results that support a certain neuroscientiic hypothesis (e.g. about neural mechanisms underlying goal understanding) under one interpretation of goal understanding do not automatically support that same hypothesis under other interpretations of goal understanding. Fogassi et al.’s (2005) study on parietal mirror neurons provides a good example of experimental setup where precise terminology is crucial. The researchers found mirror neurons in the monkey’s Inferior Parietal Lobule that responded selectively for different intentions underlying the same actions. Monkeys were trained to grasp a piece of food and either place it in a container on their shoulder, or eat it. Some neurons responded differently for these two intentions. Importantly, in some neurons this difference in iring was preserved when the monkeys observed the experimenters perform the same actions. Because Fogassi and 45 Understanding action What is meant by ‘action understanding’ differs from paper to paper. The dificulty with the notion is that it consists of two elements, action and understanding, and the meaning of these elements is interdependent and open to different interpretations. To start with actions: We have seen that action means, actions and action goals can be placed on a continuum from speciic, readily observable events (e.g. the use of a precision grip) to highly abstract events (maintaining homeostasis), and there seems to be no a priori way to make a clear-cut and objective contrast between action means, actions, and action goals. Despite the lack of a priori considerations for contrasting actions with goals in this interpretation of goals, it seems that the capacity to understand motor resonance his colleagues use the unambiguous notions ‘object’ and ‘intention’ to denote the different interpretations of goal (although the latter is sometimes also referred to as goal), there is no confusion or conlation of the notion goal here. However, if Fogassi and his colleagues would have used the notion goal in both the meaning of object and intention—as can be found elsewhere in literature, as shown above—then a circular statement about goal recognition causing goal recognition would be the result. Hamilton & Grafton (2007) provide an illustration of all three uses of this notion. In their introduction they discuss goals as being a desired world state (e.g. getting refreshment), they refer to goal-dependent mirror neuron iring in the meaning of a more abstract action, while their experiments are based on the object interpretation of goals. The authors themselves seem to be aware of the differences in interpretation when they write that “It is also important to note that the goals we have studied were deined by the identity of the object taken by the actor, contrasting between a ‘take wine bottle’ goal and a ‘take dumb bell goal’. It remains to be seen if the same parietal regions encode other types of goal, for example manipulating the same object in different ways.” Yet, the discussion of these other interpretations in the introduction, and the fact that they do not further specify their interpretation of goal throughout the paper could easily entices other researchers into applying the results to the other interpretations as well. In the section entitled “Diverging concepts” we will discuss a case in which, upon systematic conceptual analysis, the original experimental setup no longer matches subsequent interpretations by other authors. 46 grip types differs to such an extent from the capacity to understanding homeostasis that differentiation is necessary. With the mirror neuron literature in mind, we will limit the use of the notion ‘action’ to movements that exists in the here and now and that serve a goal, like grasps. We use the label ‘goals’ for more abstract actions than the observed one, in the sense that they are either non-visible (like “maintaining homeostasis”, or “keeping to one’s diet”) or involve future actions (“grasping in order to clean up the table”; cleaning up the table might be a visible action, but it is not yet at the time of picking up a cup). The fact that actions can be found along a broad continuum of increasing abstraction has consequences for the interpretation of ‘understanding’. Understanding can mean recognition (i.e. a form of classiication: “That’s a precision grip”), but also ‘recognizing the goal that is served by an action (“that’s grasping to eat”). However, as we have just seen, what is considered to be an action and what the goal of an action, is liable to interpretation. This makes the difference between recognizing an action and recognizing the goal of an action also a matter of interpretation. To stick with the drinking example: When “grasping a cup” is interpreted as an action, the goal of the action can be “to drink”. So the action can be recognized (“that’s grasping”) or its goal can be recognized (“that’s drinking”). When, however, we see drinking as an action, and quenching thirst as the goal of an action, then “that’s drinking” is a matter of action recognition, and “that’s quenching thirst” is understanding the goal of the action. Many authors seem to pitch their interpretation of action understanding somewhere along this continuum, but very few delimit or make their interpretation explicit. This makes it dificult to assess the exact claims that are made. For example, Rizzolatti and Craighero (2004) state that: “This automatically induced, motor representation of the observed action corresponds to that which is spontaneously generated during active action and whose outcome is known to the acting individual” [our italics]. Without speciication, this ‘outcome’ can mean anything from a precision grip to maintaining homeostasis. However, the claim that the MNS detects grip types is quite different from (and more modest than) the claim that the MNS is capable of detecting long-term goals or intentions. The two claims presume different capacities of the system and demand different tests to verify them. Beside recognizing the action and recognizing the goal an action serves, a third interpretation is that understanding an action is “knowing how to respond appropriately to an observed action” (Gallese et al., 1996; Rizzolatti 47 Diverging concepts Umiltà and her colleagues (2001) had monkeys watch grasping actions with the object to be grasped occluded from the monkey’s sight. By means of single cell recordings, they showed that the monkey’s mirror neurons that normally respond to the observation of a certain action also respond when the inal, crucial part of that action was hidden. This shows that the buildup to the action (e.g. the opening of the hand and the reaching towards an object) is enough to trigger the mirror neuron response, and that observation of the actual action (the grasping of an object) is not necessary. The authors conclude that these indings support the idea that the goal of an action can be recognized, even when the monkey is provided with an incomplete percept of an action, provided that the monkey knew that there was an object behind the occluder. They subsequently conclude that their indings motor resonance et al., 2001). For example: Rizzolatti et al. (2001) write: “By action understanding, we mean the capacity to achieve the internal description of an action and to use it to organize appropriate future behavior” [our italics]. So in addition to “the capacity to achieve the internal description of an action” which is in line with the irst interpretation, this deinition adds that it should be used for generating an appropriate response. Again, the different interpretations of action understanding refer to capacities that can differ to a large extent, so we will have to disentangle them. We will use the term action recognition when we mean the classiication of an action and the ability to differentiate it from other actions. By goal recognition we mean classiication of the goal of an action. This goal can be an action more abstract than the movement that takes place in the here and now, as discussed above, or another interpretation of goal, as discussed in the previous paragraph. Knowing how to respond appropriately to an action we will call action response. Table 1 presents an overview of these different interpretations. To illustrate the empirical relevance of our conceptual discussion and terminological distinctions, we will analyze a well-known mirror neuron study by Umiltá and colleagues (2001) that produced fascinating results. We will show that a univocal interpretation of the experimental data is troubled by the use of indeinite terms. As a result their data is often interpreted as supporting mirror neurons involvement in forms of goal understanding, while, in our terminology, only action recognition is demonstrated. 48 “further corroborate the previously suggested hypothesis that the mirror neurons’ matching mechanism could underpin action understanding”; a conclusion that is subsequently adopted by others (Ferrari et al., 2005; Rizzolatti & Sinigaglia, 2010). However, interpretation of these indings is not straightforward. We have shown that three different interpretations of both the notions ‘action understanding’ and ‘action goal’ circulate (let alone the range of abstraction on which actions and goals can be formulated). Umiltà and colleagues showed that certain mirror neurons that ire upon observing a certain action also ire when the inal part of the action was occluded. As the neuron exclusively ires upon viewing actions of this type, this is a form of what we would call action recognition: the recognition and classiication of an action. Their interpretation of ‘goal’ is that of ‘object’, as becomes clear in sentences like “the inferred object, which is the goal of the action” (p.161). So, when we rephrase their indings in our systematic terminology (see Table1), this experiment shows that the recognition of an action is dependent on knowledge of the presence of a graspable object. This suggests that the monkey understands that the observed movement is grasping only when it knows that it is directed towards an object. This inding is in line with early mirror neuron studies (Gallese et al., 1996; Rizzolatti et al., 1996), that also found that mirror neurons did not respond to mimed actions (i.e. actions not directed towards an object). These studies show that mirroring in order to recognize actions involves more than mirroring the kinematic features, as these features in mimed actions are identical to object directed actions, but do not evoke mirror neuron response. However, the indings of this experiment cannot be used to draw conclusions regarding goal understanding, i.e. inferring the goal that is served by a certain action from observation of that action alone, as the data show that the presence of a goal in the object sense is a prerequisite for the recognition of the action. So the tenability of the claim that these indings “further corroborate the previously suggested hypothesis that the mirror neurons’ matching mechanism could underpin action understanding” depends on what is meant by both the “previously suggested hypothesis” and “action understanding”. Regarding the irst: support for the direct-matching hypothesis (Rizzolatti et al., 2001) is problematic. This hypothesis states that the visual representation of the observed action (i.e. the kinematic features of the movement) is mapped onto our motor representation of the same action and when a 49 motor resonance matching motor representation exists, resonance occurs and the action is recognized. According to this hypothesis, action recognition thereby enables goal inference, as the observer of the action knows, from his own experience, which goal is (usually) served by the recognized action. When we try to explain Umiltà et al’s data within the framework of the direct-matching hypothesis, we seem to run into some circularity: goal recognition is a prerequisite for action recognition, yet, according to the directmatching hypothesis, action recognition is a prerequisite for goal inference. In their 2010 paper, Rizzolatti & Sinigaglia have reformulated the directmatching hypothesis. In this formulation, action mirroring is rendered as a dual-route process, with one route directly matching movements and the other mapping the goal of the observed motor act onto the observer’s own motor repertoire. When these routes are genuinely parallel, action recognition no longer is a prerequisite for goal recognition, but these two processes take place simultaneously and independently. However, support for this revised direct-matching hypothesis is also problematic, and now it becomes crucial what is meant by action understanding. When action understanding is taken to mean action recognition, then these data can only provide support for half of the reformulated hypothesis. Umiltà and her colleagues found neurons that respond selectively to different actions, which can only support the already well-established part of the revised direct-matching hypothesis: the direct matching of actions. No evidence is provided for the second route: the direct matching of goals. When action understanding is taken to mean goal recognition, the indings cannot support the direct-matching hypothesis either, as only action recognition is established, and according to the revised formulation of the hypothesis action recognition does not underpin goal recognition, but goal recognition takes place independently along a different route. In all, these indings seem more in line with competing hypotheses, such as Csibra’s (2007) or Jacob’s (2008), that state that action understanding is modulated by non-mirroring processes, such as processing of the presence of an object. Based on careful distinctions of terms, as done in Table 1, we have been able to reveal dificulties in the interpretation of data in the literature. We have given an example of how our conceptual work can help analyzing existing data, allowing for a more precise match between empirical results and conceptual interpretations. Next we will show that this conceptual analysis can also help guiding the design of new experiments in such a way that 50 conceptual confusion can be prevented. As an illustration of one of such possible experiment, we will discuss how Umiltà et al’s (2001) experiment can be modiied in a way that it can test a different interpretation of the concepts in Table1. Let us interpret ‘action understanding’ as ‘goal recognition’ and let us stick to the interpretation of goal as object. In that case, ‘goal recognition’ means ‘recognizing what object an action is directed at’. One way to test mirror neurons’ contribution to goal recognition in this sense is to identify mirror neurons that ire differently upon grasping actions towards different objects. This could be done by placing two objects instead of one behind the occluder, each demanding a different grip type (say an apple and a peanut, demanding a full hand grip and a precision grip respectively). When the monkey knows that only one of the objects is placed behind the occluder, and this object is approached with the wrong grip type, mirror neurons that ire for that grip type should remain silent, as this action cannot have the object behind the occluder as its goal. For example, the monkey knows that there is only an apple, but observes a grasping action with precision grip towards the occluder, When mirror neurons that respond only to actions performed with a precision grip remain silent (as they should when they ire selectively for different objects and there is no appropriate object behind the occluder), it could be considered further evidence that mirror neuron’s iring characteristics are dependent on the object that an action is directed at. Failure to demonstrate the ability of mirror neurons to “recognize” the wrong grip for the object behind the occluder could be considered evidence against the idea that mirror neurons contribute to goal recognition when goal is interpreted as the target object of an action. In all, different interpretations of the concepts used in theories on action understanding demand different experimental setups. We have given an example of how Umiltà et al’s (2001) experiment can be modiied in such ways that other interpretation of the concept of action understanding could be tested. Other interpretations of action understanding and goal will each require a different setup tuned speciically to that conceptualization and hypothesis that one intends to test. Conclusion The exact meaning of any statement involving action understanding, goal recognition and motor resonance can vary to a great extent, depending 51 motor resonance on the interpretation of the concepts used. In the cognitive neuroscience literature, it is often not explicated which of the multitude of possible interpretations are used. As a result different sets of experimental data can be taken in mutual support of neuroscientiic hypotheses, even though interpretations might diverge in ways that make the result in fact incompatible. By means of a careful conceptual analysis we aimed to disentangle the different possible interpretations of ‘action understanding’, ‘action goals’ and ‘motor resonance’. The ine-grained distinctions we have proposed, exempliied in Table 1, allow for better interpretations of experimental data and more adequate design of experiments. We have shown that our proposed, systematic labeling scheme is empirically relevant in interpreting research data, by showing how the use of the scheme leads to a reinterpretation of existing experimental results in the cognitive neuroscience literature. Moreover, we have illustrated how our scheme can guide the design of experimental setups aimed to test different interpretations of action understanding. The systematic use of well-deined concepts is an important aspect of the constructive and fruitful analysis of experimental data. In this paper we performed a conceptual analysis to arrive at more precise and unequivocal deinitions of the terms ‘action understanding’, ‘action goal’ and ‘motor resonance’, terms that are central to the cognitive neuroscientiic study of action and perception. We hope to have shown that the type of conceptual analyses such as we performed in this paper are not mere theoretical exercises, but a constructive contribution to the empirical cognitive neuroscience. abstract In analyses of the motor system, two hierarchies are often posited: The irst—the action hierarchy—is a decomposition of an action into sub-actions and sub-sub-actions. The second— the control hierarchy—is a postulated hierarchy in the neural control processes that are supposed to bring about the action. A general assumption in cognitive neuroscience is that these two hierarchies are internally consistent and provide complementary descriptions of neuronal control processes. In this essay, we suggest that neither offers a complete explanation and that they cannot be reconciled in a logical or conceptual way. Furthermore, neither pays proper attention to the dynamics and temporal aspects of neural control processes. We will explore an alternative hierarchical organization in which causality is inherent in the dynamics over time. Speciically, high levels of the hierarchy encode slower (goal-related) representations, while lower levels represent faster (action and motor acts) kinematics. If employed properly, a hierarchy based on this principle is not subject to the problems that plague the traditional accounts. This chapter was published, in a slightly modiied version, as: Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2012). Hierarchies in Action and Motor Control. Journal of Cognitive Neuroscience, 24(5), 1077–1086. four action hierarchies motor control action intention 54 Introduction In motor control it is common to think of actions as hierarchically structured: a goal is served by an action, which, in turn, is served by multiple sub-actions. For example, when I want a glass of milk from the fridge, I have to get up from my chair, walk to the kitchen, open the door of the fridge, grasp the box of milk, and so on. I get up by means of placing my hands on the armrests, bending forward, stretching my legs and push off. I place my hands on the armrest by means of stretching my arms, grasping the rests, etc. (similar to Newell and Simon’s (1972) means-end structure in problem solving, see also Byrne & Russon (1998)). When the goal of getting a glass of milk is placed on top, and the other aspects of the action are arranged below it, a hierarchy appears. When going down the hierarchy, the tree gets wider (more elements on one level) while the elements become less abstract, down to the level of individual muscle movements. A general assumption of cognitive science is that such action hierarchies are mirrored in the neural representation underlying them (Bechtel & Richardson, 1993; Botvinick, 2008). In other words, there are two hierarchies: an action hierarchy describing the action; and a control hierarchy, describing the neural processes that are presumed to bring the action about1. Cognitive scientist assume either implicitly (Hamilton & Grafton, 2007) or explicitly (Botvinick, 2008) that these two hierarchies match. However, as Badre notes: “the fact that a task can be represented hierarchically does not require that the action system itself consist of structurally distinct processes” (Badre, 2008, p. 193), so this assumption should be subject to testing. But, whether these two hierarchies are identical is only partly an empirical matter. Before experiments to test this assumption can be designed, some important conceptual issues need to be addressed. There are multiple ways to construct a hierarchy but two hierarchical structures seem prevalent in the literature on action and motor control: one is a hierarchy based on constitutional or part-whole relations between the elements, the other is structured around a causal inluence between the elements. When describing the action hierarchy, typically a part-whole structure is presumed, while the control hierarchy is usually explained using a causal framework. However, we will show that these two structuring principles are 1 To prevent confusion: In the literature on motor control, the notion ‘action hierarchy’ is used for a hierarchical structure both in the action and in the neural control of the action. Here we reserve the term for a hierarchical structure in the action or the behavior. Posited structures in the neural control of an action we will call ‘control hierarchy’. 55 Actions and goals The top-most level of a hierarchy in the motor domain is often labeled the ‘goal level’ (Hamilton & Grafton, 2006), ‘desire level’ (Grafton & Hamilton, 2007; Hamilton & Grafton, 2007), or ‘intention level’ (Pezzulo, Butz, & Castelfranchi, 2008), but other labels, such as ‘superordinate action’, can be found as well (Humphreys & Forde, 1998)). Below that, there is usually at least one level for ‘actions’ (Hamilton & Grafton, 2006) or ‘sub-goals’ (Hamilton, 2009) and the bottom level is often labeled ‘movements’ or ‘kinematics’. The exact labels of these levels may, of course, vary, as long as confusion is prevented. For reasons of clarity and consistency, we will call the elements on the highest level ‘goals’, the action features on lower levels ‘actions’, and we will call the elements on the lowest level ‘motor acts’2, as can be seen in Figure 1. 2 When the ideomotor terminology is adopted, an action is a movement that serves a goal (Arbib & Rizzolatti, 1997). The elements in a hierarchy serve a goal by deinition (otherwise they could not be accommodated in the hierarchy), so the elements on the lowest level cannot be ‘mere’ movements (i.e. not serving a goal), as they are sometimes referred to. Hence we choose the term ‘motor act’. action hierarchies in fact mutually exclusive, which suggests that the action hierarchy need not be similar to the control hierarchy. We will discuss empirical evidence that these two hierarchies are indeed dissimilar. The remainder of this paper is organized as follows. We will start by briefly elucidating the relation between actions and goals. Next we will discuss the two main structuring principles of hierarchies in the motor domain and argue that they are incompatible and dissimilar. As an alternative account we will discuss models that use different time scales for different control processes. In these models structures can be found that can be seen as hierarchically structured, but in a different and much more implicit form. This interpretation of a hierarchy is not subject to the problems that plague the irst two options, and might therefore be an interesting alternative for structuring elements in motor control. Understanding the nature of this hierarchical structure can guide empirical research into action control. 56 getting a glass of milk goals walk to the kitchen open fridge grasp milk box actions stand on left leg motor acts flex knee swing right leg move leg forward Figure 1. A typical action hierarchy with one goal level and multiple levels for actions and motor acts. Note that this hierarchy is far from complete; for every action, only a few subactions are shown. The idea of a hierarchical structure in actions has been applied both to action execution and action observation or action understanding. The rationale behind this dual application is that there is evidence that the same brain structures are used for action generation and action observation (see the extensive body of literature on mirror neurons (Rizzolatti & Craighero, 2004; Rizzolatti & Sinigaglia, 2010) and motor resonance (Fadiga et al., 2005; Uithol, van Rooij, Bekkering, & Haselager, 2011b)). Our analysis, however, is based mainly on claims about hierarchies in the execution of an action, but might have consequences for action observation as well. In order to get a better understanding of what is actually claimed when it is proposed that action production is structured hierarchically, it is useful to formulate an answer to two questions: (1) What makes one level higher than another (what is the variable on the vertical axis); and (2) what is the relation is between features on different levels (what do the lines between the hierarchy elements in Figure 1 portray)? By answering these two questions, we will be able to compare the two different accounts of hierarchical structures in the motor domain. 57 We can interpret a hierarchy as portraying part-whole relations between the elements. Each level of the hierarchy comprises a set of subsystems, which are themselves composed of smaller units. For example, the action ‘getting milk’ consists of ‘walking to the fridge’, ‘opening the door’, ‘grasping the box of milk’ and so on. ‘Opening the fridge’, in turn, consists of ‘grasping the handle’, ‘pulling’, and so on. See Figure 2 for an example of a part-whole hierarchy. In such a hierarchy, ‘getting milk’ does not exist apart from ‘walking to the fridge’, ‘opening the fridge’ etc.; it is composed of these action features. In other words: When there is the right kind of reaching, opening of the hand and closing of the hand (and a milk box, of course), there is ‘grasping the box of milk’. Likewise, when all the actions in the hierarchy are present, the goal of ‘getting milk’ is present. In this case the vertical axis denotes constitutive complexity. The higher up the axis, the more subparts in total a certain action element has. The lines in Figure 2 portray a ‘part of’ relation. complexity getting a glass of milk walk to the kitchen grasp handle reach towards handle grasp milk box open fridge pull full hand grip Figure 2. A hierarchy, again simpliied, structured according to the part–whole principle. The structure is practically identical to the hierarchies found in the literature on action representation (see Figure 1). Some important points need to be made in respect to a hierarchy based on a part-whole relationship between the elements. First, this hierarchy action hierarchies Part-whole relations 58 can be postulated independently of an underlying cognitive mechanism. It is a description of an action, and a way of carving an action into smaller sub-actions and sub-sub-actions. The lower the hierarchical level, the more detailed the description is; the higher the level, the more encompassing the element is. ‘Grasp handle’ is just a label for the combination of ‘reach towards handle’ and ‘full hand grip’. As such it provides a description of the explanandum, not an explanation. Similarly, one can describe a human being as consisting of a trunk, a head, two legs and two arms. The head consists of eyes, ears, a nose, a mouth etc. This description does not directly offer a mechanical explanation of the functioning of the human body; it describes the elements that need to be explained. This nature becomes evident when one tries to imagine how the postulated hierarchy could be refuted. It is hard to imagine empirical evidence that could show that ‘reaching’ appears not to be part of ‘grasping the milk box’. It seems that the kind of evidence that could refute this hierarchy would rather be conceptual in nature. Next, the part-whole hierarchy does not allow causal inluence between the elements, as that would mean that an element would be the cause of its own parts, and, in general, nothing can be the cause of its own parts (Craver & Bechtel, 2007; Lewis, 2000)3. Likewise, the head is not the cause of the eyes or nose. In terms of actions this means that the reaching action cannot be the cause of the full-hand grip, but also that the goal of getting milk cannot be the cause of walking to the kitchen, which is at odd with most studies into goal-directed action. This suggests that the part-whole principle might not be the only principle at work in the general perception of a hierarchy in the motor domain. Lastly, we have shown previously (Uithol, van Rooij, Bekkering, & Haselager, 2011b) that goals can be formulated as an action of a more abstract form (grasping a cup serves the goal drinking), as a desired world state (grasping the cup in order to have a clean table) or as an object (the cup is the 3 Circular causality is a much-debated concept within dynamical system theory (Bakker, 2005; Juarrero, 1999; Lewis, 2005), and means that elements on a lower level collectively contribute to a higher-level variable, which in turn modulates the behavior of elements at the lower level. As it is still highly contentious whether the downward causation (required for genuine circular causality) actually amounts to a causative force over and beyond the collective interactions of lower level elements (Kim, 1993, 2000). We do not wish to pursue this issue here. More importantly for our purposes, even if downward causation in this strong sense would exist, the claim still is not that the collective variable would actually cause its own parts (i.e. their existence as parts), but instead that it would causally constrain their behavior, and would therefore fall under the second principle to structure a hierarchy (see below). 59 Causal relations An alternative principle to structure a hierarchy in the motor domain, not based on part-whole relations, is a causal hierarchy in which parts higher on the hierarchy are the cause of, or causally inluence parts lower on the hierarchy4. The goal of getting a glass of milk activates a ‘get up’ action, which activates a ‘stretch legs’ and ‘bend trunk’ action. In a causal hierarchy, higher-level elements can modulate the activity of lower-level mechanisms. This structure differs from the part-whole structure in four important ways. First, the action features are not subparts of features higher up the hierarchy, but necessarily exist independent of action elements higher in the hierarchy. It is important to realize that this renders the part-whole hierarchy and the causal hierarchy incompatible. In the part-whole hierarchy, the higher elements consist of the lower elements, and therefore, by deinition, do not exist independently. In the causal hierarchy, the causal 4 Although this is generally true for action hierarchies, in the perceptual hierarchy the order is reversed: features low in the hierarchy, such as lines and colors, are thought to be the cause of higher-level features, such as objects (Felleman & Van Essen, 1991; Hubel & Wiesel, 1959). action hierarchies goal of my grasping action). It is possible to construct a part-whole hierarchy only when goals, actions and motor acts are of a similar nature, in this case a type of action. Only goals formulated as a type of action have subparts that can be accommodated in a hierarchy. When goals are rendered as desired world states or objects, no relevant subparts of an action goal can be formulated and placed in a hierarchy. Objects of course have subparts (e.g. a cup has an handle, a saucer etc.), but object parts have no place in an action hierarchy, as actions cannot be subparts of an object. The same goes for a desired world state: it has many (dissimilar) elements, such as objects and relations or properties, but they cannot be arranged in an action hierarchy. A part-whole hierarchy could be construed for a desired world state, but it would describe the world state, not the action needed to bring it about. In all, a hierarchy based strictly on a part-whole principle describes the action and its structure. No causal inluence can be assumed between the different elements in the hierarchy. Consequently, a hierarchy based strictly on a part-whole principle may provide a characterization of an action but it does not provide an explanation of actions or motor control. Also, a hierarchy of this type allows only one interpretation of a goal, viz., a goal formulated as an action of a higher abstraction. 60 inluence between the elements necessitates independent existence of the various elements. Second, when goals exist independently of actions, it is no longer necessary that elements higher in the causal hierarchy are more complex than elements lower in the hierarchy. A simple element can just as well be the cause of a complex element. Indeed, goals and intentions are often posited to be discrete, constitutionally simple and propositional states (Haggard, 2005; Pacherie, 2008; Uithol, Burnston, & Haselager, submitted). Third, possible interpretations of the notion of goals are no longer restricted to abstract-action type of goals. The fact that parts need to be of a similar (ontological) nature as the whole, entails that a part-whole hierarchy only allowed goals deined in terms of an action. This restriction drops out in a causal hierarchy, so that goals formulated as a desired world-state or an object can also be the cause of an action. Additionally, elements such as ‘affordances’ (Gibson, 1979)—being a relation between an organism and an object—can now be accommodated. Fourth, unlike the part-whole relation, the causal structuring principle does make claims about the underlying cognitive mechanisms. Effects and causes are assigned to different elements, and for these elements to have a physical reality they must be assumed to be related to physical causes and effects, for instance, such as those that may hold in the brain. To illustrate the nature of this hierarchy, let us assume that the goal of ‘getting milk’ is the cause of ‘walking to the kitchen’, ‘opening the fridge’ and ‘grasping the milk’. When we want to add further detail to this hierarchy, for example by further specifying ‘open fridge’ into ‘full hand grip’ and ‘pull’, we have to choose between simply replacing the element ‘open fridge’ with this sequence of elements (Figure 3a), or adding an extra layer below ‘open fridge’ (Figure 3b). The difference is not a mere difference in visualization, but actually corresponds to two different claims about the control of the action. In the latter situation, we postulate an extra control layer, which is ontologically independent of “full-hand grasp” and “pull handle”. In this case it is claimed that “opening the fridge” exist as a separate entity (a representation, or a command), independent of the lower level features. In the causal hierarchy, the vertical axis denotes causal inluence. Higher levels have causal inluence on lower levels, but lower levels have no inluence on higher levels. However, motor control is generally not believed to be instantiated by unidirectional downward causation. More realistic models of motor control implement feedback by means of reciprocal con- 61 b) getting a glass of milk walk to the kitchen full-hand grasp pull handle action hierarchies causal influence a) getting a glass of milk grasp milk box walk to the kitchen grasp milk box open fridge full-hand grasp pull handle Figure 3. Two different hierarchies structured according to the causal principle. In hierarchy (a) there is no extra control layer between “getting a glass of milk” and “full-hand grasp,” whereas in (b) “open fridge” exists as an independent causal unit. nections (Kilner, Friston, & Frith, 2007a), feedforward and error-prediction (Friston, 2005; Haruno, Wolpert, & Kawato, 2001). However, feedback between action elements on different levels is problematic to accommodate in a hierarchy structured around causal inluence, as feedback is also a form of causal inluence. If motor acts can also inluence actions, and actions also inluence goals, we seem to have lost the principled reason for placing goals on top and means at the bottom of the hierarchy. In other words, there seems to be no principles for placing one level below or above another level, which means a departure from one of the main characteristics of the control hierarchy: its top-down organization. To make things causally even more complex and interconnected, in addition to the aforementioned interlevel causal inluence, there is evidence for intralevel causal inluence as well: elements on a given level seem to inluence each other. As an example, Cohen and Rosenbaum found what they call the ‘hysterese effect’ (2004). This effect shows that during a grasping task, a previous grip location inluences the location where an object is grasped next, even when this means that the well known ‘end state comfort’ principle (Rosenbaum & Jorgensen, 1992)—a presumable top-down process—has to be violated. As another example, Selen et al. (2009) found that the ‘stiffness’ used in pushing an object was not only an effect of the characteristics of the object that was being pushed, but also of the previous object. In other words, it mattered what the subject did before for how the task was executed. 62 There is also evidence that what you will do next, inluences on how you perform the current action or motor act. In speech articulation this effect is known as coarticulation (Rosenbaum, 2009). When, for example, pronouncing “tulip”, the lips already round before pronouncing the “t”, to correctly pronounce the “u” ”, but consequently, the “t” is pronounced slightly different. When there seems to be mutual inluence between elements on different levels as well as between elements on a single level, and we hold on to causality as the only principle for structuring the hierarchy, the image that emerges is more like a mesh with dynamically interconnected action features than a neat tree structure with an inherent top-down ordering of levels. In a tree with bidirectional causal inluence no unambiguous ordering of levels is implied by the causal relation alone. To be clear, the conclusion of our analysis is not that the idea of an action hierarchy is in itself wrong. We have argued that if such an action hierarchy exists then it cannot be based on causal relations alone. Likewise, we do not wish to deny the existence of causal relations between the action elements, but framing the hierarchy entirely in terms of causal inluence just does not seem to capture the complexity of inluences present in the neural control of an action. Still, we, as well as many other species, are capable of organizing our behavior in such ways that a predetermined goal is achieved. When I want a glass of milk, I usually have this goal prior to initiating action. Also, I usually succeed, regardless of a few obstacles on my path, and when necessary, I can adapt my behavior to unforeseen environmental demands and still succeed. This must mean that the goal of getting a glass of milk in Figure 3 has a dominance of some sort over the other action features. A clue on how this dominance could be achieved can be found in recent modeling work. We will discuss this in the section entitled ‘temporal extension’. First we will formulate the consequences of the incompatibility explained above for cognitive research into motor control. Different hierarchies for different parts of the explanation Both the part-whole structure and the causal structure can be found in the literature on action representation and motor control. For example, Grafton and Hamilton (Grafton & Hamilton, 2007) provide much evidence for 63 action hierarchy control hierarchy structuring princinple part-whole causality location in the action in the neural control nature decomposition of the explanandum mechanism Table 1. The two types of hierarchies and their properties. We have argued that the two structuring principles are not compatible. So when the action hierarchy is supposed to be mirrored in the control hierarchy, a structuring principle that is applicable to both the hierarchies is needed. Unfortunately neither the part-whole structure, nor the causal structure seems to thrive outside its niche. The causal structure makes little sense in the action hierarchy. We might be able to explain that my walking to the fridge is caused by the goal of getting milk, but it does not make sense to state that my leg swinging is caused by my walking, as that would entail that that my walking could exist independent of leg swinging. Applying a part-whole structure to a control hierarchy is equally problematic. First, as explained above, a part-whole hierarchy would not relate to a causal mechanism, but to a (complex) representation of an action at best. Second, when one is looking for a part-whole hierarchy in neural structures, one assumes that the structure in the content of the representation is mirrored in the structure of the vehicle of the representation, which means action hierarchies a form of distributed representation of an action in which different action elements are represented in different brain regions. They claim that this distributed nature of action representation is evidence for a hierarchy in motor control. They note that “control hierarchies should be relected by differences in those areas that are recruited for preparation and execution” (p.599), suggesting a causal inluence between the various elements. Later, however, (p. 605), they postulate an action hierarchy based on levels of complexity, suggesting a part-whole structure. In general, each of the hierarchies seems to have found its own niche within explanations of an action. When describing the action hierarchy, a hierarchy is often constructed on basis of the part-whole structure. The action is carved into sub-actions, and sub-sub-actions, as explained above. On the other hand, when the control hierarchy is described, a causal structure is presumed. An overview of our conclusions thus far is presented in Table 1. 64 that one is looking for an action representation with a constituent structure (Fodor, 1975) or a microfeature structure (van Gelder, 1999). In this form of representation the vehicle (i.e. the neural state that carries the information) has identiiable subparts, and content can be attributed to these subparts. Moreover the content of the overall representation is dependent on the content of the subparts. So in case of action representation, the goal representation should consist of sub-representations that can be identiied as actions. These sub-representation again have sub-parts with identiiable content. For example, the representation of grasping the handle should consist of two identiiable representations: reaching towards the handle and a full-hand grip. This strong restriction renders much of the available neural data insuficient to support a part-whole hierarchy, as not only do we have to ind different representations for different subparts of an action, but these representation together also need to be correlated with the presence of a goal. So, for example, goal-sensitive mirror neurons in the macaque’s premotor cortex (Gallese et al., 1996; Rizzolatti et al., 1996; 1987) cannot be accommodated in a control hierarchy based on a part-whole relation. The vehicle of this goal representation is simple in the sense that no functional subparts are known to date 5 (Uithol, van Rooij, Bekkering, & Haselager, 2011a). In contrast, a goal representation that has a constituent structure should be dividable into several sub-vehicles, representing sub-goals or actions. In all, the two structures are not compatible, and neither structure is transferable to the other side of the explanation. A direct consequence is that the control hierarchy and the action hierarchy need not match. Both the structure and the set of elements of the two hierarchies can differ. Apparently our intuition to divide an action into ever-smaller parts—our ‘folk motor control’, so to speak—might not be the best strategy for inding the neural correlates of action control6. Indeed, Dennett warns us against the uncritical acceptance of a seemingly (intuitively) reasonable task descrip5 Features such as spiking frequency or phase could play a functional role in the representational capacities of a neuron. To our knowledge no study investigated these properties of mirror neurons. 6 The fact that the action hierarchy might not map (perfectly) onto the control hierarchy also has interesting consequences for theories on action understanding by means of motor resonance or mirror neurons. In these theories it is assumed that, when observing action, the same neural structures are recruited as when executing an action (Uithol et al., 2011b). But when the features of action control do not match the action features we distinguish in an observed action, the nature of the ‘shared representations’ (de Vignemont & Haggard, 2008) needs to be subjected to further research. 65 Neural evidence for two different hierarchies There are two ways in which the action hierarchy and the control hierarchy can be dissimilar: the control hierarchy can contain elements that are absent in the action hierarchy, and—vice versa—the action hierarchy can contain elements that are absent in the control hierarchy. There seems to be empirical evidence for both types of mismatches. To give an example of the irst: Graziano and Alalo (2007) stimulated the premotor areas of macaque monkeys for a relatively long duration (500-1000ms). They were thereby able to evoke complex movement sequences to a certain end-location, for instance a sequence consisting of grasping, bringing to the mouth, turning the head towards the hand and opening the mouth. Importantly, these movements were complex, but ‘dumb’: when something blocked the trajectory of the bringing-to-the-mouth movement, the arm got stuck and did no move (Graziano, 2010, p. 461). These data seem to suggest that the behavioral repertoire of the monkey is represented by means of basic chunks, and modiications to these chunks, such as target localization and adaptation to the trajectory when an object is blocking the pathway. However, a straightforward decomposition of the action into an action hierarchy would not automatically lead to these basic action chunks, and therefore would not posit the additional modifying elements. This demonstrates that the control hierarch contains elements that are absent in a straightforward action hierarchy. Similarly, the most straightforward or intuitive decomposition of a grasping action is into the movements of individual ingers and the thumb. However, there is evidence that, at the neural side, the control of the grip is action hierarchies tion: “Marr’s more telling strategic point is that if you have a seriously mistaken view about what the computational-level description of your system is […], your attempts to theorize at lower levels will be confounded by spurious artifactual puzzles. What Marr underestimates, however, is the extent to which computational level (or intentional stance) descriptions can also mislead the theorist who forgets just how idealized they are” (Dennett, 1989, p. 108). Instead, a constant interplay between neural data gathering and adapting the action hierarchy might be a more fruitful strategy. Thus far we have based our conclusion that the action hierarchy need not match the control hierarchy solely on conceptual grounds. In the next paragraph we will discuss empirical evidence that there are in fact dissimilarities between these two hierarchies. 66 not decomposed into the movements of individual ingers, but to a base posture with addition of reinements in inger and thumb position (Mason, Gomez, & Ebner, 2001). So a straightforward decomposition of a precision grip grasping action would lead to an index inger and thumb movements as basic chunks, while the neural control hierarchy has a full hand grasp and suppression of three ingers as basic chunks. Again, our ‘folk’ decomposition of an action seems not to correspond to the control hierarchy: the neural representation can contain elements that, at irst sight, do not seem to be part of the action. There seems to be neurological evidence for the opposite possibility as well: the control hierarchy can lack elements that do seem to be part of the action hierarchy. The literature on embodied7, embedded cognition provides many examples of elements that can be considered part of an action, but lack a neural correlate (see for instance Chiel & Beer, 1997). Clear examples can be found in the human gait. Our gait is a complex orchestra of movements in many joints. The muscle activation responsible for a successful gait is hypothesized to be controlled by central pattern generators (Duysens & Van de Crommert, 1998). However, these neural patterns are not suficient to generate a luent and eficient gate. Passive components, such as muscle and tendon elasticity, and inertia of the upper and lower leg are of crucial importance (Whittington et al., 2008). In other words, some particular stages or parts of an action are not controlled by the neural patterns that activate muscles, but these stages are accomplished by “exploiting” regularities of the body, such as muscle and tendon elasticity, and the context, such as inertia and gravity, and are in that sense not centrally controlled but via self-organization. These important features of a normal gate are not part of the action representation, but they are, nevertheless, part of the action. The problems outlined above suggest that, in their purest form, the two traditional principles for structuring a hierarchy might neither separately, nor combined be the best candidates for a general theory on action representation. An interesting alternative for (or modiication to) structuring the control hierarchy can be found in the temporal ordering of hierarchical elements or processes (Kelso, 1995; Kiebel, Daunizeau, & Friston, 2008; Koech- 7 The notion of ‘embodiment’ is used for various forms of dependency on a body. In cognitive science it can refer to something as modest as activation of the motor cortex (de Vignemont & Haggard, 2008), while usually in philosophy a more radical mutual dependency between body and brain in the generation of behavior is meant (Clark, 1997; Haselager, van Dijk, & van Rooij, 2008; van Dijk, Kerkhofs, van Rooij, & Haselager, 2008) (see Ziemke (2003) for an overview of the various interpretations). Here we use the more radical interpretation. 67 Temporal extension Yamashita & Tani (2008) modeled a motor system of a robot without using what they call “local representations”: neural nodes dedicated to the representation of single action primitives in an explicitly represented hierarchical structure. Every one of the 180 units was connected to every other unit, including itself. The network was trained using ‘back propagation through time’, that required reciprocal message passing. They realized self-organization of a functional hierarchy through the use of two distinct types of neurons, each with different temporal properties. The irst type of neuron is fast, in the sense that its activity can change quickly. The second type of neuron is slow. They found that after training, continuous sensorimotor lows are segmented into reusable motor primitives during repetitive execution of behavioral tasks. Moreover, these primitives could be lexibly integrated into new behavior sequences. The model accomplishes this without setting up an explicit sub-goal or function. In other words, without explicit instructions, representations of independent action elements emerge. It is important for our analysis that the two types of neurons each developed a distinct activation proile. During the execution of a repetitive motor task, repetitions of similar patterns were observed in activities of the fast context units. The activity in the slow units, in contrast, remained constant throughout the repetitious task. These results can be interpreted such that the fast units encoded reusable motor primitives, that, due to their fast dynamics, were unable to preserve goal information over long trajectories. The slow context units, in contrast, encoded the switching between these primitives, and on account of their slow dynamics could contribute to more stable goal representations. It is important to realize that the behavior of the robot was the result of the interplay of the different units, and not of slow units controlling the faster ones. This interpretation could provide us another and less problematic structuring principle for a hierarchy: temporal extension. Elements higher on the hierarchy are represented longer or more stable than lower ones. As action hierarchies lin et al., 2003). The fundamentals of such a hierarchy are best introduced by discussing a recent model in robotics (Yamashita & Tani, 2008). After this brief excursion we will return to neuroscience and discuss Koechlin’s ‘cascade model’ of neural control (Koechlin et al., 2003), that seems to be structured around the same principle. 68 such, they are able to inluence an action for a longer time interval, thereby accounting for our capacity to structure behavior around a goal. In a way, this reverses the general reasoning: Elements are not more inluential because they are higher in the hierarchy, but elements are higher in the hierarchy because they have more inluence (on account of being more persistent). Although it is related to causal inluence, temporal extension is a different criterion for building a hierarchy. It is not assumed that the causal inluence works in only one direction, from goal to action—remember that every unit in the network Yamashita & Tani used was connected to every other unit. Nor is it assumed that the causal inluence in one direction is bigger than in the reversed directed. The difference between the types of inluence is a difference in temporal extension: goals simply exert their inluence longer than the actions or motor acts. The control hierarchy structured on basis of temporal extension is not committed to the direct causal inluences as found in the causal hierarchy. This means that although the overall structure—goals high in the hierarchy and action means low in the hierarchy—can be preserved, the hierarchy is much more implicit, as the orderly tree structure is lost. There is a simultaneous inluence of a great many action features, at an unbound number of levels. The model built by Yamashita and Tani (2008) developed a functional hierarchy of only two layers: slow and fast, and the functional elements they found would still be located at the very bottom of the common action hierarchies. They suggest that “[t]he idea of functional hierarchy that selforganizes through multiple time-scales may as such contribute to providing an explanation for puzzling observations of functional hierarchy in the absence of an anatomical hierarchical structure” (p. 13). Indeed, human action control seems to be hierarchically structured—as argued above— without a clear anatomical hierarchical structure (Miller & Cohen, 2001), so this model could help interpreting an inluential neurocognitive model on action control. Koechlin, Basso, Pietrini, Panzer, & Grafman (1999) proposed a model in which different types of action control are located along a rostro-caudal axis in the lateral prefrontal cortex (PFC). In their hierarchical model four types of control are discerned (Koechlin et al., 2003; Koechlin & Summerield, 2007). Sensory control, located at the caudal end of the axis, is involved in selecting motor actions. A bit more anterior, Contextual control is involved 69 action hierarchies in selecting premotor representations or stimulus response associations. Next, episodic control is involved in selecting task sets or sets of consistent stimulus-response associations in the same context. Lastly, branching control, implemented in the rostral end of the axis, the anterior and frontopolar regions of the PFC, involves controlling the activation of sub-episodes nested in ongoing behavioral episodes. The signiicance of the proposed model does not lie in the fact that exactly four different control layers are posited (it is, we believe, unlikely that human action control consist of ixed and integer number of control layers), but in the suggestion that different control processes operate on different time scales. When going from sensory control to branching control, the temporal extension of the types of control grows. Sensory control deals with selecting immediate movements—analogous to Yamashita and Tani’s fast neurons—and monitoring stimulus changes. The input to contextual control is already more robust and less dynamic. Episodic control deals with entire sets of association within one context, while branching control is involved in managing changes between different contexts. This means that these control processes can be structured in a hierarchy structured around stability, or temporal extension. Once this hierarchy is established, it is compelling to interpret Koechlin et al’s inding in terms of a more traditional, causal motor hierarchy, with an action goal originating in the higher control processes, that is subsequently propagated to the lower types of control to evoke the appropriate action. By referring to their model as the ‘cascade model’, and by emphasizing the downward modulation, Koechlin and colleagues are—perhaps unintentionally—feeding this compelling intuition, which is subsequently adopted by other researchers (Badre & D’Esposito, 2007; Hamilton, 2009). However, the data do not suggest such an interpretation. Koechlin and colleagues (1999) show that when more temporally extended forms of control are needed, anterior and frontopolar areas are activated in addition, not alternatively, suggesting that these control processes are not responsible for the control task by themselves, but through interaction with the lower types of control, just like all the units in Yamashita and Tani’s 2008 model contributed to the resulting behavior. This collective contribution is incompatible with the idea that goal directed behavior is the result of higher layers propagating goal representations to lower layers. Goal directed behavior emerges from the interaction between the different types of processes, not from straightforward top-down modulation. 70 If we would accept Koechlin’s alternative hierarchy based on temporal extension, but continue to interpret this hierarchy as a straightforward causal hierarchy we do not do justice to the complexity seemingly inherent in action control. Additionally, interpreting the proposed hierarchy in terms of causal effects entails positing discrete states that, through interaction, bring the action about. It is, however, highly unlikely that such discrete states with these causal effects can be found in the prefrontal cortex (Uithol et al., submitted). Positing discrete states and causal interactions between them seems to ignore the complex and intertwined nature of the dynamic control processes emphasized by Koechlin and colleagues. This insight could guide future research into action control. Instead of positing an anteriorly represented action goal and try to locate the processes by which this representation is transformed to a motor program, the analysis above suggests that research into action control is better served by focusing on how goal directed behavior emerges from the interaction between the different control layers. Which sensory input is used on which layer of control? How do lower control processes shape higher ones, and vice versa? Koechlin and colleagues made an important step in shifting this focus. This shift is hampered, however, if we allow the traditional views back in to shape our analysis. An important theoretical advantage of an implicit hierarchy based on temporal extension is that it rids us from the rather artiicial constraint that an action is associated with just one goal, present in the causal and partwhole hierarchies. At every moment one can be attributed many, maybe even an ininite number of goals: to breathe, to read, to maintain homeostasis, to be a good scientist, to remain an upright posture, etc. Our behavior is the result of the interplay of this multitude of goals (McFarland, 1989; Uithol et al., 2012; submitted). These goals need not be represented on the higher layers in Koechlin’s model, but can also be an emergent result of the interaction of different control processes. To give a simple example: When swimming using a front crawl, a typical pattern of strokes and breathing is adopted. This pattern only makes sense when one realizes that two goals, to swim as fast as possible and to breath, are pursued at the same time. Of course we know about a swimmer’s goal to breathe, and this goal is unlikely to be represented in one of Koechlin’s control layers. In straightforward cognitive descriptions, we are inclined to leave it “out of the equation”, and treat it as a boundary condition. But making a distinction between variables and boundary conditions in such an intuitive and implicit manner might 71 Conclusion In theories on motor control two hierarchies, the action hierarchy and the control hierarchy are thought to match. We have presented both conceptual and empirical evidence suggesting that this assumption is unlikely to be true. We have shown that, implicitly, two structuring principles are used to construct a hierarchy, but that neither structure (nor the (impossible) combination of the two structures) can provide an adequate framework for explaining actions and motor control. The action hierarchy—constructed using a part-whole hierarchy—is a description of the action that is to be explained, but can be misleading in searching for a neural implementation of the action. The control hierarchy—constructed using causal relations— does not capture the complexity inherent in motor control. Our conclusion is not that motor control is not structured hierarchically at all, but that the traditional accounts of an action hierarchy do not capture the complex and dynamic nature of motor control. Alternatively, dynamic accounts of motor control can be interpreted as hierarchical as well. In these models which elements that are represented longer and more stable are higher in the hierarchy. Although these alternative models are hierarchical in a much more implicit way, and cannot straightforwardly be interpreted along the action hierarchies not be the best approach to a general theory on motor control. Although cognitive scientists generally have good reasons not to put an ininite number of goals in a model on action representation, to assume that the number of goals is always limited to only one might in some cases be overly restrictive. This inluence of multiple simultaneous goals cannot be easily accommodated in an explicit control hierarchy. An element can be caused or modulated by multiple goals at the same time. It might not always be clear what goals inluence a lower element, or to what extent. The result would be that the orderly tree shaped hierarchy gets replaced by a dense mesh of interconnected action elements, which would seriously undermines the value of a hierarchy in explaining the realization of actions. On the other hand, as the more implicit hierarchy, structured around is not committed to the postulation of a single, explicit goal, nor a direct and univocal relation between the higher and lower elements. Therefore the inluence of multiple simultaneous goals does not undermine the hierarchical structure. 72 same lines of the more traditional accounts, they do not suffer from the conceptual and empirical issues discussed. Much work—both conceptual and empirical—is still needed to develop an implicit hierarchical structured around temporal extension to an insightful and coherent alternative to the current theories on action representation. But only if we approach the alternative hierarchy as a genuinely alternative structure, and avoid straightforward causal interpretations based on the traditional accounts, we can expect to ind its true value. 73 action hierarchies abstract Intentions are commonly conceived of as discrete mental states that are the direct cause of actions. In the last several decades, neuroscientists have taken up the project of localizing intentions in the brain, and a number of areas have been posited as implementing representations of intentions. We argue, however, that it is doubtful that the folk notion of ‘intention’ applies to any particular physical process by which the brain initiates actions. We will show that the idea of a discrete state that causes an action is deeply incompatible with the dynamic organization of the prefrontal cortex, the agreed upon neural locus of the causation and control of actions. Discrete representations can at best, we will claim, play a subsidiary, stabilizing role in action planning. This role, however, is still incompatible with the folk notion of intention. We conclude by arguing that the prevalence of the folk notion, including its intuitive appeal in neuroscientiic explanations, stems from the central role intentions play in constructing intuitive explanations of our own and others’ behavior. A modiied version of this chapter is resubmitted for publication as: Uithol, S., Burnston, D., & Haselager, W. F. G. (under review). Will intentions be found in the brain? Cognition. five intentions in action action intention prefrontal cortex motor control 76 Introduction Actions are generally thought to be the result of a preceding intention to act. Tim intends to grasp the cup in front of him and subsequently, and consequently, he grasps the cup. Intentions, in this ‘folk’ interpretation, are conceived as discrete mental states that are the direct cause of actions. The notion of intention plays an important role in a variety of contexts, ranging from psychology (Meltzoff, 1995), to philosophical theories of action (Bratman, 1987; Davidson, 1963), to legal theory (Moore, 2011). The folk conception of intention is often straightforwardly used in neuroscientiic studies into willed action. Haggard summarizes the role the notion plays in computational neuroscience as follows: “In computational motor control, for example, actions begin with a relatively simple description of a goal (e.g. ‘I will stand up’). The brain must expand this task-level representation into an extremely detailed movement pattern specify the precise kinematics of all participating muscles and joints. Generating this information is computationally demanding. The brain’s solution to the problem may lie in the hierarchical organization of the motor system. Details of movement are decided at the lowest level of the motor system possible” (Haggard, 2005, p. 292). The picture of intentions that emerges here is of discrete and simple states, free of context-speciic details, that are the originating causes of subsequent action planning and motor movement. Ever since Libet’s pioneering investigations into the neuroscience of willed action (1985), continuous attempts have been made to ind the neural correlates of intentions. Largely based on functional MRI studies, these attempts have resulted in a variety of proposed localizations for intentions. For example, Lau, Rogers, Haggard & Passingham (2004) asked subjects to attend to their own intentions while performing an act, and measured the areas that showed modulated activity during an attended condition compared to an unattended condition. Based on its increased activation, they argue that intentions to act are localized in the pre-SMA region of the medial prefrontal cortex. By stating, after James (1890), that “any intention […] has the tendency to cause the relevant movements” (Lau et al., 2004, p. 1208), these authors explicitly state their adherence to the causal aspect of the folk notion of intentions. Other projects attempt similar localizations, but end up with different results. For instance, Haynes and colleagues (2007) report inding neural activation speciic to subjects’ intentions in the medial prefrontal cortex (more anterior than Lau and colleagues reported), as well as lateral pre- 77 intentions in action frontal cortex. In this study participants had to either subtract or add two numbers that were presented after a short interval. Simply looking at the differential brain activity in the two cases allowed for statistically signiicant predictions of which action the subjects intended to perform, leading to the hypothesis that the intentions for these actions were encoded in this differential activity. As a inal example, Hamilton and Grafton have conducted a variety of studies attempting to delineate the brain areas responsible for understanding others’ intentions, and have argued that such processes make use of the observer’s own goal representing system (Hamilton & Grafton, 2008). Citing fMRI evidence that activation in the inferior parietal lobule is not speciic to motor effectors, they argue that “human IPL [(inferior parietal lobule)] and IFG ([inferior frontal gyrus]) contain populations of neurons that encode the outcome of an observed action”, and moreover that “these results are concordant with previous data implicating IPL and IFG in goals and intentions” (Hamilton & Grafton, 2008, p. 1164). Thus, they claim that speciic groups of neurons contain an explicit representation of a desired outcome, and is involved in preparing the appropriate action (Grafton & Hamilton, 2007), thereby implementing an intention to bring about the outcome. This interpretation, in addition to being clearly inluenced by the folk view, shapes Hamilton and Grafton’s view of the overall structure of the motor system. They propose a hierarchical organization in which motor plans are at lower levels and abstract goal and intention representations are at the top (a “motor hierarchy” similar to the one described by Haggard above (Grafton & Hamilton, 2007; Hamilton & Grafton, 2007; see also Uithol, van Rooij, Bekkering, & Haselager, 2012). It is interesting that each of these projects posits a different locations for intentions. One might think that this means that the researchers are postulating alternative, competing hypotheses, and that further research will help determine which of the hypotheses is correct. In this paper we will argue, alternately, that the folk view of intentions as discrete mental states embraced by these theorists is not applicable to the neural processes that actually generate actions, and that consequently none of these projects will reveal a brain structure that instantiates states with the properties of intentions attributed by the folk interpretation. In the next section we will discuss the properties of the folk notion of intention—that intentions are functionally discrete, context-independent, and cause actions—via a brief analysis of the role the notion plays in philo- 78 sophical accounts of intentional action. We will speciically discuss Pacherie’s theory of intentions (Pacherie, 2006; 2008; Pacherie & Haggard, 2011), as we take her account to be the best attempt to make the folk notion empirically tractable, and because, importantly, a variety of investigators in the neuroscience of action explicitly cite Pacherie’s account as being at least compatible with their overall view of action generation (see, for instance, Moore, Wegner & Haggard (2009); Pezzulo and Dindo (2011) and Hamilton and Grafton (2007)). However, we will then show that this account is incompatible with a variety of results and models of action control in the lateral prefrontal cortex (lPFC), the area that is most likely to implement causation and control of actions in the brain. This incompatibility, we will show, stems directly from the adherence to the folk notion. As a consequence, any neuroscientiic account informed by the folk notion will face similar compatibility problems. Therefore, we will argue, neither the folk notion nor its philosophical descendents will provide a fruitful conceptual framework for guiding neuroscientiic investigation. Subsequently, we will discuss an alternate possible role for discrete representations, as helping to stabilize dynamical processing during action planning. This contribution, we will argue, is still incompatible with the folk notion of intention. Finally, in a more speculative section, we will discuss the possible origins of the folk notion, and try to explain why it has been so overwhelmingly pervasive. The folk notion and its philosophical descendants The notion of ‘intention’ plays an important part in our folk psychology, our everyday framework of explaining the behaviors of ourselves and others (Davies & Stone, 1995; Haselager, 1997; Stich, 1983). In this framework, intentions are generally conceived as mental states similar to beliefs and desires (Anscombe, 1957). Like a desire, an intention is characterized as representing a potential outcome, but the two mental states differ in that an intention consists of both a goal and an action plan to achieve it, whereas a desire does not (Bratman, 1981). For instance, one can desire that the sun will rise, but one cannot intent to make it rise, as there is no action that could bring this about. On philosophical interpretations of the folk framework, three important and related features are generally attributed to intentions. First, the content of intentions is independent of the context in which they occur. For instance 79 1 Our use of discreteness is about the role these mental states play, and is not to be confused with the discreteness of the representations themselves (see Maley (2010) for a discussion on the latter interpretation). intentions in action Pacherie (2008), Searle (1983) and Bratman (Bratman, 1987), stress that the content of a particular intention is independent of the perceptual, affective, and cognitive context in which it is implemented, and therefore that each particular intention needs to be subsequently embedded into a context in order to cause an appropriate action. Intentions share this contextindependence with other mental states. Just as the belief that “France is a country” is independent of the color of the walls of the room in which one entertains this belief, the intention to grasp an apple is presumed to be the same regardless of the color or shape of the apples one intends to grasp, or the reason for grasping them. This context-independency allows one to form intentions about future actions in a different context, for instance to pick up groceries after work (Pacherie, 2008, p. 183). Consequently, particular intentions retain their characteristics across different instances. An intention to grasp an apple today is the same as the intention to grasp a different apple one had last week, or last year. Note that this context-independency of the content of intentions does not mean that the occurrence of intentions is independent of context. Seeing fruit and vegetables at the supermarket can very well help to bring about my nevertheless context-independent intention to eat an apple. Second, intentions are thought to be functionally discrete and simple states. Discreteness, for purposes of this discussion, means that intentions are believed to be cognitive units that play a clearly isolatable role (Haselager, 1997)1. Given the discrete causal interactions posited for intentions, and their context independence, intentions themselves are also relatively simple. Context-speciic details are not part of the intention, which allows for a high degree of abstraction, and results in a relatively simple mental state. Complexity in the action generation system arises only either from interactions between simple, discrete states, or from the translation of these states into non-discrete format further on (or “down,” in a hierarchical view) in the action generation sequence. Philosophers often invoke these properties by talking about the “propositional” nature of intentions (see for instance Pacherie (2008), or Fodor (1985)). Propositional states consist of explicit, structured semantic representations similar to those found in language. Thus, when one intends to grasp an apple, the cause of the associated action is an attitude stemming 80 from a mental representation analogous in structure to the phrase “I intend to get an apple.” Compared to motor plans or detailed motor representations, propositions are relatively simple, consisting of abstract representations of the intended object and the act. While neuroscientists may not subscribe to the explicitly linguistic, propositional characterization of discrete states, quotes such as those from Haggard in the introduction clearly show that many neuroscientists sign on for the discrete and simple rendering of the nature of intentions. Third, intentions are thought to cause actions, as opposed to simply covarying with them (Searle, 1983). On this view, an intention must be formed prior to the planning and generation of an action, and the content of the intention determines what actions are appropriate to generate. An episode of action planning can only be successful if it achieves the outcome represented in the intention, e.g., to grasp an apple. Thus, in accordance with the hierarchical models posited by, for instance, Hamilton and Grafton (Grafton & Hamilton, 2007; Hamilton & Grafton, 2007), on the folk view the discrete intention is the primary causal and organizing factor in an episode of action generation. To account for thoughtless or unconscious actions, Searle (1983) posits two types of intentions: prior intentions and intentions in action2. Prior intentions are intended to capture the folk notion of intentions, and are supposed to account for the temporally extended and deliberating aspects of this notion. Intentions in action are of a different, unconscious and nonpropositional format, which share more features with motor plans than with propositional thought. As philosophy has further explored the types of effects associated with intentions, as many as seven functions have emerged. Intentions are posited to: 1) terminate deliberation, 2) prompt practical reasoning, 3) coordinate action, 4) initiate action, 5) sustain action, 6) guide action, and 7) monitor action (see Pacherie (2000) for a more detailed discussion of the functions of intentions and theories proposed to capture them). However, providing a speciic causal theory of intention that both holds on to the core folk notion and accounts for all seven functions has proven dificult. Pacherie has done admirable work to create a framework that is based on these philosophical theories of action and also suitable for empirical investigation. She distinguishes three types of intentions: D-intentions, P2 Similarly, Bratman (1987) contrasts future-directed and present-directed intentions, Pacherie & Haggard (2011) contrast immediate and prospective intentions. 81 intentions in action intentions and M-intentions (Pacherie, 2006; 2008). D-intentions (distal intentions) are located at the top of the action hierarchy. Since D-intentions are the outcomes of propositional reasoning processes, Pacherie explicitly states that they are propositional and discrete (p. 192), next to contextinsensitive (2008, p. 183). D-intentions therefore seem to be highly similar to the folk conception of intention. The context of the action becomes pertinent when D-intentions are translated to P-intentions (proximal intentions). P-intentions contain a plan for the action within the current context, for example, getting up from my chair, walking to the fruit bowl and grasping the apple. While the details of P-intentions are not clear on Pacherie’s account, they are supposed to play an intermediate role, aiding in the transition from a discrete, simple state to detailed motor plans. Lastly, these intentions are translated to an M-intention (motor intention) that speciies the exact motor representations needed to perform an action, and contain detailed programs for, e.g., how to get up from a chair, how to balance one’s body, etc. M-intentions are no longer propositional in nature, but instead consist of a set of motor representations that together cause and control a complex series of movements. So what starts as a discrete, context-free state with propositional content gets translated to a complex and highly speciic motor command that is entirely adapted to the current context. Pacherie’s account has an additional layer of complexity from what we have discussed here, in that, while the initial causation of the action proceeds in the way we have described, she also allows for causal inluence to propagate back up the action hierarchy (see Figure 1). Speciically, types of intentions at lower levels can, after being caused by intentions at higher levels, in turn modify these higher-level intentions. There is, however, a tension in the idea of discrete D-intentions starting the causal cascade (p. 188), and subsequently being modiied by more dynamic processing at lower levels. While functionally discrete states may be in principle possible in a dynamic system, positing this sort of architecture does raise issues about the nature of the feedback to higher levels. It is unclear how a functionally discrete state is modulated by continuous and dynamic feedback, since the structure and form of these two types of processes are radically different. In what follows, we will expand on this point, arguing that this tension arises from adhering to the folk notion of intentions, while also attempting to do justice to the dynamic nature of action generation. It is this adherence, we will claim, that 82 renders Pacherie’s framework incompatible with the brain processes that control our actions. Figure 1. An overview of Pacherie’s “causal cascade” theory of intentional action, taken from Pacherie (2008). Action control in the prefrontal cortex Based on a variety of evidence from imaging studies, neurophysiology, and neuropsychology, the prefrontal cortex (PFC) is generally recognized to be the locus of action generation and control in the brain (see Miller and Cohen (2001) for an early review of this evidence). Various models of prefrontal action control have been proposed (see Badre (2008) and Ramnani and Owen (Ramnani & Owen, 2004), for reviews), and while the details of different models of control differ (see more on this below), the notion of a posterior-anterior axis on the lateral PFC (lPFC) for implementing different kinds of control is now widely recognized (Badre, 2008; Fuster, 2004). The processes in the lPFC seem to exhibit the same temporally extended and deliberative aspects as Pacherie’s D-intentions. For example, Koechlin and colleagues (Koechlin et al., 2003; Koechlin & Summerield, 2007) have suggested and tested a model in which different types of action coordination are subserved by different areas of the lPFC. Based on differential activation found in imaging studies, they posit that posterior areas of PFC are in charge of selecting between different “sensorimotor associations,” where these consist of a learned connection between a stimulus property and a speciic motor act. When different sets of rules must be applied depending 83 intentions in action on the nature of the stimulus sets presented, dorsolateral PFC (dlPFC) is activated, performing a process that they call episodic control or contextual control. When a task must be paused, and a new set of rules implemented, due to the presentation of an interrupting stimulus, followed by a return to the irst task, the lateral frontopolar cortex (the most rostral part of the PFC) is activated, a process they call branching control. Like D-intentions, the processes in branching control are temporally extended in the sense that they must keep track of information longer than other types of control—they involve maintaining a variety of task rules in order to act appropriately in response to the presented stimulus. They also seem to be genuinely deliberative, since they must keep track of a variety of different considerations, and discern the appropriate relationships between them, in order to complete tasks effectively. These studies and models suggest that the functions attributed to the regions along the rostro-caudal axis seem to be similar to the functions that D-, P-, and M-intentions are supposed to perform. The posterior prefrontal cortex is thought to be involved in concrete action responses (Badre, 2008), just like M-intentions. More anterior and rostral regions are involved in more temporally extended and deliberative action planning (Badre, 2008; Christoff & Gabrieli, 2000; Koechlin et al., 2003), just like P- and D-intentions. So, in behavior requiring branching control, anterior areas would store the required rules, and produce processes to apply the new rules in the appropriate setting. Several of the proponents of the models here discussed interpret their results in a similar way. Koechlin and colleagues suggest that a broad “goal representation” is maintained in aPFC during the performance of sub-tasks, and activates the appropriate actions at the appropriate times (2003). Burgess, Veitch, Costello & Shallice (2000) interpret the deicits in patients with lesions in aPFC as an inability to form and carry out intentions. Strikingly, Ramnani and Owen (2004) describe the relevant sort of intention along the lines of such plans as to “Meet John at 5”: an obvious parallel to Pacherie’s D-intentions. In all, if D- P- and M-intentions are to be found in the brain, the most likely location would be along the anterior-posterior axis on the lPFC. However, despite the similarities between functions attributed to the different types of intentions and the processes along the anterior-posterior axis, we will argue in the next section that the context-independent, discrete, and causal rendering of intentions is deeply incompatible with the type of information processing that occurs in lPFC. 84 The complex control in the lPFC We have shown that in Pacherie’s model, D-intentions are simple, discrete and independent of the context in which the actions they cause are embedded. In this section we will show that there is convincing empirical evidence that neural activity in the anterior regions of the lPFC is informationally as well as dynamically complex. These types of complexity render prefrontal control processes incompatible with the properties and causal role of intentions attributed by the folk view. Let us start with informational complexity. A system’s operations are informationally complex, in our view, if, during its normal operation, the system has access to and makes use of a variety of different sources of information. The anterior regions of the lPFC receive input from the medial temporal lobe, the thalamus, as well as multiple sensory areas, such as ventral visual areas, somatosensory areas, auditory cortex, and the rostral superior temporal sulcus, which is itself known to be a multimodal area (Miller & Cohen, 2001; Ongür & Price, 2000). The existence of these pathways suggests that activity in the anterior parts of the lPFC can, in principle, be modulated by a variety of types of information. That this sort of modulation occurs during action control is supported by data from single cell studies in monkeys, showing that neurons of the anterior regions of the PFC are sensitive to a variety of perceptual information about the stimuli that cause actions. Some early studies by Fuster, Bauer & Jerveye (Fuster, Bauer, & Jervey, 1982) show evidence that anterior dorsolateral and orbital regions of the PFC had speciic responses to spatial elements, as well as perceptual features of the action context, and to the relations between them. The researchers conclude that these data “provide further evidence for the hypothesis that the prefrontal cortex is essential for the temporal integration of sensory data and motor acts in sequential behavioral structures” (p. 690). Following on these results, Fuster, Bodner & Kroger (Fuster, Bodner, & Kroger, 2000) trained monkeys to form an association between a tone and a color—speciically, the monkeys had to press a certain color on a touch screen following the presentation of a certain frequency sound. Single cell recordings were conducted in over 300 neurons spanning areas in the anterior parts of the dorsolateral and anterior prefrontal cortex. In accordance with the previous indings, they discovered that the vast majority of the cells increased activation in the interim between the presentation of the tone and the colors, suggesting that these encode the presence of a relation between tone and color. 85 intentions in action Fuster and colleagues contend that the function of these PFC cells is to facilitate the integration of perceptual information and, moreover, to do so over an extended period (Fuster et al., 1982, p. 690). Thus, the cross-temporal aspects of these neurons suggests an action control process that seems to be in line with the cross-temporal role that intentions are supposed to play, yet the process seems highly sensitive to speciic perceptual information. It is the presence of the tone in the perceptual environment that mediates processing in the lPFC, and this processing is directly relevant to performing the proper action (via the learned perceptual association). One could maintain that the cells that associate a tone and a color are encoding a discrete intention, such as “I will press the appropriate color.” This would be to invoke the folk notion that perceptual information is turned into a context free state (e.g., “the tone is present”), which in turn forms a discrete intention. An initial reason to doubt this sort of claim is that these neurons are interspersed tightly with the ones with more speciic responses—i.e responding to just a tone or a color—suggesting a highly integrated network. Also recall that the hierarchical views of actions which are based on discrete intentions, including Pacherie’s theory, posit that discrete intentions occur at a higher level—both causally and anatomically (Grafton & Hamilton, 2007)—than non-discrete ones. The interspersed physiology revealed by Fuster et al’s studies, then, is incompatible with this objection. Even more challenging to the suggestion of discrete states in the anterior PFC is the dynamical complexity of these areas. We deine dynamical complexity as the continuous modiication of the state of a system in accordance with processing and task demands. Amongst the neurons that Fuster et al. (2000) studied which showed feature-speciic associative behavior, different behavior at different stages of the action sequence was found. Some neurons maintained correlated iring only through certain epochs of stimulus presentation—for instance, during auditory presentation and in the delay between stimuli—while others ired consistently in different stages. Moreover, even the neurons that seem to represent the association between the stimulus features—as opposed to neurons that ired for just one of the features—did not demonstrate a constant response, but instead their activity varied in their temporal relations to the onset stimulus. So the activation of the anterior lPFC does not remain consistent during the course of planning and executing an action, but instead changes dynamically during the progression. 86 Notice here the complementary natures of informational and dynamical complexity: the dynamic activity shifts in the lPFC are due to the temporal relation of the current situation to the perceived stimulus—i.e., whether it is currently being perceived, or the length of time between the perception and the action. The presence of different types context-speciic information in the lPFC thus has continuous effects on its activity. While it is hard to establish the presence of dynamical complexity in imaging studies, due to the coarse temporal resolution of MRI data, the single cell data we have discussed seem to be in line with the indings of imaging studies regarding the systems-level properties of both the posterior and anterior regions of human lPFC. For instance, more posterior areas of the lPFC have been implicated in contexts where either associations between visual stimuli and particular actions (Passingham, Toni, & Rushworth, 2000) need to be recalled (see also Koechlin et al. (2003), Kouneiher et al. (2009) and Rushworth (2008)). This suggests that activity in these areas is modiied depending on whether there is perceptual information needed to complete a task. Importantly, these effects are not isolated in posterior regions of the lPFC, as would be expected if representations became less context-sensitive as one moved anteriorly along the anterior-posterior axis. Prabhakaran, Narayanan, Zhao & Gabrieli (2000) compared two conditions involving tasks cued by verbal and spatial information, one in which the two types of information were presented separately, and one in which the verbal and spatial cues coincided. In the “non-integrated” (irst) condition, activation was higher in more posterior areas of the lPFC, whereas in the “integrated” (second) condition, activation was considerably higher in more anterior regions, particularly in the right hemisphere. The difference between posterior and anterior regions, then, is not in whether information about perceptual context is processed in anterior areas, but what kind of context information is processed—i.e., information about speciic environmental features more posteriorly, and information about relations between environmental features more anteriorly (see also Christoff, Ream, Geddes, & Gabrieli, 2003). Similarly Ranganath, Johnson & D’Esposito (Ranganath, Johnson, & D’Esposito, 2000) found that left anterior lPFC shows increased activation during tasks where more speciic perceptual information had to be recalled, suggesting that informational complexity increases as one moves more anteriorly along the lPFC axis. Both the imaging and the single cell results we have discussed support the informational complexity of the 87 intentions in action lPFC, and the studies by Fuster et al. strongly support its dynamical complexity. These types of complexity are irreconcilable with each of the three properties of the folk notion of intention, discussed in section 2. First, contextindependency: The informational complexity of the PFC shows that even the most anterior parts of the lPFC are not context-insensitive in the way demanded by the core folk notion of intentions. If the role that Fuster et al. (2001; 1982; 2000) attribute to lPFC neurons of integrating perceptual information over a period of time is correct, then this context-sensitivity is vital for the functioning of these areas. The view suggested by these results is that branching and episodic control (Koechlin & Summerield, 2007) implement their functions by tracking information relevant to a particular task or task set in the actor’s perceived environment over time. Branching control, for instance, attempts to mediate shifting task demands with changes in context, and does so by keeping track of relations between a variety of different informational sources. On the view suggested by informational and dynamic complexity, this sort of procedure can produce temporally extended behaviors without relying on previously represented, discrete intentions, simply through dynamic interaction with a perceptual context. Second, the two kinds of complexity also undermine the notion of intentions as functionally discrete and comparatively simple states. Dynamical complexity alone would be problematic for an attempt to isolate a discrete state from ongoing processes, since it shows that activation in the PFC is continuously modiied from the earliest stages of the task. This speaks against the element of Pacherie’s view in which a discrete (i.e., not in a continuous functional relationship with other elements of the system) state begins the action coordination process, and is subsequently modiied by feedback from non-discrete lower levels. The additional fact that these processes are embedded in context up to the highest levels means that it is very unlikely that one will be able to isolate a state whose processing or content is dissociable from the context in which it occurs, and therefore which is consistent across changes in context. Of course, some elements of the context may remain similar across instances of apple-grasping (e.g., apples tend to have roughly the same shape and size across instances), and therefore some elements of the apple-grasping process may be similar across most instances. However, this does not mean that there is a state somewhere in this process that is discrete and context insensitive. As Fuster and colleagues suggest, these can 88 simply be due to the inluence of contextual elements that are similar across those instances, suggesting that apparent discreteness here is accidental. Moreover, the informational and dynamical complexity of lPFC control processes undermine the idea of diminishing complexity when one moves up to more deliberative and temporally extended aspects of actions, which is presumed by both Pacherie and by neuroscientists like Haggard. Instead, the data discussed above suggests that there is a different kind of complexity present at higher levels. At low-level motor control, there is complexity with respect to detailed information regarding motor states—e.g., proprioceptive and sensorimotor feedback from the muscles involved— and detailed plans for how to perform suites of motor effector movement. At higher, deliberative, levels, there is a complexity of types of information and the relations between them. So unlike low-level control, in which there is complex and detailed information within a type (effector-speciic), higher levels of control are complex in that they can process relations between a variety of stimulus types, rules, outcomes, and temporal contexts. This is far removed from the folk notion of a simple and discrete state. Third and inally, causation: According to Pacherie, the types of intentions involved in causation and control of actions are distinct, with D-intentions starting the process and not being intimately involved in lower-level, situational and motor control, which depends on the other types of intentions. We take the PFC data discussed above to show that even the most anterior parts of the lPFC play a role in control as well as causation of actions. Koechlin et al.’s (2003) data, for instance, show that anterior lPFC must be actively engaged in monitoring more posterior areas, since it is involved in recognizing the interrupting stimuli and implementing the new set of action rules at the appropriate time 3. Indeed, Koechlin and colleagues suggest that the areas that are involved in “selecting” actions (in their terms, appropriate “stimulus-response associations”) at lower levels are also involved in controlling them. This clearly suggests the conjoined functioning of causation and 3 Of course, it is possible that anterior regions have no access to processes occurring at more posterior ones. It might be that anterior lPFC simply cuts off whatever processing is occurring in more posterior regions when episodic or branching control is required. However, we consider this unlikely. First of all, the large number of connections from posterior to anterior regions suggests the availability of information regarding posterior region activity to anterior regions. Moreover, presumably the most eficient way to produce the correct modiication of more posterior processes will involve modifying its particular current processing in accordance with the needed form of control. In keeping with Koechlin et al.’s view of selection between different action associations, it is likely that anterior lPFC and orbital PFC areas are inluenced by processes occurring at lower levels. 89 ‘Kicking it upstairs’ One could object to our claim that the deliberative, future-directed aspects of action planning are implemented by the lPFC by arguing as follows: Pacherie’s D-intentions span a big range, from speciic, say the intention to eat an apple, to very broad and future directed, such as the intention to spend your next holiday in Spain. We have exclusively focused on those D-intentions that could be accounted for by lPFC processes, but more distantly future-directed intentions may fall outside of the capacities of these processes. So, a different system might need to be invoked to account for intentions that are directed at activity in the far future. Since a part of the set of D-intentions can already be accounted for by lPFC-processes, we have to split the set. Let us call the long-term D-intentions “S-intentions,” (for “Spain intentions”). The suggestion then is that, in addition to the lPFC control processes, a separate system is needed to account for these S-intentions. This way, the propositional, discrete character of intentions is saved intentions in action control in the PFC. Similarly, Ridderinkhof, Wery, Wildenberg, Segalowitz & Carter (2004) state that the PFC has the capacity for both “evaluating” contexts and “regulating” activities, which is bound up in its ability to represent both associations between contextual features and rules for selecting the appropriate associations. In all, we have argued that the functions posited by for D-intentions are exhibited by the lPFC, but that the processes within these brain regions do not exhibit the properties to which the folk notion and Pacherie’s view of D-intentions are committed. This suggests that although Pacherie is right in emphasizing the dynamical nature of action control, her adoption of the folk interpretation of intention, and her emphasis on top-down causation (D-intentions starting the causal cascade) make her model irreconcilable with empirical data on prefrontal action control. A possible objection to our perspective on these results would be to claim that the empirical data we have discussed so far correspond to relatively lowlevel intentions—i.e., intentions to perform speciic motor acts in relation to concrete stimuli—but that there are more future-directed aspects of action planning and generation that are not accounted for by this picture. Perhaps there are further sorts of intentions that are discrete, and perhaps these originate outside of the lPFC. We will discuss this objection in detail in the next section. 90 by placing it another step up the stairs in a hierarchical model of action generation. Discrete S-intentions, on this view, would be capable of causing a series of D-intentions; D-intentions would account for extended temporal associations between context-speciic information; P-intentions would plan actions within that current context; and inally M-intentions would recruit the corresponding motor representations. Although we cannot rule out something like S-intentions across the board, they bring with them a serious threat of explanatory vacuity. Going to Spain does not consist of a single action. This means that an S-intention must cause and coordinate a series of more speciic D-intentions (to go to the travel agency, to buy a Spanish dictionary, to renew your passport, etc.), which, due to their more direct link with an action, would more plausibly employ the PFC processes we have discussed. The problem is that intentions are supposed to fulill the causal aspect of the folk notion, but these S-intention do not cause a single speciic action. If we allow intentions that do not directly and immediately cause actions into our scheme, it becomes extremely unclear what intentions are causing what actions. To illustrate, going to Spain is part of John’s intention to live a happy life. This H-intention (‘Happy-intention’) does not directly select a single intention on a lower level, for example the S-intention (after all, a holiday in Spain is usually not suficient for a happy life), but causes multiple intentions, just like the S-intention causes multiple D-intentions. Maybe going to Spain is part of John’s intention to impress his neighbors, which makes him happy. In that case the impressing-neighbors intention sits between the H-intention and the S-intention, but perhaps only partly, as John might have gone to Spain irrespective of his neighbors’ opinion. Allowing intentions that do not directly cause actions, but cause other intentions in turn, opens the door to a virtually unlimited number of intentions for any speciic action. Since these sorts of intentions, on the folk notion, are supposed to be the primary explanatory construct for each action, saving discrete intentions by placing them at further and further distances—both conceptually and anatomically—from actual actions undermines the explanatory leverage. Furthermore, the idea of propositional intentions causing the processes of action control we have discussed is faces empirical problems. In order to explain extended action generation with discrete intentions, not only must a location for these intentions be posited, but it must be made clear how these discrete intentions cause speciic action-control processes—i.e., by what neural pathways and in what temporal order. One possible locus for 91 Figure 2. Action control (based on Koechlin and Summerield (2007)) and informational complexity in the prefrontal cortex. There is interconnectivity between the different regions of the PFC, as well as input on each level from sensory areas (Miller & Cohen, 2001). Moreover there is motivational input on different levels from the dorsal Anterior Cingulate Cortex (ACC) and the pre-SMA (Kouneiher et al., 2009). Note that this is just a graphical overview of the control processes we have discussed, and the combined functional connectivity that underlies them, not an alternative theory of action generation or control. intentions in action attempting to ind discrete inputs to the lPFC would be medial prefrontal cortex, due to its relative lack of sensory input and considerable projection to lPFC (Ongür & Price, 2000) and due to its recognized function of underlying the motivational elements of action generation (Egner, 2009). However, mPFC does not provide a univocal, input to lPFC, but instead inluences lFPC processing in several places (Kouneiher et al., 2009). Kouneiher and colleagues (2009) argue that motivational inluence affects both contextual and episodic levels of control—i.e., the posterior PFC as well as midPFC. This means that the motivational factors that prime actions cannot be parceled as discrete and unitary input states. Instead, they appear to be multi-faceted processes that modulate activity at different anatomical locations and on different levels of neural control, implemented by multiple heterogeneous and dynamically interacting structures. This kind of dificulty, we contend, makes it unlikely that a pathway can be found via which a discrete intention (of the S-or H-type), formed independent of the control processes, could be propagated to the lateral PFC. See Figure 2 for an overview of the connectivity we have discussed. 92 Relatedly, functional connectivity studies into aPFC suggest that there is massive inluence from more posteriorly located (“bottom-up”) PFC-processes, as well as from mPFC and from sensory and subcortical areas (Miller & Cohen, 2001). So, the information processing leading to action generation in the aPFC is not likely to be caused primarily by another system producing discrete intentions elsewhere in the brain, but is most likely due, at least in large part, to other processes (such as the motivational ones discussed above) occurring within the PFC. This is, of course, necessary when one realizes that deliberation does not take place “in a vacuum”, but is to a large extent inluenced by the context an agent is in, the motor capabilities of the agent, the background state of the agent, etc. Figure 2 clearly shows that the folk notion with its discrete, context-independent and causal character does not match the neural processes that cause and control our actions. Cognitive outsourcing Thus far we have argued that the type of intentions posited by the folk notion cannot be the primary causes of our behavior. However, this does not exclude the possibility that discrete representations—for instance certain memory traces or linguistic representations—might play some other, subsidiary role. Here, we speculatively suggest such an account, following Clark and others (Clark, 1997; 2006; Elman, 2004), on which discrete representations are a means for cognitive outsourcing, or scaffolding. On this type of view, discrete representations can serve as stabilizing factors for the more dynamic processes that actually control actions (Clark, 2006, p. 372). Speciically, the idea is that, while actions are initiated and controlled via a dynamic process centered in the PFC, additional representations can occasionally play a role as stable elements around which the dynamic processes can organize themselves. So, while it may be highly problematic to hold that discrete representations directly cause actions, due to the conceptual and empirical problems we have discussed, it may be that the presence of stable representations provides a “cognitive resource” which allows for the PFC to update its actions in accordance with the representation. This would be akin to writing “go to Spain” on a blackboard or in a calendar, and looking at it occasionally when one gets distracted. Such an outsourcing process could enable some processes that would not be possible otherwise. For instance, it might be 93 intentions in action that for actions that demand a long deliberation periods and are highly complex, such as a trip to Spain or planning a party, the PFC by itself would not have—and this is a speculative suggestion—the storage capacity to coordinate all of the appropriate actions on its own. Clark gives a compelling argument that some distinctively human processes depend on such outsourcing procedures, and complex and long-range planning might well be one of them (Clark, 2006). The process of cognitive outsourcing is emphatically a different type of process than the ones attributed to the folk interpretation of intention. The stable discrete or linguistic representations serve only as an occasional guidepost by which a structure like the PFC can orient itself in the course of an extended process. As Clark notes, there is no outside governing system that uses the representations to direct the process in question. Instead, the outsourcing view posits that stable representations are tools for the selfmanipulation of cognitive systems (Clark, 2006, p. 373) not a gateway for the inluence of outside systems. Thus, when applied to action, the outsourcing view posits that the causation and control of particular actions is still centered squarely within the PFC, and this still undermines the causal properties associated with discrete intentions by the folk notion. Of course, this proposal is speculative, and the details of the relationship between the dynamic and continuous control processes and processes operating on larger time-scales or more stable representations demand a thorough investigation. However, we believe that research into intentional action is better served by focusing on the complex interaction of different control processes, and perhaps their link with additional stable representations, than it is by attempts to localize intentions of the folk variety, or base elaborate hierarchical models of action and motor systems on such. In our view, this does not do justice to the complexity involved in action generation and control. The important questions in understanding a complex system involve investigating how structure emerges from the continuous interaction of the components involved (Bechtel, 2008). In this particular case, we contend that understanding action control will involve understanding how (i) multiple types of information are integrated, both across modalities and across time, (ii) how informationally complex processes at the anterior regions of the PFC are translated into motorically complex information at more posterior areas, and (iii) how these processes are shaped by the context of the action. The folk framework has no explanatory resources for 94 addressing these questions, which we claim are vital for understanding how cognitive beings interact with the world. In summary, we have argued that the discrete-intention view to which the folk notion is committed is not a suitable framework in which to accommodate the complexity and details of action generation in the brain. How, then, can we account for the compelling intuition underlying the ubiquity of the folk notion? In the next section we will sketch the outlines of the answer to that question, on which the characteristic of the folk interpretation of intentions stem from the use of the concept of intention in social explanation and communication. The social origin of intentions To make a irst step towards understanding our strong intuitions about intentions being the cause of our behavior, we have to look at how we go about explaining behavior. Suppose we see Mary in the hallway in her department, and we want to explain why she is walking to the coffee machine. In proposing this question, a crucial step has already been taken implicitly: we have framed Mary’s behavior in terms of a single action—i.e., walking to the coffee machine. We have already zoomed in on one aspect of Mary’s behavior, and ignored other aspects. McFarland (1989) calls this the “teleological hypothesis,” the idea that a piece of behavior can be considered in isolation from the rest of the behavioral repertoire (p.39). In reality, he claims, actions are never singular. In a normal episode of walking to the coffee machine, Mary in fact performs a variety of actions: she retains an upright posture; she tries to avoid making too much noise; she hums, or perhaps mumbles; she greets her colleague; she stretches her legs, etc. Describing behavior in terms of a single action is in many ways an artiicial strategy (McFarland, 1989; Uithol et al., 2012). Moreover, walking to the coffee machine is but one level at which we can describe the action (Uithol, van Rooij, Bekkering, & Haselager, 2011b). What Mary does at one moment can be described at many levels: lexing her knee, putting one leg in front of the other, walking, getting coffee, relieve herself from drowsiness, attempting to work on her paper more effectively, pursuing a fruitful scientiic career, etc. In most cases an explanation such as “Mary is getting coffee” is intuitively the most reasonable, but not in all cases. Which level we pick to describe her behavior is dependent on what we want to explain (Vallacher & Wegner, 1987) but also, and importantly, to 95 intentions in action whom. For instance, if we are explaining Mary’s perambulations to her boss, we might focus more on her desire to be alert and productive, than on the action of getting coffee per se. So, in general everyday explanation of actions, we tend to pick only a sub-section of the behavior (we focus on the obtaining of coffee, and momentarily forget about the humming, the posture, the leg stretching, etc.), and focus on one level of action description (‘walking’ is the action, and not ‘placing one leg in front of the other’, or ‘working eficiently on a paper’). But which action and which level we pick, is dependent on the context and the nature of the action as well as the audience to which we explain the action. While this strategy is artiicial in several ways, it serves vitally important roles in social explanation. Since we do not have access to the complex interaction of the various control processes that shape Mary’s behavior, we posit a single intention that explains just that part of Mary’s behavior that we are interested in. So not only do we adopt what Dennett calls an “intentional stance” towards Mary (Dennett, 1987), it is a highly contextualized stance, focused on only one aspect of behavior, at one level of description. This means that one possible way of accounting for the origin of the folk notion of intention is by stressing its role in explanations of behavior. Since these explanations are largely a social tool (i.e., we explain Mary’s actions to someone), the positing of discrete intentions serves an important communicative function. Interestingly, we seem to be in a similar situation in regards to explaining our own actions as we are in explaining Mary’s actions. Just as we have no direct access to the ongoings in Mary’s prefrontal cortex, our conscious access to our own control processes is limited as well (Dennett, 1991; Sellars, 1963; Wegner, 2003). So when asked what we are doing or why we are doing it, the best we can do is give a shorthand version, or a rough approximation of the expected result of the various control processes. This also holds for contexts where we explain our actions to ourselves (see also De Ruiter et al., 2007). Suppose John is trying to igure out why he went to Spain. It is well beyond his introspective abilities to recall all of the complex factors involved in producing the necessary actions, which, as we have argued, would be highly mediated by activity in the PFC. What he can do, perhaps, is describe some primary motivations that contributed to his action (for example, the desire to enjoy warm weather after a disappointing summer, or that he wanted to practice Spanish conversation, etc.). This linguistic phrasing helps 96 us (re)construct the motivations behind our own actions, thus making our actions comprehensible (Nisbett & DeCamp Wilson, 1977), but it would be a mistake to take this approximation to be the key causal factor at work in producing the actions themselves. With this in mind, it is easy to understand why the folk notion attributes to intentions the properties we have discussed. We have shown that, explicitly for philosophers, and often implicitly for neuroscientists, intentions in the folk interpretation are thought of as propositional states, usually framed as a small sentence. This is, of course, exactly the format one would expect when a complex of neuronal, and bodily activity is summarized as a linguistic representation, used for communication purposes. Moreover, language is a representational format that (unlike, for example, certain pictorial formats) facilitates, or even necessitates, abstraction and generalization. Short linguistic descriptions of objects, events, states, etc., cannot accommodate a high degree of detail. Thus, assuming that representations of this sort are both implemented in the brain and generate actions leads to the commitments of contextinsensitivity and simplicity that are inherent in the folk view. Since the processing of the PFC in action generation does not match these properties, we conclude that the nature of intentions is social, not biological, and therefore that frameworks based on this notion are unlikely to signiicantly illuminate the processes by which the brain generates actions. Conclusion We have argued that the prefrontal cortex is not likely to implement states that match the folk notion of intention. We have focused on the PFC due to its known contribution to action control, and the similarity between its established functions and those that Pacherie posits for the different types of intentions. There is still much to discover about action control, and of course the brain is bigger than the PFC. Thus, one could maintain that, despite the conceptual and empirical problems laid out above, and despite what we know about neural processes of action control, discrete intentions do exist, and that through further exploration some structure or process will be found that does not exhibit informational and dynamical complexity, and therefore can be seen as implementing discrete intentions. We think that at this point the burden of proof lies with those who would entertain such a view. This burden not only encompasses inding the brain structure 97 intentions in action that could be responsible for the generation and processing of intentions, but also what types of behavior these states can account for. Unless such states are found, we believe that it is a more fruitful strategy to concentrate on the nature and interactions of the dynamic processes that shape our action, instead of holding on to a folk interpretation of intention for which there seems to be little empirical motivation. abstract Intention reading and action understanding have been reported in ever-younger infants. However, the notions of intention attribution and action understanding, as well as their relation to each other, are surrounded by much confusion, making it dificult to assess the meaning and value of such indings. In this paper we set out to clarify the notions of ‘action understanding’ and ‘intention attribution’, and discuss their relation. We will show that what is commonly referred to as ‘action understanding’ in fact encompasses various heterogeneous association and prediction mechanisms. In general, these forms of action understanding do not result in the attribution of an intention to an observed actor. By disentangling intention attribution from action understanding, and by exposing the latter as an umbrella notion, we provide a novel theoretical framework that prevents conceptual confusion and allows for better comparison of indings from different experimental paradigms, and a much more fruitful approach to comparative questions. This chapter is submitted for publication as: Uithol, S., & Paulus, M. (submitted). What do infants understand of others’ action? A theoretical account of early social cognition. six action understanding in infants action understanding intention infancy social cognition 100 Introduction Research in recent decades has provided ample evidence that young infants process information about other people’s actions in a particular way; and that they use this information to understand and predict others’ behavior, as well as react adequately to it (e.g. Barresi & Moore, 1996; Bigelow & Birch, 1999; Carpendale & Lewis, 2004; Elsner, 2007; Elsner & Aschersleben, 2003; Falck-Ytter, Gredebäck, & Hofsten, 2006; Hauf, 2006; Kochukhova & Gredebäck, 2010; C. Moore, 2006; Nyström, Ljunghammar, Rosander, & Hofsten, 2011; Paulus, 2011; Phillips, Wellman, & Spelke, 2002; V. Reddy, 2010; Reid et al., 2009; Ruffman, Taumoepeau, & Perkins, 2011; Sodian, 2011; Southgate, Johnson, Karoui, & Csibra, 2010; Tomasello, 1999; Woodward, Sommerville, & Guajardo, 2001). Initially, these indings were mainly based on the application of traditional preferential looking paradigms (e.g. Woodward, 1998), but more recent research methods such as eye-tracking and EEG recordings have partly conirmed and extended these indings (e.g. Falck-Ytter et al., 2006; Paulus, Hunnius, & Bekkering, 2011a; Reid et al., 2009; Reid, Csibra, Belsky, & Johnson, 2007). For example, by measuring oscillations of the gamma-band over the frontal cortex Reid and colleagues (2007) provided evidence that 8-month-old infants differentiate between complete and incomplete actions. Employing eye-tracking technology, Falck-Ytter and colleagues (2006) examined infants’ eye-movements during the observation of a grasping and transport action, and provided evidence that 12-month-old infants were able to visually anticipate the target of the ongoing actions. Moreover, it has been shown that infants at the end of their irst year of life differentiate between different kinds of intentional actions: they react more impatiently when an adult is unwilling to hand them a toy compared to when he is unable to do so (Behne, Carpenter, Call, & Tomasello, 2005). Taken together, these studies provide rich evidence for the claim that from early in life human infants posses some ability to understand other people’s actions. A number of inluential theoretical accounts have interpreted these data as evidence for infants’ ability to read others intentions (Baron-Cohen, 1995; Luo & Baillargeon, 2010; Meltzoff & Brooks, 2001; Tomasello, 1999; Tomasello, Carpenter, Call, Behne, & Moll, 2005; Woodward, 2009). More precisely, it has been argued that action understanding depends on the ability to read intentions. For example, Luo and Baillargeon (2010) start their review with stating that “[o]ur ability to make sense of others’ intentional actions rests primarily on our ability to understand the mental states that 101 action understanding in infants underlie these actions” (p. 301). Baldwin and Baird (2001) claim that “[o] ur everyday, common-sense ability to interpret and predict others’ behavior hinges crucially on judgments about the intentionality of others’ actions” (p.171), and Woodward (2009) agrees that “infants understand intentions as existing independently of particular concrete actions and as residing within the individual” (p. 55). Interestingly, others—while still adhering to the notion of intention understanding—have reversed this relation and claim that intention reading is the product (i.e. the result) of a more low-level form of action processing. For instance Gallese and colleagues (2009) stress the impact of infants’ mirroring processes (motor simulation of observed actions), and posit that this functional mechanism forms the basis of their capacity to understand intentions. Similarly, Gallese and Goldman (1998) state that “humans’ mindreading abilities rely on the capacity to adopt a simulation routine. This capacity might have evolved from an action execution/observation matching system” (p. 493). These simulation routines allow that others’ “intentions can be directly grasped” (Gallese et al., 2009, p. 105). It is not only the role of intention attribution that is subject to debate; the notion itself is as well. For example, in a cautious note, Baldwin and Baird (2001) acknowledge that “our everyday notions of intention and intentionality may turn out to be invalid as characterizations of the genuine content or processes of the mind/brain” (p. 172). That is, even though in daily life we generally speak about others’ intentions, it is questionable whether the concept of ‘intentions’ is a valid scientiic construct that is useful for guiding neuroscientiic research (see also Uithol et al., submitted). Taken together, notwithstanding the volley of publications on the early development of and role of intention reading in action understanding, there doesn’t seem to be a general agreement on the notions of intention attribution and action understanding, the relation between the two processes, and their role in cognitive functioning (see also Perner & Doherty, 2005). We will argue that much of the discussion is the result of imprecise use of terminology. By showing that several distinct processes are subsumed under the label of ‘action understanding’, and detaching these processes from intention attribution, we aim at helping this discussion forward. We will argue that in everyday situations these processes only rarely result in the attribution of an intention. Consequently, framing these various forms of infants’ action understanding in terms of ‘intention attribution’ can be confusing and easily lead to over-estimation of infant’s capacities. Importantly, 102 our claim is more encompassing than a suggestion for terminology; we will show that our more precise terminology allows for reinterpretation of current indings, and can aid in designing new experiments. What do we understand, when we understand an action? There seems to be a great variety in the interpretation of the notion ‘action understanding’ in the social cognition literature (see Uithol, van Rooij, Bekkering, & Haselager, 2011b for a detailed overview). First, it can mean classifying an action, e.g. recognizing a grasp as a grasping action (examples will be discussed below). Next, it can mean recognizing the goal behind an action, i.e. recognizing a grasping action as being aimed at a particular target. This goal, in turn, also allows for various interpretations. In its most general form, a goal is framed as a desired world state, but in cognitive and neuroscientiic experiments, goal recognition is generally operationalized as target prediction or super-ordinate action prediction. In target prediction, the notion of ‘goal’ is interpreted as the object or end location towards which an action is directed: We see a grasping action, and we understand that the grasp is directed at this speciic object. Next, super-ordinate action prediction means that the goal of an action is interpreted as another action, but one of a higher abstraction. A grasping action towards a cup can be interpreted as a cue that the actor wants to drink. Drinking is also an action, but of a higher abstraction (i.e. grasping the cup is a means for the goal of drinking). Fourth and inally, action understanding can mean generating an appropriate response to an observed action. We will discuss the different forms of understanding in more detail, and relate them to research into infant action understanding. Action classiication When we classify an action, we recognize that it belongs to a certain category of actions. Through this categorization, the observed action is recognized to have the properties that belong to the category to which the action is allocated. That is, we do not perceive another person’s behavior as a unrelated movements, but rather as a particular action that follows a particular pattern. Such knowledge allows us to make sense of the observed movements, as we are able to embed them into structures of actions and to make, and enables us to make predictions about how the action will con- 103 Target prediction Anticipating the target of an action upon observing a not yet completed action is—like with action recognition—having an expectation about what will happen when the action is continued or how the action will inish. Target prediction has been a major topic of interest in infancy research over 1 Mirror neuron studies, for example, suggest such an effect. Umiltá and colleagues (2001) showed that monkey’s were able to recognize a partly occluded grasping action only when the monkey knew that there was a graspable object behind the occluder. action understanding in infants tinue or end. For example, by recognizing an action as a grasping action, the observer is able to predict that the hand will follow a particular movement pattern and continue its trajectory towards an object and take hold of it. This recognizing and predicting should emphatically not be interpreted as a conscious processes, but rather as a process of pattern completion (Buffart, Leeuwenberg, & Restle, 1981). This is something our mind does all the time: just like we recognize a chair as something we can sit on, and we do not relect on its use. Only few studies have examined infants’ developing ability of action classiication (Baldwin, Andersson, Saffran, & Meyer, 2008; Friend & Pace, 2011; Loucks & Baldwin, 2006; Saylor, Baldwin, Baird, & LaBounty, 2007). For example, Baldwin and colleagues (2001) presented 10- to 11-month-old infants with videos of series of ongoing actions. They showed that infants were more surprised when the video was paused in the middle of an action than when this happened in between two actions, which suggests that infants recognized the beginning of an action as part of a particular type of action, and that they thus classiied the action. Another example is provided by Behne and colleagues (2005), who showed that 9- to 18-month-old (but not 6-month-old) infants correctly classiied an action either as a failed attempt to hand to them a desired toy or as deliberately not handing the toy (i.e. teasing). The mechanisms underlying action classiication in infants are still unclear, but work with adults suggests that plain familiarization and the detection of statistical regularities may account for much of this capacity (Baldwin et al., 2008). Infants observe thousands and thousands of grasps, for instance, so generalization across these instances is likely to occur. Yet, it remains an open question for future research to examine whether there are other mechanisms involved, and whether there are also top-down inluences on action classiication1. 104 the past decade, so there is now a considerable body of research on infants’ target anticipations (Cannon & Woodward, 2012; Falck-Ytter et al., 2006; Paulus, Hunnius, & Bekkering, 2011a; Woodward, 1998). Two variations can be discerned: the target is an object (Woodward, 1998) or the target is a certain location (Falck-Ytter et al., 2006; Kochukhova & Gredebäck, 2010)2. In simple experimental setups, with for example only one object around the target location, this is highly similar to action recognition. In both action classiication and target prediction, action understanding is related to a prediction of how the action will end. Recognizing an action as a grasping action involves recognizing that there is a target of the grasping action. Predicting the target is more dificult, however, when there are multiple action targets available. In these cases plain associations between a graspable object and an action no longer provide enough information for forming an expectation about the action. Additional information needs to be incorporated. This information could be based on the recognition of statistical regularities (Ruffman et al., 2011) or the integration of context information (Perner & Ruffman, 2005; see also Uithol, van Rooij, Bekkering, & Haselager, 2011a). Paulus et al. (2011b), for example, showed that 9-month-olds visually anticipate which path (out of two) an agent is going to take based on previous visual experience with this agent and his behavior. In addition to statistical regularities of the observed actor, infants can employ cues such as grip types to predict the correct target. 1-Year-old infants expect an actor to grasp a particular object depending on an actor’s grip size (Daum, Vuori, Prinz, & Aschersleben, 2009) and 20-month-old infants visually anticipate the correct action target of a tool-use action depending on how the actor grasps the tool (Paulus, Hunnius, & Bekkering, 2011a). Furthermore, it has been shown that infants rely on their own action experiences to predict others’ action goals (Sommerville, Woodward, & Needham, 2005) and also take affordances and functional properties of objects, included in the task, into account (Gredebäck & Melinder, 2010; Gredebäck, Stasiewicz, Falck-Ytter, Rosander, & Hofsten, 2009). Super-ordinate Action Recognition In super-ordinate action recognition it is understood to which higher-level action the observed action contributes. For instance, it is understood that a 2 Note that in Woodward’s (1998) inluential paradigm both interpretations of target are present, although only target objects are referred to as ‘goals’, while target locations are dubbed ‘locations’. 105 Response selection A inal form of action understanding we want to discuss is being able to perform an appropriate response in social interactions (Bernieri, Reznick, & Rosenthal, 1988; Eckerman & Peterman, 2001). The recognition of response selection as a form of action understanding is theoretically rooted in embodied cognition (Clark, 1997) and ecological approaches to social interaction (Knoblich & Sebanz, 2008 scenario 1; K. L. Marsh, Richardson, Baron, & Schmidt, 2006). For example, infants in their irst year of life learn to react adequately to their caregivers’ attempts to feed them by opening their mouth in advance (van Dijk, Hunnius, & van Geert, 2009; Young & Drewett, 2000). Were a third person to observe these interactions, and subsequently be asked why the infant reacted the way it did, she would likely say that the infant recognized the intent of the other (i.e. to feed) and reacted action understanding in infants cup is grasped for drinking. This type of goal recognition is usually tested by means of prediction of the subsequent action (e.g. moving the cup towards the mouth). Super-ordinate actions can be more dificult to recognize, as often there is not a one-to-one mapping between observed actions and the super-ordinate actions. Not every cup is grasped for drinking, for example; the cup can also be grasped in order to be placed in the dishwasher (see Iacoboni et al., 2005). This means that additional cues are needed to form the right expectation about the next action. For example, a grasp towards a full cup is more likely to be followed by drinking, an empty cup by placing in the dishwasher. So next to straightforward action-object and person-object associations (Paulus, 2011; Perner & Ruffman, 2005), context information and additional information about the object is needed to successfully predict these actions. Super-ordinate action recognition has, to our knowledge, not been a topic of wide interest in infant studies. The only study that examined this issue comes from Woodward and Sommerville (2000), who showed that 12-month-old infants understood a single action as embedded in a sequence of other actions (i.e. as predictive for a subsequent action), when the actions were causally connected. One reason for this lack of research interest might be that infants’ super-ordinate action recognition is methodologically more dificult to assess than target anticipation, as the latter deals with concrete objects that stay at the same place. Further research is necessary to understand the ontogenesis of super-ordinate action recognition, and the mechanisms that could underlie this capacity. 106 with the appropriate action. That is, one is inclined to think that in order to perform an appropriate response, one needs to irst understand an action, making response selection a consequence of action understanding, not itself a form of understanding. This need not be the case. It might be enough that infants recognize a particular context (e.g., a particular daily routine, a room) and that they learn to act on the perception of a particular signal (e.g., the approaching spoon) with the opening of their mouth. Even though such behaviors do not necessarily entail a prediction or a full classiication, but rather the initiation of a “correct” response (without being aware that it is a correct response), such a behavior shows some form of action understanding (De Jaegher, Di Paolo, & Gallagher, 2010; Mead, 1934). This form of action understanding might rely on habitual learning and the acquisition of perception-action associations (Bargh & Chartrand, 1999; Heyes, in press; Van Schie, van Waterschoot, & Bekkering, 2008). Summary What the exposition above shows is that not every form of action understanding has received an equal amount of attention within developmental psychology. Target prediction seems by far the most studied, which enables discussing candidate mechanisms that provide for this capacity. Other forms of action understanding, such as super-ordinate action recognition and action response, are largely overlooked, so that suggestions for the underlying mechanisms are much more speculative. The developmental pathways of these abilities remain subject for future research. Note that the order in which we presented the various forms of action understanding does not suggest an increase of complexity. Recognizing a scooping action as belonging to ‘feeding’ is likely to occur before complex actions such as uncorking a bottle can be recognized. Neither does the order suggest that the latter forms of understanding dependent on the former. Recognizing a target object may very well be a prerequisite for classifying an action as a grasping action (see footnote 1). Future empirical research is needed to investigate the heterogenic roots of infants’ action understanding, and the interaction of the various capacities. We discussed ways of understanding actions, and how infants might accomplish this. The review of the literature shows that action understanding is a quite heterogeneous and multifaceted concept that subsumes many different forms, which are only partly related to each other. At this point it is important to note that the forms of action understanding discussed can all 107 What is in an intention? Rooted irmly in folk-psychology (Haselager, 1997; Stich, 1983; Stich & Ravenscroft, 1994), the notion of intention plays a key role in psychological explanations. Action intentions to act are generally thought to consist of a belief about the physical environment, a desire to change the environment, and an action plan to realize that change (Bratman, 1987; Malle & Knobe, 1997; Moses, 2001). Since intentions are intimately related to beliefs, both mental states have similar properties (cf. Uithol et al. submitted). First, both are thought of as functionally discrete mental states, meaning that they play a functional role that can be clearly contrasted from the role other beliefs and intentions play. For example, the intention to buy a pear can be contrasted with the intention to buy an apple, as each of the intentions would result in a different action. Next, both beliefs and intentions have context-independent content, allowing one to form beliefs or intentions about situations and contexts other than the current one, for example the intention to pick up groceries on your way home after work, even though you are still in the ofice. Finally both mental states are often (although not necessarily) assumed to have a propositional structure, meaning that the structure or syntax resembles the structure of linguistic representations. In literature on intention understanding, it is generally and implicitly assumed that an intention in the actor is responsible for causing and guiding his or her action (Baldwin & Baird, 2001; Luo & Baillargeon, 2010; Tomasello et al., 2005). Intention attribution is correct, it is assumed, when the attributed intention matches the intention that was responsible for the observed behavior. For example, Baldwin and Baird (2001) equate “what others intend” with “judgments about the speciic content of the intentions guiding others’ actions” and claim that this is crucial for predicting behavior (p.171). action understanding in infants be interpreted as grasping something from the other’s intentional action, but none of them involve attributing an intention (see also Perner, 2010; Povinelli, 2001). This is not to argue that intention attribution does not play a role at all in both infant and adult action understanding, but that this role might be smaller than is generally assumed. In the next section we will discuss how intentions are traditionally construed and, what they are posited to do or explain. 108 However, there is increasing evidence that this is a highly problematic framework for studying action understanding. At least three problems make the idea of intentions being the cause of our actions an unfortunate framework for studying action understanding. First, actions are never performed in isolation, but always parallel and in interaction with many other actions (McFarland, 1989; Uithol et al., 2012; submitted). One cannot perform a grasping action, without also breathing, making saccades, balancing one’s body, etc. At the same time, one might also hum, visually target the object or make eye contact with another person etc. A different intention can be attributed to each of these simultaneous actions. As a consequence, at each moment we can attribute an indeinite number of intentions to others. The objection that only one of these actions was intended, and the others are either unconscious or necessary side effects will be addressed below, when we discuss how our behavior is caused and controlled. Next, every action can be described at a virtually unlimited number of levels, with a different intention for every new action description (Uithol et al., 2012; submitted). A grasping action can be described as “stretching one’s elbow”, “grasping an object”, “drinking”, “maintaining homeostasis”, “survival” etc. Of course, it is unlikely that we infer “maintaining homeostasis” from an observed action, but there seems to be no principled reason to choose one level over the other as the level at which an action should be described and an intention should be attributed. Empirical research has shown that intention attribution depends on a number of factors such as whether the observed person is liked or disliked, or whether the performed actions are positively or negatively valued (Kozak, Marsh, & Wegner, 2006). Also, explicit instruction to take the other person’s mental perspective does not affect the probability of intention attribution, but it merely inluences the level at which the actions are described. The fact that there is not a single level at which actions are described makes any claim about what particular intention is inferred in a particular situation underdetermined, unless one has speciied which level one is interested in. Third, it is highly questionable that intentions conceived as mental states are the actual key causal factor in generating behavior. Much of our behavior (and most of the actions used in experimental setups) is generated and controlled via (interaction of) dynamic control processes (Fuster, 2001; Koechlin et al., 2003; Kouneiher et al., 2009; Petrides, 2005; Smith, Thelen, Titzer, & McLin, 1999; see also E. Thelen & Smith, 1994; E. Thelen, Schöner, Scheier, & Smith, 2001). For example, Koechlin, Ody and Kounei- 109 action understanding in infants her (2003) posit a model of action control in which behavior is the result of four interacting control layers, each functioning on its own time scale. Goal-directed behavior is not the result of an action goal that is propagated from one (higher) control layer to lower ones, but emerges from the interaction of these layers (see also Uithol, van Rooij, Bekkering and Haselager (2012)). This complex and dynamic interaction is deeply incompatible with the prevalent understanding of the notion of intention, discussed above, as a functionally discrete state with context-independent content (Uithol, Burnston & Haselager (submitted)). Of course, it often seems like we irst form an intention, after which we start our action. We decide to get coffee, and subsequently we get up from our desk, walk down the hallway and head for the coffee machine. But it is important not to confuse this personal level and phenomenological description with the actual causes of our behavior. Our postulated intention—even if we bother to explicitly formulate one—is more likely to be a rough summary of the outcome of the complex interaction of the control processes, than an actual mental state that is responsible for the subsequent actions. When our behavior is indeed the result of dynamic and online processes, and there is no discrete intention that causes our actions, intention attribution cannot be a matter of matching the attributed with a behavior-causing intention. Alternatively, we would argue that when you attribute an intention, you isolate from an ongoing and continuous stream of behavior a selection that you deem worth explaining. For example, when a colleague offers you a cup of coffee, you focus on the hand that passes you the coffee cup, and momentarily ignore her breathing, her balancing her own cup, her saccades, etc. As you do not have access to the complex processes that control the observed behavior, you postulate an intention that explains just that part we are interested in, usually the most salient perceptual effect (Elsner, 2007; Paulus, in press), i.e. the passing of the coffee. Like the other forms of action understanding discussed above, this attributed intention is an expectation of how the action will continue or end, or what action is likely to follow the current one, but there are important differences. Attributed intentions have an explicit propositional format while the forms low-level action predictions discussed above don’t (see also Moses, 2001). This explicit propositional format allows for the attribution of highly complex intentions, which can make complex actions that are directed at a far future intelligible, which would not be possible relying solely on low-level action predictions. For example, not only can one attribute the 110 intention of grasping a cup to an observed action, but also grasping a cup in order to clean the table, or cleaning the table because the actor expects visitors, and so on. The level of abstraction of the attributed intentions is virtually unlimited. However, there are additional constraints. Intentions, for example, need to correspond to beliefs an actor has. An actor cannot intend to get coffee unless she also has the belief that there is a coffee machine down the hallway. This makes belief attribution also necessary for attributing intentions (Moses, 2001), thereby making intention attribution a cognitively demanding and often conscious practice (see also van Rooij et al., 2011), and one we do not often engage in, as we will explain in the next section. We rarely attribute intentions In everyday life we only seldom seem to engage in intention attribution. We thoughtlessly grasp a cup of coffee when someone offers one, without attributing the intention to offer coffee to the other (just like we can thoughtlessly sip our coffee without ever explicitly forming the intention to take a sip). Work by social and cognitive psychologists has provided a compelling body of evidence that large parts of our everyday social interaction are not guided by explicit relections and conscious considerations (Bargh, 2006; Bargh & Chartrand, 1999; Bargh & Ferguson, 2000; Bargh, Chen, & Burrows, 1996; Bernieri & Rosenthal, 1991; Cesario, Plaks, Hagiwara, Navarrete, & Higgins, 2010; Langer, 1978), and that the intentional control of action is often quite limited (for developmental indings see also Kenward, Folke, Holmberg, Johansson, & Gredebäck, 2009; Klossek & Dickinson, 2012; Klossek, Russell, & Dickinson, 2008), consisting rather of automatized action routines (Aarts & Dijksterhuis, 2000; Dijksterhuis & Nordgren, 2006). At this point, we can think of two objections to our claim that action understanding and intention attribution are distinct processes and that the latter is a relatively rare phenomenon. First, one might claim that we are using an overly rich and high-level interpretation of the notion of ‘intention’. Perhaps our analyses hold for high-level intentions, such as the intention to pick-up groceries after work, but not for simpler ones, such as the intention to pick up an object. Searle (1983), for example, contrasts prior intentions from intentions in action (see also Bratman (1987) and Pacherie (2000; 2008) for similar distinctions). Prior intentions represent the goal of an action, and the means to achieve it, prior to the action, while intentions in action 111 action understanding in infants are the immediate causes of the movements needed to perform the action. With this contrast in mind, one could object that our analysis might hold for prior intentions, but not for intentions in action, and claim that the low-level forms of action understanding that we have discussed are instances of attributing intentions in action. This is, we believe, a problematic suggestion, as intentions in action cannot play a role in action understanding. What this suggestion entails is that for each of the observed actions (or sub-action) an intention in action is posited, and that action understanding is the process of inferring these intentions in action. However, the only access one has to these hidden states is through observation of the actions. As every intention in action corresponds to one action or sub-action, inferring the intention in action from an observed action provides no additional information at all (e.g. one does not learn anything new from inferring the intention to grasp from a grasping action). The fact a second processing step of inferring unobservable and unveriiable mental entities from observed actions is needed, and that this extra step does not provide a deeper form of understanding, render the positing of intentions in action explanatory idle with respect to action understanding. A second objection could be that we do attribute intentions for every action we observe and understand, in all their propositional glory, but we do this automatically and unconsciously. We see no way to test this claim empirically, but given the fact that preverbal infants engage in action prediction, but do not yet grasp the typical propositional properties of intentions (what Moses (2001) calls the ‘epistemic aspect of intention’: the understanding that an intention of an actor is tied to his or her beliefs and desires) and given the various possible mechanisms underlying action understanding we have discussed above, we deem this highly unlikely. In all, when intentions are understood as propositional mental states, intention attribution is a cognitively demanding and relatively rare form of action understanding. When intentions are understood as non-propositional states that are uniquely related to an action, intention attribution does not provide any additional information over the forms of action understanding we have discussed above. Claiming that action understanding in preverbal infants is best characterized as ‘attributing intentions’ seem to assume that there is a third option, in between the two possibilities laid-out above: one that is non-propositional but still provides extra information. We see no way to turn such a third option into an intelligible concept. 112 Understanding actions and attributing intentions We have explained that intention attribution is a special and relatively rare form of understanding observed actions. Of course no one could be prevented from using the notion of ‘intention attribution’ for every form of action understanding as well, but we believe this to be highly problematic. It entails that the notion of ‘intention attribution’ is stretched to such an extent that it encompasses various heterogeneous capacities, from low-level, unconscious forms of understanding to highly cognitive, conscious and deliberative acts of attribution. This stretch evokes a serious risk of losing explanatory leverage. First, under such an account various automated prediction processes are instances of genuine intention attribution. For example, a word processor’s auto-correct function can be said to attribute the intention to write “the” when confronted with the input “teh”. Similarly, a soccer ball kicked towards the goal can be attributed the intention of entering the goal (see Luo, 2011). In the end the notion of intention attribution becomes so broad and comprehensive that without further speciication, it is not clear how to interpret a claim regarding intention attribution3. The capacity to associate a grasping movement and a target object is qualitatively different from the capacity to infer that an actor wants to clean up the table, as he wants to make a good impression on his visitors. Consequently, data collected on one interpretation of intention attribution cannot straightforwardly be related to data that was created using a different interpretation. So when action understanding and intention attribution are not carefully contrasted, and it is not speciied what interpretation of intention attribution is used exactly, confusion and under- or overestimation of infants’ capacities is easily created. Alternatively, when low-level action understanding and intention attribution are properly contrasted, there are three ways in which the two notions can be related. First, one could claim that action understanding and intention attribution are entirely distinct and independent processes. While this is a theoretical possibility, we are not aware of accounts that posit such independence. Second, one could hold that intention attribution is required for successful action understanding. Although this has been suggested (Baldwin & Baird, 2001; Luo & Baillargeon, 2010), the cognitive complexity of intention attri3 Haselager, De Groot & Van Rappard (2003) describe a highly similar problem with the notion of representation. 113 action understanding in infants bution, combined with the capacity of predicting targets in very young infants, as well as indings of powerful statistical learning capacities in infants (Kirkham, Slemmer, & Johnson, 2002; Paulus, 2011; Ruffman et al., 2011; Saffran, Aslin, & Newport, 1996), suggest that this option is unlikely (see also de Bruin, Strijbos, & Slors, 2011). We have discussed several mechanisms that could account for the various forms of action understanding, and none of these mechanisms relied on attributing an intention to an actor. Moreover, Luo (2011) showed that infants as young as 3-months old also form predictions about non-human agents. Although one could maintain that this shows that infants already attribute intentions at the age of 3 months, even to non-human agents, a far more plausible and parsimonious explanation would be that infants do not need to attribute intentions in order to predict these movements. Third and inally, low-level action-object associations and action expectations could be interpreted as contributing to intention attribution. In this case the observer uses the various forms of low-level expectations to come to a judgment about the intention behind an observed action. As an example, Gallese and colleagues (2009) recently suggested that before or below explicit propositional intention attribution lies a prerelexive form of action understanding that relies on the observer’s own motor system. Through motor resonance, the actor’s intentions “can be directly grasped without the need of representing them in propositional form” (p.105), and this crucial role is played “by the motor system in providing the building blocks upon which more sophisticated social cognitive abilities can be built” (p. 108). We agree that the attribution of an explicit intention can be informed by the other forms of action understanding, and we believe that Gallese and colleagues (2009) made an important step in stripping away the propositional format that is generally attributed to action understanding, but we have two important objections to their theory of action understanding. First, we believe that their emphasis on motor cognition outstrips the types of action understanding that can be based exclusively on this mechanism. For example, when observing an action that involves multiple target objects, or actions that can serve multiple super-ordinate actions, plain motor resonance often cannot make one goal more likely than the other (see also Jacob, 2008; Jacob & Jeannerod, 2005; Uithol, van Rooij, Bekkering, & Haselager, 2011a). Second, while in some complex cases it might be correct to frame action predicting mechanisms as contributing to intention attribution, or inten- 114 tion understanding, it does not capture what we believe to be regular action understanding, because, as we have shown above, much of everyday action understanding does not involve attributing intentions to observed agents, but relies on low-level prediction mechanisms. Below we will sketch the contours of a framework that we believe to accommodate action understanding and intention attribution and their relationbest. Infants’ action understanding: towards a new theoretical framework As outlined above, an action intention contains a belief about a particular state of the world. Consequently, an attribution of an intention to another person is closely tied to an understanding of the other’s belief. For a long time it was undisputed that infants do not attribute beliefs to other agents until about four years old (Perner, 1991; Wimmer & Perner, 1983). Recently, some have questioned this ‘common sense’, and claimed to have shown earlier false-belief attribution (Kovács, Téglás, & Endress, 2010; Onishi & Baillargeon, 2005). However, several theoretical models posited that these earlier forms are most likely not subserved by belief attribution, but other mechanisms (e.g. Apperly & Butterill, 2009; De Bruin & Newen, 2012; Perner & Ruffman, 2005; Rakoczy, 2011). Apperly and Butterill (2009) argued that there is a difference between the explicit attribution of beliefs, and other forms of understanding. The irst is heavily dependent on language use and executive function, and seem to be absent until the age of four (Wellman, Cross, & Watson, 2001). The latter can only be tested implicitly (e.g. by means of a looking-time paradigm), but seems to be present at a younger age. The irst empirical results that support these models have been reported (Thoermer, Sodian, Vuori, Perst, & Kristen, 2011). Based on the analysis above, and the dependence of explicit intentions on beliefs, we can now formulate a corresponding account for action understanding and intention attribution. In this new framework, infants can acquire several forms of action understanding at a very young, preverbal age, but explicit intention attribution does not occur until much later, when cognitive capacities such as language use are adequately developed. But whereas Apperly and Butterill argue for a two-system model for belief attribution (although they do not exclude the possibility of more systems, see footnote 1, p. 953), we have argued that action understanding relies on many interacting mechanisms. 115 action understanding in infants There seems to be empirical evidence for a connection between explicit intention attribution and language understanding (Ruffman, Slade, Rowlandson, Rumsey, & Garnham, 2003). Children’s social understanding is reported to be correlated with maternal talk about the child’s emotions and desires (Ruffman, Slade, & Crowe, 2002; Taumoepeau & Ruffman, 2008), suggesting that intention attribution might at irst instance be more like a language game (Wittgenstein, 1953) that children learn to play, than a necessity for the understanding of everyday’s actions (cf. Astington, 2006; Montgomery, 1997; Nelson, 2007; Olson, 1988). The exposure of action understanding as an umbrella notion, and the insight that action understanding does not rely on, nor necessarily contribute to, intention attribution, have important consequences for future research. Instead of trying to establish intention attribution in ever younger infants (e.g. Luo, 2011), we believe that research into the ontogenesis of social cognition is better served by considering the various processes and mechanisms that underlie the different forms of action understanding, and study how they interact to produce successful behavior that enables the observer to act successfully in the social word. In older children, the interaction between low-level action prediction and explicit intention attribution can be studied. Inluence might very well work in both directions. Like Gallese and colleagues (2009) argue, explicit mentalizing might use information acquired by means of low-level processes. But having an idea about the actor’s intention might inluence how we observe or predict an action as well. There is evidence that activity in the mirror neuron system is modulated by additional parameters, such context (Iacoboni et al., 2005), and task speciics (Lingnau & Petris, submitted), but to our knowledge the inluence of presumed intentions on action processing has not been tested directly. The relation could be even more complicated when learning is taken into account. Actions that irst need to be understood by means of explicit intention attribution, could, after familiarization, no longer require explicit inference, but rely on associations. For example, within the theory-of-mind research tradition it has been shown that for younger children false belief understanding involves more frontal (i.e. executive control) processes, whereas adults do not rely to the same extent on executive functions in evaluating others’ beliefs (Meinhardt, Sodian, Thoermer, Döhnel, & Sommer, 2011), suggesting a more automatized processing of others’ false beliefs. This points to developmental changes in the neurocognitive mechanisms 116 underlying the same kind of social understanding, which become more automatized once they are acquired. By acknowledging that there are different forms of action understanding we can shed new light on various seemingly contrasting hypotheses. For example, Luo (2011) presents two views on early goal attribution: humans irst theories, that claim that infants psychological reasoning is at irst restricted to humans, and all-agents theories, that claim that infants also attribute goals to non-human agents. However, this contrast only emerges when action understanding or goal attribution is framed as an ‘all-or-nothing’ capacity, that either starts with humans, or is applied to all types of agents. By subdividing action understanding and detaching it from intention attribution, as done above, the contrast seems to disappear. Several forms of associations and predictions are at work, some of which might be stronger for humans, perhaps due to higher exposure, while others are more universal mechanisms. Woodward (1998) found longer looking times in 6-month olds for a novel target object than for a novel target location, suggesting that these infants more strongly associate a human grasping action to an object than to a location, and that this capacity to associate is not present for poking actions with a tool. This does not seem to be at odd with Luo and Baillargeon’s (2005) indings, which show that infants are also able to associate a self-propelled box with a certain target object. Also Luo and Baillargeon found that target objects allow for stronger associations than target locations, and that this association is also formed with non-human agents. However, There is no ground for assuming that a ‘psychological reasoning system’ (Luo, 2011, p. 454) is at work here. These indings can be explained in terms of low-level associations, without relying on any psychological or mental states. Only when operating on a conlated understanding of goal or intention attribution and action understanding is one enticed to interpret these data in terms of ‘psychological reasoning’, and the contrast between human-irst and all-agents emerges. Conclusion We have argued that it is not insightful to frame indings of action processing in infants solely in terms of ‘action understanding’. What is commonly referred to as action understanding in fact consists of various heterogeneous processes of action prediction and anticipation. As a consequence, action understanding is not an ‘all-or-nothing’ capacity that infants of a certain age 117 action understanding in infants do or do not have, or that is applied to either humans only or to all agents. Instead, it appears to be a multi-facetted notion, based on various mechanisms that develop gradually. Importantly, these processes of action prediction and action anticipation do not involve the attribution of an intention. As intention attribution seems to be dependent on language capacities, it does not emerge until language is suficiently developed. Intention attribution is cognitively effortful but it can account for understanding far more complex or future-directed actions. Most of our everyday actions, however, are understood without intention attribution. sevendiscussion action hierarchies simulation intention 120 The previous chapters discuss the notions of ‘representation’, ‘action’, and ‘intention’ relatively independently. In this discussion I will highlight the connections between the notions, and show that when our interpretation of one of these notions changes, the others will change too. At the same time, I will illustrate how the analyses of the previous chapters can impact current and future research into the cognitive and neural mechanisms of action generation and action understanding. I will do so by irst explaining the mutual dependency of the notions of ‘action hierarchy’ and ‘intention’ and discussing how the insights of Chapter 4 and 5 impact current interpretation of experimental data on imitation. Next I will use the conclusions about action understanding (Chapter 3 and 6), and the framework of embodied cognition (introduction) to assess the claim that mirror neurons ‘simulate’. After having discussed claims about current research, I will discuss what the conclusions mean for the notion of intention itself, and how future research into intentional action can be build on the reinterpretation of the notion I have proposed. Action hierarchies and imitation I have deined intentions loosely as a desired world state, or goal, combined with an action plan for how to reach it. The action hierarchies I have discussed and criticized in Chapter 4 consist of exactly these elements: an action goal on top, and the actions needed to accomplish this below it. In this chapter I have argued that a hierarchy with a goal on top that causes or initiates lower action features is conceptually problematic. Chapter 5 continues along these lines by arguing that intentions as discrete representations do not correspond to neural states. Like in Chapter 4, the framework is replaced by a multitude of interacting control processes. When this framing of action control indeed corresponds better to the actual neural basis of intentional action than the traditional frameworks, there are far-reaching consequences for studying intentional action. To illustrate, I will discuss an imitation experiment by Bekkering and colleagues (Bekkering et al., 2000; Bekkering & Wohlschlager, 2002), and show how a more ine-grained notion of action understanding and the abandoning intentions as primary cause of actions changes the interpretation of the data. Bekkering and colleagues (2002) discuss several experiments that are generally interpreted as supporting the hypothesis that goals are represented hierarchically. In one (Bekkering et al., 2000), children had to imitate 121 discussion an adult model touching his or her own ear. The most common error was the so-called contra-ipsi error. In that case the adult model used the contralateral hand in the demonstration, while children imitated this action using the ipsilateral hand, nevertheless touching the correct ear. Having the adult model touch only one ear (always right, or always left, using the ipsilateral or contralateral hand randomly) signiicantly reduced imitation errors. The received explanation is that eliminating the necessity of keeping track of the main goal (either left or right ear) enabled children to reproduce goals or actions lower in the hierarchy in the to be imitated action, such as using the correct hand or correct movement path. They conclude that children tend to imitate the goal of a movement rather than speciic means, which is in line with the general idea of ‘rational imitation’ (Csibra & Gergely, 2007; Gergely, Bekkering, & Király, 2002). These interesting indings are often interpreted as evidence for a hierarchy in action observation and imitation (Bekkering & Wohlschlager, 2002; Hamlin, Hallinan, & Woodward, 2008; Sebanz & Knoblich, 2009), as well as a hierarchy in action control (Grafton & Hamilton, 2007; Hamilton, 2009). By applying the insights from the previous chapters, I will show that the irst interpretation is far from straightforward and the second highly problematic. To start with the second: What is studied in Bekkering et al’s experiment is imitation, which relies on action observation, so for this to apply to the control of an action as well, one needs to assume that the hierarchy in the observed action matches the hierarchy in action control, but it is exactly this assumption that is shown to be problematic in Chapter 4. Then the evidence for a hierarchy in imitation or observation: The multitude of goals that can be attributed to a single action (Chapter 3, 4, 5 and 6) troubles a straightforward translation to an action hierarchy. Which goal will be attributed depends on the level of description, the embedding in a super-ordinate action, the experimental task, etc. The objection that only one goal is ‘the right one’, namely the one that has initiated the observed action in the actor, is dismissed in Chapter 5. So even in such a highly controlled experimental setup, there are many action interpretations and levels of description possible. Was it, for instance, the intention of the actor to “make a movement with his or her right arm”, “touch a random ear”, “touch his or her left ear”, “make a clear stimulus”, “do as the experimenter told to do”, “make a few euros”, etc.? Next, the fact that both the movement and the target location can be interpreted as a “goal” (Chapter 3, 4, and 6) means that we no longer have 122 reasons to assume that the choice of target is higher in the hierarchy than the choice to use the left or the right arm. The suggestion that the data shows that targets locations are higher in the hierarchy than effectors begs the question, as this data was supposed to show that higher goals are better of more often imitated than lower goals. To see this: Suppose a (highly lexible) actor would touch his ear with either his hand or his foot. In that case it is to be expected that the effector would be much more salient, and the target location less so. If target locations and effectors are indeed part of an action hierarchy, the hierarchy could easily lip, depending on this experimental paradigm used1. The data Bekkering and colleagues presented might therefore reveal a hierarchy in saliency in the stimulus, meaning that in the current setup, the target location is a more salient action feature than the speciic means to reach the target. Differences in saliency can provide insight in to what aspects of an action children pay attention to, or in other words, which characteristics of an action are used in action prediction, action understanding and imitation (Chapter 6). See also Elsner (2007) for a review of saliency in imitation tasks. A more general underlying problem, not so much with the original study and presentation of the indings, but rather with subsequent interpretations of the data, is that the indings are often interpreted and cited at a more general and abstract level (Grafton & Hamilton, 2007; Hamilton, 2009). In exact wording, Bekkering and colleagues found that for an ear touching action, children pay more attention to the target (ear) than whether the left or right hand is used. This is an interesting and valuable inding in itself. However, Chapter 6 argues that action understanding consists of various types of associations and mechanisms. A consequence of this heterogeneous nature is that the current indings might not generalize to other types of actions (e.g. the hand versus foot action, mentioned above), other types of goals (e.g. a superordinate action, Chapter 6), or other forms of action understanding. In all, a straightforward interpretation of this imitation task in terms of an action hierarchy appears to be problematic. The acknowledgement that goals and intentions are not ‘out there in the stimulus’, and ready to be picked up, but arise in action interpreting, combined with a more inegrained terminology with respect to action understanding show that other 1 Multiple realizability (i.e. there are different means to one end) cannot fix the orientation of the hierarchy either, as we may indeed use different arms to reach a single target, but the reverse is also true: we can use one and the same arm in reaching various targets. 123 Motor resonance, simulation, and action understanding Mirror neurons activity or motor resonance during action observation is often interpreted as a form of simulation of the observed action (Calvo-Merino et al., 2005; Calvo-Merino, Grezes, Glaser, Passingham, & Haggard, 2006; Decety & Grezes, 2006; Gallese & Goldman, 1998; Gallese & Sinigaglia, 2011; Goldman, 2009; Grezes & Decety, 2001), thereby claiming support for the simulation theory of mind reading (Goldman, 2006; 1989; Gordon, 1986, see also Chapter 3). This support assumes that the observer of an action uses his or her own motor system to form a prediction of the mental states responsible for the observed action (Gallese, 2007; 2009; Gallese & Goldman, 1998; Glenberg, 2006; Iacoboni, 2008; Keysers & Gazzola, 2007; Rizzolatti & Craighero, 2004; Rizzolatti & Sinigaglia, 2010). By drawing on the previous chapters, Barsalou’s theory of ‘Perceptual Symbol Systems’, and Hommel’s ‘Theory of ‘Event Coding’ I will argue that this support is not straightforward, and that the persuasiveness of this support is largely based on an overly rich interpretation of the notion of ‘simulation’ and a strict contrast between perceptual and motor representations (a contrast that is not in line with empirical evidence, amongst which the very inding of mirror neurons itself). By arguing that conceptual knowledge is based on perceptual systems, Barsalou (Barsalou, 1999; 2008; Barsalou, Simmons, Barbey, & Wilson, 2003), presented a perceptual theory of knowledge. In contrast to classical theories, that assume that knowledge resides in a modular and semantic system that it separate from episodic memory (Barsalou et al., 2003, p. 84), this theory posits that knowledge is represented in the same systems that produced the knowledge through perception. For example, perceiving a car activates my perceptual areas. After many perceptions of many cars, a concept of car is formed that is no longer tied to one speciic perception, discussion interpretations are not only possible, but sometimes even more likely (e.g. imitating the most salient aspect of an action). Follow-up studies are needed to gain a deeper understanding of the exact mechanisms that are exploited in acts of imitation and action understanding. The discrimination of various ways of understanding an action presented in Chapter 3 and 6, and the mechanisms I hypothesized to be underlying these different forms of understanding in Chapter 6 could assist in designing these studies. 124 but that is still represented in the perceptual areas (and therefore still perceptual in nature). A perceptual symbol is thus created2. These perceptual symbols can be activated through perception (e.g. perceiving a car), or through imagery (thinking about cars). Activations of these perceptual symbols are called ‘simulations’. It is important to realize that what is simulated here is a previous encounter with the perceived entity. This type of simulation is therefore a form of recognition (when a stimulus is present) or memory (in absence of a stimulus, i.e. in case of imagery). Even though Barsalou speaks of ‘sensory-motor areas’, his theory is mainly build around sensory or perceptual representations. If we were to sketch an account for motor representations analogous to Barsalou’s perceptual symbols, we would posit that representations of actions are represented modally, and therefore in the motor and perhaps premotor areas. Consequently, these representations are active during action execution (the motor analogue to Barsalou’s perception) but also during motor imagery. In the latter case, it can be said that the observed action is ‘simulated’, but it is important to realize that this is in the sense of memorizing the action. For the sake of the argument I have posited a strict contrast between perceptual and motor processes. Such a strict contrast is more and more questioned, however. Hommel’s Theory of Event Coding (TEC: Hommel, 2003; 2004; Hommel et al., 2001), for example, argues that the representations underlying perception and action are coded together in a common representational medium. As a consequence, the activity of a neuron or a group of neurons in an area involved in this common coding can be related to both perceptual and motor events. So for our ‘motor symbols’ this means that next to action execution and action imagery, action observation also activates these multimodal action representations. This is, of course, exactly in line with activation characteristics of mirror neurons. Consequently, assuming this common coding scheme, mirror neuron iring corresponds to the activation of a representation that underlies both perceptual and motor features of an action, or analogous to Barsalou and colleagues (2003): the re-enactment of a modal action representation. This means that mirror neurons can be said to simulate, but this is simulation in Barsalou’s re-enactment sense. 2 Some (e.g. Aydede, 1999; Adams & Campbell, 1999, Zwaan Stanield & Madden, 1999) have argued that Barsalou might be right in claiming that the representations are accommodated in perceptual areas, but that does not make them necessarily modal. This, however, does not make a difference for my argumentation here. 125 3 Following this reasoning, one could just as well lip the received interpretation, and interpret mirror neuron iring as perceptual representations, and claim that the remarkable inding of mirror neuron activity found during action execution demands extra explanation, and attribute it to ‘simulating the perceptual effect of the action’. Although this suggestion bears similarity to ideomotor theories (James, 1890), few researchers interpret mirror neurons this way. 4 When the dual response proile is a result of the embodied nature of cognition, one would expect it to be a rather widespread phenomenon, present in multiple cortical areas, and all across the animal kingdom. Indeed, Mukamel (2010) established neurons with mirror properties in various (human) cortical regions, and mirror neurons have been reported in humans discussion For motor resonance to support the simulation theory of mind reading, we have to assume a different or additional form of simulation: simulation of the observed actor’s mental processes (Gallese & Goldman, 1998, see also Chapter 3). The contrast between the two types of simulations I posited here maps on the distinction between intrapersonal and interpersonal resonance, presented in Chapter 3. Simulation as re-enactment can be considered a form of intrapersonal resonance, while simulation in the mindreading sense is a form of interpersonal resonance. However, interpreting mirroring processes as a form of simulation in the mind reading sense relies on some remarkable additional assumptions. First, only if we interpret mirror neuron iring as ‘motor representations that also become active during action observation’ we are inclined to interpret this activity in terms of simulation in the mindreading sense. This means that we have to assume a strict contrast between perceptual and motor representations, and between perceptual and motor areas that accommodate these representations, and we have to assume that mirror neurons are motor representations. After all, when we would consider them to be perceptual representations, there would be no reason to call them simulations in the mindreading sense3. The assumption of a strict contrast between perception and motor areas, and the classiication of the inferior frontal gyrus (IFG)—the area in which mirror neurons were irst found—as belonging to the second category is understandable from a historical perspective. The IFG has been designated to be a premotor area due to its important role in the planning and execution of actions (Rizzolatti et al., 1988). It is remarkable, however, that the iring of mirror neurons—the incarnation of multimodal representations—is still implicitly assumed to be related to motor representations, and hence simulation of an actor’s motor decisions. Following TEC’s common coding, areas or neurons with a dual response proile are not only to be expected, they are a necessary consequence of the representational organization4. When the dual response proile of mirror 126 neurons would indeed be a consequence of the representational organization of the brain, then interpreting the activation of mirror neurons upon action observation as a form of covert simulation in the mindreading sense is taking a teleological stance towards a phenomenon that might be the result of plain neural organization. The discussion about the function of mirror neurons should therefore be preceded by a discussion about the origin of these neurons. Yet, even when mirror neurons are a direct consequence of the modal, embodied representational structure of the brain, it could still be the case that the brain makes use of this dual response proile, that mirror neurons can be attributed a function after all, and that this function has to do with action understanding. Yet, the notion of ‘mindreading’ is still out of place. Chapter 5 argues that the idea of discrete intentions causing our actions is highly problematic. Consequently, the idea of mindreading as ‘inferring the mental states that were the cause of the observed actions’, is equally problematic. Chapter 6 argues that action understanding is a multi-facetted process, and that it is plausible that mirroring processes play a role in the mechanisms underlying some of the discussed forms of action understanding. These processes, however, are better described as forms action prediction than of ‘mindreading’. The future of intentions An overall theme that connects the previous chapters is the straightforward applications of psychological and folk psychological notions into neuroscientiic research, and the problems this creates. Of course I am not the irst to object to such application; eliminative materialism (Churchland, 1981; Lycan & Pappas, 1972) has been around for a while, and argues that a mature neuroscience will in the end replace folk-psychological notions, and that, consequently, these folk notions will be eliminated. However, being primordially a philosophical enterprise, eliminativism is usually formulated on a rather abstract level, discussing ‘mental states’ and ‘brain states’. By contrast, I have tried to descend from this abstract level by studying particular types of mental states and concepts. In chapter 5 the possible neural imple- (Chong, Cunnington, Williams, Kanwisher, & Mattingley, 2008; Kilner, Neal, Weiskopf, Friston, & Frith, 2009; Mukamel et al., 2010), monkeys (Gallese et al., 1996; Rizzolatti et al., 1996), and songbirds (Prather, Peters, Nowicki, & Mooney, 2008). It is too soon, however, to determine whether this is as widespread as one would expect based on common coding principles. 127 discussion mentations of intentions are analyzed in detail, and it is concluded that the folk interpretation of ‘intention’ does not apply to any particular physical process by which the brain initiates actions. The conclusion of Chapter 5 can be considered to be eliminative to some extent, as I have argued that the notion of intentions does not correspond to a particular brain state. However, I do not expect the notion to disappear with a developing neuroscience. I have suggested that the notion plays an important role in social and communicative acts, which makes it unlikely that it will ever be replaced. Like Dennett (Dennett, 1978; 1987) explains, the concepts of ‘beliefs’ and ‘desires’ provide an easy and accurate explanation of ongoing behavior, even of systems that can be argued to have none of such mental states, like thermostats and chess computers. Similarly, intentions will remain providing an explanation or prediction of an ongoing action that is accurate enough in a daily context. What I have been arguing, though, is that this notion does not describe a neural or psychological state, and that it therefore does not provide a fruitful starting point for empirical research. I have speculated that for more complex and temporally extended behavior, the neural processes that guide our actions seem to interact with various inputs that function as stabilizing elements, and I have speciically emphasized linguistic structures (Clark, 2006). A linguistically structured representation can function as a reminder for future actions, similar to the way that reciting the memorized grocery list in the supermarket helps one to ind the items one needs. The type of processes that the brain can rely on need not be restricted to linguistic structures, however. The brain can also rely on external stabilizing elements, such as artifacts and other agents. For example, a stop sign is a control factor in our ‘safe driving behavior’, and therefore accounts for part of the function attributed to the intention to drive safely. After all, when intentions are not brain states, but a social construction, as Chapter 5 argues, we do not have principled reasons to only include neural processes in the processes that contribute to behavior. This means that the function we commonly attribute to an intention is in fact performed by a number of processes and elements, of which some are internal to the brain (neural control processes), some have external origins but are internalized (linguistic structures), and some are external (stop sign). This is emphatically not a list of types of intentions—a stop sign 128 cannot be considered an intention on its own—but a list of elements that collectively contribute to the functions generally attributed to intentions 5 If one wants to maintain the causal role of intentions, and yet acknowledge the various processes that shape and control our behavior, one is forced to embrace a radical version of the ‘extended mind thesis’ (Adams & Aizawa, 2001; Chemero, Silberstein, & Sloutsky, 2008; Clark, 2003; 2008; Sterelny, 2010), which states that cognition is not bound by brain processes, but involves external tokens and processes as well. Within this framework, one can hold that the representation of an intention has a distributed vehicle, consisting of heterogeneous elements. However, this interpretation is rather remote from traditional interpretations of Causal Action Theory (Bratman, 1987; Davidson, 1963; Pacherie, 2000), and it is unclear what explanatory leverage is gained by maintaining intentions as primary cause of our actions. See Figure 1 for a graphic overview of the differences between the folk psychological framework and the alternative sketched here. a b intention other agents objects actor actor context deliberation objects intention action linguistic processes context control processes action other agents Figure 1. Diagram a) shows the traditional ‘folk’ framework of action causation, b) shows the proposed alternative. I have argued above and in Chapter 6 that this reinterpretation of the notion of intention has important consequences for studying action understanding. It involves shifting the focus from how an intention can be inferred from an observed movement, which is argued to be highly problematic (Jacob, 2008), and potentially computationally intractable (Chapter 2, van Rooij, 2008), to how the interaction between context, objects and motor resonance contributes to action understanding. A irst step in mapping these various processes has been taken in Chapter 6. The proposed framework has also consequences for studying joint action. Generally, joint action is conceived as being the result of a shared intention 5 See Frank, van Rooij and Haselager (2009) for an interesting example of how a model can acquire systematicity by relying on the systematicity in the external world. A similar process could account for the discreteness of intentions. 129 Final remarks The main conclusions of this thesis can be summarized as follows: 1) actions are not generated by straightforward top-down causation of intentions, and 2) action understanding is a heterogeneous concept that encompasses various forms of recognition and prediction. Of course the framework sketched in this thesis is by no means a full-blown theory or a fully-ledged alternative to existing theories on action generation. How, for instance, can a hierarchy based on temporal extension account for dificulties such as the exact timing of actions, the order of sub-actions to be performed, etc.? How do dynamic action control processes interact with the stabilizing factors I have mentioned? A theory that is to replace the models and theories I have criticized will have to formulate an answer to such questions. However, I hope to have made clear that the proposed framework offers a promising alternative in which the complexity and the dynamical nature of the neural processes discussion (Bratman, 1993; Butterill & Sebanz, 2011; Tomasello et al., 2005), which is considered to be a special, and often problematic case of the common individual intention (Pacherie, 2011), in which the role of context is not easily accounted for. By contrast, in the proposed framework, action control processes are already distributed over the brain, context, and other agents. This means that in a similar context, the “intentions” of two or more actors already overlap to a certain extent. Also, as action initiation and control are deeply intertwined at the neural level, the process of movement synchronization (Chartrand & Bargh, 1999; Lakin, Jefferis, Cheng, & Chartrand, 2003) further shapes the joint action. Objects can also play a role: for example the movement of the table that I am carrying with my colleague can become a stabilizing constituent, shaping my action control (Sebanz et al., 2006). This means that the focus in studying joint action should not only lie on how people form shared intentions, but also on how the interplay between the described elements and control processes results in stable goal representations, and thereby successful joint action. Additionally, reinterpreting the notion of intention might have implications well outside of the discussed ields, including neurotechnology (e.g. Brain Computer Interfaces “reading” the intentions of patients, see Haselager (submitted)), and legal theory, in which the notion plays a pivotal role as well. that control our actions and the mechanisms that underlie action understanding can be accounted for. references 132 a Aarts, H., & Dijksterhuis, A. (2000). The auto- Baker, C., Tenenbaum, J. B., Saxe, R., & matic activation of goal-directed behaviour: Trafton, J. (2007). Goal inference as inverse The case of travel habit. Journal of Environplanning. Proceedings of the 29th Annual Cognimental Psychology, 20(1), 75–82. tive Science Society. (pp. 779–784). Abdelbar, A., & Hedetniemi, S. (1998). Approximating MAPs for belief networks is NP-hard and other theorems. Artiicial Intelligence, 102(1), 21–38. Adams, F., & Campbell, K. (1999). Modality and abstract concepts. Behavioral and Brain Sciences, 22(04), 610. Bakker, B. (2005). The concept of circular causality should be discarded. Behavioral and Brain Sciences, 28, 195–196. Baldwin, D. A., & Baird, J. A. (2001). Discerning intentions in dynamic human action. Trends in Cognitive Sciences, 5(4), 171–178. doi:10.1016/S1364-6613(00)01615-6 Adams, F., & Aizawa, K. (2001). The bounds of cognition. Philosophical Psychology, 14(1), 43–64. Baldwin, D. A., Andersson, A., Saffran, J. R., & Meyer, M. (2008). Segmenting dynamic human action via statistical structure. Cognition, 106(3), 1382–1407. doi:10.1016/j. Anscombe, G. E. M. (1957). Intention. Oxford: cognition.2007.07.005 Basil Blackwell. Baldwin, D. A., Baird, J. A., Saylor, M. M., & Apperly, I. A., & Butterill, S. A. (2009). Do Clark, M. A. (2001). Infants parse dynamic humans have two systems to track beliefs action. Child development, 72(3), 708–717. and belief-like states? Psychological Review, 116(4), 953–970. doi:10.1037/a0016923 Bargh, J. A. (2006). What have we been priming all these years? On the development, Arbib, M. A., & Rizzolatti, G. (1997). Neural mechanisms, and ecology of nonconscious expectations: a possible evolutionary path social behavior. European journal of social from manual skills to language. Communicapsychology, 36(2), 147–168. doi:10.1002/ tion and Cognition, 29, 393–424. ejsp.336 Astington, J. W. (2006). The developmental Bargh, J. A., & Chartrand, T. L. (1999). The interdependence of theory of mind and unbearable automaticity of being. American language. In S. C. Levinson & N. J. Enield psychologist, 54(7), 462. (Eds.), The roots of human sociality: Culture, cognition, and human interaction (pp. Bargh, J. A., & Ferguson, M. J. (2000). 179–206). Oxford, UK: Berg. Beyond behaviorism: On the automaticity of higher mental processes. Psychological BulAydede, M. (1999). What makes perceptual letin, 126(6), 925. symbols perceptual? Behavioral and Brain Sciences, 22(04), 610–611. Bargh, J. A., Chen, M., & Burrows, L. (1996). Automaticity of social behavior: Direct efBadre, D. (2008). Cognitive control, hierfects of trait construct and stereotype action archy, and the rostro–caudal organization on construct accessibility. Journal of personalof the frontal lobes. Trends in Cognitive ity and social psychology, 50, 869–878. Sciences, 12(5), 193–200. doi:10.1016/j. tics.2008.02.004 Baron-Cohen, S. (1995). Mindblindness: An essay on autism and theory of mind. Cambridge, Badre, D., & D’Esposito, M. (2007). FunctionMA: MIT Press. al magnetic resonance imaging evidence for a hierarchical organization of the prefronBarresi, J., & Moore, C. (1996). Intentional tal cortex. Journal of Cognitive Neuroscience, relations and social understanding. Behav19(12), 2082–2099. ioral and Brain Sciences, 19, 107–154. b Baker, C., Goodman, N., Tenenbaum, J. B., Love, B., McRae, K., & Sloutsky, V. (2008). Theory-based Social Goal Inference (pp. 1447–1452). Presented at the Proceedings of the 30th Annual Conference of the Cognitive Science Society. Barsalou, L. W. (1999). Perceptual Symbol Systems. Behavioral and Brain Sciences, 22, 577–660. Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645. 133 Bechtel, W. (1998). Representations and Cognitive Explanations: Assessing the Dynamicist’s Challenge in Cognitive Science. Cognitive Science, 22(3), 295–318. Bechtel, W., & Richardson, R. C. (1993). Discovering Complexity. Decomposition and Localization as Strategies in Scientiic Research (p. 286). Princeton: Princeton University Press. Infant Behavior and Development, 22(3), 367–382. Botvinick, M. (2008). Hierarchical models of behavior and prefrontal function. Trends in Cognitive Sciences, 12(5), 201–208. Brass, M., & Haggard, P. (2008). The What, When, Whether Model of Intentional Action. The Neuroscientist, 14(4), 319–325. doi:10.1177/1073858408317417 Brass, M., & Heyes, C. M. (2005). Imitation: Is cognitive neuroscience solving the correspondence problem? Trends in Cognitive Sciences, 9(10), 489–495. Beer, R. (2000). Dynamical approaches to cognitive science. Trends in Cognitive Sciences, Bratman, M. E. (1981). Intention and meansend reasoning. The Philosophical Review, 4(3), 91–99. 90(2), 252–265. Behne, T., Carpenter, M., Call, J., & TomaBratman, M. E. (1987). Intention, plans, and sello, M. (2005). Unwilling Versus Unable: practical reason. Cambridge, MA: Harvard Infants’ Understanding of Intentional University Press. Action. Developmental Psychology, 41(2), 328–337. doi:10.1037/0012-1649.41.2.328 Bratman, M. E. (1993). Shared intention. EthBekkering, H., & Wohlschlager, A. (2002). Action perception and imitation: A tutorial. In W. Prinz & B. Hommel (Eds.), Attention & Performance XIX: Common mechanisms in perception and action (pp. 294–314). Oxford, UK: Oxford University Press. ics, 104(1), 97*113. Brooks, Rodney. (1986). A Robust Layered Control System for a Mobile Robot. IEEE Journal of Robotics and Automation, 2(1), 14–23. Brooks, Rodney. (1991a). Intelligence withBekkering, H., Wohlschlager, A., & Gattis, M. out representation. Artiicial Intelligence, 47, (2000). Imitation of Gestures in Children is 139–159. Goal-directed. The Quarterly Journal of Experimental Psychology Section A: Human Experimen- Brooks, Rodney. (1991b). New approaches to robotics. Science, 253(5025 ), 1227–1233. tal Psychology, 53(1), 153–164. Bennett, M., & Hacker, P. (2003). Philosophical Buccino, G., Binkofski, F., & Riggio, L. (2004a). The mirror neuron system and acfoundations of neuroscience (pp. XVII, 461). tion recognition. Brain and Language, 89(2), Malden, MA: Blackwell Publishing. 370–376. Bernieri, F. J., & Rosenthal, R. (1991). InterBuccino, G., Binkofski, F., Fink, G., Fadiga, personal coordination: Behavior matchL., Fogassi, L., Gallese, V., Seitz, R., et al. ing and interactional synchrony. In R. S. (2001). Action observation activates premoFeldman & B. Rime (Eds.), Fundamentals of tor and parietal areas in a somatotopic nonverbal behavior (pp. 401–432). New York: manner: an fMRI study. European Journal of Cambridge University Press. Neuroscience, 13(2), 400–404. Bernieri, F. J., Reznick, J. S., & Rosenthal, Buccino, G., Vogt, S., Ritzl, A., Fink, G., Zilles, R. (1988). Synchrony, pseudosynchrony, K., Freund, H.-J., & Rizzolatti, G. (2004b). and dissynchrony: Measuring the entrainNeural circuits underlying imitation learnment process in mother-infant interactions. ing of hand actions: An event-related fMRI Journal of personality and social psychology, study. Neuron, 42(2), 323–334. 54(2), 243. Bigelow, A. E., & Birch, S. A. J. (1999). The ef- Buffart, H., Leeuwenberg, E., & Restle, F. (1981). Coding theory of visual pattern fects of contingency in previous interactions completion. Journal of Experimental Psycholon infants’ preference for social partners. references Barsalou, L. W., Simmons, K., Barbey, A., & Wilson, C. (2003). Grounding conceptual knowledge in modality-speciic systems. Trends in Cognitive Sciences, 7(2), 84–91. 134 Chartrand, T. L., & Bargh, J. A. (1999). The Chameleon Effect: The Perception-Behavior Link and Social Interaction. Journal of personBurgess, P. W. W., Veitch, E., de Lacy Costello, ality and social psychology, 76(6), 893–910. A., & Shallice, T. (2000). The cognitive and neuroanatomical correlates of multitasking. Chemero, T. (2000). Anti-RepresentationalNeuropsychologia, 38(6), 848–863. ism and the Dynamical Stance. Philosophy of Science, 67(4), 625–647. Butterill, S. A., & Sebanz, N. (2011). Editorial: Joint Action: What Is Shared? Review of Chemero, T., Silberstein, M., & Sloutsky, V. Philosophy and Psychology, 2, 137–146. (2008). Defending Extended Cognition. In B. Love, K. McRae, & V. Sloutsky (Eds.), Bylander, T., Allemang, D., Tanner, M., & Proceedings of the 30th Annual Conference of Josephson, J. (1991). The Computationalthe Cognitive Science Society (pp. 129–134). Complexity of Abduction. Artiicial IntelProceedings of the 30th Annual Meeting of ligence, 49(1-3), 25–60. the Cognitive Science Society. Byrne, R. W., & Russon, A. (1998). Learning Chiel, H., & Beer, R. (1997). The brain has by imitation: A hierarchical approach. Bea body: Adaptive behavior emerges from havioral and Brain Sciences, 21(05), 667–684. interactions of nervous system, body and doi:doi:null environment. Trends in Neurosciences, 20(12), ogy - Human Perception and Performance, 7(2), 241. c 553–556. Calvo-Merino, B., Glaser, D., Grezes, J., Passingham, R. E., & Haggard, P. (2005). Action Chong, T., Cunnington, R., Williams, M., Observation and Acquired Motor Skills: An Kanwisher, N., & Mattingley, J. (2008). fMRI Study with Expert Dancers. Cerebral fMRI Adaptation Reveals Mirror Neurons Cortex, 15(8), 1243–1249. doi:10.1093/cerin Human Inferior Parietal Cortex, 18(20), cor/bhi007 1576–1580. Calvo-Merino, B., Grezes, J., Glaser, D., PassChristoff, K., & Gabrieli, J. D. E. (2000). The ingham, R. E., & Haggard, P. (2006). Seeing frontopolar cortex and human cognition: or Doing? Inluence of Visual and Motor Evidence for a rostrocaudal hierarchical Familiarity in Action Observation. Current organization within the human prefrontal Biology, 16, 1905–1910. cortex. Psychobiology, 28(2), 168–186. Cannon, E. N., & Woodward, A. L. (2012). Infants generate goal-based action predictions. Developmental science, 15(2), 292–298. doi:10.1111/j.1467-7687.2011.01127.x Christoff, K., Ream, J. M., Geddes, L. P. T., & Gabrieli, J. D. E. (2003). Evaluating SelfGenerated Information: Anterior Prefrontal Contributions to Human Cognition. Behavioral Neuroscience, 117(6), 1161–1168. doi:10.1037/0735-7044.117.6.1161 Carpendale, J. I. M., & Lewis, C. (2004). Constructing an understanding of mind: the development of children’s social under- Churchland, P. M. (1981). Eliminative Matestanding within social interaction. Behavioral rialism and the Propositional Attitudes. The and Brain Sciences, 27(1), 79–151. Journal of Philosophy, 78(2), 67–90. Cesario, J., Plaks, J. E., Hagiwara, N., NaClark, A. (1997). Being there: Putting body, varrete, C. D., & Higgins, E. T. (2010). The brain, and world together again. Cambridge, ecology of automaticity. How situational MA: MIT Press. contingencies shape action semantics and social behavior. Psychological Science, 21(9), Clark, A. (2003). Natural-born cyborgs: minds, 1311–1317. doi:10.1177/0956797610378685 technologies, and the future of human intelligence (p. 229 blz.). Chalmers, D. (1995). On implementing a computation. Minds and Machines, 4(4), Clark, A. (2006). Language, embodiment, 391–402. and the cognitive niche. Trends in Cognitive Sciences, 10(8), 370–374. Charniak, E., & Goldman, A. (1993). A Bayesian Model of Plan Recognition. Artiicial Clark, A. (2008). Supersizing the Mind: EmbodiIntelligence, 64(1), 53–79. ment, Action, and Cognitive Extension (pp. 135 Davies, M., & Stone, T. (1995). Folk Psychology: The Theory of Mind Debate (p. 350). Cambridge, MA: Wiley-Blackwell. Cohen, R. G., & Rosenbaum, D. A. (2004). Where grasps are made reveals how grasps De Bruin, L. C., & Newen, A. (2012). An asare planned: generation and recall of motor sociation account of false belief understandplans. Experimental Brain Research, 157(4), ing. Cognition. 486–495. doi:10.1007/s00221-004-1862-9 de Bruin, L., Strijbos, D., & Slors, M. (2011). Cooper, R., & Shallice, T. (2006). HierarchiEarly Social Cognition: Alternatives to cal Schemas and Goals in the Control of Implicit Mindreading. Review of Philosophy Sequential Behavior. Psychological Review, and Psychology, 2(3), 499–517. doi:10.1007/ 113(4), 887–916. s13164-011-0072-1 Craver, C. F., & Bechtel, W. (2007). Top-down De Jaegher, H., Di Paolo, E., & Gallagher, S. causation without top-down causes. Biology (2010). Can social interaction constitute and Philosophy, 22(4), 547–563. social cognition? Trends in Cognitive Sciences, 14(10), 441–447. Csibra, G. (2007). Action mirroring and action interpretation: An alternative account. De Ruiter, J., Noordzij, M. L., Newman-NorIn P. Haggard, Y. Rossetti, & M. Kawato lund, S., Hagoort, P., & Toni, I. (2007). On (Eds.), Sensorimotor Foundations of Higher the origin of intentions. In P. Haggard & Y. Cognition. Attention and Performance XXII (pp. Rossetti (Eds.), Attention & Performance 22. 427–451). Oxford: Oxford University Press. Sensorimotor Foundations of Higher Cognition Attention and Performance (pp. 593–609). Csibra, G., & Gergely, G. (2007). “Obsessed Oxford: Oxford University Press. with goals”: functions and mechanisms of teleological interpretation of actions in de Vignemont, F., & Haggard, P. (2008). humans. Acta Psychologica, 124(1), 60–78. Action observation and execution: What is doi:10.1016/j.actpsy.2006.09.007 shared? Social Neuroscience, 3(3), 421–433. Cuijpers, R., Van Schie, H. T., Koppen, M., Erlhagen, W., & Bekkering, H. (2006). Goals and means in action observation: A computational approach. Neural Networks. Cummins, R. (1989). Meaning and mental representation (pp. VIII, 180). d Damasio, A. (1989). Time-locked multiregional retroactivation: A systems-level proposal for the neural substrates of recall and recognition. Cognition, 33, 25–62. Damasio, A. R. (1985). Understanding the mind’s will. Behavioral and Brain Sciences, 8, 589. doi:10.1017/S0140525X00045180 Daum, M. M., Vuori, M. T., Prinz, W., & Aschersleben, G. (2009). Inferring the size of a goal object from an actor’s grasping movement in 6- and 9-month-old infants. Developmental science, 12(6), 854–862. doi:10.1111/j.1467-7687.2009.00831.x Davidson, D. (1963). Actions, Reasons, and Causes. Journal of Philosophy, 60(23), 685–700. Davidson, D. (1980). Essays on actions and events. New York: Oxford University Press. de Vignemont, F., & Singer, T. (2006). The empathic brain: how, when and why? Trends in Cognitive Sciences, 10(10), 435–441. Decety, J., & Grezes, J. (2006). The power of simulation: Imagining one“s own and other”s behavior. Brain Research, 1079, 4–14. Dennett, D. C. (1978). Brainstorms, Philosophical Essays on the Mind and Psychology. Montgomery, VT: Bradford Books. Dennett, D. C. (1987). The Intentional Stance (p. 388). Cambridge, MA: MIT Press. Dennett, D. C. (1989). Cognitive Ethology: Hunting for bargains or a wild goose chase? In A. Monteiore & C. Noble (Eds.), Goals, no-goals, and own goals (pp. 101–116). London: Unwin Hyman. Dennett, D. C. (1991). Consciousness Explained. Boston: Little, Brown and Co. Di Pellegrino, G., Fadiga, L., Fogassi, L., Gallese, V., & Rizzolatti, G. (1992). Understanding motor events: a neurophysiological study. Experimental Brain Research, 91(1), 176–180. references XXIX, 286). New York: Oxford University Press. 136 Dietrich, E., & Markman, A. (2001). Dynamical description versus dynamical modeling. Trends in Cognitive Sciences, 5(8), 332. Dietrich, E., & Markman, A. (2003). Discrete Thoughts: Why Cognition Must Use Discrete Representations. Mind & Language, 18(1), 95–119. Dijksterhuis, A., & Nordgren, L. F. (2006). A theory of unconscious thought. Perspectives on Psychological Science, 1(2), 95–109. Dinstein, I., Thomas, C., Behrmann, M., & Heeger, D. (2008). A mirror up to nature. Current Biology, 18(3), R13–R18. Elsner, B. (2007). Infants’ imitation of goaldirected actions: The role of movements and action effects. Acta Psychologica, 124(1), 44–59. Elsner, B., & Aschersleben, G. (2003). Do I get what you get? Learning about the effects of self-performed and observed actions in infancy. Consciousness and Cognition, 12(4), 732–751. Erlhagen, W., Mukovskiy, A., & Bicho, E. (2006). A dynamic model for action understanding and goal-directed imitation. Brain Research, 1083, 174–188. Dretske, F. (1988). Explaining behavior: reasons in a world of causes (pp. XI, 165). Cambridge, MA: MIT Press. f Duysens, J., & Van de Crommert, H. (1998). Neural control of locomotion; Part 1. The central pattern generator from cats to humans. Gait & Posture, 7(2), 131–141. Falck-Ytter, T., Gredebäck, G., & Hofsten, von, C. (2006). Infants predict other people’s action goals. Nature Neuroscience, 9(7), 878–879. doi:10.1038/nn1729 Eckerman, C. O., & Peterman, K. (2001). Peers and infant social/communicative development. In G. Bremner & A. Fogel (Eds.), Blackwell Handbook of Infant Development (pp. 326–350). Malden, MA: Blackwell Publishers. Felleman, D., & Van Essen, D. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1(1), 1–47. e Edelman, G., Tononi, G., & Haier, R. (2003). A Universe of Consciousness: How Matter Becomes Imagination. Contemporary psychology, 48(1), 92–93. Egner, T. (2009). Prefrontal cortex and cognitive control: motivating functional hierarchies. Nature Neuroscience, 12(7), 821–822. doi:10.1038/nn0709-821 Fadiga, L., Craighero, L., & Olivier, E. (2005). Human motor cortex excitability during the perception of others’ action. Current Opinion in Neurobiology, 15(2), 213–218. Ferrari, P. F., Gallese, V., Rizzolatti, G., & Fogassi, L. (2003). Mirror neurons responding to the observation of ingestive and communicative mouth actions in the monkey ventral premotor cortex. The European journal of neuroscience, 17(8), 1703–1714. Ferrari, P., Rozzi, S., & Fogassi, L. (2005). Mirror Neurons Responding to Observation of Actions Made with Tools in Monkey Ventral Premotor Cortex. Journal of Cognitive Neuroscience, 17(2), 212–226. Eiter, T., & Gottlob, G. (1995). The Complex- Fodor, J. (1975). The language of thought (pp. ity of Logic-Based Abduction. Journal of the x, 214). New York: Crowell. Association for Computing Machinery, 42(1), 3–42. Fodor, J. A. (1983). The modularity of mind. an essay on faculty psychology (p. 145). The Eliasmith, C. (2005). A new perspective on MIT Press. representational problems. Journal of Cognitive Science, 6, 97–123. Fogassi, L., & Luppino, G. (2005). Motor functions of the parietal lobe. Current OpinEliasmith, C. (2010). How we ought to deion in Neurobiology. scribe computation in the brain. Studies In History and Philosophy of Science Part A, 41(3), Frank, S., Haselager, W. F. G., & van Rooij, I. 313–320. (2009). Connectionist semantic systematicity. Cognition, 110(3), 358–379. Elman, J. L. (2004). An alternative view of the mental lexicon. Trends in Cognitive Friend, M., & Pace, A. (2011). Beyond event Sciences, 8(7), 301–306. doi:10.1016/j. segmentation: Spatial-and social-cognitive tics.2004.05.003 processes in verb-to-action mapping. Develop- 137 mental Psychology, 47(3), 867–876. Gallese, V., & Sinigaglia, C. (2011). What is so special about embodied simulation? Trends in Cognitive Sciences, 15(11), 512–519. doi:10.1016/j.tics.2011.09.003 Friston, K. (2005). A theory of cortical responses. Philosophical Transactions of the Royal Society B-Biological Sciences, 360(1456), 815. Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). Action recognition in the Fritsch, G., & Hitzig, E. (1870). Ueber die elepremotor cortex. Brain, 119(2), 593–610. ktrische Erregbarkeit des Grosshirns. Archiv fuer Anatomie, Physiologie und wissenschaftliche Gallese, V., Keysers, C., & Rizzolatti, G. Medicin, 300–332. (2004). A unifying view of the basis of social cognition. Trends in Cognitive Sciences, 8(9), Frixione, M. (2001). Tractable competence. 396–403. Minds and Machines, 11(379-397). Gallese, V., Rochat, M., Cossu, G., & SinigaFuster, J. M. (2001). The Prefrontal Cortex— glia, C. (2009). Motor Cognition and Its An Update: Time Is of the Essence. Neuron, Role in the Phylogeny and Ontogeny of 30, 319–333. Action Understanding. Developmental Psychology, 45(1), 103–113. Fuster, J. M. (2004). Upper processing stages of the perception-action cycle. Trends in Garey, M., & Johnson, D. (1979). Computers Cognitive Sciences, 8(4), 143–145. and intractability: a guide to the theory of NPcompleteness (p. 338 blz.). Fuster, J. M., Bauer, R. H., & Jervey, J. P. (1982). Cellular discharge in the dorsoGentilucci, M., Bertolani, L., Benuzzi, F., lateral prefrontal cortex of the monkey in Negrotti, A., Pavesi, G., & Gangitano, M. cognitive tasks. Experimental Neurology, 77(3), (2000). Impaired control of an action after 679–694. doi:10.1016/0014-4886(82)90238supplementary motor area lesion: a case 2 study. Neuropsychologia, 38(10), 1398–1404. Fuster, J. M., Bodner, M., & Kroger, J. K. Gergely, G., Bekkering, H., & Király, I. (2002). K. (2000). Cross-modal and cross-temRational imitation in preverbal infants. poral association in neurons of frontal Nature, 415(6873 ), 755. cortex. Nature, 405(6784), 347–351. doi:10.1038/35012613 Gibson, J. (1979). The ecological approach to visual perception (pp. XVI, 332). Gallese, V. (2001). The “Shared Manifold” Hypothesis: From Mirror Neurons To EmGlenberg, A. (1997). What memory is for. pathy. Journal of Consciousness Studies, 8(5-7), Behavioral and Brain Sciences, 20(1), 1–18. 33–50. Glenberg, A. (2006). Naturalizing cognition: Gallese, V. (2007). Before and below “theory The integration of cognitive science and of mind”: embodied simulation and the biology. Current Biology, 16(18), R802–R804. neural correlates of social cognition. Goldberg, G. (1985). Supplementary Motor Philosophical Transactions of the Royal Society Area Structure and Function - Review and B-Biological Sciences, 362(1480), 659–669. Hypotheses. Behavioral and Brain Sciences, Gallese, V. (2009). Mirror Neurons, Embod8(4), 567–588. ied Simulation, and the Neural Basis of Social Identiication. Psychoanalytic Dialogues, Goldman, A. (2006). Simulating minds: the philosophy, psychology, and neuroscience of 19(5), 519–536. mindreading. Philosophy of mind series (pp. IX, Gallese, V., & Goldman, A. (1998). Mirror 364). neurons and the simulation theory of mindGoldman, A. (2009). Mirroring, Simulating reading. Trends in Cognitive Sciences, 2(12), and Mindreading. Mind & Language, 24(2), 493–501. 235–252. g references Fries, P. (2005). A mechanism for cognitive dynamics: neuronal communication through neuronal coherence. Trends in Cognitive Sciences, 9(10), 474–480. Gallese, V., & Lakoff, G. (2005). The Brain’s concepts. Cognitive Neuropsychology, 22(3), 455–479. 138 Goldman, A. I. (1989). Interpretation Psychologized. Mind & Language, 4(3), 161–185. Gordon, R. M. (1986). Folk Psychology as Simulation. Mind & Language, 1(2), 158–171. Grafton, S. T., & Hamilton, A. (2007). Evidence for a distributed hierarchy of action representation in the brain. Human Movement Science, 26(4), 590–616. Graziano, M. S. A. (2010). Ethologically relevant movements mapped onto the motor cortex. In A. Ghazanfar & M. Platt (Eds.), Primate Neuroethology (pp. 454–470.). New York: Oxford University Press. Haken, H., Kelso, J. A. S., & Bunz, H. (1985). A theoretical model of phase transitions in human hand movements. Biological Cybernetics, 51(5), 347–356. doi:10.1007/ BF00336922 Hamilton, A. (2009). Research review: Goals, intentions and mental states: challenges for theories of autism. Journal of Child Psychology and Psychiatry, 50(8), 881–892. Hamilton, A., & Grafton, S. T. (2006). Goal representation in human anterior intraparietal sulcus. Journal of Neuroscience, 26(4), 1133–1137. Hamilton, A., & Grafton, S. T. (2007). The motor hierarchy: from kinematics to goals and intentions. In P. Haggard, Y. Rossetti, & Graziano, M. S. A., & Alalo, T. (2007). MapM. Kawato (Eds.), Attention & Performance ping Behavioral Repertoire onto the Cortex. 22. Sensorimotor Foundations of Higher CogniNeuron, 56(2), 239–251. tion Attention and Performance (pp. 381–408). Oxford: Oxford University Press. Graziano, M. S. A., Taylor, C., & Moore, T. (2002). Complex Movements Evoked by Mi- Hamilton, A., & Grafton, S. T. (2008). Action crostimulation of Precentral Cortex. Neuron, outcomes are represented in human infe34(5), 841–851. rior frontoparietal cortex. Cerebral Cortex, Gredebäck, G., & Melinder, A. (2010). Infants’ understanding of everyday social interactions: A dual process account. Cognition, 114, 197–206. doi:10.1016/j.cognition.2009.09.004 18(5), 1160–1168. Hamlin, J. K., Hallinan, E. V., & Woodward, A. L. (2008). Do as I do: 7‐month‐old infants selectively reproduce others’ goals. Developmental science, 11(4), 487–494. Gredebäck, G., Stasiewicz, D. D., Falck-Ytter, Haruno, M., Wolpert, D. M., & Kawato, T., Rosander, K., & Hofsten, von, C. (2009). M. (2001). MOSAIC Model for SensoAction type and goal type modulate goalrimotor Learning and Control. Neural directed gaze shifts in 14-month-old infants. Computation, 13(10), 2201–2220. doi:d Developmental Psychology, 45(4), 1190–1194. oi:10.1162/089976601750541778 doi:10.1037/a0015667 Harvey, I. (1996). Untimed and misrepreGrezes, J., & Decety, J. (2001). Functional sented: connectionism and the computer anatomy of execution, mental simulation, metaphor. AISB Quarterly, 96, 20–27. observation, and verb generation of actions: A meta-analysis. Human Brain Mapping, Haselager, W. F. G. (1997). Cognitive science 12(1), 1–19. and folk psychology: the right frame of mind (pp. viii, 165). London, etc.: Sage. Grezes, J., Armony, J., Rowe, J., & Passingham, R. E. (2003). Activations related to “mirror” Haselager (submitted) Did I do that? Brainand “canonical” neurones in the human Computer Interfacing and the sense of brain: an fMRI study. NeuroImage, 18(4), agency. 928–937. Haselager, W. F. G., De Groot, A., & Van RapGrush, R. (1997). The architecture of reppard, H. (2003). Representationalism vs. resentation. Philosophical Psychology, 10(1), anti-representationalism: a debate for the 5–24. sake of appearance. Philosophical Psychology, 16(1), 5–23. Haggard, P. (2005). Conscious intention and motor cognition. Trends in Cognitive Sciences, Haselager, W. F. G., van Dijk, J., & van Rooij, I. 9(6), 290–295. (2008). A Lazy Brain? Embodied Embedded h Haugeland, J., & Rumelhart, D. (1991). Iacoboni, M., Woods, R., Brass, M., BekkerRepresentational Genera. In W. Ramsey, S. ing, H., Mazziotta, J. C., & Rizzolatti, G. Stich, & D. Rumelhart (Eds.), Philosophy and (1999). Cortical mechanisms of human Connectionist Theory. Hillsdale, N.J.: Erlbaum. imitation. Science, 286(5449), 2526–2528. Haynes, J.-D., & Rees, G. (2006). Decoding mental states from brain activity in humans. Nature Reviews: Neuroscience, 7, 523–534. Haynes, J.-D., Sakai, K., Rees, G., Gilbert, S., Frith, C. D., & Passingham, R. E. (2007). Reading Hidden Intentions in the Human Brain. Current Biology, 17(4), 323–328. doi:10.1016/j.cub.2006.11.072 Heyes, C. M. (in press). What can imitation do for cooperation? In B. Calcott, R. Joyce, & K. Sterelny (Eds.), Signalling, Commitment and Emotion. Cambridge, MA: MIT Press. j Jacob, P. (2008). What Do Mirror Neurons Contribute to Human Social Cognition? Mind & Language, 23(2), 190–223. Jacob, P. (2009). A Philosopher’s Relections on the Discovery of Mirror Neurons. Topics in Cognitive Science, 1(3), 570–595. Jacob, P., & Jeannerod, M. (2005). The motor theory of social cognition: a critique. Trends in Cognitive Sciences, 9(1), 21–25. James, W. (1890). The principles of psychology (p. 2 v.). New York: H. Holt and company. Jeannerod, M. (1994). The representing Hickok, G. (2009). Eight Problems for the brain: Neural correlates of motor intention Mirror Neuron Theory of Action Understanding in Monkeys and Humans. Journal of and imagery. Behavioral and Brain Sciences, 17(2), 187–201. Cognitive Neuroscience, 21(7), 1229–1243. Hommel, B. (2003). Planning and Represent- Juarrero, A. (2002). Dynamics in Action. Intentional Behavior As a Complex System (p. ing Intentional Action. The Scientiic World, 300). Cambridge, MA: The MIT Press. 3, 593–608. Hommel, B. (2004). Event iles: feature binding in and across perception and action. Trends in Cognitive Sciences, 8(11), 494–500. k Kakei, S., Hoffman, D. S., & Strick, P. L. (1999). Muscle and movement representations in the primary motor cortex. Science, 285(5436), 2136–2139. Hommel, B., Müsseler, J., Aschersleben, G., Keijzer, F. (2001). Representation and behavior & Prinz, W. (2001). The Theory of Event (pp. viii, 276). Coding (TEC): A framework for perception and action planning. Behavioral and Brain Kelso, J. A. S. (1995). Dynamic patterns: the selfSciences, 24 (5), 849–877. organization of brain and behavior (pp. XVII, 334). Cambridge, MA: MIT Press. Hubel, D., & Wiesel, T. (1959). Receptive ields of single neurones in the cat’s striate cortex. Journal of Physiology, 148, 574–591. Kenward, B., Folke, S., Holmberg, J., Johansson, A., & Gredebäck, G. (2009). Goal directedness and decision making in infants. Hume, D. (1739). A Treatise on Human Nature. Developmental Psychology, 45(3), 809–819. (P. H. Nidditch, Ed.) (2nd ed.). Oxford: doi:10.1037/a0014076 Clarendon Press. Keysers, C., & Gazzola, V. (2006). Towards a Humphreys, G., & Forde, E. (1998). Disordered action schema and action disorganisa- unifying neural theory of social cognition. Progress in brain research, 156, 379–402. tion syndrome. Cognitive Neuropsychology, 15, 771–812. 139 references i Cognition and Neuroscience. In P. Calvo Iacoboni, M. (2008). Mirroring People. The new & T. Gomila (Eds.), Handbook of Cognitive science of how we connect with others. New York: Science. An Embodied Approach (pp. 273–290). Farrar, Straus and Giroux. Oxford: Elsevier. Iacoboni, M., Molnar-Szakacs, I., Gallese, V., Hauf, P. (2006). Infants’ perception and Buccino, G., Mazziotta, J. C., & Rizzolatti, G. production of intentional actions. Progress (2005). Grasping the intentions of others in brain research, 164, 285–301. doi:10.1016/ with one’s owns mirror neuron system. PLoS S0079-6123(07)64016-3 Biology, 3(3), e79. 140 Keysers, C., & Gazzola, V. (2007). Integrating simulation and theory of mind. Trends in Cognitive Sciences, 11(5), 194–196. Kiebel, S. J., Daunizeau, J., & Friston, K. J. (2008). A hierarchy of time-scales and the brain. Plos Computational Biology, 4(11), e1000209. doi:10.1371/journal. pcbi.1000209 Kilner, J., Friston, K., & Frith, C. D. (2007a). Predictive coding: an account of the mirror neuron system. Cognitive Processing, 8, 159–166. Kilner, J., Friston, K., & Frith, C. D. (2007b). The mirror-neuron system: a Bayesian perspective. NeuroReport, 18(6), 619–623. of manual feeding and lying spoons. Child development, 81(6), 1729–1738. doi:10.1111/ j.1467-8624.2010.01506.x Koechlin, E., & Summerield, C. (2007). An information theoretical approach to prefrontal executive function. Trends in Cognitive Sciences, 11(6), 229–235. Koechlin, E., Basso, G., Pietrini, P., Panzer, S., & Grafman, J. (1999). The role of the anterior prefrontal cortex in human cognition. Nature, 399(6732), 148–151. doi:10.1038/20178 Koechlin, E., Ody, C., & Kouneiher, F. (2003). The Architecture of Cognitive Control in the Human Prefrontal Cortex. Science, 302(5648), 1181–1185. doi:10.1126/science.1088545 Kilner, J., Neal, A., Weiskopf, N., Friston, K., & Frith, C. D. (2009). Evidence of Mirror Neurons in Human Inferior Frontal Gyrus. Kouneiher, F., Charron, S., & Koechlin, E. Journal of Neuroscience, 29(32), 10153–10159. (2009). Motivation and cognitive control doi:10.1523/jneurosci.2668-09.2009 in the human prefrontal cortex. Nature Neuroscience, 12(7), 939–945. doi:10.1038/ Kim, J. (1993). Supervenience and Mind: Selected nn.2321 Philosophical Essays (p. 400). Cambridge: Cambridge University Press. Kovács, A. M., Téglás, E., & Endress, A. D. (2010). The social sense: susceptibility to Kim, J. (2000). Mind in a Physical World. Camothers’ beliefs in human infants and adults. bridge, Ma: MIT Press. Science, 330(6012), 1830–1834. doi:10.1126/ science.1190792 Kirkham, N. Z., Slemmer, J. A., & Johnson, S. P. (2002). Visual statistical learning in in- Kozak, M. N., Marsh, A. A., & Wegner, D. fancy: evidence for a domain general learnM. (2006). What do I think you’re doing? ing mechanism. Cognition, 83(2), B35–B42. Action identiication and mind attribution. doi:10.1016/S0010-0277(02)00004-5 Journal of personality and social psychol- Klossek, U. M. H., & Dickinson, A. (2012). Ra- ogy, 90(4), 543–555. doi:10.1037/00223514.90.4.543 tional action selection in 1½- to 3-year-olds following an extended training experience. Lakin, J. L., Jefferis, V. E., Cheng, C. M., & Journal of experimental child psychology, 111(2), Chartrand, T. L. (2003). The Chameleon 197–211. doi:10.1016/j.jecp.2011.08.008 Effect as Social Glue: Evidence for the Evolutionary Signiicance of Nonconscious Klossek, U. M. H., Russell, J., & Dickinson, A. Mimicry. Journal of Nonverbal Behavior, 27(3), (2008). The control of instrumental action 145–162. doi:10.1023/A:1025389814290 following outcome devaluation in young children aged between 1 and 4 years. Journal Langer, E. (1978). Rethinking the role of of experimental psychology. General, 137(1), thought in social interaction. In I. H. 39–51. doi:10.1037/0096-3445.137.1.39 Harvey, W. I. Ickes, & R. F. Kidd (Eds.), New l Knoblich, G., & Sebanz, N. (2008). Evolving intentions for social interaction: from entrainment to joint action. Philosophical Transactions of the Royal Society B-Biological Sciences, 363(1499), 2021–2031. directions in attribution research (pp. 36–58). Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers. Lau, H. C., Rogers, R. D. D., Haggard, P., & Passingham, R. E. (2004). Attention to Intention. Science, 303(5661), 1208–1210. Kochukhova, O., & Gredebäck, G. (2010). doi:10.1126/science.1090973 Preverbal infants anticipate that food will be brought to the mouth: an eye tracking study Lewis, D. (2000). Causation as inluence. The 141 Journal of Philosophy, 97(4), 182–197. Liepelt, R., Cramon, Von, D., & Brass, M. (2008). What is matched in direct matching? Intention attribution modulates motor priming. Journal of Experimental Psychology - Human Perception and Performance, 34(3), 578–591. doi:10.1037/0096-1523.34.3.578 Malle, B. F., & Knobe, J. (1997). The folk concept of intentionality. Journal of Experimental Social Psychology, 33, 101–121. Marsh, K. L., Richardson, M. J., Baron, R. M., & Schmidt, R. C. (2006). Contrasting Approaches to Perceiving and Acting With Others. Ecological Psychology, 18(1), 1–38. doi:10.1207/s15326969eco1801_1 Lingnau, A., & Petris, S. (2012). Action understanding inside and outside the motor Mason, C., Gomez, J., & Ebner, T. (2001). system: the role of task dificulty. Cerebral Hand synergies during reach-to-grasp. JourCortex nal of Neurophysiology, 86(6), 2896. Lingnau, A., Gesierich, B., & Caramazza, A. McFarland, D. (1989). Goals, no-goals and (2009). Asymmetric fMRI adaptation reveals own goals. In A. Monteiore & D. Noble no evidence for mirror neurons in humans. (Eds.), Goals, no-goals and own goals: a debate Proceedings of the National Academy of Scion goal-directed and international behaviour ences of the United States of America, 106(24), (pp. 39–57). London: Unwin Hyman. 9925–9930. Mead, G. H. (1934). Mind, Self, and Society. (C. Loucks, J., & Baldwin, D. A. (2006). When is W. Morris, Ed.) (p. 401). Chicago, London: a grasp a grasp? Characterizing some basic University Of Chicago Press. components of human action processing. Meinhardt, J., Sodian, B., Thoermer, C., DöhIn K. Hirsh-Pasek & R. Golinkoff (Eds.), nel, K., & Sommer, M. (2011). True- and Action meets words: How children learn verbs false-belief reasoning in children and adults: (pp. 228–261). New York: Oxford University An event-related potential study of theory of Press. mind. Accident Analysis and Prevention, 1(1), Luo, Y. (2011). Three-month-old infants attri67–76. doi:10.1016/j.dcn.2010.08.001 bute goals to a non-human agent. DevelopMeltzoff, A. N. (1995). Understanding the mental science, 14(2), 453–460. intentions of others: Re-enactment of Luo, Y., & Baillargeon, R. (2005). Can a selfintended acts by 18-month-old children. propelled box have a goal? Psychological Developmental Psychology, 31(5), 838–850. reasoning in 5-month-old infants. PsychologiMeltzoff, A. N., & Brooks, R. (2001). “Like cal Science, 16(8), 601–608. doi:10.1111/ Me” as a Building Block for Understanding j.1467-9280.2005.01582.x Other Minds: Bodily Acts, Attention, and Luo, Y., & Baillargeon, R. (2010). Toward a Intention. In B. F. Malle, L. J. Moses, & D. A. mentalistic account of early psychological Baldwin (Eds.), Intentions and intentionality: reasoning. Current Directions in Psychological foundations of social cognition (pp. 171–191). Science, 19(5), 301–307. Cambridge, MA: MIT Press. Lycan, W. G., & Pappas, G. S. (1972). What is eliminative materialism? Australasian Journal of Philosophy, 50(2), 149–159. doi:10.1080/00048407212341181 m Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 24, 167–202. Millikan, R. (1984). Language, thought, and Mahon, B., & Caramazza, A. (2008). A critical other biological categories: new foundations for look at the embodied cognition hypothesis realism (p. 355). Cambridge, MA: MIT Press. and a new proposal for grounding conMillikan, R. G. (1995). Pushmi-Pullyu ceptual content. Journal of Physiology-Paris, representations. Philosophical Perspectives, 9, 102(1-3), 59–70. 185–200. references Libet, B. (1985). Unconscious cerebral initiative and the role of conscious will in voluntary action. Behavioral and Brain Sciences, 8, 529–566. Maley, C. J. (2010). Analog and digital, continuous and discrete. Philosophical Studies, 155(1), 117–131. doi:10.1007/s11098-0109562-8 142 Miyashita, Y. (2005). Understanding Intentions: Through the Looking Glass. Science, 308(5722), 644–645. dynamics. Philosophical Psychology, 23(6), 759–773. Nilsson, N. (1984). Shakey The Robot, Technical Montgomery, D. (1997). Wittgenstein“s Note 323. Private Language Argument and Children”s Nisbett, R. E., & DeCamp Wilson, T. (1977). Understanding of the Mind, . DevelopmenTelling more than we can know: Verbal tal Review, 17(3), 291–320. doi:10.1006/ reports on mental processes. Psychological drev.1997.0436 Review, 84(3), 231–259. Moore, C. (2006). The development of comNoli, S., Ikegami, T., & Tani, J. (2008). monsense psychology (p. 241). Mahah, NJ: Editorial: Behavior and Mind as a Complex Lawrence Erlbaum. Adaptive System. Adaptive Behavior, 16(2-3), Moore, J. W., Wegner, D. M., & Haggard, P. 101–103. doi:10.1177/1059712308090150 (2009). Modulating the sense of agency with Nordh, G., & Zanuttini, B. (2005). Propoexternal cues. Consciousness and Cognition, sitional abduction is almost always hard. 18(4), 1056–1064. doi:10.1016/j.conProceedings of the 19th International Joint Concog.2009.05.004 ference on Artiicial Intelligence (IJCAI-2005), Moore, M. S. (2011). Philosophical foundaEdinburgh, Scotland, UK, 534–539. tions of criminal law. In R. A. Duff (Ed.), Nordh, G., & Zanuttini, B. (2008). What Philosophical foundations of criminal law (pp. makes propositional abduction tractable. 179–205). New York: Oxford University Artiicial Intelligence, 172(10), 1245–1284. Press. Moses, L. J. (2001). Some Thoughts on Ascribing Complex Intentional Concepts to Young Children. In B. F. Malle, L. J. Moses, & D. A. Baldwin (Eds.), Intentions and Intentionality (pp. 69–83). Cambridge, MA: MIT Press. Mukamel, R., Ekstrom, A., Kaplan, J., Iacoboni, M., & Fried, I. (2010). Single-Neuron Responses in Humans during Execution and Observation of Actions. Current Biology, 20(8), 750–756. Murata, A., Fadiga, L., Fogassi, L., Gallese, V., Raos, V., & Rizzolatti, G. (1997). Object Representation in the Ventral Premotor Cortex (Area F5) of the Monkey. Journal of Neurophysiology, 78(4), 2226–2230. n Nelson, K. (2007). Young minds in social worlds. experience, meaning, and memory (p. 315). Cambridge, MA: Harvard University Press. Nyström, P., Ljunghammar, T., Rosander, K., & Hofsten, von, C. (2011). Using mu rhythm desynchronization to measure mirror neuron activity in infants. Developmental science, 14(2), 327–335. o Olson, D. (1988). On the origin of beliefs and other intentional states in childeren. In J. W. Astington, P. Harris, & D. R. Olson (Eds.), Developing theories of mind (pp. 414–426). Cambridge, UK: Cambridge University Press. Ongür, D., & Price, J. L. (2000). The organization of networks within the orbital and medial prefrontal cortex of rats, monkeys and humans. Cerebral Cortex, 10(3), 206–219. Onishi, K. H., & Baillargeon, R. (2005). Do 15-Month-Old Infants Understand False Beliefs? Science, 308(5719), 255–258. doi:10.1126/science.1107621 Newell, A., & Simon, H. A. (1972). Human Problem Solving (p. 784). Prentice Hall. Ouden, den, H., Frith, U., Frith, C. D., & Blakemore, S.-J. (2005). Thinking about intentions. NeuroImage, 28(4), 787–796. Newman-Norlund, R. D., Van Schie, H. T., Van Zuijlen, A., & Bekkering, H. (2007). The mirror neuron system is more active during complementary compared with imitative action. Nature Neuroscience, 10(7), 817–818. Oztop, E., Kawato, M., & Arbib, M. A. (2006). Mirror neurons and imitation: A computationally guided review. Neural Networks, 19(3), 254–271. Nielsen, K. S. (2010). Representation and Oztop, E., Wolpert, D. M., & Kawato, M. (2005). Mental state inference using visual control parameters. Cognitive Brain Research, 143 22(2), 129–151. Pacherie, E. (2006). Towards a dynamic theory of intentions. In S. Pocket, W. P. Perner, J., & Doherty, M. (2005). Do infants Banks, & S. Gallagher (Eds.), Does Consciousunderstand that external goals are internalness Cause Behavior? An investigation of the ly represented? Behavioral and Brain Sciences, Nature of Volition (pp. 145–167). Cambridge, 28(5), 710–711. MA: MIT Press. Perner, J., & Ruffman, T. (2005). Psychology. Pacherie, E. (2008). The phenomenology of Infants’ insight into the mind: how deep? action: A conceptual framework. Cognition, Science, 308(5719), 214–216. doi:10.1126/ 107(1), 179–217. science.1111656 Pacherie, E., & Haggard, P. (2011). What are Petrides, M. (2005). The Rotral-Caudal Axis Intentions? In W. Sinnott-Armstrong & L. of Cognitive Control within the Lateral Nadel (Eds.), Conscious Will and ResponsibilFrontal Cortex. In S. Dehaene, J.-R. Duity: A Tribute to Benjamin Libet (pp. 70–84). hamel, M. D. Hauser, & G. Rizzolatti (Eds.), New York: Oxford University Press. From Monkey Brain to Human Brain. A Fyssen Passingham, R. E., Toni, I., & Rushworth, M. F. S. (2000). Specialisation within the prefrontal cortex: the ventral prefrontal cortex and associative learning. Experimental Brain Research, 133(1), 103–113. Paulus, M. (2011). How infants relate looker and object: evidence for a perceptual learning account of gaze following in infancy. Developmental science, 14(6), 1301–1310. Paulus, M. (in press). Action mirroring and action understanding: an ideomotor and attentional account. Psychological Research. doi:10.1007/s00426-011-0385-9 Paulus, M., Hunnius, S., & Bekkering, H. (2011a). Can 14- to 20-month-old children learn that a tool serves multiple purposes? A developmental study on children’s action goal prediction. Vision Research, 51(8), 955–960. doi:10.1016/j.visres.2010.12.012 Paulus, M., Hunnius, S., van Wijngaarden, C., Vrins, S., van Rooij, I., & Bekkering, H. (2011b). The Role of Frequency Information and Teleological Reasoning in Infants. Developmental Psychology, 47(4), 976–983. Foundation Symposium (pp. 293–314). Cambridge, MA: MIT Press. Pezzulo, G., & Dindo, H. (2011). What should I do next? Using shared representations to solve interaction problems. Experimental Brain Research, 211(3-4), 613–630. doi:10.1007/s00221-011-2712-1 Pezzulo, G., Butz, M. V., & Castelfranchi, C. (2008). The Anticipatory Approach: Deinitions and Taxonomies. In G. Pezzulo, M. V. Butz, C. Castelfranchi, & R. Falcone (Eds.), The Challenge of Anticipation. A unifying framework for the Analysis and Design of Artiicial Cognitive Systems (pp. 23–43). Berlin, Heidelberg: Springer-Verlag. Pfeifer, R., & Scheier, C. (1999). Understanding intelligence (pp. xx, 697 p.). Phillips, A. T., Wellman, H. M., & Spelke, E. S. (2002). Infants’ ability to connect gaze and emotional expression to intentional action. Cognition, 85(1), 53–78. Piccinini, G. (2008). Computation without representation. Philosophical Studies, 137(2), 205–241. Penield, W., & Rasmussen, T. (1950). The cere- Povinelli, D. J. (2001). On the Possibility of bral cortex of man; a clinical study of localization Detecting Intentions Prior to Understandof function. New York, Macmillan. ing Them. In B. F. Malle, L. O. J. Moses, & D. A. Baldinw (Eds.), Intentions and IntentionPerner, J. (1991). Understanding the representaality (pp. 225–248). Cambridge: MIT Press. tional mind (p. 348). Cambridge, MA: The MIT Press. Prather, J. F., Peters, S., Nowicki, S., & Perner, J. (2010). Who took the cog out of Cognitive Science? Mentalism in an era Mooney, R. (2008). Precise auditory–vocal mirroring in neurons for learned vocal com- references p Pacherie, E. (2000). The Content of Intentions. Mind & Language, 15(4), 400–432. of anti-cognitivism. In P. A. Frensch & R. Schwarzer (Eds.), Cognition and Neuropsychology: International Perspectives on Psychological Science (Vol. 1, pp. 241–261). Hove: Psychology Press. 144 munication. Nature, 451(7176), 305–310. doi:10.1038/nature06492 Prinz, W. (1997). Perception and Action Planning. The European journal of cognitive psychology, 9(2), 129–154. Pylyshyn, Z. W. (1987). The Robot’s Dilemma: The Frame Problem in Artiicial Intelligence. Norwood, NJ: Ablex Publishing. tem and Imitation. In S. Hurley & N. Chater (Eds.), Perspectives on Imitation (pp. 55–76). Cambridge, Ma: MIT Press. Rizzolatti, G., & Craighero, L. (2004). The Mirror-Neuron System. Annual Review of Neuroscience, 27, 169–192. Rizzolatti, G., & Sinigaglia, C. (2010). The functional role of the parieto-frontal mirror circuit: interpretations and misinterpretaQuian Quiroga, R., Reddy, L., Kreiman, G., tions. Nature Reviews: Neuroscience, 11(4), Koch, C., & Fried, I. (2005). Invariant visual 264–274. representation by single neurons in the human brain. Nature, 435, 1102–1107. Rizzolatti, G., Camarda, R., Fogassi, L., Gentilucci, M., Luppino, G., & Matelli, M. (1988). Rakoczy, H. (in press). Do infants have a Functional organization of inferior area 6 theory of mind? British Journal of Developmenin the macaque monkey. II. Area F5 and the tal Psychology. control of distal movements. Experimental Brain Research, 71(3), 475–490. Ramnani, N., & Owen, A. (2004). Anterior prefrontal cortex: insights into function Rizzolatti, G., Fadiga, L., Gallese, V., & from anatomy and neuroimaging. Nature Fogassi, L. (1996). Premotor cortex and the Reviews: Neuroscience, 5(3), 184–194. recognition of motor actions. Cognitive Brain q r Ranganath, C., Johnson, M. K., & D’Esposito, M. (2000). Left anterior prefrontal activation increases with demands to recall speciic perceptual information. Journal of Neuroscience, 20(22), RC108. Research, 3(2), 131–142. Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mechanisms underlying the understanding and imitation of action. Nature Reviews: Neuroscience, 2, 661–670. Reddy, V. (2010). Engaging Minds in the irst Rizzolatti, G., Gentilucci, M., Fogassi, L., year: The developing awareness of attention Luppino, G., Matelli, M., & Ponzoni-Maggi, and intention. In G. Bremner & T. Wachs S. (1987). Neurons related to goal-directed (Eds.), Handbook of Infant Development. 2nd motor acts in inferior area 6 of the macaque Edition. Chichester: Wiley-Blackwell. monkey. Experimental Brain Research, 67(1), 220–224. Reid, V. M., Csibra, G., Belsky, J., & Johnson, M. H. (2007). Neural correlates of Rosenbaum, D. A. (2009). Human motor conthe perception of goal-directed action in trol. Human perception and performance infants. Acta Psychologica, 124(1), 129–138. (2nd ed. p. 505). San Diego: Academic doi:10.1016/j.actpsy.2006.09.010 Press. Reid, V. M., Hoehl, S., Grigutsch, M., Groendahl, A., Parise, E., & Striano, T. (2009). The neural correlates of infant and adult goal prediction: evidence for semantic processing systems. Developmental Psychology, 45(3), 620–629. doi:10.1037/a0015209 Ridderinkhof, K. R., van den Wildenberg, W. P. M., Segalowitz, S. J., & Carter, C. S. (2004). Neurocognitive mechanisms of cognitive control: The role of prefrontal cortex in action selection, response inhibition, performance monitoring, and rewardbased learning. Brain and Cognition, 56(2), 129–140. doi:10.1016/j.bandc.2004.09.016 Rizzolatti, G. (2005). The Mirror Neuron Sys- Rosenbaum, D. A., & Jorgensen, M. J. (1992). Planning macroscopic aspects of manual control☆. Human Movement Science, 11(1-2), 61–69. doi:10.1016/0167-9457(92)90050-L Ruffman, T., Slade, L., & Crowe, E. (2002). The relation between children“s and mothers” mental state language and theory-ofmind understanding. Child development, 73(3), 734–751. Ruffman, T., Slade, L., Rowlandson, K., Rumsey, C., & Garnham, A. (2003). How language relates to belief, desire, and emotion understanding. Cognitive Development, 18(2), 139–158. 145 Rushworth, M. F. S. (2008). Intention, Choice, and the Medial Frontal Cortex. Annals of the New York Academy of Sciences, 1124(1), 181–207. doi:10.1196/annals.1440.014 s Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical Learning by 8-MonthOld Infants. Science, 274(5294), 1926–1928. doi:10.1126/science.274.5294.1926 Saltzman, E. (1979). Levels of sensorimotor representation. Journal of Mathematical Psychology, 20(2), 91–163. Saxe, R. (2005a). Against simulation: the argument from error. Trends in Cognitive Sciences, 9(4), 174–179. Saxe, R. (2005b). Tuning forks in the mind: Reply to Goldman and Sebanz. Trends in Cognitive Sciences, 9(7), 321. Saxe, R. (2009). The neural evidence for simulation is weaker than I think you think it is. Philosophical Studies, 144(3), 447–456. Saylor, M. M., Baldwin, D. A., Baird, J. A., & LaBounty, J. (2007). Infants’ On-line Segmentation of Dynamic Human Action. Journal of Cognition and Development, 8(1), 113–128. doi:10.1080/15248370709336996 Searle, J. (1983). Intentionality, An Essay in the Philosophy of Mind. Cambridge: Cambridge University Press. Connectionist Systems. Mind & Language, 22(3), 246–269. Skinner, B. (1953). Science and human behavior. New York, Macmillan. Smith, L. B., Thelen, E., Titzer, R., & McLin, D. (1999). Knowing in the context of acting: The task dynamics of the A-not-B error. Psychological Review, 106(2), 235. Sodian, B. (2011). Theory of mind in infancy. Child Development Perspectives, 5(1), 39–43. Sommerville, J. A., Woodward, A. L., & Needham, A. (2005). Action experience alters 3-month-old infants“ perception of others” actions. Cognition, 96(1), B1–B11. Southgate, V., Johnson, M. H., Karoui, I. E., & Csibra, G. (2010). Motor system activation reveals infants’ on-line prediction of others’ goals. Psychological Science, 21(3), 355. Sterelny, K. (2010). Minds: extended or scaffolded? Phenomenology and the Cognitive Sciences, 9(4), 465–481. doi:10.1007/s11097010-9174-y Stich, S. P. (1983). From folk psychology to cognitive science: The case against belief. (p. 266). Cambridge, MA: The MIT Press. Stich, S. P., & Ravenscroft, I. (1994). What is folk psychology? Cognition, 50(1-3), 447–468. doi:10.1016/0010-0277(94)90040-X t Tai, Y., Scherler, C., Brooks, D., Sawamoto, N., & Castiello, U. (2004). The Human Premotor Cortex Is “Mirror” Only for Biological Actions. Current Biology, 14(2), 117–120. Sebanz, N., & Knoblich, G. (2009). Prediction Taumoepeau, M., & Ruffman, T. (2008). Stepping Stones to Others’ Minds: Maternal Talk in joint action: What, when, and where. TopRelates to Child Mental State Language and ics in Cognitive Science, 1(2), 353–367. Emotion Understanding at 15, 24, and 33 Sebanz, N., Bekkering, H., & Knoblich, G. Months. Child development, 79(2), 284–302. (2006). Joint action: bodies and minds doi:10.1111/j.1467-8624.2007.01126.x moving together. Trends in Cognitive Sciences, Thagard, P., & Verbeurgt, A. (1998). Coher10(2), 70–76. ence as constraint satisfaction. Cognitive Selen, L. P. J., Franklin, D. W., & Wolpert, Science, 22(1), 1–24. D. M. (2009). Impedance control reduces Thelen, E., & Smith, L. (1994). A dynamic instability that arises from motor noise. systems approach to the development of cognition Journal of Neuroscience, 29(40), 12606–12616. and action (pp. XXIII, 376). doi:10.1523/JNEUROSCI.2826-09.2009 Sellars, W. (1963). Science, Perception and Reality. New York: Humanities Press. Shea, N. (2007). Content and Its Vehicles in Thelen, E., Schöner, G., Scheier, C., & Smith, L. (2001). The dynamics of embodiment: A ield theory of infant perseverative reaching. Behavioral and Brain Sciences, 24(1), 1–86. references Ruffman, T., Taumoepeau, M., & Perkins, C. (2011). Statistical learning as a basis for social understanding in children. British Journal of Developmental Psychology. 146 Thoermer, C., Sodian, B., Vuori, M., Perst, Neurophysiological Study. Neuron, 31(1), H., & Kristen, S. (2011). Continuity from an 155–165. implicit to an explicit understanding of false belief from infancy to preschool age. British Vallacher, R. R., & Wegner, D. M. (1987). What do people think they’re doing? Action Journal of Developmental Psychology. identiication and human behavior. PsychoTomasello, M. (1999). The cultural origins of logical Review. human cognition. Cambridge, MA; London: van Dijk, J., Kerkhofs, R., van Rooij, I., & Harvard University Press. Haselager, W. F. G. (2008). Can There Be Tomasello, M., Carpenter, M., Call, J., Behne, Such a Thing as Embodied Embedded CogT., & Moll, H. (2005). Understanding and nitive Neuroscience? Theory & Psychology, sharing intentions: The origins of cultural 18(3), 297. cognition. Behavioral and Brain Sciences, 28, van Dijk, M., Hunnius, S., & van Geert, 675–735. P. (2009). Variability in eating behavior Toni, I., Lange, F. P., Noordzij, M. L., & Hathroughout the weaning period. Apgoort, P. (2008). Language beyond action. petite, 52(3), 766–770. doi:10.1016/j.apJournal of Physiology-Paris, 102(1-3), 71–79. pet.2009.02.001 v Tsotsos, J. (1990). Analyzing vision at the complexity level. Behavioral and Brain Sciences, 13, 423–469. u Van Elk, M. (2010). Action semantics. functional and neural dynamics (p. 235). Radboud University Nijmegen. Uithol, S., Burnston, D., & Haselager, W. F. G. Van Elk, M., Van Schie, H. T., & Bekker(submitted). Will intentions be found in the ing, H. (2008). Conceptual knowledge for brain? Cognition. understanding other’s actions is organized primarily around action goals. Experimental Uithol, S., Haselager, W. F. G., & Bekkering, Brain Research, 189(1), 99–108. H. (2008). When Do We Stop Calling Them Mirror Neurons? (B. Love, K. McRae, & V. van Gelder, T. (1992). Deining `distributed Sloutsky, Eds.)Proceedings of the 30th Annual representation’. Connection Science, 4(3/4), Conference of the Cognitive Science Society (pp. 175–191. 1783–1788). van Gelder, T. (1995). What Might Cognition Uithol, S., van Rooij, I., Bekkering, H., & Be, If Not Computation? Journal of PhilosoHaselager, W. F. G. (2011a). What do mirror phy, 91(7), 345–381. neurons mirror? Philosophical Psychology, van Gelder, T. (1998). The Dynamical Hy24(5), 607–623. pothesis in Cognitive Science. Behavioral and Uithol, S., van Rooij, I., Bekkering, H., & Brain Sciences, 21, 615–665. Haselager, W. F. G. (2011b). Understanding van Gelder, T. (1999). Distributed versus motor resonance. Social Neuroscience, 6(4), local representation. In (pp. 236–238). 388–397. Cambridge, MA: The MIT Encyclopedia of Uithol, S., van Rooij, I., Bekkering, H., & Cognitive Sciences. Haselager, W. F. G. (2012). Hierarchies in Action and Motor Control. Journal of Cogni- van Rooij, I. (2008). The tractable cognition thesis. Cognitive Science, 32(6), 939–984. tive Neuroscience, 24(5), 1077–1086. Umiltà, M. A., Escola, L., Intskirveli, I., Gram- van Rooij, I., Haselager, W. F. G., & Bekmont, F., Rochat, M., Caruana, F., Jezzini, A., kering, H. (2008). Goals are not implied by actions, but inferred from actions and et al. (2008). When pliers become ingers contexts. Behavioral and Brain Sciences, 31, in the monkey motor system. Proceedings 38–39. of the National Academy of Sciences, 105(6), 2209–2213. doi:10.1073/pnas.0705985105 van Rooij, I., Kwisthout, J., Blokpoel, M., Szymanik, J., Wareham, T., & Toni, I. (2011). Umiltà, M. A., Kohler, E., Gallese, V., Fogassi, Intentional communication: ComputationL., Fadiga, L., Keysers, C., & Rizzolatti, G. ally easy or dificult? Frontiers in Human Neu(2001). I Know What You Are Doing - A roscience, 5. doi:10.3389/fnhum.2011.00052 147 w y z Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children’s understanding of deception. Cognition, 13(1), 103–128. doi:10.1016/00100277(83)90004-5 Wittgenstein, L. (1953). Philosophical Investigations. Oxford: Blackwell. Wohlschlager, A., & Bekkering, H. (2002). Is human imitation based on a mirror-neuron system? Some behavioural evidence. Experimental Brain Research, 143(3), 335–341. Woodward, A. L. (1998). Infants selectively encode the goal object of an actor’s reach. Cognition, 69(1), 1–34. Woodward, A. L. (2009). Infants“ Grasp of Others” Intentions. Current Directions in Psychological Science, 18(1), 53–57. doi:10.1111/ references Van Schie, H. T., van Waterschoot, B. M., j.1467-8721.2009.01605.x & Bekkering, H. (2008). Understanding Woodward, A. L., & Sommerville, J. A. (2000). action beyond imitation: reversed compatTwelve-month-old infants interpret action ibility effects of action observation in imitain context. Psychological Science, 11(1), 73–77. tion and joint action. Journal of Experimental Psychology - Human Perception and Performance, Woodward, A. L., Sommerville, J. A., & Gua34(6), 1493–1500. doi:10.1037/a0011750 jardo, J. J. (2001). How Infants Make Sense of Intentional Action. In B. F. Malle, L. O. Ward, L. (2003). Synchronous neural oscilJ. Moses, & D. A. Baldwin (Eds.), Intentions lations and cognitive processes. Trends in and Intentionality. Cambridge, MA: MIT Cognitive Sciences, 7(12), 553–559. Press. Wegner, D. M. (2003). The Illusion of Conscious Yamashita, Y., & Tani, J. (2008). Emergence Will (p. 419). Cambridge, MA: The MIT of Functional Hierarchy in a Multiple TimPress. escale Neural Network Model: A Humanoid Wellman, H. M., Cross, D., & Watson, J. Robot Experiment. Plos Computational Biol(2001). Meta-Analysis of Theory-of-Mind ogy, 4(11), 1–18. Development: The Truth about False Belief. Young, B., & Drewett, R. (2000). Eating beChild development, 72(3), 655–684. haviour and its variability in 1-year-old chilWhittington, B., Silder, A., Heiderscheit, B., dren. Appetite, 35(2), 171–177. doi:10.1006/ & Thelen, D. (2008). The contribution of appe.2000.0346 passive-elastic mechanisms to lower extremity joint kinetics during human walking. Gait Ziemke, T. (2003). What’s that thing called embodiment? In R. Alterman & D. Kirsh & Posture, 27(4), 628–634. (Eds.), Proceedings of the 25th Annual ConferWicker, B., Keysers, C., Plailly, J., Royet, J.-P., ence of the Cognitive Science Society (pp. 1134– Gallese, V., & Rizzolatti, G. (2003). Both of 1139). Mahwah, NJ: Lawrence Erlbaum. Us Disgusted in My Insula - The Common Neural Basis of Seeing and Feeling Disgust. Zwaan, R. A., Stanield, R. A., & Madden, C. J. (1999). Perceptual symbols in language Neuron, 40(3), 655–664. comprehension: Can an empirical case be Wilson, M., & Knoblich, G. (2005). The made? Behavioral and Brain Sciences, 22(04), case for motor involvement in perceiving 636–637. conspeciics. Psychological Bulletin, 131(3), 460–473. 148 149 Summary The notion of ‘intention’ plays an important role in our everyday thinking of action. Yet, little is known about how intentions are represented in the brain, and how they initiate actions. This thesis investigates how actions arise, and what role intentions play. After introducing the notions of ‘representation’, ‘action’ and ‘intention’ in Chapter 1, in Chapter 2 I investigate the process of content attribution to the iring of single mirror neurons. Single cell recordings in monkeys provide strong evidence for an important role of the motor system in action understanding, but although the data acquired from single cell recordings are generally considered to be robust, several debates have shown that the interpretation of these data is far from straightforward. Chapter 2 argues that, without principled restrictions, research based on single-cell recordings allows for unlimited content attribution to single mirror neurons. A theoretical analysis of the type of processing attributed to the mirror neuron system can help formulating restrictions on what mirroring is and what cognitive functions could, in principle, be explained by a mirror mechanism. It is argued that the processing at higher levels of abstraction needs assistance of non-mirroring processes to such an extent that subsuming the processes needed to infer goals from actions under the label ‘mirroring’ is not warranted. In humans single cell recordings are problematic. Therefore, activation of the motor areas upon action observation using fMRI or EEG is studied. The inding of this so called ‘motor resonance’ is generally regarded to be supportive for motor theories of action understanding. These theories take motor resonance to be essential in the understanding of observed actions and the inference of action goals. However, Chapter 3 shows that the notions of ‘resonance’, ‘action understanding’ and ‘action goal’ appear to be used ambiguously in the literature. A survey of the literature on mirror neurons and motor resonance yields two different interpretations of the term ‘resonance’, three different interpretations of ‘action understanding’, and again three different interpretations of what the ‘goal’ of an action is. This entails that, unless it is speciied what interpretation is used, the meaning of any statement about the relation between these concepts can differ to a great extent. By discussing Umilta et al.’s (2001) well-known experiment on mirror neurons I show that more precise deinitions and use of the concepts will allow for better assessments of motor theories of action understanding and hence a more fruitful scientiic debate. Lastly, I provide an example of 150 how the discussed experimental setup could be adapted, based on the preceding analysis, to test other interpretations of the concepts. Actions are commonly thought of as structured hierarchically. Chapter 4 analyses such hierarchies. In the literature two hierarchies are often posited: The irst—the action hierarchy—is a decomposition of an action into sub-actions and sub-sub-actions. The second—the control hierarchy—is a postulated hierarchy in the neural control processes that are supposed to bring about the action. A general assumption in cognitive neuroscience is that these two hierarchies are internally consistent and provide complementary descriptions of neuronal control processes. In this chapter I show that that neither hierarchy offers a complete explanation and that they cannot be reconciled in a logical or conceptual way. Furthermore, neither pays proper attention to the dynamics and temporal aspects of neural control processes. I explore an alternative hierarchical organization in which causality is inherent in the dynamics over time. Speciically, high levels of the hierarchy encode slower (goal-related) representations, while lower levels represent faster (action and motor acts) kinematics. If employed properly, a hierarchy based on this principle is not subject to the problems that plague the traditional accounts. Chapter 5 analyzes the neural applicability of the notion of ‘intention’. Intentions are commonly conceived of as discrete mental states that are the direct cause of actions. In the last several decades, neuroscientists have taken up the project of localizing intentions in the brain, and a number of areas have been posited as implementing representations of intentions. I argue, however, that it is doubtful that the folk notion of ‘intention’ applies to any particular physical process by which the brain initiates actions. Drawing on the analysis of Chapter 4, Pacherie’s account of intentions (Pacherie, 2006, 2008), and Koechlin’s model on action control (Koechlin et al, 1999, 2003) I show that the idea of a discrete state that causes an action is deeply incompatible with the dynamic organization of the prefrontal cortex, the presumed neural locus of the causation and control of actions. Discrete representations can at best, I will claim, play a subsidiary, stabilizing role in action planning, but this role is still incompatible with the folk notion of intention. This chapter concludes by arguing that the prevalence of the folk notion, including its intuitive appeal in neuroscientiic explanations, stems from the central role intentions play in constructing intuitive explanations of our own and others’ behavior. Some future directions based on the presented analysis are sketched. 151 summary Finally, in Chapter 6 the ideas, results, and analyses of the previous chapters are applied to the ield of developmental psychology. Intention reading and action understanding have been reported in ever-younger infants, but these indings are highly debated. In this chapter I set out to clarify the notions of ‘action understanding’ and ‘intention attribution’ and discuss their relation. I use the various forms of ‘action understanding’ from Chapter 3 and speculate on the mechanisms that could underlie these capacities. Based on Chapter 5 I argue that these forms of action understanding do not generally result in the attribution of an intention to an observed actor. By disentangling intention attribution from action understanding, and by exposing the latter as an umbrella notion, I provide a framework that allows for better comparing indings from different experimental paradigms. Finally, in Chapter 7 I discuss the implications of previous chapters for the notions of ‘action understanding’ and ‘intention’, and for our conception of action hierarchies. The most important conclusions are: 1) that the evidence for the presence of an action hierarchy is partly circulair, 2) that motor resonance in itself does not provide support for mindreading, and 3) that a reinterpretation of the concept of ‘intention’ can aid in our understanding of how we understand each others’ actions, and how joint action is possible. 152 153 Samenvatting (Dutch summary) Het begrip ‘intentie’ speelt een belangrijke rol in ons alledaags denken over acties. Toch is er weinig bekend over hoe intenties in het brein gerepresenteerd zijn en onze acties teweegbrengen. In dit proefschrift onderzoek ik hoe acties tot stand komen, en welke rol intenties daarin spelen. Na een korte introductie van de begrippen ‘representatie’, ‘actie’ en ‘intentie’ in Hoofdstuk 1, beschrijf ik in Hoofdstuk 2 de manier waarop representationele inhoud aan individuele spiegelneuronen wordt toegekend. Metingen aan individuele neuronen in het apenbrein lijken erop te duiden dat het motorsysteem een belangrijke rol speelt bij het begrijpen van acties. De vraag hoe deze bevindingen geïnterpreteerd moeten worden, zorgt echter voor veel controverse. In dit hoofdstuk laat ik zien dat de representationele inhoud die aan individuele spiegelneuronen kan worden toegeschreven onbegrensd is, voor wie bereid is een almaar hoger abstractieniveau te kiezen. Door middel van een theoretische analyse van het type processen dat in het spiegelneuronensysteem verondersteld wordt plaats te vinden kan de inhoudattributie begrensd worden. Ik concludeer dat bij cognitieve processen van een hogere abstractie, zoals het achterhalen van intenties, dusdanig veel hulp van andere processen nodig is dat het predicaat ‘spiegelen’ misplaatst is. Bij mensen wordt de hersenactiviteit die zich voordoet bij actieobservatie—de zogenoemde ‘motorresonantie’—niet in individuele neuronen gemeten, maar met behulp van scantechnieken als fMRI en EEG. Deze activiteit wordt doorgaans als bewijs voor motortheorieën van actiebegrip gezien. Volgens deze theorieën speelt motorresonantie een essentiële rol bij het begrijpen van acties van anderen. In Hoofdstuk 3 laat ik echter zien dat de begrippen ‘motorresonantie’, ‘actiebegrip’ en ‘actiedoel’ binnen de cognitiewetenschappen op uiteenlopende manieren geïnterpreteerd worden. Door deze veelheid aan interpretaties kan een bewering over de relatie tussen deze begrippen sterk uiteenlopende betekenissen hebben, tenzij expliciet duidelijk gemaakt wordt welke interpretatie is gehanteerd. Door een preciezere deinitie van de gehanteerde begrippen en een zorgvuldiger gebruik van de concepten kan de toetsbaarheid van de motortheorieën— en daarmee het wetenschappelijke debat— sterk verbeterd worden, zoals ik laat zien aan de hand van een analyse van een experiment van Umiltà et al. (2001). Ten slotte laat ik zien hoe het door Umiltà gebruikte paradigma op basis van mijn analyse kan worden aangepast om andere interpretaties van de concepten te toetsen. 154 Doorgaans wordt aan acties een hiërarchische structuur toegeschreven. In Hoofdstuk 4 analyseer ik deze structuur. In de literatuur worden veelvuldig twee hiërarchieën gepostuleerd: de eerste—de action hierarchy—is een verdeling van een actie in subacties, en sub-subacties. De tweede—de control hierarchy—is een hiërarchie die verondersteld wordt aanwezig te zijn in de neurale processen die acties coördineren. Binnen de cognitieve neurowetenschappen wordt over het algemeen aangenomen dat deze twee hiërarchieën coherent zijn en complementaire beschrijvingen van de neurale processen bieden. In dit hoofdstuk betoog ik dat geen van beide hiërarchieën een complete verklaring kan bieden, en dat ze niet op een logische of conceptuele wijze te integreren zijn. Bovendien schenkt geen van beide voldoende aandacht aan het dynamische karakter van de betreffende neurale processen. Ik bespreek een alternatieve hiërarchische structuur waarbij causaliteit inherent is aan de temporele dynamica. Langzamere niveaus sturen doelgerelateerde processen, en snellere niveaus zijn gerelateerd aan vluchtigere aspecten van acties, zoals bewegingen. Wanneer deze alternatieve hiërarchische structuur zorgvuldig wordt toegepast, zijn de problemen die zich bij de andere hiërarchieën voordoen te vermijden. In Hoofdstuk 5 onderzoek ik de toepasbaarheid van het concept ‘intentie’ in de neurowetenschappen. Intenties worden doorgaans gezien als mentale toestanden die de directe oorzaak van onze acties zijn. De laatste decennia hebben neurowetenschappers talrijke pogingen gedaan om deze mentale toestanden in het brein te vinden, en een aantal gebieden aangewezen waar intenties gerepresenteerd zouden zijn. Ik betoog echter, dat het twijfelachtig is dat het begrip ‘intentie’ overeenkomt met één enkel neuraal proces dat acties genereert. Door de resultaten van Hoofdstuk 4 te combineren met Pacherie’s model van intenties (Pacherie, 2006, 2008), en Koechlin’s model van actieplanning (Koechlin et al., 1999, 2003) laat ik zien dat het idee van een enkele en onderscheidbare mentale toestand onverenigbaar is met de complexe en dynamische karakter van de processen in de prefrontaalschors, het hersengebied waarvan wordt aangenomen dat het acties initieert en stuurt. Afzonderlijke representaties kunnen hoogstens een stabiliserende rol spelen bij het plannen van acties. Vervolgens betoog ik in dit hoofdstuk dat zowel de alomtegenwoordigheid van het begrip intentie, als de eigenschappen die aan het begrip worden toegeschreven zijn te verklaren zijn door te kijken naar de rol die het begrip speelt bij het beschrijven van onze eigen en andermans acties. Ten slotte schets ik hoe de aangehaalde studies en de verworven inzichen dicteren met welke pro- 155 dutch summary cessen en eigenschappen een alternatieve interpretatie van actieplanning rekening moet houden. In hoofdstuk 6 worden de ideeën, analyses en resultaten van de voorgaande hoofdstukken toegepast op het gebied van de ontwikkelingspsychologie. Het begrijpen van intenties en acties wordt aan steeds jongere kinderen toegeschreven. Tegelijkertijd is er een discussie gaande over de mate waar in jonge kinderen deze capaciteiten bezitten. In dit hoofdstuk verduidelijk ik de concepten “acties begrijpen” en “intenties toeschrijven”, en bespreek ik hun relatie. Daarbij maak ik gebruik van de verschillende vormen van actiebegrip uit Hoofdstuk 3, en speculeer ik over welke mechanismen voor de verschillende facetten van het begrijpen van acties verantwoordelijk zouden kunnen zijn. Op grond van de conclusies in hoofdstuk 5 betoog ik dat deze vormen van actiebegrip over het algemeen niet tot de attributie van een intentie leiden. Ik laat zien dat het loskoppelen van intentie-attributie en actiebegrip, en het inzicht dat dit laatste begrip een containerbegrip is de mogelijkheid scheppen een nieuw kader voor het bestuderen van sociale cognitie bij jonge kinderen te schetsen. Ten slotte bespreek ik in hoofdstuk 7 wat in meer algemene zin de consequenties zijn van de voorafgaande hoofdstukken voor experimenteel onderzoek naar actiebegrip, voor onze interpretatie van motorhiërarchieën, motor simulatie en het concept ‘intentie’. De belangrijkste conclusies zijn 1) dat het bewijs voor het bestaan van een hiërarchie in acties deels circulair is, 2) dat motor resonantie niet zonder meer verondersteld kan worden bij te dragen aan ‘mindreading’ en 3) dat een herinterpretatie van het begrip “intentie” kan bijdragen aan een beter begrip van hoe we andermans acties begrijpen, en hoe samenwerking mogelijk is. 156 157 Thank you Thank you Pim, for being my supervisor. 5 years ago I googled for possible collaborations, and I’m still glad you came up irst. Thank you for never lowering your expectations for my writings, my posters, my presentations, and even my attitude, which could be frustrating at times, but deinitely improved our project (and perhaps even my attitude slightly). Your capacity of getting to the core of a problem in a matter of seconds has never ceased to amaze me. Thank you Harold, for being my promoter. Your experience and your knowledge of the ield, your intuition of how to pitch an argument and what roads to pursue have been crucial for the succes of our project. Thank you for your wonderful way of managing a lab, a centre, and an institute. Thank you Iris, for being my copromoter, and for your contribution to the irst chapters of this thesis. Thank you for our endless discussions: our dificulties to ind agreement (on nearly everything) greatly sharpened the argumentation and formulation in our papers. Thank you Bill, for hosting me in San Diego. Your department and your courses have been of great inspiration. Thank you Dan M.F. Burnston, for our adventure in intentions and the prefrontal cortex, and for the good times we had in San Diego, Mexico, and Nijmegen. Thank you Markus for our endeavour to unravel infant action understanding. The speed at which we worked suggests that this thesis could have easily contained 20 chapters. Thank you my colleagues at the Donders Centre for Cognition, for making the DCC a great place to be a PhD student, and a great place to be. Thank you Maaike, for being my paranymf, for the countless salads we ate and cappuccinos we drank together. Thank you for the fun we had sailing (bailing water), running, playing squash, and everything else. Thank you Pascal, for being my paranymf, for the great fun we had exercising, running, drinking, and everything else. Finally, thank you Jacinthe and Daphne, for making the world outside of my PhD-project the best possible. 158 159 Publications Uithol, S., & Paulus, M. (under review). What do infants understand of others’ action? A theoretical account of early social cognition. Uithol, S., Burnston, D., & Haselager, W. F. G. (resubmitted). Will intentions be found in the brain?. Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2012). Hierarchies in Action and Motor Control. Journal of Cognitive Neuroscience, 24(5), 1077–1086. Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2011). Understanding motor resonance. Social Neuroscience, 6(4), 388–397. Uithol, S., van Rooij, I., Bekkering, H., & Haselager, W. F. G. (2011). What do mirror neurons mirror? Philosophical Psychology, 24(5), 607–623. Uithol, S., Haselager, W. F. G., & Bekkering, H. (2008). When Do We Stop Calling Them Mirror Neurons? Proceedings of the 30th Annual Conference of the Cognitive Science Society (pp. 1783–1788). 160 161 Curriculum vitae Sebo was born February 16th, 1977, in Langerak. After inishing high school in Heerenveen, he went to study mechanical engineering at the University of Twente, Enschede. Being unsatisied there, he switched to philosophy of science, technology and society. During the inal part of this master program he became interested in the mind-body problem, and wrote his master thesis about the possibility to ind mental representations using imaging techniques. After graduating he contacted Pim Haselager from Radboud University Nijmegen, and together they submitted a research proposal on the notion of representation, that received an internal NICI graduation grant. Simultaneously, Sebo got a small assignment as a teacher in the psychology program at the university of Twente. After inishing his PhD thesis, Sebo was offered an extension of his contract at the Donders Institute, to develop a number of ideas. 162 163 Donders graduate school for cognitive neuroscience series van Aalderen-Smeets, S.I. (2007). Neural dyPoser, B.A. (2009). Techniques for BOLD and namics of visual selection. Maastricht University, blood volume weighted fMRI. Radboud UniverMaastricht, the Netherlands. sity Nijmegen, Nijmegen, the Netherlands. 1 13 Schoffelen, J.M. (2007). Neuronal communication through coherence in the human motor system. Radboud University Nijmegen, Nijmegen, the Netherlands. 2 Baggio, G. (2009). Semantics and the electrophysiology of meaning. Tense, aspect, event structure. Radboud University Nijmegen, Nijmegen, the Netherlands. de Lange, F.P. (2008). Neural mechanisms of motor imagery. Radboud University Nijmegen, Nijmegen, the Netherlands. van Wingen, G.A. (2009). Biological determinants of amygdala functioning. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 3 Grol, M.J. (2008). Parieto-frontal circuitry in visuomotor control. Utrecht University, Utrecht, the Netherlands. 4 14 15 Bakker, M. (2009). Supraspinal control of walking: lessons from motor imagery. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 16 Bauer, M. (2008). Functional roles of rhythmic neuronal activity in the human visual and somatosensory system. Radboud University Nijmegen, Aarts, E. (2009). Resisting temptation: the role of Nijmegen, the Netherlands. the anterior cingulate cortex in adjusting cognitive control. Radboud University Nijmegen, NijmeMazaheri, A. (2008). The Inluence of Ongoing gen, the Netherlands. Oscillatory Brain Activity on Evoked Responses and Behaviour. Radboud University Nijmegen, Prinz, S. (2009). Waterbath stunning of chickens Nijmegen, the Netherlands. – Effects of electrical parameters on the electroencephalogram and physical relexes of broilers. Hooijmans, C.R. (2008). Impact of nutritional Radboud University Nijmegen, Nijmegen, the lipids and vascular factors in Alzheimer’s Disease. Netherlands. Radboud University Nijmegen, Nijmegen, the Netherlands. Knippenberg, J.M.J. (2009). The N150 of the Auditory Evoked Potential from the rat amygdala: Gaszner, B. (2008). Plastic responses to stress by In search for its functional signiicance. Radboud the rodent urocortinergic Edinger-Westphal nucleus. University Nijmegen, Nijmegen, the NetherRadboud University Nijmegen, Nijmegen, the lands. Netherlands. Dumont, G.J.H. (2009). Cognitive and physiWillems, R.M. (2009). Neural relections of ological effects of 3,4-methylenedioxymethamphetmeaning in gesture, language and action. Radamine (MDMA or ’ecstasy’) in combination with boud University Nijmegen, Nijmegen, the alcohol or cannabis in humans Radboud UniverNetherlands. sity Nijmegen, Nijmegen, the Netherlands. van Pelt, S. (2009). Dynamic neural representaPijnacker, J. (2010). Defeasible inference in autions of human visuomotor space. Radboud Uni- tism: a behavioral and electrophysiogical approach. versity Nijmegen, Nijmegen, the Netherlands. Radboud University Nijmegen, Nijmegen, the 5 17 6 18 7 8 19 9 20 10 21 Lommertzen, J. (2009). Visuomotor coupling at Netherlands. different levels of complexity. Radboud University de Vrijer, M. (2010). Multisensory integration in Nijmegen, Nijmegen, the Netherlands. spatial orientation. Radboud University Nijmegen, Nijmegen, the Netherlands. Poljac, E. (2009). Dynamics of cognitive control in task switching: Looking beyond the switch cost. Vergeer, M. (2010). Perceptual visibility and apRadboud University Nijmegen, Nijmegen, the pearance: Effects of color and form. Radboud UniNetherlands. versity Nijmegen, Nijmegen, the Netherlands. 11 12 22 23 164 Levy, J. (2010). In Cerebro Unveiling Unconscious Snijders, T.M. (2010). More than words – neural Mechanisms during Reading. Radboud Univer- and genetic dynamics of syntactic uniication. sity Nijmegen, Nijmegen, the Netherlands. Radboud University Nijmegen, Nijmegen, the Netherlands. Treder, M. S. (2010). Symmetry in (inter)action. Radboud University Nijmegen, Nijmegen, the Grootens, K.P. (2010). Cognitive dysfunction Netherlands. and effects of antipsychotics in schizophrenia and borderline personality disorder. Radboud UniverHorlings C.G.C. (2010). A Weak balance; sity Nijmegen Medical Centre, Nijmegen, the balance and falls in patients with neuromuscular Netherlands. disorders. Radboud University Nijmegen, Nijmegen, the Netherlands. Nieuwenhuis, I.L.C. (2010). Memory consolidation: A process of integration – Converging evidence Snaphaan, L.J.A.E. (2010). Epidemiology of from MEG, fMRI and behavior. Radboud Unipost-stroke behavioural consequences. Radboud versity Nijmegen Medical Centre, Nijmegen, University Nijmegen Medical Centre, Nijme- the Netherlands. gen, the Netherlands. Menenti, L.M.E. (2010). The right language: Dado – Van Beek, H.E.A. (2010). The reguladifferential hemispheric contributions to language tion of cerebral perfusion in patients with Alproduction and comprehension in context. Radzheimer’s disease. Radboud University Nijmegen boud University Nijmegen, Nijmegen, the Medical Centre, Nijmegen, the Netherlands. Netherlands. Derks, N.M. (2010). The role of the non-pregan- van Dijk, H.P. (2010). The state of the brain, how glionic Edinger-Westphal nucleus in sex-dependent alpha oscillations shape behaviour and event restress adaptation in rodents. Radboud University lated responses. Radboud University Nijmegen, Nijmegen, Nijmegen, the Netherlands. Nijmegen, the Netherlands. 24 37 25 38 26 27 39 28 40 29 41 Wyczesany, M. (2010). Covariation of mood and brain activity. Integration of subjective self-report data with quantitative EEG measures. Radboud University Nijmegen, Nijmegen, the Netherlands. 30 Meulenbroek, O.V. (2010). Neural correlates of episodic memory in healthy aging and Alzheimer’s disease. Radboud University Nijmegen, Nijmegen, the Netherlands. 42 Oude Nijhuis, L.B. (2010). Modulation of Beurze S.M. (2010). Cortical mechanisms for human balance reactions. Radboud University reach planning. Radboud University Nijmegen, Nijmegen, Nijmegen, the Netherlands. Nijmegen, the Netherlands. Qin, S. (2010). Adaptive memory: imaging van Dijk, J.P. (2010). On the Number of Motor medial temporal and prefrontal memory systems. Units. Radboud University Nijmegen, Nijme- Radboud University Nijmegen, Nijmegen, the gen, the Netherlands. Netherlands. Lapatki, B.G. (2010). The Facial MusculaTimmer, N.M. (2011). The interaction of ture – Characterization at a Motor Unit Level. heparan sulfate proteoglycans with the amyloid Radboud University Nijmegen, Nijmegen, the Beta-protein. Radboud University Nijmegen, Netherlands. Nijmegen, the Netherlands. 31 43 32 44 33 45 Kok, P. (2010). Word Order and Verb Inlection in Crajé, C. (2011). (A)typical motor planning and Agrammatic Sentence Production. Radboud Uni- motor imagery. Radboud University Nijmegen, versity Nijmegen, Nijmegen, the Netherlands. Nijmegen, the Netherlands. van Elk, M. (2010). Action semantics: Functional van Grootel, T.J. (2011). On the role of eye and and neural dynamics. Radboud University head position in spatial localisation behaviour. Nijmegen, Nijmegen, the Netherlands. Radboud University Nijmegen, Nijmegen, the 34 46 35 47 Majdandzic, J. (2010). Cerebral mechanisms of processing action goals in self and others. Radboud University Nijmegen, Nijmegen, the Netherlands. 36 Netherlands. Lamers, M.J.M. (2011). Levels of selective attention in action planning. Radboud University Nijmegen, Nijmegen, the Netherlands. 48 165 49 van Leeuwen, T.M. (2011). ‘How one can see what is not there’: Neural mechanisms of graphemecolour synaesthesia. Radboud University Nijmegen, Nijmegen, the Netherlands. 50 Scheeringa, R. (2011). On the relation between oscillatory EEG activity and the BOLD signal. Radboud University Nijmegen, Nijmegen, the Netherlands. van Tilborg, I.A.D.A. (2011). Procedural learning in cognitively impaired patients and its application in clinical practice. Radboud University Nijmegen, Nijmegen, the Netherlands. Bögels, S. (2011). The role of prosody in language comprehension: when prosodic breaks and pitch accents come into play. Radboud University Nijmegen, Nijmegen, the Netherlands. Bruinsma, I.B. (2011). Amyloidogenic proteins in Alzheimer’s disease and Parkinson’s disease: interaction with chaperones and inlammation. Radboud University Nijmegen, Nijmegen, the Netherlands. 51 Ossewaarde, L. (2011). The mood cycle: hormonal inluences on the female brain. Radboud University Nijmegen, Nijmegen, the Netherlands. 52 61 62 63 Voermans, N. (2011). Neuromuscular features of Ehlers-Danlos syndrome and Marfan syndrome; expanding the phenotype of inherited connective tissue disorders and investigating the role of the Kuribara, M. (2011). Environment-induced acti- extracellular matrix in muscle. Radboud Univervation and growth of pituitary melanotrope cells of sity Nijmegen Medical Centre, Nijmegen, the Xenopus laevis. Radboud University Nijmegen, Netherlands. Nijmegen, the Netherlands. Reelick, M. (2011). One step at a time. DisentanHelmich, R.C.G. (2011). Cerebral reorganizagling the complexity of preventing falls in frail older tion in Parkinson’s disease. Radboud University persons. Radboud University Nijmegen MediNijmegen, Nijmegen, the Netherlands. cal Centre, Nijmegen, the Netherlands. Boelen, D. (2011). Order out of chaos? AsBuur, P.F. (2011). Imaging in motion. Applicasessment and treatment of executive disorders in tions of multi-echo fMRI. Radboud University brain-injured patients. Radboud University Nijmegen, Nijmegen, the Netherlands. Nijmegen, Nijmegen, the Netherlands. Schaefer, R.S. (2011). Measuring the mind’s Koopmans, P.J. (2011). fMRI of cortical layers. ear: EEG of music imagery. Radboud University Radboud University Nijmegen, Nijmegen, the Nijmegen, Nijmegen, the Netherlands. Netherlands. Xu, L. (2011). The non-preganglionic Edingervan der Linden, M.H. (2011). Experience-based Westphal nucleus: an integration center for energy cortical plasticity in object category representation. balance and stress adaptation. Radboud UniverRadboud University Nijmegen, Nijmegen, the sity Nijmegen, Nijmegen, the Netherlands. Netherlands. Schellekens, A.F.A. (2011). Gene-environment Kleine, B.U. (2011). Motor unit discharges interaction and intermediate phenotypes in alcohol - Physiological and diagnostic studies in ALS. Rad- dependence. Radboud University Nijmegen, boud University Nijmegen Medical Centre, Nijmegen, the Netherlands. Nijmegen, the Netherlands. van Marle, H.J.F. (2011). The amygdala on Paulus, M. (2011). Development of action alert: A neuroimaging investigation into amygdala perception: Neurocognitive mechanisms underlying function during acute stress and its aftermath. children’s processing of others’ actions. Radboud Radboud University Nijmegen, Nijmegen, the University Nijmegen, Nijmegen, the NetherNetherlands. lands. De Laat, K.F. (2011). Motor performance in indiTieleman, A.A. (2011). Myotonic dystrophy type viduals with cerebral small vessel disease: an MRI 2. A newly diagnosed disease in the Netherlands. study. Radboud University Nijmegen Medical Radboud University Nijmegen Medical CenCentre, Nijmegen, the Netherlands. tre, Nijmegen, the Netherlands. 64 53 54 65 55 66 56 67 57 68 58 69 59 70 60 71 donders series Van der Werf, J. (2011). Cortical oscillatory activity in human visuomotor integration. Radboud University Nijmegen, Nijmegen, the Netherlands. 166 Mädebach, A. (2011). Lexical access in speaking: Studies on lexical selection and cascading activation. Radboud University Nijmegen, Nijmegen, the Netherlands. 72 Vrins, S. (2012). Shaping object boundaries: contextual effects in infants and adults. Radboud University Nijmegen, Nijmegen, the Netherlands. Poelmans, G.J.V. (2011). Genes and protein networks for neurodevelopmental disorders. Radboud University Nijmegen, Nijmegen, the Netherlands. Weber, K.M. (2012). The language learning brain: Evidence from second language and bilingual studies of syntactic processing. Radboud University Nijmegen, Nijmegen, the Netherlands. 73 van Norden, A.G.W. (2011). Cognitive function in elderly individuals with cerebral small vessel disease. An MRI study. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 74 Jansen, E.J.R. (2011). New insights into V-ATPase functioning: the role of its accessory subunit Ac45 and a novel brain-speciic Ac45 paralog. Radboud University Nijmegen, Nijmegen, the Netherlands. 75 Haaxma, C.A. (2011). New perspectives on preclinical and early stage Parkinson’s disease. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 76 Haegens, S. (2012). On the functional role of oscillatory neuronal activity in the somatosensory system. Radboud University Nijmegen, Nijmegen, the Netherlands. 77 van Barneveld, D.C.P.B.M. (2012). Integration of exteroceptive and interoceptive cues in spatial localization. Radboud University Nijmegen, Nijmegen, the Netherlands. 78 Spies, P.E. (2012). The relection of Alzheimer disease in CSF. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 79 Helle, M. (2012). Artery-speciic perfusion measurements in the cerebral vasculature by magnetic resonance imaging. Radboud University Nijmegen, Nijmegen, the Netherlands. 80 84 85 Verhagen, L. (2012). How to grasp a ripe tomato. Utrecht University, Utrecht, the Netherlands. 86 Nonkes, L.J.P. (2012). Serotonin transporter gene variance causes individual differences in rat behaviour: for better and for worse. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 87 Joosten-Weyn Banningh, L.W.A. (2012). Learning to live with Mild Cognitive Impairment: development and evaluation of a psychological intervention for patients with Mild Cognitive Impairment and their signiicant others. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 88 89 Xiang, HD. (2012). The language networks of the brain. Radboud University Nijmegen, Nijmegen, the Netherlands Snijders, A.H. (2012). Tackling freezing of gait in Parkinson’s disease. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 90 Rouwette, T.P.H. (2012). Neuropathic Pain and the Brain - Differential involvement of corticotropin-releasing factor and urocortin 1 in acute and chronic pain processing. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 91 Van de Meerendonk, N. (2012). States of Egetemeir, J. (2012). Neural correlates of real-life indecision in the brain: Electrophysiological and joint action. Radboud University Nijmegen, hemodynamic relections of monitoring in visual Nijmegen, the Netherlands. language perception. Radboud University Nijmegen, Nijmegen, the Netherlands. Janssen, L. (2012). Planning and execution of (bi)manual grasping. Radboud University Sterrenburg, A. (2012). The stress response of Nijmegen, Nijmegen, the Netherlands. forebrain and midbrain regions: neuropeptides, sex- 81 82 Vermeer, S. (2012). Clinical and genetic characterisation of Autosomal Recessive Cerebellar Ataxias. Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands. 83 92 93 speciicity and epigenetics. Radboud University Nijmegen, Nijmegen, The Netherlands Uithol, S. (2012). Representing Action and Intention. Radboud University Nijmegen, Nijmegen, The Netherlands 94 167