[go: up one dir, main page]

Academia.eduAcademia.edu

Varieties of Paternalism and the Heterogeneity of Utility Structures

A principal source of interest in behavioral economics has been its advertised contributions to policies aimed at 'nudging' people away from allegedly natural but self-defeating behavior toward patterns of response thought more likely to improve their welfare. This has occasioned controversies among economists and philosophers around the normative limits of paternalism, especially by technical policy advisors. One recent suggestion has been that 'boosting,' in which interventions aim to enhance people's general cognitive skills and representational repertoires instead of manipulating their choice environments behind their backs, avoids the main normative challenges. A limitation in most of this literature is that it has focused on relatively sweeping policy recommendations and consequently on strong polar alternatives of general paternalism and strict laissez faire. We review a real instance, drawn from a consulting project we conducted for an investment bank, of a proposed intervention that is more typical of the kind that economists are more often actually called upon to offer. In this example, the sophistication of current tools for preference attribution, combined with philosophical externalism about the semantics of preferences that makes it less plausible to attribute their literal self-conscious representation to people as propositional attitude content becomes more tightly refined, blocks applicability of the distinction between nudging and boosting. This seems to call for irreducible, context-specific ethical judgment in assessing the appropriateness of the forms of paternalism that economists must actually wrestle with in going about their everyday business.

Varieties of Paternalism and the Heterogeneity of Utility Structures Glenn W. Harrison1 Don Ross2 1 – Department of Risk Management and Insurance and Center for the Economic Analysis of Risk, Robinson College of Business, Georgia State University, USA, Robinson College of Business, Georgia State University. Harrison is also affiliated with the School of Economics, University of Cape Town, South Africa. E-mail: gharrison@gsu.edu 2 – School of Sociology and Philosophy, University College Cork, Ireland,; School of Economics, University of Cape Town, South Africa; and Center for Economic Analysis of Risk, Robinson College of Business, Georgia State University, USA. Email: don.ross@uct.ac.za Abstract A principal source of interest in behavioral economics has been its advertised contributions to policies aimed at ‘nudging’ people away from allegedly natural but selfdefeating behavior toward patterns of response thought more likely to improve their welfare. This has occasioned controversies among economists and philosophers around the normative limits of paternalism, especially by technical policy advisors. One recent suggestion has been that ‘boosting,’ in which interventions aim to enhance people’s general cognitive skills and representational repertoires instead of manipulating their choice environments behind their backs, avoids the main normative challenges. A limitation in most of this literature is that it has focused on relatively sweeping policy recommendations and consequently on strong polar alternatives of general paternalism and strict laissez faire. We review a real instance, drawn from a consulting project we conducted for an investment bank, of a proposed intervention that is more typical of the kind that economists are more often actually called upon to offer. In this example, the sophistication of current tools for preference attribution, combined with philosophical externalism about the semantics of preferences that makes it less plausible to attribute their literal self-conscious representation to people as propositional attitude content becomes more tightly refined, blocks applicability of the distinction between nudging and boosting. This seems to call for irreducible, context-specific ethical judgment in assessing the appropriateness of the forms of paternalism that economists must actually wrestle with in going about their everyday business. Keywords: nudging, paternalism, applied economics, risk preferences, investment choices JEL codes: A11, A13, B40, C44, C54, C93, D01, D14, D63, D81, D91 1. Nudging Versus Boosting A principal source of interest in behavioral economics has been its advertised contributions to policies aimed at ‘nudging’ people away from allegedly natural but selfdefeating behavior toward patterns of response thought more likely to improve their welfare. Leading early promotions of this kind of application of behavioral studies are Camerer et al (2003) and Sunstein & Thaler (2003a)(2003b). Grüne-Yanoff & Hertwig (2016) [GYH] have distinguished nudging, which is based on the heuristics-and-biases 1 (H&B) branch of behavioral economics research associated with Kahneman & Tversky (1982) and Kahneman (2011), from policies aimed at ‘boosting,’ which apply the ‘simple heuristics’ (SH) research program of Gigerenzer et al (1999), Todd et al (2012) and Hertwig et al (2013). Nudging and boosting are contrasted as follows. Nudges aim to change a decision-maker’s (DM) ecological context and external cognitive affordances in such a way that the DM will be more likely to choose a welfare-improving option without having to think any differently than before. Nudging is thus open to the charge that it is manipulative: see Ashcroft (2011) and Conly (2013; p. 8). Its defenders point out that if people are naturally prone to systematic error, then any scaffolding built by any institution unavoidably involves manipulation, so the manipulation in question might as well be benevolent. Boosting, by contrast, involves endowing DMs with enhanced cognitive capacities by teaching them more effective decision principles1, which they can choose to apply or not once they have been enlightened. Thus boosting, according to GYH, avoids manipulating the agents to whom the policies in question are applied, and is to that extent less paternalistic.2 An additional contrast relevant to normative assessment is that a nudge would normally be expected to have effects only on the specific behavior to which it is applied, and only in the setting that the nudge adjusts. A boost, on the other hand, to the extent that it alters standing cognitive capacities and associated behavioral propensities across ranges of structurally similar choice problems, might be hoped to generate ‘rationality spillovers’ discussed by Cherry et al (2003). Furthermore, boosting might plausibly capacitate people with defenses against non-benevolent nudging by narrowly selfinterested parties such as marketers and demagogues. The classic example of nudging is changing default options. If the policy maker thinks that workers ought to invest in retirement savings plans, then the policy maker can make participation the outcome if the DM is passive, needing to take action only if the DM wants to act on a preference not to participate. The leading example of a boost discussed by GYH is teaching people to represent the alternatives in risky decisions as natural frequencies, even when they are presented as probabilities. This is thought to improve the quality of choices because evidence suggests that some people are more likely to use ‘accuracy-promoting’ heuristics when reasoning about the former than when reasoning about the latter. Almost all examples in the literature on both nudges and boosts resemble these in taking the policy maker or the educator as the target community for whose consideration the policies are proposed. Though there is typically a general 1 GYH assume that the principles in question should be effective heuristics in the sense of Gigerenzer et al (1999). This reflects the arguable assumption that any general reasoning principle that most people can adopt reliably across a range of decision contexts is by definition a heuristic. 2 This motivation for boosting is similar to reasons given by John et al (2009) and John et al (2011) in favour of what they call a ‘think’ strategy for correcting people’s reasoning errors. Such strategies are a special case of boosting that work through engaging the intended beneficiaries in collective deliberation. The form of boosting we will consider does not involve such deliberation. We share the concerns of Le Grand & New (2015, p. 142) concerning the general practicality and likely effectiveness of think strategies. 2 presumption that members of these communities should prefer to avoid gratuitous paternalism, it is often assumed that their primary aim is to maximize the probability that DMs influenced by their policy choices or educational interventions will maximize their welfare. Examples are typically constructed in such a way that what is taken to be the welfare-maximizing behavior is transparent. This frame will strike many economists as problematic. Economists are typically more reluctant than policy makers or pedagogues to help themselves to opinions about what constitutes an agent’s welfare. There is a strong tradition in economics of treating preferences as summaries of, or statistical patterns in, actual choices, rather than as independent standards against which to try to regulate decisions. Clearly this is partly because mainstream economics descends historically and intellectually from utilitarian and classical liberal political and moral philosophies that view paternalism as more or less anathema. But suspicion about welfare judgments that aren’t derived directly from the observed behavior of the people whose welfare is being judged also has other, more deliberative, sources. First, economists are typically highly sensitive to prospects for unintended consequences of policies. They see these as mainly arising from the interactions of people with heterogeneous preferences, or differing resources, or both, and so are less sanguine than many policy makers about letting normative considerations that are not fully decentralized drive policy choices. A myriad of microscale decisions, economists often suppose, will tend toward equilibria in which each participant is making the best choice for herself that she can given the choices of everyone else. Thus economists are often more comfortable making welfare assessments ex post rather than ex ante. But both nudging and boosting depend on ex ante evaluations. Second, economists distinguish between welfare, a technical concept of their own construction that is by definition subjective, but for which they have a wellstocked and venerable analytical tool-kit, from well-being, a broader but vaguer idea on which philosophers have long tolerated and indeed fostered disagreement. Economists who emphasize the ‘positive’ nature of their enterprise, such as Friedman (1953), might simply assert that the merits or downsides of nudging and boosting are none of their concern ex ante, just as with all other normative questions. However, over the past couple of decades this has become a minority stance within the discipline. Leamer (2012) stresses that most economists think that theirs is policydriven inquiry, in the strong sense that the hierarchy of interesting problems largely derives from the practical requirements of the businesses, governments, and households that seek their advice. The majority of economic inquiry is not basic research but is commissioned by clients seeking assistance in policy selection and design. A more common view is that intervention to modify a target person’s behavior can be acceptable paternalism when it corrects (and merely corrects) for failures of the target’s rationality,3 while any proposal for intervention that imposes normative 3 Le Grand & New (2015) philosophically analyze government, as opposed to private, paternalism, and refer more broadly to corrections of “judgment” rather than corrections of ‘rationality.’ We endorse their semantic preference. However, in the context where we are characterizing views common among, specifically, economists, ‘rationality’ is the more accurate term. Le Grand & New (2015) defend the normative thesis that justification of paternalism requires identification of a correctible judgment. 3 judgments about the best way to live that the target might not share faces a prima facie obligation to morally justify the specific usurpation of the target’s autonomy. This is the approach of some behavioral welfare theorists, such as Bernheim & Rangel (2008), who argue for appeal to psychological facts about targets to ensure that when the economist’s advice implies over-ruling a target’s immediate preference, there is good reason to believe that the target’s ex post preference will accord with the judgment implied by the advice. For example, if a person’s behavior exhibits conflict between wanting to smoke and wanting to break the addiction, policy should side with the latter preference because, as a matter of psychological fact, few if any ex-smokers regret having quit, while most continuing smokers regret their recurrent lapses of willpower.4 These kinds of situations involving intrapersonal conflict and ambivalence are sometimes thought to mark the generic enabling conditions for acceptable nudging. Where they do not apply, the view would elaborate, we should try to change people only by teaching (or transparently incentivizing) them, not by manipulating them: that is, we should boost (or hire), not nudge. We are concerned with the distinction between nudging and boosting as it applies to what we believe to be a representative context of commissioned economic research. What we show is that the economist’s need to operate with a technically precise model of the information built into the utility functions assigned to agents exposes problematic simplifications in the way in which the nudging versus boosting distinction is normatively interpreted. The behavioral welfare theorist’s suggested meta-policy fails to give the economist helpful advice in the most common sorts of policy situations of practical interest. We emphasize our methodological focus on practical issues that arise for applied economists, as opposed to philosophical issues that dominate abstract debates. Philosophical discussions, as in Hausman (2011), often proceed, for understandable reasons, by considering the implications of conceptual distinctions for idealized, general, or hypothetical cases, set up so as to push pragmatic ‘side issues’ into the background. We are not directly engaging the debate at that level of abstraction. Thus we should not be interpreted as trying to argue that nudging and boosting are conceptually indistinguishable. It is clear enough that changing people’s behavior by altering its context and changing their behavior by teaching them new cognitive skills are not in We conjecture that most economists could be persuaded without much strain to agree that substituting the broader concept of judgment for a narrower concept of rationality would respect their normative concerns. However, incorporating that adjustment here would both require a distracting foray into wider issues in the philosophy of economics, and gratuitously complicate our focus on the interrelationship between economists’ normative assumptions and the technical resources they use in welfare analyses. 4 The idea here is not that preferences over options arising later in time should generally be regarded as dominating preferences over options arising earlier in time. The proposal of Bernheim and Rangel (2008) is that the welfare analyst should search for a choice environment in which the target agent’s preferences are consistent. Earlier or later time slices of the agent’s biography, drawn from environments in which consistency is violated, are treated as preferences of other agents. The welfare analyst then recommends any Pareto-consistent policies, applied to the community of subagents, that she can find. This of course allows for, indeed predicts, situations in which no recommendation between some alternative policies is favored. 4 general the same kind of thing, and that this difference is significant where concerns about paternalism arise. Our point, instead, will be to illuminate complexities that arise for this philosophically clear-enough distinction when it is exported from its home territory in purely normative policy and meta-policy debates, into an everyday domain of economic engineering. In this domain, normative and technical considerations are typically tightly entangled, as we illustrate. We argue that a meta-policy, according to which boosting is morally unproblematic, while nudging proposals must always be accompanied by responses to concerns about paternalism, is awkwardly adapted to the front line of applied economics. If we see economics as largely a policy science, a form of institutional engineering, then economists cannot simply refuse to engage with normative complexities. But Leamer (2012) also reminds us that philosophical distinctions developed in vivo need to be examined in situ if they are to be made fully relevant to economists. We conduct this exercise by describing a recent consulting project we carried out for a large South African retailer of investment products, and asking whether what we were doing for our client was helping them nudge their customers or helping them boost those customers. We also ask where any potential moral issues of interest arise, and for which parties. Crucially, our exercise was not designed to be a test-bed for conceptual or normative issues. Equally importantly, the advice we based on it, if implemented by the client, will have real consequences for individuals and households. In Section 2 we describe the commissioned experimental research that we conducted, and the advice we were asked to provide on the basis of it. Section 3 motivates the analyses we performed on the data, and the results we obtained. Section 4 pulls the preceding strands together and gives the argument for the main methodological conclusions. Although we have stressed that we are not engaged in first-order philosophical investigation into the idealized concepts of nudging and boosting, we believe that debates drawn from the philosophy of mind and agency can shed diagnostic light on the difficulties encountered in translating welfare theory into policy-focused practice. This diagnosis is outlined in our concluding Section 5. . 2. Helping Investment Product Retailers Give Better Customer Advice In 2014 we accepted a commission for research from a major South African retailer5 of household investment products, which are primarily mutual funds in American terminology. The company’s motivation in commissioning the research began from its observation, nearly universal in the industry, of many clients buying products that were sensible investments, given the clients’ stated savings and earnings goals, only assuming tolerance for pre-specifiable ranges and average durations of decline in net product value, and then selling back the products, or compounding losses by churning their portfolio elements, upon encountering the predicted episodes of decline. 5 Our not naming the company is part of a general policy observed here of censoring information that explicitly or implicitly reveals commercially valuable results of our research furnished to our client. This precludes our describing any results in terms of monetary magnitudes. 5 The company hoped to reduce the extent of this behavior. In general, a company can seldom expect to maximize its sales volumes, customer base, or brand reputation if many of its customers systematically fail to derive full value from its products due to misuse. Investment portfolios can be unusual where this relationship is concerned, however, because volumes of commissions to providers and their agents are typically driven up, rather than down, when clients over-churn. This incentive to encourage, or not fully discourage, client over-activity is countered by losses of business when disappointed clients withdraw their funds altogether. Over-churning by large proportions of clients can in extreme cases disrupt the performance metrics on a company’s funds. We had no access to our client’s accounts, so we cannot comment on the mixture of self-interest and social responsibility in its motivations for wishing to see more of its customers behave in a way that optimized their expected returns. But given the prominence of our client’s brand, we would be surprised if social responsibility were not a relevant factor. The company hypothesized that its customers might show greater resilience during periods of portfolio value decline if, when they chose their portfolios, they were presented with richer information about the histories of net value movements in the set of alternative products, formatted in a way thought to correspond to widespread patterns of cognitive adaptedness.6 The need for us to guard our client’s intellectual property limits the extent of detail with which we can describe this informational intervention. However, we can say enough to locate the intervention in terms of the distinction between nudging and boosting. The client’s customers, when meeting with a broker to choose portfolios, were typically told only about options’ probable long-run rates of return on initial investment, maximum expected ‘drawdown’ (lowest value likely to be visited by the asset’s value walk), and historical standard deviation. This allowed for a crude, qualitative operationalization of ‘risk aversion’: if a customer indicated discomfort with the maximum expected drawdown, they would be advised to opt for a portfolio with lower variance at the expense of a more modest expected longrun return. The client’s ‘education intervention,’ which we were asked to experimentally test, provided clients with online charts showing fill histories of portfolios under consideration. These showed historical variance in the strict sense, along with skew and kurtosis in distributions of returns. Furthermore, the information site was interactive so that the customer could retrieve definitions and brief explanations of the risk-related portfolio properties displayed. The intervention included no simple heuristics or motivating messages, of the kind which Ambuehl, Bernheim and Lusardi (2014) found under some circumstances can lead retail investors to choose less optimally (with respect to their subjective utility) than if they are provided with objective information only. Our research consisted in designing, administering, and analyzing a controlled trial of a prototype of the intervention. The client believed that most customers they perceived as ‘rational,’ in the sense that they did not prematurely sell their portfolios or over-churn, would be annoyed and discouraged by the time involved in experiencing the intervention, and might find the explanatory notes condescending. The client therefore wanted to identify demographic characteristics of potential customers that could predict which subsets of the customer base were likely to benefit from the 6 'Adaptedness' refers in evolutionary psychology to pre-adapted dispositions a subject brings to a task. 6 intervention. We brought to the client’s attention that scientifically estimated risk preference structures, which we could elicit in an experiment, might prove to be at least as informative as demographic properties. The client agreed that our experiment should explore this aspect. Our specific research design involved a sample of 193 subjects, who for reasons of convenience related to budget constraints were employees of the University of Cape Town (UCT). For each subject we estimated their aversion to risk, and then assigned them randomly to one of two investment treatments. Risk attitudes were measured by evaluating a series of choices by each subject between pairs of lotteries that had an average yield of 300 South African Rand (R300, which exchanged for about US$27 at the time of the experiment). In this Lottery Task, 50 pairs of lotteries were chosen at random from a set of 100 pairs and presented to the subjects sequentially on computer screens in the form of pie charts, illustrated in Figure 1. The subjects were asked to choose one lottery from each pair by clicking on the corresponding button below their preferred lottery. One of the 50 choices was selected at random for realization and payment. [Figure 1 about here] The data generated by performance of this task allowed us to estimate the structures of risk preferences for each subject. Lottery tasks similar to the ones employed here have been used to estimate risk preferences for individuals, typically using maximum likelihood estimation in the spirit of Hey & Orme (1994) and Harrison & Ng (2016). In the Investment Task, each subject chose simulated investment funds modeled on products available in the South African market, and received payment based on the simulated performance of the fund they chose. Subjects in the control treatment received names of investment funds with basic information on each fund: investment objective, return history, standard deviation, and maximum drawdown. Subjects in the treatment group were additionally provided with the ‘education intervention.’ To avoid uncontrolled interaction between laboratory objects and subjects’ varying knowledge of real-world objects, we designed simulated funds based on the principle that informed the original design of mutual funds available to retail consumers in South Africa. We coined names for the simulated funds that mimic those used by their providers. The expected performance of each simulated fund was based stochastically on the historical performance and volatility of the real funds that furnished their models. The simulated market was designed to be moderately bullish, such that the average take-home per subject from this part of the study would be R250.7 7 Subjects also made predictions of future events, indicating their degrees of confidence in their predictions, and were rewarded with cash payments of up to R100 when their predictions were correct, with rewards reduced commensurately with subjects’ confidence levels. Analysis of the results of this task will not figure in the discussion here, so we pass over design details. 7 In the Investment Task subjects were endowed with R65 and presented with 8 possible simulated funds in which they could invest their endowment. Each of these 8 funds represented an approximation to a financial product to which subjects could potentially have access through a brokerage. Each simulated fund was a discretized lottery of the continuous distribution of historical returns associated with the real-life counterpart of the simulated fund in question. The 8 simulated funds were composed of 4 types: high equity, medium equity, low equity, and interest bearing. There were two simulated funds per group in the choice set, representing the existence of competing products in the actual marketplace. Before the subjects made any choices in this task, it was explained that the task involved choosing an investment portfolio that would be played out against a simulated market. This market was represented by the 50,000 possible states of the world to which the real-world funds were mapped in discrete intervals. Subjects were told that, for practical reasons, one of these 50,000 states would be randomly selected to calculate their investment earnings for their experimental session before they had made the choices for this task. Die-rolling by subjects was used to select one of the simulated markets. The task started with a screen explaining that a certain amount of money was to be invested in one or more funds. The different types of funds were explained, but without details on their potential returns. Those in the treatment group were then presented with the interactive ‘education intervention’ that allowed exploration of the histories of the funds, in formats hypothesized to be cognitively accessible. Subjects in both the treatment and control groups were then allowed to allocate their endowments to funds, and everyone saw some base level of information about the potential fund returns: the expected 3-year and 5-year returns, the standard deviation of yearly returns, and the maximum drawdown of each fund. Subjects were asked to invest in as many funds as they wanted. After each subject had completed all of their experimental tasks, a research assistant tallied their earnings on a record sheet and then privately paid them in cash. 3. Analytical Methods and Results Idealized discussions of welfare and of economic policy have, at least until the recent emergence of the behavioral literature, taken the Expected Utility Theory (EUT) of Savage (1954) to provide the basic technical apparatus for normatively comparing alternative states of the world for an agent. Binmore (2009) provides an authoritative updating, with suitable cautions against hubristic over-extension, of this theoretical landmark in the context of contemporary operationalization. Behavioral economists often interpret their work as motivating revisions to, or, for those who favor rhetorics of disruption, paradigm replacements for, EUT. Among various formal models of choice under risk or uncertainty that are contrasted with EUT, the Cumulative Prospect Theory (CPT) of Tversky and Kahneman (1992) has received the most attention. A common strategy in both theoretical and applied behavioral economics has been to run ‘horse races’ between EUT and CPT or another alternative as rival models for estimating a specific data set, and urging that the winner of the race, that is, the model that yields the best fitting estimation, should then be used as the basis 8 for empirical interpretation. Such horse races stack the deck against EUT when, as is almost always the case, the other horse has greater structural complexity and observed behavior is economically heterogeneous (Ross 2005, pp. 174-176). When an investigator following this approach concludes that EUT is the ‘losing’ contender with respect to empirical estimation, the question remains open about how to proceed to normative analysis. An economist who follows Savage (1954) in thinking that EUT is the normatively correct model of ‘rational’ decision, regardless of the extent to which real human choice conforms to it, might analyze agents’ welfare against the outcomes they would have obtained had EUT correctly characterized their behavior. Alternatively, one might employ the latent utility function embedded in a more elaborate model of risk preferences, as proposed by Bleichrodt et al (2001). Following recent theoretical advances summarized in Harrison and Rutström (2008), the technical apparatus used to analyze the experiment we discuss goes beyond this ‘horse race’ methodology. It reflects advances in understanding of the relationship between CPT and other alternatives to EUT as descriptive models of choice estimated at the level of individuals. Define the risk premium as the difference between the actuarial expected value of a risky prospect and the certain amount of money an individual would accept in exchange for giving it up. Assume there is no bargaining process causing the individual to strategically mis-state this certainty equivalent if asked for it directly or indirectly. We consider two core models of decision-making under objective risk. One is and posits that the risk premium is explained solely by an aversion to variability of earnings from a prospect. The second is the Rank-Dependent Utility (RDU) model of Quiggin (1982), which further posits that decision-makers may be pessimistic or optimistic with respect to the probabilities of outcomes. RDU does not rule out aversion to variability of earnings, but augments it with an additional psychological process. The process may be ‘latent’ or ‘virtual’ in the sense associated with Dennett’s (1987) intentional stance;9 that is, it might not refer to a specific physical computation ‘in a person’s head,’ but to an equivalence class of relationships between decision contexts and observed choices. Both EUT and RDU assume that individuals asset integrate, in the sense that they net out framed losses from some endowment. EUT,8 We do not estimate our data using CPT. Our avoidance of CPT is based on analysis of its relationship to RDU, both theoretically and in application to empirical data. Harrison & Swarthout (2016) provide an extensive literature review, which finds 8 We consider decision making under objective risk because all of our methodological points can be made in that setting. An important extension would be to consider risk preferences under subjective risk, using either Subjective Expected Utility or some models that allow for uncertainty aversion when individuals do not apply the Reduction of Compound Lotteries axiom to subjective probability distributions. In that latter case some aspect of the distribution, other than the average, matters for decisions: see Harrison (2011; §4). Models that allow for ambiguity aversion when individuals do not even have well-formed subjective probability distributions could also be considered, but this would raise many additional issues of positive and normative methodology well beyond our immediate remit. 9 The intentional stance is discussed in Section 5. 9 that most reported evidence for ‘loss aversion’ is actually evidence for probability weighting. They also report evidence of (at least local) asset integration in the laboratory, which is fatal for empirical adequacy of CPT. Harrison and Ross (2017) review further evidence, and consider the implications for welfare assessment of the conjecture that the many reported ‘horse race’ victories of CPT over EUT were really wins for RDU in disguise, where CPT’s successes stemmed from its allowance for probability weighting rather than ‘utility’ loss aversion relative to an idiosyncratic reference point. We thus focus on EUT and RDU. We begin with EUT. Assume that utility of income is defined by a utility function U(x), where x is the lottery prize. Under EUT the probabilities for each outcome xj, p(xj), are those induced by the experimenter, so expected utility is the probability weighted utility of each outcome in each lottery. Once the utility function is estimated, risk aversion is measured. The concept of risk aversion traditionally refers to ‘diminishing marginal utility,’ which is driven by the curvature of the utility function, which is in turn given by the second derivative of the utility function. Although loose, this can be viewed as characterizing individuals that are averse to mean-preserving increases in the variance of returns. We assume that utility of income reflects constant relative risk aversion (CRRA), defined by U(x) = x(1-r)/(1-r) where x is a lottery prize and r≠1 is a parameter to be estimated. Then r is the coefficient of CRRA for an EUT individual: r=0 corresponds to risk neutrality, r<0 to a risk loving attitude, and r>0 to risk aversion. The RDU model extends EUT by allowing for decision weights on lottery outcomes. These decision weights reflect probability weights on objective probabilities. The decision weights are defined after ranking the prizes from largest to smallest. The largest prize receives a decision weight equal to the weighted probability for that prize: the decision weight reflects the probability weight of getting at least that prize. The decision weight on the second largest prize is the probability weight of getting at least that second largest prize, minus the decision weight of getting the highest prize. Similarly for other prizes. Subjects’ risk preferences were analysed based on the Lottery Task. Again, we conducted analysis based on the assumption that each subject’s behavior was either best characterized by EUT or by RDU. When a subject was estimated to be an RDU agent, we tested further to determine which of several probability weighting functions best characterized the pessimism or optimism about probabilities. We consider three popular probability weighting functions. The first is the ‘power’ probability weighting function with curvature parameter γ: γ(p) = pγ. So γ ≠1 is consistent with a deviation from the conventional EUT representation.10 The second probability weighting function is the ‘inverse-S’ function: ω(p) = pϒ / ( pγ + (1-p) γ )1/Υ. This function exhibits inverse-S probability weighting (optimism for small p, and pessimism for large p) for γ<1, and S-shaped probability weighting (pessimism for small p, and optimism for large p) for γ>1. The third probability weighting function is a general functional form proposed by Prelec (1998) that exhibits considerable flexibility. 10 Convexity of the probability weighting function, when γ>1, is said to reflect ‘pessimism’ and generates, if one assumes, for simplicity, a ‘linear’ utility function, a risk premium since ω(p) < p for all p and hence the RDU expected value (EV) weighted by ω(p) instead of p has to be less than the EV weighted by p. 10 This function is ω(p) = exp{-η(-ln p)φ} and is defined for 0<p≤1, η >0 and φ >0. The RDU agent is also assumed to have a CRRA utility function with parameter r. We can use the results from a specific subject to illustrate the type of risk preferences estimated. Consider subject #22. We first determine if subject #22 should be classified as an EUT or RDU decision-maker. The log-likelihood value calculated for the best RDU model (-27.0) is better than the log-likelihood of the EUT model (-28.9), so the subject would be classified as RDU with Prelec probability weighting function by this metric. The difference in log-likelihoods, however, is numerically quite small. Once we test for the subject being EUT, the null hypothesis that ω(p) = p cannot be rejected at the 5% or 1% significance level, since the p-value is 0.099; it would be rejected at the 10% level. Thus the classification of this subject depends on the significance level used, following Harrison and Ng (2016). If the sole metric for deciding if a subject was better characterised by EUT or RDU were the log-likelihood of the estimated model, then there would be virtually no subjects classified as EUT since RDU nests EUT. But if we use metrics of 10%, 5% or 1% significance levels on the test of the EUT hypothesis, then we classify 50%, 57% or 67%, respectively, of our 193 subjects with valid estimates as being EUT-consistent. Figure 2 displays these results using the 5% significance level. The left panel shows a kernel density of the 193 p-values estimated for each individual and the EUT hypothesis test that ω(p) = p; we use the best-fitting RDU variant for each subject. The vertical lines show the 1%, 5% and 10% p-values, so that one can see that subjects to the right of these lines would be classified as being EUT-consistent. The right panel shows the specific allocation using the representative 5% threshold. So 5% of the density in the left panel of Figure 2 corresponds to the right of the middle vertical line at 5%. [Figure 2 about here] We now turn to the data generated by the Investment Task. Our aim in the analysis of subjects’ investment choices was to identify whether the information provided under the treatment, our client’s education intervention, had a significant effect in reducing what we refer to, and described to our client as, subjects’ ‘welfare loss.’ The significance of this interpretation of the analysis will be critically revisited below. We made it explicit to our client that we viewed welfare loss as the difference between the certainty equivalents of the optimal portfolio conditional on risk preferences and the certainty equivalent of the actual portfolio chosen. The certainty equivalent (CE) is the certain, non-risky return that is equivalent in terms of a subject’s subjective utility to the expected utility or (alternatively, depending on the subject) rank-dependent utility of the risky return. We used the estimated expected utility or rank-dependent functionals for each subject to calculate the CE. This approach to welfare evaluation follows Harrison and Ng (2016). In estimating portfolio optima, we used a bootstrapping method, which we made less computationally intensive by optimizing over a grid of parameter values intended to map the range of feasible estimates, and then interpolating the bootstrapping procedure. Based on the distribution of point estimates of parameters, taking into account standard errors, we optimize portfolio allocations for the following parameter values: EUT: r = (0, 0.05, 0.1, …, 2, 2.5, 3, 3.5); RDU Power: r = (-10, -5, -3, -2, -1, 0, 0.1, 0.2, …, 1, 1.25, 1.5) and γ = (0.2, 0.7, 1.2, …, 3.2, 4, 5); RDU Inverse-S: r = (-10, -5, -3, -2, -1, 11 0, 0.2, 0.4, …, 1.6) and γ = (0.3, 0.4, 0.5, …, 1.1); RDU Prelec: r = (-10, -5, -3, -2, -1, 0, 0.25, 0.5, …, 2), η = (0.3, 0.8, 1.3, …, 2.8), and φ = (0.5, 0.7, 0.9, 1.1, 2, 3). Figure 3 displays the risk-return tradeoff from the simulated funds in the investment task. The return is the average of the annualized returns on the fund, and the risk is the standard deviation of the annualized returns on the fund. The returns here come from 50,000 simulations of fund performance, based on historical data on returns. We observe that for higher average returns the investor must be willing to take on greater risk, which is no surprise. But in some cases the extra return only entails a minimal increase in risk: for instance, compare the X123 Equity fund with the ABC Multi High fund. The evaluation of these increments in risk, exchanged for increments in return, depends on the attitude to risk of the investor, if we assume that the subjective risk perceptions of the investor match these historical returns. For each of the high, medium and low equity asset classes, the historical performance of a mutual fund in each class was derived from returns for the whole asset class.11 The second funds in each of the high and medium equity classes were simulations of real funds traded in the South African market. For the low equity fund, historical performance of the fund was equated to the historical inflation movement plus 5%. The interest bearing funds were derived from historical data using the interest bearing variable term funds and money market funds, respectively, also retailed in South Africa. Month-end price data from June 2001 to August 2014 were used to determine the funds’ performance parameters such as historical returns and standard deviation of returns. This period included the bull run of 2006/2007, the global financial crisis of 2007/2008, and the recovery period post-2008. [Figure 3 about here] Figure 4 shows the number of funds that received some allocations of the R65 subjects had available to invest. There is a clear mode at 2 funds, with very few subjects investing in more than 4 funds. Relatively few subjects chose to invest all of their money in one fund. Of course, this does not show us whether the funds invested in were optimal or how sub-optimal they were. [Figure 4 about here] The optimal allocation to equity funds was relatively easy to characterize. Using the relative risk aversion (r) as a summary, descriptive measure of the risk premium, we found that 100% of the endowment of R65 would optimally have been allocated to the ABC Company Equity Fund for all values of r up to 0.62, and then that fraction declines to about 50% as r approaches 1. The residual is entirely the 123 Company Equity Fund. The vast bulk of estimates of relative risk aversion in the laboratory are around 0.65, with some variation of course: see Harrison and Rutström (2008) for a survey. 11 We did not group asset classes based on subjective judgment. They were defined as per Association for Savings and Investment South Africa categories used by financial advisors. 12 Figure 5 shows the average allocation of investment funds to each fund, where the total that could be invested was R65. We show a vertical red line at the 50% mark for reference.12 In this display the funds are ordered in terms of smallest (average) allocation to largest, so one has to pay attention to the names of the funds. For the averages we see that the two equity funds received the highest average allocation, but that the 123 Company Equity Fund was only the third most popular in terms of median allocations. Figure 5 also displays the average allocations to all funds in comparison to the optimal allocations. Since we find that all optimal allocations should be to the two equity funds, we aggregate these funds and show the optimal allocation as R65, or 100% of the portfolio. The remaining funds should always receive a zero allocation. Viewed in this light, and ignoring the optimality of the allocation within equity funds, we can see that the average investor was making a qualitatively optimal investment, with the majority of allocations to the equity funds. However, the level of allocations falls short of the optimal amount of R65. The distance between the average observed allocations and the optimal allocations is what generates the welfare losses we reported. These distances only tell us that there will be a welfare loss on average: to evaluate the significance of that loss we evaluated the foregone CE from the observed portfolios compared to the CE of the optimal portfolio. [Figure 5 about here] Each CE calculation uses 50,000 draws from the multivariate normal distribution underlying the simulated funds. These CE are conditional on estimates of the parameters defining risk preferences, and the uncertainty of the estimates is allowed for by sampling 500 draws from the joint parameter distribution. The means of these 500 draws are the parameter point estimates based on the winning risk preference structure model for the individual at the 5% significance level, and the covariance matrix between the parameter estimates. Multivariate normality of the joint parameter distribution is assumed, which is potentially problematic with large standard errors for some subjects: very high or low estimates of probability weighting parameters give rise to implausible decision weight schemes, and very high or low estimates of the relative risk aversion coefficient give rise to numerical overflow. Simulated values of risk preference parameters were accordingly constrained within the following bounds: EUT: r ∈ [-5,5]; RDU Power: r ∈ [10,10], γ ∈ [0.2,5]; RDU Inverse-S: r ∈ [-10,10], γ ∈ [0.3,3]; RDU Prelec: r ∈ [-10,10], η ∈ [0.3,3], φ ∈ [0.3,3]. Welfare loss calculations could be performed for 174 of the 193 subjects. The remaining 19 were those for whom a winning model could not be assigned because the estimated coefficient of relative risk aversion was arbitrarily close to one. Negative welfare losses are calculated in several instances, because of the inaccuracies of the multilinear interpolation method, giving rise to a portfolio which is sub-optimal and yielding a lower CE than the actual allocation chosen. 12 The median allocations are close to the average allocation except for the Equity Fund. In that case the median is exactly R32.5, or 50% of the portfolio. 13 Each of the 500 simulations presents a set of risk preference parameters, conditional on which welfare loss can be calculated. For each of these simulations, a ttest can reveal whether the mean welfare loss is significantly lower for the treatment group than for the control group. We allow for the error with which risk preference parameters are estimated by performing the test for each simulation and examining the distribution of test results. Figure 6 displays the average welfare loss, in Rand, for each subject for which we could generate valid estimates of risk preferences and optimal portfolios conditional on those risk preferences. Truncating a small fraction of welfare losses greater than R300, we observe that the density of welfare losses is much smaller under the Education Intervention Treatment than under the Control. Hence we conclude that the Education Intervention Treatment leads to better decisions being made about investment in this setting, designed to mimic, under controlled conditions, the natural setting in which the intervention will be applied. [Figure 6 about here] Figure 7 shows that the Education Intervention Treatment did not generate a greater dispersion in welfare losses. This is useful to know, since this might have mitigated the benefits of the reduction in the average of welfare losses. [Figure 7 about here] Figures 8 and 9 show that the Education Intervention Treatment had benefits for both EUT and RDU decision-makers, but that the benefits for the RDU decision-makers are much larger. In part, this is because the RDU decision-makers suffered greater welfare losses even in the Control. It is easier to evaluate the total and marginal effects of various demographics and treatments using descriptive statistical methods such as a regression of average welfare loss. When the right-hand-side covariate is just the demographic characteristic or treatment dummy variable we evaluate the ‘total effect’ of the covariate, which is the effect taking into account all of the correlated effects of covariates that also vary with the covariate of interest. For example, if women are younger than men in our sample, then the total effect of women will also include any effect of being a woman and being younger. When the right-hand-side covariates are all demographic characteristics and treatment dummy variables we evaluate the ‘marginal effect’ of the covariate. Both total effects and marginal effects are of interest, and answer different questions. Figure 10 displays the total effect of each characteristic and treatment, sorted by the size of the effect. The Education Intervention Treatment is shown in bold. Figure 11 displays the marginal effect of each characteristic and treatment. In both cases we see a significant effect of the Education Intervention Treatment to reduce welfare losses. We also see, in both cases, a significant effect, to increase welfare losses, of the subject being classified as violating EUT. [Figure 8 about here] [Figure 9 about here] [Figure 10 about here] 14 [Figure 11 here] The average of the difference in mean welfare loss between control and treatment groups across the 500 simulations is R57.28 (median = R56.23) with standard deviation R17.98. Welfare loss was lower for the treatment group in all 500 simulations. A one-sided test, with the alternative hypothesis being that welfare loss is lower for the treatment group than for the control, yields a p-value < 0.05 in 392 of the 500 simulations. The p-value is < 0.1 for 460 simulations. In our concluding advice to our client, we emphasized that the value of their Education Intervention, measured in terms of client welfare, would depend on the proportion of RDU agents in their customer population. As our experimental subject pool was not representative of this population, we suggested that they might wish to run the Lottery Task on a large, randomly selected sample drawn from their client demographic. Generalizing this advice, our policy-relevant opinion is that the expected presence of significant numbers of people in South Africa whose risk preference structure is well characterized by an RDU structure is a main source of scope for investments in education about comparative details of portfolio risk structures to raise the frequency with which South Africans reach retirement with savings that better approximate available potentials. 4. Are we nudging or are we boosting? At first glance, the recommendation we made to our client concerning application of their Investor Education Intervention, based on our experimental results, might look like a prime case of boosting. If our advice were followed, investors would be presented with information about historical fund performances, in a format that would increase the likelihood that their decisions would optimize their returns, reducing the probability that their savings goals would be frustrated. The intervention is thus intended to directly improve the decision-making resources of the investor, especially the investor with a RDU risk preference structure, and might plausibly create rationality spillovers as discussed earlier. In particular, people familiarized with the richer information might be motivated to seek it out when they make other financial decisions under risky conditions. The intervention does not manipulate the targets in the straightforward sense of altering their environments without their knowledge. On deeper reflection, however, matters aren’t so clear-cut. The first three columns of Table 1 are taken from the GYH discussion of the differences between nudging and boosting. In the fourth column we add our assessment of the fit of this taxonomy to the recommendation we made to our client concerning application of their Investor Education Intervention. If we were to treat GYH’s table as providing eight (non-exclusive) criteria for distinguishing a nudge from a boost, then our recommended policy would emerge as an exact hybrid, matching a nudge on four criteria and a boost on the other four. 15 Table 1 Eight assumptions of the nudge and boost approaches Cognitive error awareness Nudge Boost Investor Education Intervention No Yes No No Yes Yes Yes No Yes Yes No Yes Yes No Yes Yes No No No Yes Yes No Yes Yes Must the decision maker be able to detect the influence of error? Cognitive error controllability Must the decision maker be able to stop or override the influence of the error? Information about goals Must the designer know the specific goals of the target audience? Information about the goals’ distribution Must the designer know the distribution of goals in the target audience? Policy designer and cognitive error Must experts be less error-prone than decision makers? Policy designer and benevolence Must the designer be benevolent? Decision maker and minimal competence Must the decision maker be able to acquire trained skills? Decision maker and sufficient motivation Must the decision maker be motivated to use trained skills? 16 Our assessments in the fourth column require some explanation and justification. Where the first row is concerned, the investors have historically not been able to infer that they decided in error until, arguably, well after the fact. Even then, according to our client, most did not attribute their early selling of their funds to any error made by them, though they sometimes expressed disappointment in the provider or advisor. But in general our advice does not rest on the assumption that any investors are ever aware of any errors. The suggestion is rather that information about historical distributions of fund values make people who reveal RDU risk preference structures behave more like people with EUT risk preferences. With respect to the second row, clearly the intervention is motivated by the client’s view that many investors choose in such a way as to undermine their own welfare, as attributed based on their observed behavior, but can be induced to alter their decisions in at least a significant proportion of instances. Concerning our assessments in the third and fourth rows, the main point of the further experimental evidence we urged our client to obtain is to gain richer knowledge of the structure of their customers’ preferences (i.e., RDU or EUT), and of the distribution of non-EUT preferences.13 Clearly this implies, as per the fifth row, that the experts are less error prone than the investors, and it is far from clear that it would be generally efficacious for the experts to try to explain the differences between RDU and EUT preference structures to investors. Where the sixth row is concerned, as discussed earlier we suspect that our client is benevolent about investors’ welfare to some extent, but this motivation is not necessary, as it is in the investment house’s interest for customers to maintain their investments through market downturns. Finally, the intervention is only efficacious to the extent that investors are able and motivated to be influenced by carefully designed representations of more complete information to choose in ways that better approximate what they would choose were they expected utility optimizers. The general diagnosis of the hybrid nature of the intervention as between nudging and boosting lies in the epistemic status and the normative presuppositions of the economic experts (i.e., us). With respect to the former, we have technical knowledge about the relationship between objective risk and subjective preference structures that investors lack, and that would be difficult to directly explain to most of them, let alone to directly inspire through exhortation (Ambuehl, Bernheim and Lusardi 2014). Concerning normative presuppositions, we assume that by revealing preferences in relatively simple decision contexts, choices between risky lotteries, people provide an informational basis for assessing the implications for their own welfare of decisions in more complicated circumstances. This follows, in part, an approach exemplified and promoted in a similar problem context by Harrison and Ng (2016), when they evaluate the welfare gain ‘introduced 13 A referee objected that our client would not need to track specific goals of any customers once the intervention had been administered, but would simply leave it to the ‘educated’ customers to reflect their new information in their choices or not. Thus the referee suggested that our assessments should be “no” in column 4 of rows 3 and 4. This suggestion depends on equivocation over what the intervention is: the client viewed administration of the education intervention as burdensome to customers. If the client company follows our advice, then, it will be selecting certain customers to be burdened on the basis of identifications made by it, not on the basis of selfidentifications by customers of their own needs in light of enhanced knowledge. 17 into the world’ by a standard type of indemnity insurance product. They aim to reliably estimate the distribution of risk preferences among individuals, and the distribution of their subjective beliefs about loss contingencies and likelihood of payout, so as to identify a certainty equivalent of a risky insurance policy that can be compared to the certain insurance premium. This simple logic extends to non-standard models of risk preferences, such as RDU, in which some people exhibit ‘optimism’ or ‘pessimism’ about loss contingencies in their evaluation of the risky insurance policy. Harrison and Ng (2016) illustrate the application of these basic ideas about the welfare evaluation of insurance policies in a controlled laboratory experiment, just as we do in the case study reviewed here. They estimate the risk preferences of individuals from one task, and separately present each individual with a number of insurance policies in which loss contingencies are objective, so there is no issue about subjective beliefs being biased. They then estimate the expected consumer surplus gained or foregone from observed take-up decisions. There is striking evidence of foregone expected consumer surplus from incorrect take-up decisions. This motivates a highly relevant and general policy conclusion, namely, that the metric of take-up itself, widely used in welfare evaluations of insurance products, provides a qualitatively incorrect guide to the expected welfare effects of insurance. Economists typically infer agents’ subjective assessments of value from their actual choices. This need not be based on an analytic identification of preferences with choices, as in Samuelson’s (1937)(1938) original version of revealed preference theory. Ross (2014) argues that is more defensibly based on the philosophical thesis of externalism about the contents of intentional attitude ascriptions, upon which we elaborate in Section 5 below. According to that thesis, such attitudes, which include beliefs as well as preferences, are ascribed by people to others and to themselves in such a way as to rationalize patterns of observed behavior (including utterances). Thus we do not take preferences to be internal psychological states. Intentional attitude ascription is holistic, taking account of all such behavior as is evident. We thus have no quarrel with the insistence of Hausman (2011) that preference ascriptions implicate assumptions about beliefs, but we add to this the claim that belief ascriptions likewise implicate assumptions about preferences. The co-dependence of belief ascription and preference ascription is not viciously circular. Intentional attitude ascription is recursive and always open to revision as more evidence arrives. With Binmore (2009) we regard it as misleading to say that a person’s preference for some X over some Y is a cause of their choosing X over Y; on the other hand, behavior that is rationalized by ascribing a preference for X’s over Y’s can be part of the information background for predicting or explaining a specific new instance of choice of X over Y. Furthermore, past behavior rationalized by this preference ascription can also be part of the explanatory background for a choice among other contingencies related to X and Y, and this can be crucial in motivating welfare judgments. Let us apply this methodological point to the normative analysis given by Harrison and Ng (2016). Suppose we think that a person has chosen an insurance policy that will reduce their utility relative to the state in which they did not choose the policy. If we were forced by crude revealed preference dogma to say that the choice of the policy necessarily revealed a preference for having the policy over not having the policy, then it would be impossible for any such choice to ever be deemed welfare reducing. This would show that the concept had been drained of the content that makes it useful. 18 If we can’t even say that a person reduces their welfare when they buy an actuarially unsound insurance policy (which people do), then we’ll never be able to say anything about welfare in an applied context. But it would be consistent with taking behavior as the informational basis for preference ascription to hold that the choice was a mistake based on its inconsistency with ascription of a risk preference structure attributed on the basis of a run of the person’s other behavior. Lottery choices made under controlled experimental conditions, as in our case study, arguably provide a more direct and less noisy probe of risk preference structure than the choices of investment funds, also made in the lab, with which to make comparisons. Of course attribution of risk preferences derived from the lottery choices to the subjects choosing funds depends on the assumption that to some specified extent subjects’ risk preferences are stable across choice contexts. This is often, though not always, a reasonable assumption in policy contexts. 14 This general methodological approach allows the economist to draw useful conclusions about what types of decisions led to welfare losses, and to identify demographics that are more likely to make those types of decisions. To illustrate, again from the insurance policy choices considered by Harrison and Ng (2016): out of all purchase decisions made by the subjects in their experiment, 60% were associated with a welfare loss. Notably, female subjects had a 9.8 pp higher chance than men of making such excess purchase errors, with a 95% confidence interval between 0 pp and 20 pp. When Harrison & Ng (2016) consider the marginal effect of gender, controlling for other demographics, this estimated effect was 11.8 pp with a 95% confidence interval between 1 pp and 23 pp. This type of information allows the economist to recommend structured interventions to improve decisions by targeting certain demographic groups and certain types of errors. A further potential knowledge gain from welfare assessment based on sophisticated revealed preference experiments in lab and field is that one can rigorously identify which axioms of a normative model of risk preferences fail when one observes expected welfare losses. For instance, are the subjects that suffer losses when faced with an index insurance product those for whom the Reduction of Compound Lotteries axiom fails behaviorally? Precise characterizations of such failures can be identified in experiments (e.g., Harrison, Martínez-Correa and Swarthout 2015), just as the lottery battery employed in the Investor Education Intervention study allows us to structurally identify behavioral failures of the Compound Independence axiom. It might seem that all of this amounts only to a modest, practical point that should be of limited interest to theorists. That is, we might seem to be saying only that, although the concepts of nudging and boosting are as clear as can reasonably be expected at the abstract level, consulting clients often frame the questions they assign to economists in terms that force the distinction to be elided in practice. In that case, it might be thought that the sole upshot is that economists could usefully bring the nudging/boosting distinction to clients’ attention while research briefs are being 14 Sugden (2004)(2009) denies, at least, that the assumption is viable generally enough to provide a sound methodology for normative economics. We take up his objection in Section 5. 19 negotiated, so that clients will at least appreciate that presuppositions they bring to the framing of their policy options may embed normative blind spots. In fact, however, we think that lessons of deeper methodological, and indeed philosophical, significance can be taken from the main case study we have presented, and from its relationship to the Harrison and Ng (2016) case. We draw out these implications in the concluding section. 5. Welfare Analysis From the Intentional Stance In their welfare analysis of insurance product choices, Harrison and Ng (2016) use the best descriptive model of risk preferences to make normative evaluations for their subjects. As they put it, periculum habitus non est disputandum (“risk habits are not to be disputed.”) By contrast, Bleichrodt et al (2001) maintain that EUT is the appropriate normative model, and correctly note that if an individual is an RDU or CPT decision-maker, then recovering the utility function from observed lottery choices requires allowing for probability weighting and/or sign-dependence. They then implicitly propose using that utility function to infer the certainty equivalent using EUT. These are radically different normative positions. Some notation will help. Let RDU(x) denote the evaluation of an insurance policy x in Harrison and Ng (2016) using the RDU risk preferences of the individual, including the probability weighting function. They calculate the certainty-equivalent CE by solving URDU(CE) = RDU(x) for CE, where URDU is the utility function from the RDU model of risk preferences for that individual. But Bleichrodt et al (2001) evaluate the CE by solving URDU(CE) = EUT(x) where EUT(x) uses that utility function in an EUT manner, assuming no probability weighting. This strikes us as normatively illogical. The logical approach here would be to estimate the “best fitting EUT risk preferences” for the individual from their observed lottery choices, and then use the utility function UEUT as the basis for evaluating the CE using UEUT(CE) = EUT(x). In our case study we follow the approach of Harrison and Ng (2016) as described above. The choice of this approach is evidently of direct relevance with respect to the extent of paternalism involved in normative assessment. We think that it can be justified on deeper philosophical grounds. In our case study, although we recommended addition cognitive preparation for RDU choosers before they selected investment products, we did not recommend trying to teach them the concept of probability weighting so they could then apply this characterization to themselves. This is only partly motivated by the questionable practicality of the pedagogical task that would be required. It also reflects wariness about telling subjects a story about themselves they would surely interpret as telling them that they possess a kind of internal psychological ‘defect’ when such a story would outrun our available data and is in any case doubtful according to sophisticated philosophy of mind. It is unlikely that most people choosing investment funds attempt to compute internally represented optima – either from EUT or RDU bases – and then make computational errors that could be pointed out to them. This echoes a point made by Infante, Lecouteux and Sugden (2016) (ILS) when they complain that behavioral welfare economists typically follow Hausman (2011) in ‘purifying’ empirically observed 20 preferences. ILS argue that purification reflects an implicit philosophy according to which an ‘inner’ Savage-rational agent is ‘trapped within’ a psychological, irrational ‘shell’ from which best policy should try to rescue her. ILS provide no general philosophical framework within which they motivate their skepticism about ‘inner rational agents’. However, such a framework is available. Dennett (1987) provides a rich account of the ontology of beliefs, preferences and other ‘propositional attitude’ that relate behavioral and cognitive dispositions to different states of the world and to different representations of those states. Dennett (1987) argues at length that ascribing preferences and beliefs involves taking the intentional stance toward an agent. This consists in assuming that the agent’s behavior is guided by goals and is sensitive to information about means to the goals, and about the relative probabilities of achieving the goals given available means. Goals, like preferences and beliefs, are not internal states of agents, but are rather relationships between agents, environments, and ascribers, The baseline case for understanding such ascription is effort by a third party to interpret and predict the agent’s actions by means of controlled speculation about an agent’s overall behavioral ecology and informationprocessing capacities. Crucially, people are socially obliged, and trained during socialization while growing up, to adopt the intentional stance toward themselves. For the sake of coordination in both action and communication, agents’ self-ascriptions are made under constraint of at least approximate alignment with ascriptions of others. These ascriptions and self-ascriptions are not guesses about ‘true’ beliefs and preferences hidden from direct view in people’s heads. Rather, constructed rationalizations of agents’ behavioral and cognitive ecologies is what beliefs and preferences are. Critics have sometimes misinterpreted this view as instrumentalism, a doctrine according to which beliefs and preferences are mere useful fictions. Dennett has consistently maintained, however, that there are facts of the matter about agents’ goals and access to information, and hence also facts about their propositional attitudes. It may be true that Carol goes to work because she believes that if she does she will get paid, and prefers having the paycheque to having the leisure she would gain if she bunked the job; but this truth status need not depend on there being discrete, recurring states of Carol’s nervous system that realize the belief and, separately, the preferences. Beliefs and preferences are virtual states15 of whole intentional systems rather than particular physical states of brains; but being virtual is a way of being real, not a way of being fictitious. 15 One way of understanding virtual states is as reaction potentials coupled with environmental affordances in the sense of Gibson (1977), except that the affordances in question will frequently be features of social events rather than (only) features detectable directly by sensory transducers. Because intentional states are propensities inferred from patterns of behavior, they approximately correspond to what some psychologists call ‘latent’ tendencies. However, psychologists often suppose that latent states have discrete neural realizations that might be discoverable by brain probes or functional neuroimaging. The use of ‘virtual’ expresses the view among many current philosophers that intentional states generally do not have such realizations because their semantic contents, what is believed or desired or preferred, vary partly with conditions external to the bodies of the agents whose states they are (Burge 1986; McClamrock 1995). 21 If a claim about intentional states is the sort of claim that can have a truth value, then it had better be possible to specify possible evidence that would undermine it. The holistic nature of intentional stance description allows for error, but also complicates it. Suppose we did not know, in setting out to explain Carol’s behavior, that she has just won the lottery and so no longer needs the paycheque; but suppose further we also did not know that she would be ashamed to pass on a half-finished project to the colleague who will succeed her. On this hypothetical scenario, we predicted correctly that Carol would go to work because our two bits of ignorance cancelled one another out; but the error will reveal itself as we widen the sample of observations so that we include days beyond completion of Carol’s current projects. It can also show up when we expand the range of behavior the intentional stance is called upon to rationalize – when we ask, for example, why Carol is no longer starting any new projects. Nevertheless, the holism of intentional attitude ascription does leave room for interpretive slack that we would not expect if we embraced naïve psychological realism associating beliefs and preferences with particular occurrent states in nervous systems. When we say that Carol prefers not to leave projects partly completed, do we refer to her conscientiousness, or to her fear of harm to her reputation? There might or might not be a fact of the matter here, and whether there is or isn’t might not be relevant to the accuracy of the preference ascription. Ross (2014) argues that this marks a main basis for the distinction between economics and psychology. Psychologists are professionally interested directly in how individuals process information, including information that influences decisions. Economists, by contrast, are concerned with this only derivatively. If a system of incentives will lead various people, through a heterogeneous set of psychological processes, to all make the same choice then the people form, at least for an analysis restricted to that choice, an equivalence class of economic agents. But it is a strictly empirical matter when this psychological heterogeneity will and won’t matter economically. Economists, like all scientists, seek generalizations that support out-ofsample predictions. Different data-generating processes tend to produce, sooner or later, different data, including different economic data (that is, series of or patterns in incentivized choices). Economics is thus crucially informed by psychology in general, while not collapsing into the psychology of valuation as some behavioral economists have urged (Camerer et al 2005). Applying this philosophy of mind and agency to our main case study,, we assume the intentional stance to make sense of our experimental subjects’ overall behavioral patterns, and use the lottery choice experiment as a relatively direct source of constraint on the virtual preference structures we assign when we perform welfare assessment of their investment fund choices. Externalism about preference content blurs the distinction between ‘treating’ the subject and ‘treating’ the subject’s environment. Furthermore, the more precisely we specify the contents of propositional attitudes, especially in quantitative terms, the less weight in identification will rest on ‘inboard’ elements of data generating processes relative to external aspects of the agents’ overall behavioral ecologies.16 Our technical tools allow us to identify virtual intentions that most subjects are not able to identify when they take the intentional stance to themselves, and that 16 Clark (1998) refers to these external elements as ‘cognitive scaffolding.’ Ross (2005)(2014) develops the role of scaffolding in specifying and identifying utility functions using sophisticated revealed preference theory. 22 they could not deliberately use to evaluate their own decisions. On the other hand, our experiment provides evidence that attention to certain informational patterns induces a significant number of subjects to act as if they were stochastically closer to expected value optimizers. These patterns therefore enter into a fully informed analyst’s specification of the subjects’ beliefs and preferences. In this philosophical framework, it makes sense to say that we boost the subjects’ informational access in a way that nudges their (sub-deliberative) cognition. It helps to contextualize our approach to normative analysis to contrast it with the more radical revisionism advocated by Sugden (2004)(2009). He develops an insightful framework for normatively evaluating agents’ outcomes under alternative institutional arrangements in a way that privileges their autonomy as choosers (i.e., their consumer sovereignty) without depending on their specific preference orderings, and thus without requiring their preferences to even be consistently ordered, let alone fully EUT-compliant. According to Sugden (2004)(2009), agents are made better off to the extent that their opportunity sets are expanded, and worse off to the extent that their opportunity sets are contracted. Against this standard, ‘pure’ boosts will typically make agents better off and ‘pure’ nudges will typically make them worse off. We find this idea, which Sugden (2004)(2009) elegantly formalizes, attractive as a way of addressing normative questions in circumstances where welfare analysis in the technical sense is not possible due to preference reversals. Thus, for example, this approach can generate recommendations in cases where the method of Bernheim and Rangel (2008) would find Pareto indifference and therefore yield no guidance. But we should not abjure ever doing standard welfare analysis merely because it can’t be undertaken in every context. In both the Harrison and Ng (2016) case and in the situation presented to us by our consulting client, the complications arise from the existence of preferences that violate EUT but are nevertheless well-ordered. We suggest that this is the standard situation where relevant utilities are expected monetary values.17 To summarize, the claimed normative advantage of boosting over nudging relies on the distinction between altering an agent’s inner and outer environments. This might seem relatively straightforward if we assume, as many behavioral economists do, that the utility functions on which welfare analysis is based are generally grounded in latent cognitive processes on the ‘inboard’ side of the agent/environment boundary. However, economists model utility in a way that is better captured by externalist/ascriptionist accounts of minds such as Dennett’s intentional stance (Ross 2014). This complicates, though it does not vitiate, attempts to apply the nudging/boosting distinction to practical economic welfare assessments. 17 Someone who thinks that even expected monetary payoffs are typically hyperbolically discounted by people would quarrel with this suggestion. Sugden (2004)(2009) implies this concern. But that hypothesis is rejected by the leading psychological theorist of hyperbolic discounting, Ainslie (1992), and is contrary to empirical findings reported by Andersen et al (2014). 23 References Abdellaoui, M., Bleichrodt, H., & Paraschiv, C. (2007). Measuring loss aversion under prospect theory: A parameter-free approach. Management Science 53: 16591674. Abdellaoui, M., l’Haridon, O., & Paraschiv, C. (2013). Individual vs. couple behavior: An experimental investigation of risk preferences. Theory and Decision 75: 175-191. Ainslie, G. (1992). Picoeconomics. Cambridge: Cambridge University Press. Andersen, S., Harrison, G.W., Lau, M. I. and Rutström, E. E. (2014). Discounting Behavior: A Reconsideration. European Economic Review, 71, 15-33. Ambuehl, S; Bernheim, B. D., and Lusardi, A. (2014). The effect of financial education on the quality of decision making. NBER Working Paper 20618. http://www.nber.org/papers/w20618 Ashcroft, R. (2011). Personal financial incentives in health promotion: Where do they fit in an ethic of autonomy. Health Expectations 14: 191-200. Bernheim, B. D (2009). Behavioral welfare economics, Journal of the European Economic Association 7: 267-319. Bernheim, B.D., & Rangel, A. (2008). Choice-theoretic foundations for behavioral welfare economics. In A. Caplin and A. Schotter, eds., The Foundations of Positive and Normative Economics: A Handbook, pp. 155-192. Oxford: Oxford University Press. Binmore, K. (2009). Rational Decisions. Princeton: Princeton University Press. Bleichrodt, H., Pinto, J., & Wakker, P. (2001). Using descriptive findings of prospect theory to improve the prescriptive use of expected utility. Management Science, 47: 1498-1514. Booij, A., & van de Kuilen, G. (2009). A parameter-free analysis of the utility of money for the general population under prospect theory. Journal of Economic Psychology 30: 651-666. Burge, T. (1986). Individualism and psychology. Philosophical Review 95: 3-45. Camerer, C., Issacaroff, S., Loewenstein, G., O’Donaghue, T., & Rabin, M. (2003). Regulation for conservatives: Behavioral economics and the case for asymmetric paternalism. University of Pennsylvania Law Review 151: 1211-1254. Camerer, C., Loewenstein, G., & Prelec, D. (2005). Neuroeconomics: How neuroscience can inform economics. Journal of Economic Literature 43: 9-64. Cherry, T., Crocker, T., & Shogren, J. (2003). Rationality spillovers. Journal of Environmental Economics and Management 45: 63-84. Clark, A. (1998). Being There. Cambridge, MA: MIT Press.Conly, S. (2013). Against Autonomy: Justifying Coercive Paternalism. Cambridge: Cambridge University Press. 24 Cox, J. & Sadiraj, V. (2008). Risky decisions in the large and in the small: Theory and experiment. In J. Cox & G. Harrison (Eds.). Risk aversion in experiments. Bingley, UK: Emerald. Dennett, D. (1987). The Intentional Stance. Cambridge, MA: MIT Press. Fishburn, P., & Kochenberger, G. (1979). Two-piece von Neumann-Morgenstern utility functions. Decision Sciences 10: 503-518. Friedman, M. (1953). Essays in Positive Economics. Chicago: University of Chicago Press. Gibson, J.J. (1977). The theory of affordances. In R. Shaw & J. Bransford (Eds.), Perceiving, Acting, and Knowing: Toward an Ecological Psychology, pp. 67-82. Hillsdale, NJ: Lawrence Erlbaum. Gigerenzer, G., Todd, P., & the ABC Research Group (1999). Simple Heuristics that Make Us Smart. Oxford: Oxford University Press. Grüne-Yanoff, T., & Hertwig, R. (2016). Nudge versus boost: How coherent are policy and theory? Minds and Machines, forthcoming. Harless, D. (1992), Predictions about indifference curves inside the unit triangle: A test of variants of expected utility. Journal of Economic Behavior and Organization 18: 391-414. Harrison, G., & List, J. (2004)., Field experiments. Journal of Economic Literature, 42: 1009-155. Harrison, G., Martínez-Correa, J., & Swarthout, J.T. (2015). Reduction of compound lotteries with objective probabilities: Theory and evidence. Journal of Economic Behavior and Organization 119: 32-55. Harrison, G., & Ng, J.M. (2016). Evaluating the expected welfare gain from insurance. Journal of Risk and Insurance 83: 91-120. Harrison, G., & Ross, D. (2017). The empirical adequacy of cumulatve prospect theory and its implications for normative assessment. CEAR Working Paper 2017-01, Center for Economic Analysis of Risk, Robinson College of Business, Georgia State University. Harrison, G., & Rutström, E. (2008). Risk aversion in the laboratory. In J. Cox & G. Harrison (Eds.), Risk Aversion in Experiments, pp. 41-196. Bingley: Emerald. Harrison, G., & Rutström, E. E. (2009). Expected utility and prospect theory: One wedding and a decent funeral. Experimental Economics 12: 133-158. Harrison, G., & Swarthout, J. T. (2016). Cumulative prospect theory in the laboratory: A reconsideration. CEAR Working Paper 2016-05, Center for Economic Analysis of Risk, Robinson College of Business, Georgia State University. Hausman, D. (2011). Preference, Value, Choice and Welfare. Cambridge: Cambridge University Press. 25 Hertwig, R., Hoffrage, U., & the ABC Research Group (2013). Simple Heuristics in a Social World. Oxford: Oxford University Press. Hey, J.D. & Orme, C. (1994). Investigating Generalizations of Expected Utility Theory Using Experimental Data. Econometrica 62(6): 1291–1326. Infante, G.; Lecouteux, G,, and Sugden, R. (2016). Preference purification and the inner rational agent: A critique of the conventional wisdom of behavioral welfare economics. Journal of Economic Methodology 23: 1-25. John, P., Cotterill, S., Moseley, A., Richardson, L., Smith, G., Stoker, G., & Wales, C. (2011). Nudge, Nudge, Think, Think: Experimenting With Ways to Change Civic Behaviour. London: Bloomsbury Academic. John, P., Smith, G., & Stoker, G. (2009). Nudge nudge, think think: Two strategies for changing civic behavior. Political Quarterly 80: 361-370. Kahneman, D. (2011). Thinking Fast and Slow. New York: Farrar, Straus and Giroux. Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica 47: 263-292. Kahneman, D., Slovic, P., & Tversky, A. (1982). Judgment Under Uncertainty: Heursitics and Biases. Cambridge: Cambridge University Press. Köbberling, V., & Wakker, P. (2005). An Index of Loss Aversion. Journal of Economic Theory 122: 119-131. Leamer, E. (2012). The Craft of Economics. Cambridge, MA: MIT Press. Le Grand, J., & New, B. (2015). Government Paternalism: Nanny State or Helpful Friend? Princeton: Princeton University Press. Loomes, G., & Sugden, R. (1998). Testing different stochastic specifications of risky choice. Economica 65: 581-598. McClamrock, R. (1995). Existential Cognition. Chicago: University of Chicago Press. Pennings, J., & Smidts, A. (2003). The shape of utility functions and organizational behavior. Management Science 24: 1251-1263. Prelec, D. (1998). The Probability Weighting Function. Econometrica 66: 95-113. Quiggin, J. (1982). A Theory of Anticipated Utility. Journal of Economic Behavior and Organization 3: 323-343. Ross, D. (2005). Economic Theory and Cognitive Science: Microexplanation. Cambridge, MA: MIT Press. Ross, D. (2014). Philosophy of Economics. London: Palgrave Macmillan. Samuelson, P. (1937). A note on measurement of utility. Review of Economic Studies 4: 154–161. 26 Samuelson, P. (1938). A note on the pure theory of consumer’s behavior. Economica 5: 61-72. Savage, L. (1954). The Foundations of Statistics. New York: Wiley. Schmidt, U., & Traub, S. (2002). An experimental test of loss aversion. Journal of Risk and Uncertainty 25: 233-249. Schmidt, U., & Zank, H. (2008). Risk aversion in cumulative prospect theory. Management Science 54: 208–216. Sugden, R. (2004). The opportunity criterion: Consumer sovereignty without the assumption of coherent preferences. American Economic Review 2004: 10141033. Sugden, R. (2009). Market simulation and the provision of public goods: A nonpaternalistic response to anomalies in environmental evaluation. Journal of Environmental Economics and Management 57: 87-103. Sunstein, C., & Thaler, R. (2003a). Libertarian paternalism. American Economic Review, Papers and Proceedings 93: 175-179. Sunstein, C., & Thaler, R. (2003b). Libertarian paternalism is not an oxymoron. University of Chicago Law Review 70: 1159-1202. Todd, P., Gigerenzer, G., & the ABC Research Group (2012). Ecological Rationality: Intelligence in the World. Oxford: Oxford University Press. Tversky, A., & Kahneman, D. (1992). Advances in Prospect Theory: Cumulative Representations of Uncertainty. Journal of Risk and Uncertainty 5: 297-323. Wakker, P. (2010). Prospect Theory for Risk and Ambiguity. New York: Cambridge University Press. Wilcox, N. (2011). Stochastically more risk averse: A contextual theory of stochastic discrete choice under risk. Journal of Econometrics 162: 89-104. 27 28 29 30 31 32 33