fMRI Evidence of a Hot-Cold Empathy Gap in Hypothetical and Real Aversive Choices

Hypothetical bias is the common finding that hypothetical monetary values for “goods�? are higher than real values. We extend this research to the domain of “bads�? such as consumer and household choices made to avoid aversive outcomes (e.g., insurance). Previous evidence of hot-cold empathy gaps suggest food disgust is likely to be strongly underestimated in hypothetical (cold) choice. Depending on relative underestimation of food disgust and pain of spending, the hypothetical bias for aversive bads can go in the typical direction for goods, disappear, or reverse in sign. We find that the bias is reversed in sign — subjects pay more to avoid bads when choice is real. fMRI shows that real choice more strongly activates striatum and medial prefrontal cortex (reward regions) and shows distinct activity in insula and amygdala (disgust, fear regions). The neural findings suggest ways to exogeneously manipulate or record brain activity in order to create better forecasts of actual consumer choice.

Real choices are binding consequential commitments to a course of action, like undergoing surgery or moving to a hurricane-prone coastal town. Researchers in marketing and behavioral sciences seek to understand how real choices are made. However, in studying decisions, scientists and policy makers often have to settle for measuring hypothetical statements about what people would choose, rather than what they will actually choose. In marketing research, for example, hypothetical surveys are used to forecast sales of existing products, to test new products by asking consumers what they would buy, and to evaluate marketing programs (Chandon, Morwitz, and Reinartz 2004;Green and Srinivasan 1990;Infosino 1986;Jamieson and Bass 1989;Raghubir and Greenleaf 2006;Schlosser, White, and Lloyd 2006;Silk and Urban 1978;Urban et al. 1983). In public economics and political science, survey data are used to establish the dollar value of goods that are not traded in markets (such as clean air or the prevention of oil spills), and to poll likely voters before an election (Carson et al. 1996;Crespi 1989;Diamond and Hausman 1994;Mortimer and Segal 2008). Hypothetical choices are also necessary in some types of psychology and neuroscience experiments in which measuring real choices is impractical or unethical (Greene et al. 2001(Greene et al. , 2004Hariri et al. 2006;Kühberger, Schulte-Mecklenbeck, and Perner 2002;Monterosso et al. 2007).
The reliance on hypothetical choice data presumes either that hypothetical choices are a good and legitimate way to forecast real choices, or that there is some relationship between the two types of choices, such that the hypothetical data can be adjusted to match real choice data.
However, many studies in behavioral economics have shown a substantial, systematic gap: typically, hypothetical valuations are greater than real valuations (Ariely and Wertenbroch 2002;Blumenschein et al. 2007;Cummings, Harrison, and Rutstrom 1995;Johannesson, Liljas, and Johansson 1998;List and Gallet 2001;Little and Berrens 2004;Murphy et al. 2005;Tanner and Carlson 2009). To remedy this typical bias, in marketing research conjoint analysis using hypothetical preference data has been extended and improved through hybrid incentive-aligned methods in which an inferred choice will be implemented for real (Ding 2007;Ding, Grewal, and Liechty 2005;Dong, Ding, and Huber 2010).
To our knowledge, all of these studies comparing hypothetical and real choices used appetitive goods, that is, goods to which people assign a positive value. Our paper is the first to compare hypothetical and real economic valuations of aversive "bads". We do so using functional magnetic resonance imaging (fMRI).
In our choice paradigm, subjects choose how much they would pay to avoid having to eat a food that most people find unpleasant (such as pigs' feet, canned oyster, or a large dollop of spicy wasabi; see Supplementary Table 1 and Plassmann, O'Doherty, and Rangel, 2010). Eating these foods is, of course, not as dramatic as some naturally-occurring bads that consumers must spend money or effort to avoid, such as regular colonoscopy screenings or protecting against identity theft. However, the advantage of using bad foods is that consenting subjects can actually make these unpleasant real choices in a lab environment. That is, at the end of the experiment they actually eat one food if they don't pay enough to avoid it. Having real choices is crucial, of course, for the comparative study of real and hypothetical choices. It is expected that initial clues from fMRI during unpleasant decisions about bad foods will provide some guidance regarding the neural valuation of more dramatic and unpleasant aversive experiences.
Hypothetical measures are often used to judge the value of aversive bads. One category of bads is environmental damage (Carson et al. 2003;Loureiro, Loomis, and Xose Vazquez 2009;Martin-Ortega, Brouwer, and Aiking 2011;von Stackelberg and Hammitt 2009). Another category includes a public good which benefits society but harms the host location, such as a prison or toxic waste dump. In studies of medical decision making, patients are often asked to choose between hypothetical medical treatments that could involve serious side effects (Levy and Baron 2005;Silvestri, Pritchard, and Welch 1998), or to express valuations of those procedures in numerical terms such as quality-adjusted life years (QALYs) (Zeckhauser and Shepard 1976). In all these cases, it is difficult or impossible to compare hypothetical choices with real ones.
How are choices among bads relevant to marketing? While marketers are most often focused on consumer choices of appetitive products, choices over aversive goods or services do also occur in marketing contexts. Some consumer choices require paying-money, as well as effort-to avoid unpleasant and harmful events. Consider insurance. The purchase of a car or earthquake insurance policy, an AppleCare service package, or an alarm system does not have any appetitive value, per se; instead, it is a payment to avoid future aversive events (similar to paying to avoid unpleasant food). Similarly, going to the doctor or dentist, taking medicines with side effects, and dieting and exercise, are (typically) aversive choices to prevent even worse future outcomes. Political campaigns also use marketing tactics, to persuade voters to accept aversive tradeoffs (such as raising taxes to eliminate California's deficit, or cutting pensions in Greece). Finally, many household marketing purchases might be appetitive for one household member but aversive for others (e.g., one spouse suffering through a summer action movie, or a teenager dragged to a bed-and-breakfast with her parents). If the unfortunate spouse or teenager misforecasts how aversive the activity will really be, during hypothetical planning, then our study could shed some light on how to market such mixed-valence family activities.
Using fMRI in our study establishes tentative findings about whether there are differences in neural circuitry in making hypothetical and real choices involving aversive bads.
Our data extend the only previous fMRI study on this topic, which used appetitive consumer goods (Kang et al. 2011). Our paper also adds to the small, but exciting literature on consumer neuroscience (Knutson et al. 2007;Plassmann, Ramsøy, and Milosavljevic 2012;Plassmann et al. forthcoming;Yoon et al. forthcoming).
Behaviorally, comparing bads and goods could also illuminate the general mechanisms which create differences in hypothetical and real choices. Note that evaluating goods requires a comparison of the utility from a positively-valued appetitive good with an aversive payment of money. A hypothetical bias could result from either overvaluation of appetitive goods, or undervaluation of the distaste of paying money, or both, during hypothetical choice. Studying only appetitive goods cannot discriminate which type of biased evaluation is occurring.
For aversive bads, overly positive hypothetical evaluation (Tanner and Carson 2009) leads to underestimation of two different kinds of disutility-disutility from eating aversive foods and disutility from paying money. If both are underestimated during hypothetical choice, it is unclear which effect is likely to be more dominant a priori, so the difference in hypothetical and real choices is unclear. In fact, there are three possible hypotheses about which effect could predominate and what the sign of the hypothetical-real bias will be.
First, suppose that in hypothetical choice, there is a general underestimation of how bad spending money is (as compared to real payment), and further, that this underestimation is more substantial than the error in the predicted disutility from the consumption of bad food. Then in the aversive bads domain that we study, real willingness-to-pay (WTP) will be lower (i.e., closer to zero payment) than hypothetical WTP. This result would unify the findings for appetitive goods and aversive goods; both could then be explained by an insufficient appreciation, in hypothetical situations, for the distasteful spending of money in real choices-the "pain of paying" (Prelec and Loewenstein 1998). According to this hypothesis 1, dollar values are always inflated in hypothetical choice and are deflated toward zero when choices are real (e.g., paying $100 is not too painful when it's not real spending, be it for goods or bads).
Second, an alternative and more plausible hypothesis 2 is that in hypothetical choices, people will underestimate the aversive experience of eating bad foods to a greater extent than they may underestimate the pain of paying. Consider the extreme example of eating a monkey brain (as in "Indiana Jones") for real. For many people, this would cause an immediate visceral response (e.g., nausea, feeling of disgust). And to be clear, disgust is indeed a visceral, as in physiological, response. For example, Harrison, Gray, Gianaros, and Critchley (2010) found that ratings of disgust after watching repulsive food videos led to stronger gastric stomach responses and neural activity in the insula and thalamus.
In contrast, losing an abstract secondary reward, such as money, is likely to elicit less visceral and more cognitive "pain". In hypothetical choices which are purely cognitive and have no binding consequences, the brain might make rapid and effortless decisions without fully taking any visceral factors into account. However, in real choices, visceral factors such as disgust are likely to be weighed more heavily, especially for food choices.
The reasoning laid out above is consistent with "hot-cold empathy gaps" (Kühberger, Schulte-Mecklenbeck, and Perner 2002). According to Loewenstein (1996Loewenstein ( , 2005, when making a decision, people underestimate or ignore the effect of visceral factors (generally aversive) such as thirst, fear, and craving for tobacco that are not currently experienced. More specifically, when people are in an affective cold state (e.g., not experiencing thirst or craving at the moment), they do not accurately estimate how much such visceral states (hot states) will change their preference and behavior, hence the term "hot-cold empathy gap." For example, smokers who were not having a craving for a cigarette underestimated how much they would value a cigarette when they were later in a high craving state (Sayette et al. 2008 (Badger et al. 2007).
A bigger hot-cold empathy gap for food disgust than for money payment implies the opposite of the typical hypothetical > real bias for goods. For real bad-food choices, the aversion to eating a bad food will be strongly adjusted upward and the aversion to paying money will be adjusted upward, by a smaller amount. Then real valuations will be higher than hypothetical valuations (i.e., real WTP > hypothetical WTP), reversing the typical hypothetical bias.
Furthermore, we expect that during real choice, there will be stronger neural activations in affective areas implicated in disgust and fear processing, such as the insula and amygdala (Craig 2002(Craig , 2009Whalen 1998), in addition to commonly known valuation areas. This is an important step because no previous study of hot-cold empathy gaps has compared biological activity in hot and cold states, and shown direct evidence of an affective (empathic, or emotional) difference.
The third and last hypothesis is that the aversion to bad food and the pain of paying are equally underestimated (or not underestimated at all). In this scenario, it is predicted that there will be no significant behavioral difference between real and hypothetical valuations, and fewer neural differences in the two conditions. The elements of these hypotheses can be mathematically summarized as follows. Let w be an individual's WTP to avoid bad food, x pain of paying, and y disutility from consuming bad food. And let (w r , x r , y r ) and (w h , x h , y h ) denote the levels of w, x, y in the real and hypothetical conditions, respectively. Suppose that w, x, and y have a functional relationship, ) , ( y x f w  . Generally (e.g., common sense predicts), all other things being equal, a person will be less willing to pay to avoid aversive bads if he feels more pain of paying. Likewise, all other things being equal, he will be more willing to pay to avoid bad food if the disutility from consuming it increases (e.g., he feels more strongly disgusted by it). These relationships can be mathematically expressed as: (1).
Typical consumers are overly optimistic (Weinstein 1980(Weinstein , 1987(Weinstein , 1989Tanner and Carlson 2009) and are thus expected, in hypothetical situations, to misforecast (i.e., underestimate) the real pain of paying as well as the real badness of consuming bad food. This can be expressed as: where dx and dy denote the changes in x and y between the hypothetical and real conditions.
Given the above formalism, the change in hypothetical and real WTP, denoted dw, can be written as: Note the signs of the terms on the right-hand side from equations (1) and (2); the sign of dw (e.g., the relative size of hypothetical and real WTP) is determined by the relative size of the terms on the right as summarized in Table 1.

Participants
Twenty-seven subjects participated in the fMRI experiment (10 females, 17 males; M ± SD age = 22.48 ± 8.96 years; age range = 18−65). Eight additional subjects were excluded for the following reasons: one subject could not finish the scanning due to nausea; three subjects were excluded because their behavioral data showed no variability; and four subjects were excluded due to excessive head movement. All subjects were right-handed; had normal or corrected-to-normal vision; had no history of psychiatric, neurological, or metabolic illnesses; and were not taking medications that interfere with the performance of fMRI. Since the study involved choice (and possible consumption) of foods, all subjects were screened on arrival for any dietary restrictions such as food allergies, diabetes, or any other medical condition or religious/ethical practices that may affect choice of foods in any way.
Stimuli HYPOTHETICAL AND REAL AVERSIVE CHOICES 11 Fifty aversive food items were used in the current study. Our food stimulus set was based on the set generously shared by Plassmann et al. (2010). Some of the least aversive foods (e.g., pears; about 20% of the original list) were dropped and replaced by new items more consistently rated as unpleasant. The new items were available at local grocery stores and were chosen based on 'disgust ratings' provided by a group of independent evaluators (For a complete list of foods in our stimulus set, see Supplementary Table 1 in the supplementary materials).
The food stimuli were presented to the subjects using color pictures (72 dpi) on the computer screen during pre-and post-scanning parts, and through MRI-compatible video goggles during scanning. Stimulus presentation and response recording were implemented in Matlab, using the Psychophysics Toolbox extensions (Brainard 1997;Kleiner, Brainard, and Pelli 2007;Pelli 1997).

Experiment Procedure
The experiment consisted of three parts-pre-scanning, scanning, and post-scanning parts ( Figure 1A). At the beginning of the experiment, subjects were told that they would earn up to $50 for completing the experiment ($45 fixed plus $5 spending budget), and were informed that there were three experimental parts. Detailed instructions for each part were not given until each part began. We intentionally did not counter-balance the order of the hypothetical and real conditions in the scanner (following Kang et al. 2011). We discuss this design choice in the discussion.

[FIGURE 1 HERE]
In the pre-scanning part, subjects were shown the 50 different food images, one at a time and in random order. Subjects were asked to rate each food item on how familiar they were with it, using a scale from 0 through 3. The scale was defined as follows: . We collected familiarity ratings for the following reason: it is possible that subjects might bid higher (not to eat) for the foods that they were less familiar with (e.g., ambiguity aversion, Hsu et al. 2005), so the familiarity rating was entered into the functional imaging data analysis to control for any potential familiarity effect on neural activity (e.g., people value familiar items more).
The scanning part had two blocks of bidding tasks, each block consisting of 50 trialsone for each food item. Both blocks were identical except that the first was hypothetical and the second block was real. Within each block, subjects were shown the same 50 food items as in the pre-scanning part, one in each trial, in random order ( Figure 1B).
At the start of each scanning block, subjects were instructed as follows. They were told (to imagine in hypothetical block) that at the end of the experiment, one out of the 50 trials would be randomly selected by the computer and they would have to eat the food shown on that selected trial. The only way they could avoid eating the chosen food was to purchase the right not to eat it, and they had to make a bid in order to buy this right. The right to avoid the food was sold using the Becker-DeGroot-Marschak (BDM) auction mechanism (Becker et al. 1964;Plassmann et al. 2007Plassmann et al. , 2010Kang et al. 2011). The auction worked as follows: Subjects bid one of $0, $1, $2, $3, or $4 in each trial. At the end of the experiment, the computer determined the price for the right and randomly selected one trial. Regarding the pricing, the computer would randomly generate an integer between 0, 1, 2, 3, and 4 (each integer was equally likely), and this randomly generated integer, say p, would determine the price of the right to avoid the food. If the bid made by the subject for a given food item, say b, was greater than or equal to p (b ≥ p), then the subject paid $p to purchase the right, and did not have to eat the item. However, if b < p, the subject had to eat the food shown (2-3 spoonfuls), and did not have to pay anything (see Plassmann et al. (2007Plassmann et al. ( , 2010 for the characteristics and limitations of the BDM auction). Note the key difference between the hypothetical and real blocks: for hypothetical trials, subjects were told to decide how much to bid while imagining that they really may have to eat the food, whereas in real trials, they knew they would have to eat the food when they did not pay enough to avoid it.
After reading the instructions for the real scanning block and prior to starting the block, subjects in the scanner gave an additional consent to actually eat the food. Subjects were made aware that they could withdraw from the experiment anytime if they did not want to continue and that in this case, they would still collect whatever they had earned up until that point. All subjects asked to eat an aversive food, at the end of the experiment, did indeed do so.
In the post-scanning part outside of the scanner, subjects were asked to rate each of the same 50 food items on how appetitive or disgusting they were to them. Ratings were entered with a sliding scale from −3 through 3 (−3: very disgusting; 0: neutral or indifferent; 3: very appetitive). This number is henceforth referred to as a 'disgust rating'-note that lower values indicate a higher level of disgust.
The initial location of the anchor on the bidding scale was randomized for each trial and recorded during the scanning session. These data were used as a check for subjects' engagement in the task and possible anchoring effects. Correlations between subjects' bids and anchor positions were not significantly different from zero for most of the subjects (Supplementary  (a parametric modulator of R1 indicating familiarity rating), and R4 (a boxcar function denoting real bidding phase). The regressors were convolved with a canonical hemodynamic response function. The parameter estimates from this 1 st -level analysis were then entered into a random effects group analysis, and linear contrasts were generated to identify regions that responded differentially to bids between the hypothetical and real conditions.

Region-of-Interest (ROI) Analysis. Region of interest analyses of how activity in regions
identified by the 2 nd -level group analysis scaled with bids were conducted by running an additional GLM. In this analysis, trials were grouped, within each condition, according to the three levels of bids for each subject, resulting in 'Low', 'Mid', and 'High' bid trials. The GLM thus had 8 regressors of interest, including food image presentations with 'Low', 'Mid', and 'High' bids, bidding phase for each of hypothetical and real conditions, and regressors of no interest. All regressors of interest were modeled as a boxcar function. The three levels of bids were defined as follows: for all of the subjects except for one, $0 was 'Low', $1 and $2 were 'Mid', and $3 and $4 were 'High' bid; for one subject, $2 was 'Low', $3 was 'Mid', and $4 was 'High' bid as this subject did not make any $0 or $1 bids in either condition. The β coefficients resulting from this post-hoc analysis were used to create the bar graph shown in figures 3 and 4.

RESULTS
The first results below concern the comparison of behavior in hypothetical and real choices, and associated response times (RTs) which provide clues about information processing.
The second set of results below addresses differences in neural activity established using fMRI, in both hypothetical and real valuations.

The Behavioral Difference between Hypothetical and Real Conditions
First, we discuss the differences in bidding behavior in the hypothetical and real conditions. The average hypothetical bid and the average real bid for each food were significantly correlated with each other across foods (ρ = .91, p < .0001). Despite this very high correlation, there are systematic differences between the hypothetical and real bids; almost all of the average real bids (averaged across subjects) were higher than the average hypothetical bids for the same foods ( Figure 2A).
Average bids by individual subject showed a similar pattern. The average hypothetical and real bids for each subject were highly correlated (ρ = .74, p < .0001), and the average real bid (M ± SD = 1.92 ± .87) was significantly greater than the average hypothetical bid (M ± SD = 1.63 ± 0.68) (t(26)= 2.60, p = .015, paired two-sample t-test, two-sided). On average, most of the subjects made real bids that were higher than their corresponding hypothetical bids ( Figure 2B).
Overall, subjects found the presented foods to be slightly disgusting ( Controlling for either disgust or familiarity, real bids were approximately $.20 higher than hypothetical bids. This difference is not very large in magnitude, but it is highly significant across both foods ( Figure 2A) and subjects ( Figure 2B).

[FIGURE 2 HERE]
RTs during image presentation were significantly different between hypothetical and real conditions, possibly due to repeated exposure to the same stimuli (Hypothetical: M ± SD = 3.73 ± 1.41 sec; Real: 3.38 ± 1.73 sec; t(26) = 2.89, p = .008, paired two-sample t-test, two-sided). In order to rule out the possibility that any neural difference between the two conditions was due to a difference in RTs, we estimated an additional GLM that was identical to the primary GLM except that RT was entered as a modulator in addition to bid and familiarity rating. However, the results from the two models did not substantially differ, so we report only the results from a simpler model without RT.

The Neural Difference between Hypothetical and Real Conditions
The GLM using a simple treatment regressor (i.e., H1 − R1) showed that there is generally more activation during hypothetical bidding as compared to real bidding (see Supplemental Table 3). This is a surprising result and contrary to Kang et al. (2011), which reported more overall activity in real trials. However, since real bidding in the current study was deliberately confounded with experience (the real bids come later in the trial sequence) this deactivation could be due to either the real vs. hypothetical treatment, or to the general effect of stimulus experience and neural habituation reducing brain activity; this is a well-established effect (Fischer et al. 2003;Phan et al. 2003; Thompson and Spencer 1966;Wright et al. 2001;Yamaguchi et al. 2004).
The more diagnostic and interesting analysis therefore focuses on regions in which activity scales with bid amounts differentially in hypothetical and real conditions. To find these regions we looked for areas that correlated with bids in the real trials more strongly than in the hypothetical trials. The analysis uses the contrast R2 -H2 (denoted [Real*Bid -Hypothetical*Bid] below).
In this analysis, there was no brain region identified to be more positively correlated with real bids than hypothetical bids, even at a lenient value of p < .01 (uncorrected) with a small extent threshold of 5 voxels. However, with the whole-brain analysis, we identified regions where the BOLD signal was more negatively correlated with real bids than hypothetical bids (Table 2, Figure 3A and 4A, and Supplementary Figure 4). These areas include the ventromedial prefrontal cortex (vmPFC), amygdala, anterior cingulate cortex (ACC), thalamus, and insula.
Most of these areas, including the vmPFC, left amygdala, ACC, thalamus, and insula, are still significantly different after correction using a false-discovery rate (FDR) p < .05.
We previously found that for appetitive goods (consumer products), the vmPFC is more strongly involved in valuation in real decision making compared to hypothetical decision making (Kang et al. 2011). That previous study, using appetitive goods, did not find the amygdala to be involved in valuation. However, the amygdala is thought to play a key role in processing of aversive stimuli and aversive conditioning (among other functions) (Johansen et al. 2010;Phelps 2006;Whalen 1998). Hence, we further explored how bids appeared to be encoded in both vmPFC and amygdala areas.
As figure 3B and 4B show, the vmPFC and the amygdala areas show a significant negative linear trend across different levels of bids in real trials only; such a trend is not observed in hypothetical trials. That is, less aversive goods (which subjects pay less to avoid eating) activate the vmPFC and the amygdala more strongly than more aversive goods, but only in real trials.
When the food disgust rating was used in place of dollar bids, similar regions of brain activity are found (Supplementary Figure 5). This finding is important because it implies that economic valuation, per se, is not fundamentally different than judgments of disgust, at least for these types of aversive foods.

[FIGURE 4 HERE]
Lastly, we compared the areas that were parametrically modulated by bids in the current study and with the areas modulated by decision values of appetitive goods in the previous study by Kang et al. (2011). Due to the lack of deactivation in the Hypothetical*Bid contrast in the current study, we overlaid the areas that negatively correlated with real bid in the current study and the areas that were positively correlated with a real decision value of appetitive goods in the previous study (Supplementary Figure 6). We found that the vmPFC, ACC, and ventral striatum (VStr) appeared in both studies, but the amygdala and the surrounding areas only appear in the current study of the aversive domain.

DISCUSSION
This study is the first to compare the willingness-to-pay to avoid aversive outcomes (unpleasant foods), in the two conditions of non-binding hypothetical decision and binding real decision. Previous studies with appetitive stimuli typically find that hypothetical valuations are higher than real valuations. We find the opposite result. Binding, real bids to avoid eating unpleasant foods were larger than hypothetical bids. The within-subject design provides good statistical power to show that the real > hypothetical bias is highly significant across both foods and subjects.
Before proceeding, it is useful at this point to squarely address potential threats to the validity of our scientific inference from the deliberate design choice to elicit both hypothetical and real valuations of the same foods in a fixed order (i.e., two exposures per food, hypothetical then real). We fixed this order out of concern that eliciting real bids first would lead to a mental state for the second hypothetical block fundamentally different than that in most lifelike situations where hypothetical judgments are made.
Many studies have presented the same stimuli multiple times (e.g., FitzGerald et al. 2009;Hare et al. 2009), and found consistent signals. The biggest threat to validity when stimuli are judged repeatedly is that the repeated judgments are artificially consistent. However, any such effect would lower the capacity of the design to detect a highly significant hypothetical < real difference in willingness-to-pay; and yet, we do find such a difference. (Therefore, it is likely that a between-subjects design could show a much larger difference, both behaviorally and neurally.) Furthermore, if there was habituation to the foods over time, we would expect neural activity to diminish or (perhaps) be less value-sensitive for the real trials that come after the hypothetical ones, but we found just the opposite. And behavioral experiments using appetitive goods by Kang et al. (2011) showed that there were real vs. hypothetical differences in both possible treatment orders. Thus, we argue that the potential risks from the within-subject design with fixed treatment order are not strongly evident in the data. In addition, there are many statistical benefits of repeating the same stimuli in the hypothetical and real conditions. Doing so controls for nuisance variables such as physical and psychological aspects of stimuli (e.g., color, shape, experience, memory) that might be correlated with stimulus value, but are not involved in the valuation of aversive stimuli per se.
Returning to the scientific contributions of the current study, the combination of results i.e., for both appetitive goods and aversive bads.
While this conclusion is tentative, it is important because a popular theory about hypothetical bias is that people underestimate the value of money when expressing hypothetical values. An important examples is the influential report (Arrow et al. 1993) by a panel of academic economists on how to best elicit and use "contingent valuation" survey measures to establish reasonable prices for non-market-traded goods and services (such as clean air). They specifically "emphasize[d] the urgency of studying the sensitivity of willingness to pay responses to… reminders of other things on which respondents could spend their money." The panel's conclusion followed from a conjecture that opportunity cost reminders would lower hypothetical responses because disutility from monetary payment is underestimated in hypothetical choice, so that reminders would lead to better approximate real values (as shown, in fact, by Knoepfle et al.

2009).
Our results do not support the idea of a general strong devaluation of money during hypothetical valuation (for both appetitive and aversive objects). Instead, the results lend tentative support to a different hypothesis (hypothesis 2) mentioned in the introduction: aversive experience of visceral factors (i.e., disgust) is more strongly underestimated in hypothetical choice than aversive experience of more cognitive factors (i.e., paying money) because people in an affectively "cold" state easily fail to appreciate the influence of a "hot" visceral factor that is not currently experienced upon their preference and behavior (Loewenstein 1996(Loewenstein , 2000(Loewenstein , 2005.
We call this the "visceral response underestimation" hypothesis.
In valuing appetitive goods hypothetically, "overly optimistic consumers" (Tanner and Carlson 2009) would probably overestimate benefit from consumptions of goods and to underestimate the pain of paying, leading to hypothetical values that are too high. In valuing aversive goods hypothetically, disutilities from both of spending money and eating unpleasant foods are generally expected to be underestimated. However, disgust of eating unpleasant foods is likely to be stronger in real choice if there is a tendency to more strongly underestimate the influence of the more visceral factor. Our behavioral finding (higher WTP in real choice) is consistent with this account.
The brain imaging results reported also support this account. Stronger encoding of "better" valuation (i.e., lower bids to avoid less aversive foods) during real choice is found in cortical regions that are well-established to encode value (vmPFC, ACC, and VStr). Notably, Plassmann et al. (2007Plassmann et al. ( , 2010 find that the vmPFC encodes both increased value for appetitive goods, and decreased distaste (a positive improvement) for aversive goods. Tom et al. (2007) found a similar common encoding in the VStr and vmPFC for both increased potential money gains and decreased money losses.
Most importantly, we find more value-sensitive activity during real choice in the insula, amygdala, and hippocampus. The insula is thought to encode general emotional and visceral discomfort (Craig 2002(Craig , 2009), ingestive disgust (Harrison et al. 2010), risk (Preuschoff et al. 2008;Mohr et al. 2010), and empathy for pain (Bernhardt and Singer in press;Singer et al. 2004;Singer, Critchley and Preuschoff 2009). The amygdala responds rapidly to impending threat, creating neural vigilance (among other functions) (Adolphs, Tranel, and Damasio 1998;Whalen 1998). The visceral response underestimation hypothesis is consistent with these functional attributions, assuming that anticipation of actually eating unpleasant foods is viscerally uncomfortable or threatening, as compared to merely imagining so as in hypothetical choice.
Keep in mind that insula and amygdala activity is stronger for low-WTP (i.e., less aversive) bad foods, which we interpret as evidence that when subjects expect to have to actually eat those foods (since they are not bidding their way out of eating) they have an aversive reaction, which is stronger in real choice. This direction is also consistent with Plassman et al.'s (2010) findings of higher parahippocampal and insula activity in response to lower bids to avoid unpleasant foods.
To our knowledge, our fMRI evidence of differential insula and amygdala activity is the first direct evidence of a biological encoding of a hot (real choice)-cold (hypothetical choice) empathy gap, in the brain.
Speculation about visceral response underestimation also suggests some potential ways to "de-bias" hypothetical choices for future research. For example, it is known that the amygdala responds to fearful or threatening stimuli such as a fearful face (and even fearful eye whites only) (Whalen et al. , 2004 and electric shocks (Phelps et al. 2001). Insula also responds reliably to exogeneous stimuli that are unpleasant. Hence, one future direction is to manipulate amygdala or insula activity by using such stimuli during hypothetical choice. The idea is that stimulation of such regions by external stimuli might 'fool' the neural circuitry into making judgments as if it is in a hot state. Inducing an artificial hot state could produce hypothetical choices that are better forecasts of actual real choices. Another direction is to tap visceral urges during hypothetical valuation by having subjects inspect and smell real aversive food items, facilitating them to more easily integrate visceral factors into their hypothetical valuations.
Note that in our study hypothetical bids are (significantly different but) highly correlated with real bids, and that both types of bids are also significantly correlated with disgust ratings.
However, despite such high correlation between the two behavioral measures, there is little overlap of neural activity during real and hypothetical valuations. This finding serves a cautionary note to researchers in consumer neuroscience and marketing practitioners who want to predict real choices using hypothetical contexts. There has been growing academic interest in applying machine learning techniques to neuroimaging data in order to predict purchase choices, using either a real choice or a hypothetical choice paradigm (Grosenick, Greer, and Knutson 2008;Tusche, Bode, and Haynes 2010;Smith et al. 2012). Although such effort has been somewhat successful (in terms of accuracy rates), the findings of the current study suggests that predictive models estimated using hypothetical data might fail to accurately predict real purchase choice because they use different kinds of neural circuitry (as well as some shared circuitry).
Likewise, marketing practitioners should be wary of using hypothetical choices if they want to predict real purchase intentions with neural data or other physiological measures, such as skin conductance response, which is closely linked to activity in a particular brain area (e.g., Williams et al. 2001).
Finally, we note that many data about how people evaluate aversive experiences are hypothetical, rather than real, particularly if the data are collected to pre-value or anticipate potential averse events. A common domain in this regard is that of medical decision making. In studies of patients' medical choices, subjects typically read hypothetical scenarios regarding different stages of a disease and treatment toxicity and make hypothetical decisions about treatments (e.g., with end-stage cancer, choice between chemotherapy that could extend life by 4 months with severe side effects versus supportive care that could only alleviate symptoms) (Gurmankin et al. 2002;Levy and Baron 2005;Malenka et al. 1993;O'Connor et al. 1987;Silvestri et al. 1998;Yellen et al. 1994). Further, a physician, who is giving medical recommendations, may need to put herself in the hypothetical situation of being in her patient's minds (particularly for a minor or an incapacitated patient). Cancer patients often say that they would not want to receive grueling cancer treatments (e.g., chemotherapy) but then change their minds when they do get cancer (Loewenstein 2005). Further understanding of how the brain makes both hot and cold (real and hypothetical) decisions could guide people and societies to make these difficult decisions less painfully. Note. Height threshold: t(26) = 3.67, p < .001 (uncorrected); extent threshold, k = 5 voxels. L: left; R: right; † survives whole-brain FDR correction at p < .05; * Part of a larger cluster. were identical. All 50 food items were repeated across the two bidding blocks. During the food image presentation, subjects were asked to press any button as soon as they had decided how much to bid. Subjects submitted their bid using a sliding scale. The initial position of the bidding cursor (anchor) was randomized in every trial in order to avoid any potential anchoring effects (Tversky and Kahneman 1974). Paired sample t-test, ** p < .01, * p < .02, no asterisk = not significant.  β values extracted from this amygdala mask. ** p = .015, * p = .034, no asterisk = not significant, paired two-sample t-test.

Supplementary Materials
Supplementary