1 Introduction: categories of behavioral failures
For about a century, economists have justified government intervention to address market failures that occur when individuals acting in their own self-interest lead to an inefficient market outcome. The traditional instances of market failures include the existence of a monopoly, public goods, or externalities. But for many decades, economists have also considered a role for government to address a market failure that occurs when people, due to cognitive limitations and psychological biases, fail to act in their own self-interest, leading them to cause themselves harm. For example, consumer protection regulations and job safety standards are justified to address possible shortcomings in individual decisions, such as consumers failing to understand the attendant product safety risks or workers being unaware of health hazards posed by their jobs. Market failures that stem from irrational actions in which informed consumers fail to behave in an economically efficient manner are similar to, but distinct from, cases that stem from inadequate information or differential access to information by the various market participants.
A recent overview of behavioral deviations from standard economic assumptions by Congdon, Kling and Mullainathan (2011) classified the different categories of behavioral failures in terms of imperfect optimization, bounded self-control, and nonstandard preferences. Examples of these phenomena include a lack of understanding of the levels of risk associated with dangerous activities, inadequate attention to one’s future health and financial well-being, and reference-dependence effects as reflected in the greater weight placed on losses as compared to comparable gains. Not all of these phenomena serve as rationales for stricter regulation. Consumer overestimation of the risks of airplane travel does not, for example, provide an impetus for more stringent airline safety standards.
Our focus here is on how to assess the benefits of addressing market failures that stem from systematic behavioral failings that lead people to irrationally cause themselves harm. The evidence of systematic irrational behavior creates a conflict between two core principles of benefit-cost analysis (BCA): the Kaldor–Hicks principle and the principle of consumer sovereignty. The Kaldor–Hicks principle instructs the analyst to attempt to identify the outcome that maximizes the net benefits to the people subject to the set of policy options, while the principle of consumer sovereignty instructs the analyst to respect the choices that the people would make in determining what is best for themselves. If consumers are believed to be acting irrationally, then an analyst must choose between incorporating the benefits of a policy that addresses the self-harm done by an individual and respecting consumer sovereignty and thus ignoring such benefits, leading to a violation of the Kaldor–Hicks principle.2
There is currently insufficient official guidance for estimating benefits in the presence of behavioral failures, which has led government agencies at times to adopt arbitrary and excessive benefit valuations. In this article, we propose some guiding principles for benefit assessment in the presence of behavioral failures, recognizing that such guidance will surely evolve as the evidence with respect to behavioral failures increases. While there are well-defined principles and standards for established components of BCAs, there has not been comparable development of standards for the application of the insights of the behavioral economics literature. This article proposes the development of the analog to a benefits transfer test, which we term a “behavioral transfer” test. More generally, we suggest that agencies adopt a more cautious approach to addressing behavioral market failures in contrast to traditional market failures. We suggest this cautious approach because the evidence regarding behavioral failures tends to be based on narrowly defined contexts that might not be generalizable, there are real risks of a policy harming consumers by overriding their preferences and unnecessarily restricting their choices, and the regulators themselves are behavioral agents and therefore subject to psychological biases that could lead to poor decision making that distort policies.
2 Benefits transfer and behavioral transfer
2.1 The rationale for a behavioral transfer test
Much of the evidence of behavioral failures is derived from laboratory experiments, stated preference studies, hypothetical classroom exercises, or narrowly defined decision contexts. Behavioral anomalies are not restricted to these narrowly defined contexts, but also have been evidenced in some case studies of market behavior. For example, DellaVigna (2009) summarizes “a growing list of recent papers that document aspects of behavior in the market settings that also deviate from the forecasts of the standard theory.” The examples he presents in which real world anomalies are observed include the purchase of health club memberships, credit card usage, retirement savings decisions, cab driver effort decisions, and consumers’ inattention to shipping costs. Although many of these studies are for specialized decision contexts, they provide evidence that behavioral deviations from the standard model are not always confined to laboratory settings. However, recognizing that some behavioral anomalies have affected actual decisions does not imply that all behaviors are subject to the same anomalies or that the empirical magnitudes that have been identified are generalizable to different types of choices and different classes of individuals.
Whether the results of behavioral economics studies have any applicability to benefit assessment in actual policy contexts is not self-evident.3 We refer to the practice of applying results from a behavioral study in one context to a broader application of policy as “behavioral transfer” to recognize its similarity to the long-acknowledged challenge of “benefits transfer,” in which the benefit estimation in one subpopulation is applied to another subpopulation being evaluated for a regulation. In the case of benefit valuation, there is recognition by economists and governmental administrators that benefits transfer efforts, while necessary, require scrutiny to determine whether the transfer provides accurate information. Do, for example, the preferences exhibited in a particular study characterize the willingness to pay (WTP) for the benefits by those affected by the policy? Government agencies and professionally accepted principles and standards for BCA indicate that these benefit transfers raise a class of concerns that should be addressed in the economic analyses prepared by government agencies.4
Benefit transfers are quite common in the valuation of mortality risks, which is the most prominent benefit component of U.S. government regulations. The U.S. Department of Transportation (2014) uses labor market estimates of the value of a statistical life (VSL) to monetize the safety benefits of transportation regulations. Similarly, various branches of the U.S. Environmental Protection Agency (EPA) use these labor market values to monetize the fatality risks of cancer and other fatal environmental illnesses, though there have been proposals that the agency make some provisional adjustment for potential relatively large morbidity effects associated with cancer deaths. These and other agencies must address the continuing concern of whether valuations derived primarily from traumatic labor market fatalities accurately reflect the WTP values for mortality risk reductions in quite different situations. The main underlying rationale for the benefits transfer in this instance is that the largest benefit component of the VSL is the loss of one’s life, not the associated morbidity effect (Gentry & Viscusi, 2016), so that there is a substantial common element to the VSL in different contexts.
Behavioral transfer raises an even more fundamental set of concerns. Consider the case of evidence drawn from laboratory experiments in which meaningful incentives are provided to the subjects, who are subsequently shown to act against their own self-interest. Such an experiment is more reliable and generalizable than behavioral studies that fail to rely on meaningful incentives. Nonetheless, there is a substantial difference between demonstrating a behavioral failing in the laboratory and showing it exists in the marketplace. To what extent, then, are the incentivized experiments in a laboratory setting reflective of actual market decisions? Are there important differences in the frequency of the decisions, opportunities for learning, the stakes involved, and the characteristics of the decision makers? Even fundamental economic phenomena do not have universal generalizability. One would not, for example, utilize the demand elasticity estimate from an incentivized student experiment involving candy bars and claim that this elasticity was a meaningful estimate of the demand elasticity for all consumer products and all consumer groups. Even if they are well designed, laboratory experiments may have implications that are limited to identifying the existence of a particular class of behavioral failures rather than indicating the empirical magnitudes of behavioral failures that will pertain to different kinds of choices in the market. The financial stakes, the nature of the decision, the frequency with which people make similar decisions, and the opportunity to learn from past mistakes may be quite different in the experimental context compared to the market context. Given these limitations, we believe a higher level of scrutiny is required for behavioral transfers than for traditional benefits transfer, and that many of the results of behavioral studies are most relevant for indicating the presence of a potential behavioral failure rather than for credibly estimating the empirical magnitude of the failure.
To establish appropriate guidance for behavioral transfer from studies of cognitive limitations and psychological biases to BCAs, we propose that agencies develop the same kind of guidelines not unlike the pertinent guidance that has developed for stated preference studies.5 Is the sample of respondents reflective of the beliefs and preferences of those affected by the government policy? Do the underlying studies provide credible empirical values that are indicative of the particular policy contexts to which they are being applied? Do the respondents fully understand what is being valued? If the metric is not financial, as in the case where happiness measures are proposed rather than monetary WTP, what is the basis of comparison and the time frame that people should use in assessing their happiness? Do the respondents in these studies exhibit sufficient consistency in their responses and attention to the experimental task to lead one to extrapolate these experimental results to the policy situation? Is the nature of the decision comparable to that in the market in terms of the commodity involved, the financial stakes, and the consumer attributes that are pertinent to the decision? These concerns are not exhaustive, but they do highlight the fundamental importance of demonstrating well-documented and systematic behavioral failures that are pertinent to the specific policy context.
2.2 Applying a behavioral transfer test to WTA/WTP disparities
It is possible to illustrate the applicability of a behavioral transfer test by considering a behavioral phenomenon with potential direct applicability to benefit assessments – the discrepancy between willingness-to-accept (WTA) values and WTP values. The survey of the stated preference environmental literature by Horowitz and McConnell (2002) found a mean WTA–WTP ratio of 7.2, and Tunçel and Hammitt (2014) found a geometric mean WTA–WTP ratio of 3.3.6
To the extent that estimates of the VSL are WTA values for the wages that workers require to accept risky jobs, a straightforward application of the WTA–WTP discrepancy to mortality risk benefit assessment would be to deflate these benefits by a factor of 3 to 7. Although government agencies have been quick to adopt behavioral findings that boost benefit amounts, no government agencies have yet reduced their estimates of VSL to reflect the WTA–WTP discrepancy.
Consistent with the benefits transfer concept articulated above, it is essential to assess the WTA–WTP gap for the specific context of labor market valuations of fatality risks. None of the many WTA–WTP studies are either actual field experiments or incentivized experiments involving fatality risks. Ideally, any evidence pertaining to such anomalies should be tied as closely as possible to the dominant governmental approach to setting the VSL level, which relies on labor market estimates of VSL. Consequently, the most meaningful evidence of whether a reduction in benefit values is warranted is to examine whether there is a discrepancy between WTA and WTP values for labor market estimates of VSL. To resolve this issue, Kniesner, Viscusi and Ziliak (2014) examined the VSL for job changers, considering workers who switched jobs and received greater wage compensation for increases in risk (WTA) and workers who received reduced compensation risk after moving to jobs that posed lower risks (WTP). Their analysis did not indicate any statistically significant gap in the values. For this pivotal benefit assessment component, application of a behavioral transfer test provides no empirical justification for deflating the benefit values to reflect the WTA–WTP gap.
2.3 Applying a behavioral transfer test to discounting anomalies
A behavioral anomaly that has played a prominent role in regulatory impact analyses is the possibility of intertemporal irrationalities. Consider the implications of incentivized classroom experiments suggesting the presence of hyperbolic discounting when students are offered rewards at different future dates. Will the same degree of hyperbolic discounting evident in fairly short-term experiments be applicable to assessing population-wide decisions involving pension saving, investment in more energy-efficient appliances, and decisions to smoke cigarettes for which the stakes are considerably greater and the time horizon is measured in decades rather than weeks or months? The longer time horizon alone may undermine the pertinence of the experimental findings since the person’s series of applicable discount rates used to value future outcomes extends over multiple years. Knowledge of apparent short-term temporal myopia may not have a huge bearing on the extent of the long-term deviation from rational behavior. Common hyperbolic discounting models suggest that people have inordinately high discount rates initially but that there is less of a bias thereafter. Suppose the individual’s long-term discount rate is 8% but that for decisions in the next year, the discount rate applied is 14%.7 Then for decisions involving a payoff one year from now, the intertemporal error relative to the long-term rate caused by the initially high discount rate is to value the payoff a year from now at 95% of the rate at which it would be valued had the person used the 8% rate. However, for payoffs occurring over a 10-year period, as in the case of consumer durables and long-term government investments, the bias caused by an initially high discount rate is less pronounced.
The more glaring disparity and potential intertemporal irrationality derives from use of a governmental interest rate of 3% as the rationality norm. That approach implies that not only is the consumer’s initial discount rate above the consumer’s longer-term rate, but even in subsequent periods after the initial hyperbolic discounting in our example above has abated that consumers will be categorized as suffering from temporal myopia because their discount rate exceeds the low governmental discount rate of 3%. In this instance, government agencies may be remiss when they use a 3% discount rate as the universal rationality norm.
We recognize that there is diverse, real evidence of intertemporal irrationalities. However, they do not appear to be so widespread and severe that government agencies should override all individual choices that have a multi-period component. As with behavioral transfer issues, generally, there should be an estimate of the magnitude of the market failure in the particular context. There should also be an assessment of the welfare loss that is generated by policy mandates that assume all decisions should be guided by a uniform 3% interest rate.
3 Judging rationality in a behavioral transfer test
Assessments of the rationality of choices should recognize legitimate differences in preferences, beliefs, and financial resources. Particularly in policy situations involving energy utilization decisions in which efficient decisions are based on engineering models, there is a tendency to both homogenize the characteristics of the decision and to discard attributes that are unimportant to the regulator. Assuming that everyone has the same average preferences and beliefs may create the illusion that there is an energy-efficiency gap, whereas what is being observed are the consequences of different individual preferences. Similarly, ignoring car attributes such as acceleration and focusing on fuel efficiency as the paramount concern will lead to an overstatement of the private benefits derived from fuel economy standards.
The fundamental rationale guiding U.S. Department of Energy, U.S. EPA, and U.S. Department of Transportation regulations mandating energy-efficiency levels for consumer goods is the claim that consumers suffer from psychological biases that lead them to undervalue the long-term efficiency gains.8 The subsequent benefit assessment practice takes any deviation in the optimal energy choices derived from the agencies’ net present value models (which require assumptions for such things as capital costs, current and future energy prices, and duration and frequency of use, and which omit other relevant factors such as convenience) as evidence of consumer irrationality rather than modeling error. Under this rationale, overriding consumer preferences by mandating a restricted set of energy-efficient products generates net benefits because other valued attributes compromised by the energy-efficiency gains are excluded from consideration. The current benefits assessment approach assumes that people neglect future energy savings and that there are no unobserved competing concerns that make less energy-efficient durables attractive, such as different product attributes (e.g., power and acceleration for a vehicle), limited prospective time that the consumer will be using the product (e.g., because the appliance is for an apartment that the consumer will soon be leaving), and financial considerations that lead consumers to exhibit a rate of time preference higher than the 3% government discount rate.
The empirical evidence on whether there is in fact an energy-efficiency gap is decidedly mixed. Some studies suggest that consumers apply an inordinately high discount rate to energy savings of durable goods. However, other studies of consumers’ energy-related purchases highlight the important practical role of considerations that differ across consumers but are homogenized in the engineering studies that are used to identify the “efficient” consumer choice based on capital costs, current and future energy prices, individual discount rates, and the pattern and duration of the product’s use (McKinsey & Company, 2009). Engineering studies also may promise energy-efficiency gains that do not correspond to the realized benefits derived by a representative consumer (Metcalf & Hassett, 1999). Many pertinent costs and benefits also are omitted from the engineering studies, such as the effort and time-consuming nature of weatherizing one’s home (Allcott & Greenstone, 2012).
The practical significance of the assumed behavioral failures is to generate benefit values of addressing consumer irrationality that frequently account for the preponderance of the estimated regulatory benefits. For the U.S. Department of Transportation’s recent fuel economy mandates for passenger cars and light trucks, $440 billion of the $521 billion in benefits (based on a 3% interest rate and constant 2009 dollars) are the purported benefits of overcoming consumer irrationality (Gayer & Viscusi, 2013). Without this 85% share of benefits due to irrationality, the benefits decline to $81 billion, which is below the estimated $177 billion in costs. The U.S. EPA made similar assumptions in evaluating benefits of these fuel economy mandates, as it estimated that 87% of the total benefits of $613 billion would be due to addressing consumer irrationality (Gayer & Viscusi, 2013). Similar claims of substantial private benefits serve as principal drivers of the benefits of a wide range of energy-efficiency regulations, including incandescent light bulbs, clothes dryers, room air conditioners, and fuel economy standards for heavy-duty vehicles (Gayer & Viscusi, 2013).
Government agencies are sometimes struck by the inordinately large benefits that their analyses attribute to the behavioral failures, but these agencies seldom bring to bear empirical evidence to document the purported benefits as one would expect in any sound behavioral transfer practice. With respect to the fuel economy mandates, EPA (2011) observed that “it is a conundrum from an economic perspective that these large fuel economy savings have not been provided by automakers and purchased by consumers.” When an analysis generates such enormous apparent behavioral failures for not particularly complex decisions involving private benefits and costs, that is often a signal that the analysis has gone astray. It is usually advisable to explore whether there might be quite sensible reasons why the economic actors do not conform to the hypothesized behaviors, such as valuation of product attributes omitted from the analysis. However, rather than re-evaluating whether the considerable estimates of behavioral failures were warranted, EPA (2011) hypothesized that consumers were victims of intertemporal irrationality whereby “consumers put little weight on benefits from fuel economy in the future and show high discount rates.” Another EPA (2011) conjecture for which the agency provided no documentation of the existence or magnitude of any effect was that consumer uncertainty was the responsible factor: “Fuel savings in the future are uncertain, while at the time of purchase the increased costs of fuel-saving technologies are certain and immediate.” Other unpersuasive conjectures offered by EPA (2011) were that search costs were the driving factor (“Consumers may not be able to find the vehicles they want with improved fuel economy.”) and that “factors such as transaction costs and differences in quality may not be adequately measured.” The EPA and Department of Transportation struggled even more to explain the seeming irrationality of buyers of heavy-duty trucks because compared to passenger cars, the vast majority of these vehicles are purchased and operated by businesses, which the agencies acknowledge have “narrow profit margins, and for which fuel costs represent a substantial operating expense” (EPA and Department of Transportation, 2011). The hallmark of these various explanations is that they are all conjectures with limited empirical support coupled with no meaningful attempt to map the purported behavioral failure into a credible benefit estimate.
4 Choosing the policy evaluation reference point
We take the fully informed, fully rational outcome as the preferred policy reference point. In particular, we advocate using rational choice models, such as the expected utility model, as the normative guide for policymaking, while recognizing that the behavioral literature finds ample evidence that people deviate from these norms. Indeed, there are a number of alternative models informed by behavioral studies, such as Kahneman and Tversky’s (1979) prospect theory, that offer alternative models to describe the systematic deviations in behavior from the rational expected utility model. Any behavioral anomalies that lead to irrational self-harm suggest deviations from the fully rational outcome. Based on our recommended approach, the benefits of an appropriate policy response in such instances of behavioral failure would be the value that would pertain if people behaved in a manner that is consistent with the fully informed, fully rational model.
Thus, while we recognize that people may not behave in a manner that is consistent with frameworks, such as the expected utility model, we advocate reliance on such rational models as the normative reference point for policy assessment. This viewpoint is not shared universally by adherents to alternatives to the expected utility approach. For example, Kahneman (2011) urges a departure from the ex ante perspective of expected utility models and a shift in emphasis to an experienced utility model. As discussed below, we believe it may be feasible to incorporate experienced utility into conventional WTP rationality models at least in the case of nonstochastic decisions. Thus, the behavioral economics challenge involves not only a challenge to our understanding of how people make choices, but also may involve a challenge to the normative guidelines used for policy assessment. While we are sympathetic with the possibility that there may be behavioral failures that should be addressed by government policy, we are unwilling to jettison the reliance on rational economic choice models as the normative reference point.
5 Biases in risk beliefs
Behavioral studies consistently find evidence of people misperceiving risk, which could lead to suboptimal outcomes compared to what would occur in the fully informed, fully rational reference point. For example, if consumers overestimate the risks associated with using a product, they may underconsume the product. Likewise, if consumers underestimate the risks, they may overconsume the product and incur greater risks than they would if they acted on the basis of the true probabilities.
The comparison to our suggested reference point suggests two possible reasons for the misperception of risk: a person may lack information regarding the risk, or a fully informed person might suffer from a behavioral failing that leads to over- or underestimating the risk. Market failures that stem from irrational actions in which informed consumers fail to behave in an economically efficient manner are similar to, but distinct from, cases that stem from inadequate information or differential access to information by the various market participants. Inadequate information often is a matter of asymmetric information, such as the case of the drivers of GM vehicles who were unaware of the ignition switch defect.9 The behavioral failing occurs when people have received the pertinent information and are aware of the risk, but they did not fully incorporate the information when forming their risk beliefs. People might dismiss the information because it is inconsistent with their personal experiences or because they view other erroneous information as more credible. This second failure is a shortcoming of people’s beliefs even in the presence of risk awareness.
Systematic biases in risk perception are well documented. Studies consistently find that people tend to overestimate small risks that they face, such as the risk of botulism, and underestimate substantial risks, such as the chance of dying of heart disease.10 Sometimes there is specific empirical evidence that can be brought to bear on this general pattern of risk misperceptions. An interesting recent example is the level of risks posed by e-cigarettes, which have emerged as a lower-risk alternative to conventional cigarettes. Consumers greatly overestimate the associated mortality and lung cancer risks of e-cigarettes, as they apparently use their perception of the hazards of conventional cigarettes as their guide.11 The result of this misperception is that consumers considering a pairwise choice between e-cigarettes and conventional cigarettes will tend to choose conventional cigarettes to a greater extent than they would if their risk beliefs were accurate, exposing them to greater health risks.
The policy challenge posed by this new, safer product may be quite general. If consumers equate the risks of new, safer product alternatives with the risk levels of existing products, then market forces will discourage the emergence of new technologies and the adoption of safer products. The gap between people’s risk beliefs and the actual risks associated with the new product serves as a basis for calculating the benefits of fostering the new technology.
One behavioral phenomenon that can lead to risk overestimation is the availability heuristic. Vivid, recent events may play a disproportionate role in driving risk beliefs. Individuals should, of course, incorporate information based on their experiences in forming risk beliefs, but sometimes they do so to an excessive degree. After the 9/11 attack, there was a tendency to overestimate the number of people who will be killed in terrorist attacks on planes because people can imagine this fearful prospect. Other dramatic events, such as natural disasters, are subject to similar biases as they are salient and can be readily recalled, whereas more mundane but more prevalent risks such as motor-vehicle fatalities are less subject to these biases. A classic example of this phenomenon is that the public incorrectly believes that the likelihood of being killed in a terrorist attack while visiting Israel is greater than the chance of being killed there in a motor-vehicle accident (Kahneman, 2011).
Some behavioral model adherents support our approach of the full information rational decision reference point as a guiding principle for benefit assessment, such as in cases where there is an overestimation of risk stemming from the availability heuristic.12 But consider the following example inspired by the Happyville parable developed by Portney (1992). Suppose the government could completely clean up one of two hazardous waste sites. At site A, there is no actual risk, but people have very high perceptions of the risk. At site B, the risk is real and accurately assessed, but the perceived risk is lower than is the perceived risk at site A. Cleaning up site A achieves greater perceived risk reduction, while cleaning up site B reduces greater actual harm. Advocates of consumer sovereignty prefer that consumer choices be respected and that site A be cleaned up. Such analysts would assess benefits based on the reduction in the perceived risk levels achieved through the cleanup. However, using our approach based on actual risk levels, there are no benefits associated with cleaning up the phantom risks at site A, but there are demonstrable benefits associated with the cleanup at site B.
We approach benefit estimation from the default position of respecting consumer sovereignty under the presumption that fully informed people make self-interested decisions, or at least are more able to make decisions that bear on their own well-being than are policymakers. However, where there are systematic and well-documented findings of behavioral biases, such as in the case of flawed risk perceptions, we advocate basing policies on achieving the outcome that would result from fully informed, fully rational decision making. Frequently, this policy outcome can be achieved through less intrusive regulations, such as correcting information asymmetries through providing information, or even correcting misperceptions of risk through informational nudges. Sometimes, more direct interventions are needed. Our approach establishes a consistent approach to government policy. Just as there is widespread support for policies that seek to discourage consumption of products for which consumers underestimate the health risks, erroneous individual choices based on risk overestimation should be assessed from the standpoint of the objective risk levels.
6 Ambiguity aversion
A long established anomaly is that individuals are averse to imprecise probabilities. The Ellsberg (1961) Paradox observed that for any given mean probability level, people are averse to uncertain chances of winning a prize. Subsequent research has documented that people are often averse to ambiguous risks of losses as well. The relationships can become more complex as there also may be instances of ambiguity-seeking behavior. When people face a high probability of an adverse outcome, such as the risk of a major disaster, they may seek comfort that uncertain probabilities provide since such uncertainty suggests that the actual probability of the loss may not be as great as the mean value suggests. Unlike risk aversion, which is consistent with standard expected utility models, rational economic choice models regard ambiguity aversion and ambiguity-seeking behavior as evidence of individual irrationality.
As with many other documented forms of irrationality, these anomalies have spawned a cottage industry of economic models seeking to incorporate the role of the aberrant behavior into models of individual choice.13 If one were to accept such models as the appropriate normative reference point, there would be no ambiguity-related benefits that would be generated by altering choices influenced by risk ambiguity. However, using fully informed, fully rational framework as the guide, the benefits can be calculated by the valuation derived when people are assumed to act on the objective probability levels, with no influence of the ambiguity of the risk on their decisions.
Benefit assessment for policies consequently will be quite different based on how ambiguity attitudes are treated. Suppose that people are averse to products containing nanoparticles because of imagined risks associated with the new, uncertain technology. Regulating or banning such uncertain technologies is often touted as being consistent with the precautionary principle. Treatment of ambiguity aversion as a legitimate preference to be accounted for in benefit assessment will generate potential benefits from banning products containing nanoparticles, whereas reliance on our approach would only attribute benefits based on the reduction in the mean objective risk levels.
7 Experienced utility
Policymakers make decisions prospectively based on their anticipated effects. Thus, conventional WTP approaches for these benefits comprise a natural framework for conceptualizing these effects. For policies with probabilistic effects, the guidance provided by the expected utility model is well suited to the task. Benefit values grounded in this approach, such as the VSL, are formulated in a manner that corresponds to the structure of the policy situation.
Recent work by psychologists has suggested that decision utility may differ from experienced utility levels.14 For the most important benefit assessment component, mortality risks, the role of experienced utility levels does not appear to be a pertinent concern except perhaps with respect to attendant morbidity effects. Death, which is an outcome that produces the absence of future utility levels, need not be experienced for people to appreciate its finality. To the extent that there are situations in which one can identify gaps between decision utility and experienced utility levels, perhaps because people lack the information to anticipate the welfare effects ex ante, even advocates of a neoclassical economics perspective might advocate that there be recognition of the difference. For example, patients with multiple sclerosis are less willing to accept a risk of death from a potential cure than are healthy people who are confronting the hypothetical disease (Sloan, Viscusi, Chesson, Conover & Whetten-Goldstein, 1998). Such disparities may result from inherent limitations of using stated preference studies to value prospective health outcomes, or it may be that adaptation results in less of a welfare loss than is anticipated. It is feasible to incorporate such effects in conventional WTP values for risk reduction without abandoning rational economic frameworks.
A more problematic suggestion is that current benefit valuation approaches be replaced by the use of happiness scales. One could also make more limited proposed uses of happiness scales, as Adler (2016) suggests that happiness might be an argument in the person’s preference–utility function. Sunstein (2016) likewise takes a cautious, but supportive view with respect to happiness studies. A typical happiness survey question asks the respondent to rate his or her happiness with life on scales such as from 0 to 10, 1 to 10, or 1 to 7 (Layard, Mayraz & Nickell, 2008). For example, Graham (2016) discusses the Cantril ladder question where respondents consider an imaginary ladder in which 0 corresponds to the worst life and 10 is the best possible life. Other approaches, such as 0–1 stress questions and whether the person smiled yesterday, serve as additional happiness measures. Dolan and Laffan (2016) report estimates of how air pollution affects life satisfaction on a 0 to 10 scale as well as similarly scaled measures of happiness, anxiety, and how worthwhile one’s life is. We believe that this approach is ill-suited to benefit assessment for the following five reasons.15 First, there is no reference point for conceptualizing the rating. After suffering a disability, should I rate my happiness today conditional on that disability, or should it be with respect to what my happiness would be in the absence of the disability? Second, the scales at best are person-specific ordinal rankings with no quantitative validity. Movements along the scale from 8 to 5 may not have the same welfare effects as a movement from 4 to 1, and may not be three times as bad as a drop from 8 to 7. Similarly, considering effects across respondents, the drop in welfare from 8 to 6 may be quite different for different people, making benefit assessments for policy problematic. International comparisons involving happiness levels of quite different stages of economic development and personal income are likely to be particularly meaningless. Third, the scales do not have a theoretical foundation that is suitable for dealing with policy choices involving probabilistic outcomes. Is, for example, a 50% chance of obtaining happiness levels of either 8 or 4 equal to a happiness value of 6? Given the rejection of expected utility theory that underlies many behavioral economics approaches, there would seem to be no consistent theoretical rationale for undertaking such an expected happiness value calculation. Fourth, the happiness measures are too coarse and narrowly focussed to be used for most policies. Seldom does any policy improve a person’s happiness score from 7 to 8. For example, complete elimination of the average 1/25,000 risk of death faced by American workers has a monetary value of $360 based on a VSL of $9 million, which is a very modest effect relative to worker income levels. Policies with personal impacts on the order of such amounts are not large enough to generate a perceptible change in a happiness scale on a hypothetical hedonometer. If a policy raises my happiness level from 8 to 9, will this be a permanent impact that continues for the life of the policy, or is it an ephemeral increase in well-being? Does it reflect an increase in my self-assessed well-being as I live my life, or does it reflect the judgment I make when I evaluate my life (Kahneman, 2011)? Fifth, and perhaps most fundamental, the overall objective of BCA is to put benefits and costs in comparable units to assist in being able to make policy decisions. Costs are already in monetary terms so that converting the benefits into a monetary value establishes the commensurability of the benefit and cost components. The challenges above illustrate some of the difficulties that happiness studies face in arriving at monetized benefits for practical use in policy analysis.16
Happiness and life satisfaction questions would not pass behavioral transfer tests analogous to the benefits transfer tests commonly applied to other economic evidence. A pivotal test of stated preference values is whether the responses vary in a manner that is consistent with basic aspects of economic rationality. Smith (2008) presents evidence indicating that happiness questions fail reasonable rationality tests, as adverse health shocks and losses of non wage-related income do not have consistent adverse effects on reported happiness levels.
Application of a behavioral transfer test analogous to current benefits transfer tests would undermine the potential applicability of using happiness measures as an alternative to a WTP benefits assessment measure. Happiness studies generally would not satisfy other criteria applied to stated preference studies in the environmental literature. Indeed, some of the principal critiques of early contingent valuation studies of environmental goods have been by happiness study proponents, but these same critiques are also applicable to happiness studies. Kahneman and Knetsch (1992) observed that respondents in contingent valuation studies may be subject to embedding effects in which, instead of valuing the specific environmental good in question, they were expressing a more general support of environmental quality. Happiness questions are subject to similar criticisms if they do not establish quite clearly what respondents are valuing, on what dimensions, and over what time frame. Similarly, there have been long-standing critiques of stated preference studies suggesting that they do not elicit actual underlying preferences, but instead lead to the construction of what appear to be preferences as part of the survey task. However, Kahneman and Krueger (2006) similarly indicate that responses to happiness and life satisfaction questions are constructed during the survey task and are influenced by the survey structure. Just as the stated preference literature evolved over time to incorporate more stringent validity tests, it is possible that the happiness literature will also develop to have a sounder basis. The contribution by Shogren and Thunström (2016) in this issue develops similar themes and suggests that some methodologies developed for stated preference studies could assist in addressing hypothetical bias concerns, such as the use of oaths to promote honest responses. But such studies should be subject to the same kinds of scrutiny that other economic evidence receives before the results of these studies are adopted. And until happiness scales can be converted into monetary terms, they cannot play an instrumental role in BCAs.
For benefit estimation, we adopt the default position of respecting consumer sovereignty under the presumption that fully informed people are better able to make decisions that bear on their own well-being than are others. The basis for this revealed-preference approach, which is supported by much empirical evidence, is that in most contexts consumers are better equipped than analysts or policymakers to make market decisions that affect themselves. Consumers are typically better able to make decisions about which products they value and which goods they should purchase given the substantial heterogeneity in preferences, financial resources, and personal situations.
However, the insights of behavioral and psychological studies suggest that market failures indeed exist where people make self-harming decisions. We advocate estimating the benefits of correcting these actions relative to the outcome that would present if people were fully informed and fully rational actors. It is important that such an approach be grounded on systematic, well-documented, and context-specific findings of behavioral failings. What we have seen in the case of energy-efficiency regulations is that the agencies assume that findings of short-sightedness in some contexts provide sufficient rationale for overriding consumer preferences in other contexts and thus justify the use of heavy-handed mandates.
Frequently, the policy outcome to address behavioral shortfalls that lead to self-harming actions can be achieved through less intrusive regulations, such as correcting information asymmetries through providing information, or even correcting misperceptions of risk through informational nudges. Our approach establishes a consistent approach to government policy. Just as there is widespread support for policies that seek to discourage consumption of products for which consumers underestimate the health risks, erroneous individual choices based on risk overestimation should be assessed from the standpoint of the objective risk levels.
Our approach also limits the policy response to correct market failures that occur due to behavioral biases that lead people to self-harming actions. We differ with the advocacy position in some behavioral studies that policymakers should adjust the choice architecture faced by people, not to correct self-harming behavior, but instead to achieve other socially desired goals. For example, an implication of the tax salience literature is that net benefits increase as a tax becomes less salient (Chetty, Looney & Kroft, 2009). However, the increase in net benefits is composed of an increase in tax revenues that dominates a decrease in consumer surplus, since the less salient tax leads consumers to under-respond to it. In other words, tricking taxpayers into thinking a tax does not exist has two effects: it leads the taxpayers to make poor consumption choices, and it increases tax revenue because more transactions are taxed. Although net benefits are increased since the second effect is greater than the first, this policy of disguising taxes would be inconsistent with our approach of basing policies on fully informed, fully rational decision making.
Indeed, there is evidence that suggests that since opaque taxes dull consumer responses, they can also dull the political penalty associated with raising taxes. An approach of optimizing government coffers would then suggest opaque taxes would lead to tax rates higher than fully informed voters would like. Finkelstein (2009) finds that drivers are more aware of tolls paid at booths than paid electronically, and that switching from the former to the latter led to a 20 to 40% rate increase. In other words, as tax salience goes down, tax rates go up. She finds that after the adoption of electronic tolls, toll setting becomes less sensitive to the local election calendar, suggesting that reduced tax salience reduces the political costs of raising tolls.
There are other behavioral studies that suggest that behavioral anomalies can be exploited for achieving other goals rather than correcting the self-harming activity. For example, Engström, Nordblom, Ohlsson and Persson (2015) find that taxpayers in Sweden were more aggressive about claiming tax deductions when they owed additional taxes at the time of filing than when they expected a refund (which is consistent with predictions of prospect theory). According to Madrian (2014), this finding suggests that tax officials should adopt a strategy that relies on overwithholding taxes in order to provide more refunds at the time of tax filling, which would make taxpayers less likely to engage in tax avoidance strategies. This approach would increase tax revenues, but it does address a market failure stemming from self-harming behavior.
As the many examples we have provided suggest, the concerns we have expressed with respect to the role of behavioral economics in benefit-cost assessments are not entirely hypothetical. Agencies have already begun to treat behavioral economics findings as providing carte blanche for laying claim to inordinately large and highly speculative benefit levels. Although behavioral failures may exist, they do not provide open-ended justifications for benefit estimates any more than do more traditional market failures, such as the presence of externalities. In this article, we have proposed that government agencies be subject to analytical discipline that is in many respects similar to the principles and standards that govern other forms of BCAs. Chief among our concerns are that there needs to be formal guidelines for behavioral transfer practices and that choices of fully informed, rational consumers should serve as the normative reference point. The continued proliferation of behavioral anomalies coupled with agencies’ incentive to justify policies based on their parochial interests suggest that the concerns we have raised will not abate in the absence of establishing guidelines for BCAs that have adapted to the policy ramifications of this emerging behavioral economics literature.