“Switching” between fast and slow processes is just reward-based branching

George Ainslie

doi:10.1017/S0140525X22002990

“Switching” between fast and slow processes is just reward-based branching

Published online by Cambridge University Press: 18 July 2023

George Ainslie

Show author details

George Ainslie*: Affiliation:
Department of Veterans Affairs, Coatesville, PA, USA ga@picoeconomics.org www.picoeconomics.org

Article contents

Abstract
Financial support
Competing interest
References

Rights & Permissions

Abstract

Shortcuts to goals are rewarded by faster attainment and punished by more frequent failure, so selection of the various kinds – heuristics, cached sequences (habits or macros), gut instincts – depends on reward history just like other kinds of choice. The speeds of shortcuts lie on continua along with speeds of deliberation, and these continua have no obvious separation points.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 46 , 2023 , e113

DOI: https://doi.org/10.1017/S0140525X22002990 [Opens in a new window]
Copyright: Copyright © The Author(s), 2023. Published by Cambridge University Press

This target article (TA) follows on De Neys's recent proposal “that trying to answer the core single vs dual process model debate is pointless for empirical scientists” so it is “time to move on” (De Neys, Reference De Neys2021, Introduction). What he proposes in the TA is “a more viable dual process architecture” (target article Abstract), which is “orthogonal [to] whether the difference between the two types of processing should be conceived as merely quantitative or qualitative.” Nevertheless, he argues for two qualitatively different processes, perhaps characterized by the 14 different properties he listed in the earlier article (fast, effortless, affective, automatic… vs. slow, effortful, affectless, controlled…; De Neys, Reference De Neys2021, p. 4), which he calls here simply fast and slow. He demonstrates the flaws in dual-process theories' usual assumptions: that the two processes must operate separately (“exclusivity”), and that there must be a “switch feature… by which a reasoner can decide to shift between more intuitive and deliberate processing”; but he pulls back from other authors' proposal that “we simply abandon the dual process enterprise.” His refutation of the authors who have favored a single, quantitatively based decision process is just to point out that “some responses require more deliberation than others,” which would not seem to require a dichotomy.

Dual-process models have admittedly been popular over the years, beginning with Plato's wild versus well-behaved chariot horses. In addition to De Neys's fast versus slow examples, choice making has been described as passionate versus reasonable, impulsive versus reflective, myopic versus far-sighted, hot versus cool, and model-free versus model-based, among others. De Neys also includes as fast the products of “automatization,” by which repeated sequences of choices “will be elicited intuitively.” In addition, brain imaging has found evidence for steep-discounting versus shallow-discounting brain centers (McClure et al., Reference McClure, Laibson, Loewenstein and Cohen2004; van den Bos & McClure, Reference van den Bos and McClure2013).

However, as De Neys himself concludes, there is no operation that system 2 can perform that system 1 cannot, and “thinking always involves a continuous interaction between system 1 and system 2 application” (target article, sect. 3.4, para. 5). Other authors have pointed out obvious problems with the dual approach: If there are distinct systems, there must be more than two of them, because the properties attributed to the two systems do not reliably occur together (Zbrodoff & Logan, Reference Zbrodoff and Logan1986); in particular, automatic processes may or may not be affectively arousing (Ainslie, Reference Ainslie2021). Furthermore, the listed properties such as effort, affect, and speed itself are themselves continua. If the two whole lists of properties really constitute discrete systems, there should be natural breaks in the continua from fast to slow, and the breaks should occur at equivalent levels in the n dimensions. As a negative example, the only obvious break in transparency would be too-fast-to-introspect versus not-too-fast-to-introspect, which would not define different kinds.

Most “type 1” processing in humans comprises sequences that have been automatized, macros (or habits) that call up other macros. In language, a squiggly line is interpreted as a letter, a sequence of letters is interpreted as a word, a series of words forms a concept (or cliché). All highly automatized; but if I was to find an anomaly – no, it should be “were to find an anomaly” – my ear would be quick to re-set it. This should not require a distinct system. Even if I stopped to ponder the use of the subjunctive, I would just be trying out sequences I had previously automatized. Likewise, as my calculation proceeds from 2 + 2 through, say 8 + 8, to 64 + 64, and so forth, at some points my mind will pause to find component automatizations; but is there a point where the pause divides two systems?

The strongest case for separate processes might be based on the activities of separate sites in the brain, but even here true separation is doubtful. The dorsolateral striatum (putamen) is differentially active when repeated connections have been cached to form macros, whereas the dorsomedial striatum (caudate) is more active during flexible behavior; but their functioning has been observed to be integrally combined (Dolan & Dayan, Reference Dolan and Dayan2013; Keramati et al., Reference Keramati, Smittenaar, Dolan and Dayan2016). Similarly, the existence of separate steep and shallow reward discount centers in the brain is controversial (Kable & Glimcher, Reference Kable and Glimcher2007; Lempert et al., Reference Lempert, Steinglass, Pinto, Kable and Simpson2019). If there do exists anatomically separate response-selection systems in the brain, the best candidates would be those for motivational salience and (supposedly separate) reward, governing the attraction of attention and behavioral approach/avoidance, respectively (Berridge & Robinson, Reference Berridge and Robinson1998). But even here, salience and behavior selection are correlated with activity in mostly the same brain regions (Kim et al., Reference Kim, Nanavaty, Ahmed, Mathur and Anderson2021); and when even threatening stimuli are voluntarily gated out, attention to them must have been weighed in the common marketplace of reward (see Ainslie, Reference Ainslie2009).

The professed scope of the TA's model is universal, but except for its reference to cupcakes its examples are cognitive searches for correct solutions to puzzles, rather than choices among competing rewards. Accordingly, “the peak activation strength of an intuition reflects how automatized or instantiated the underlying knowledge structures are (i.e., how strongly it is tied to its eliciting stimulus).” This rather Pavlovian convention hampers the model's application to goal-directed activities. By contrast, it is feasible to model the selection of all learnable processes which can replace each other using the amount and timing of their contingent reward (Ainslie, Reference Ainslie2017). The sources of reward – consumption goods, ethical goods, social cues, puzzle solutions, signal detections, emotions, the satisfaction of urges – as well as their speeds of onset, are miscellaneous. It should not matter that some of their subroutines involve particular parts of the brain (for instance the amygdala – Aqino et al., Reference Aquino, Minxha, Dunne, Ross, Mamelak, Rutishauser and O'Doherty2020 – or hippocampus – Gauthier & Tank, Reference Gauthier and Tank2018), as long as their weights are ultimately comparable to each other. Likewise, the weighing process may or may not involve a specific site, such as the orbitofrontal cortex (Bartra et al., Reference Bartra, McGuire and Kable2013; Levy & Glimcher, Reference Levy and Glimcher2012), a set of interacting sites (Krönke et al., Reference Krönke, Wolff, Shi, Kräplin, Smolka, Bühringer and Goschke2020), or no identifiable dwelling place (Dohmatob, Dumas, & Bzdok, Reference Dohmatob, Dumas and Bzdok2020). In reward research, the adoption of millisecond-specific electroencephalography (for instance, Sambrook et al., Reference Sambrook, Hardwick, Wills and Goslin2018) promises to give precise evidence about branching to fast, slow, and intermediate processes.

Financial support

This material is the result of work supported with resources and the use of facilities at the Department of Veterans Affairs Medical Center, Coatesville, PA, USA. The opinions expressed are not those of the Department of Veterans Affairs or of the US Government.

Competing interest

None.

References

Ainslie, G. (2009). Pleasure and aversion: Challenging the conventional dichotomy. Inquiry: A Journal of Medical Care Organization, Provision and Financing, 52(4), 357–377. http://dx.doi.org/10.1080/00201740903087342 CrossRef Google Scholar

Ainslie, G. (2017). De gustibus disputare: Hyperbolic delay discounting integrates five approaches to choice. Journal of Economic Methodology 24(2), 166–189. http://dx.doi.org/10.1080/1350178X.2017.1309748 CrossRef Google Scholar

Ainslie, G. (2021). Reply to commentaries to “willpower with and without effort.” Behavioral and Brain Sciences 44, E57. https://doi.org/10.1017/s0140525x21000029 Google Scholar

Aquino, T. G., Minxha, J., Dunne, S., Ross, I. B., Mamelak, A. N., Rutishauser, U., & O'Doherty, J. P. (2020). Value-related neuronal responses in the human amygdala during observational learning. Journal of Neuroscience, 40(24), 4761–4772.10.1523/JNEUROSCI.2897-19.2020CrossRef Google Scholar PubMed

Bartra, O., McGuire, J. T., & Kable, J. W. (2013). The valuation system: A coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. NeuroImage, 76, 412–427.10.1016/j.neuroimage.2013.02.063CrossRef Google Scholar PubMed

Berridge, K. C., & Robinson, T. (1998). What is the role of dopamine in reward: Hedonic impact, reward learning, or incentive salience. Brain Research Reviews, 28, 309–369.10.1016/S0165-0173(98)00019-8CrossRef Google Scholar PubMed

De Neys, W. (2021). On dual- and single-process models of thinking. Perspectives on Psychological Science, 16(6), 1412–1427.10.1177/1745691620964172CrossRef Google Scholar PubMed

Dohmatob, E., Dumas, G., & Bzdok, D. (2020). Dark control: The default mode network as a reinforcement learning agent. Human Brain Mapping, 41(12), 3318–3341.10.1002/hbm.25019CrossRef Google Scholar PubMed

Dolan, R. J., & Dayan, P. (2013). Goals and habits in the brain. Neuron, 80(2), 312–325.10.1016/j.neuron.2013.09.007CrossRef Google Scholar PubMed

Gauthier, J. L., & Tank, D. W. (2018). A dedicated population for reward coding in the hippocampus. Neuron, 99(1), 179–193.10.1016/j.neuron.2018.06.008CrossRef Google Scholar PubMed

Kable, J. W., & Glimcher, P. W. (2007) The neural correlates of subjective value during intertemporal choice. Nature Neuroscience 10, 1625–1633.10.1038/nn2007CrossRef Google Scholar PubMed

Keramati, M., Smittenaar, P., Dolan, R. J., & Dayan, P. (2016). Adaptive integration of habits into depth-limited planning defines a habitual-goal-directed spectrum. Proceedings of the National Academy of Sciences, 113(45), 12868–12873.10.1073/pnas.1609094113CrossRef Google Scholar PubMed

Kim, H., Nanavaty, N., Ahmed, H., Mathur, V. A., & Anderson, B. A. (2021). Motivational salience guides attention to valuable and threatening stimuli: Evidence from behavior and functional magnetic resonance imaging. Journal of Cognitive Neuroscience, 33(12), 2440–2460.10.1162/jocn_a_01769CrossRef Google Scholar PubMed

Krönke, K. M., Wolff, M., Shi, Y., Kräplin, A., Smolka, M. N., Bühringer, G., & Goschke, T. (2020). Functional connectivity in a triple-network saliency model is associated with real-life self-control. Neuropsychologia, 149, 107667.10.1016/j.neuropsychologia.2020.107667CrossRef Google Scholar

Lempert, K. M., Steinglass, J. E., Pinto, A., Kable, J. W., & Simpson, H. B. (2019). Can delay discounting deliver on the promise of RDoC?. Psychological Medicine, 49(2), 190–199.10.1017/S0033291718001770CrossRef Google Scholar PubMed

Levy, D. J., & Glimcher, P. W. (2012). The root of all value: A neural common currency for choice. Current Opinion in Neurobiology, 22(6), 1027–1038.10.1016/j.conb.2012.06.001CrossRef Google Scholar PubMed

McClure, S. M., Laibson, D. I., Loewenstein, G., & Cohen, J. D. (2004). The grasshopper and the ant: Separate neural systems value immediate and delayed monetary rewards. Science (New York, N.Y.) 306, 503–507.10.1126/science.1100907CrossRef Google Scholar

Sambrook, T. D., Hardwick, B., Wills, A. J., & Goslin, J. (2018). Model-free and model-based reward prediction errors in EEG. NeuroImage, 178, 162–171.10.1016/j.neuroimage.2018.05.023CrossRef Google Scholar PubMed

van den Bos, W., & McClure, S. M. (2013). Towards a general model of temporal discounting. Journal of the Experimental Analysis of Behavior, 99(1), 58–73.10.1002/jeab.6CrossRef Google Scholar PubMed

Zbrodoff, N. J., & Logan, G. D. (1986). On the autonomy of mental processes: A case study of arithmetic. Journal of Experimental Psychology: General, 115(2), 118.10.1037/0096-3445.115.2.118CrossRef Google Scholar PubMed