The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity.

Our ability to make decisions is predicated upon our knowledge of the outcomes of the actions available to us. Reinforcement learning theory posits that actions followed by a reward or punishment acquire value through the computation of prediction errors—discrepancies between the predicted and the actual reward. A multitude of neuroimaging studies have demonstrated that rewards and punishments evoke neural responses that appear to reflect reinforcement learning prediction errors [e.g., Krigolson, O. E., Pierce, L. J., Holroyd, C. B., & Tanaka, J. W. Learning to become an expert: Reinforcement learning and the acquisition of perceptual expertise. Journal of Cognitive Neuroscience, 21, 1833–1840, 2009; Bayer, H. M., & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47, 129–141, 2005; O'Doherty, J. P. Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769–776, 2004; Holroyd, C. B., & Coles, M. G. H. The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679–709, 2002]. Here, we used the brain ERP technique to demonstrate that not only do rewards elicit a neural response akin to a prediction error but also that this signal rapidly diminished and propagated to the time of choice presentation with learning. Specifically, in a simple, learnable gambling task, we show that novel rewards elicited a feedback error-related negativity that rapidly decreased in amplitude with learning. Furthermore, we demonstrate the existence of a reward positivity at choice presentation, a previously unreported ERP component that has a similar timing and topography as the feedback error-related negativity that increased in amplitude with learning. The pattern of results we observed mirrored the output of a computational model that we implemented to compute reward prediction errors and the changes in amplitude of these prediction errors at the time of choice presentation and reward delivery. Our results provide further support that the computations that underlie human learning and decision-making follow reinforcement learning principles.

Download Full-text

Feedback-related Negativity Codes Prediction Error but Not Behavioral Adjustment during Probabilistic Reversal Learning

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2010.21456 ◽

2011 ◽

Vol 23 (4) ◽

pp. 936-946 ◽

Cited By ~ 126

Author(s):

Henry W. Chase ◽

Rachel Swainson ◽

Lucy Durham ◽

Laura Benham ◽

Roshan Cools

Keyword(s):

Reinforcement Learning ◽

Reversal Learning ◽

Prediction Error ◽

Human Error ◽

Behavioral Adjustment ◽

Actual Behavior ◽

Negative Outcomes ◽

Neural Basis ◽

Rule Based ◽

Probabilistic Reversal Learning

We assessed electrophysiological activity over the medial frontal cortex (MFC) during outcome-based behavioral adjustment using a probabilistic reversal learning task. During recording, participants were presented two abstract visual patterns on each trial and had to select the stimulus rewarded on 80% of trials and to avoid the stimulus rewarded on 20% of trials. These contingencies were reversed frequently during the experiment. Previous EEG work has revealed feedback-locked electrophysiological responses over the MFC (feedback-related negativity; FRN), which correlate with the negative prediction error [Holroyd, C. B., & Coles, M. G. The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679–709, 2002] and which predict outcome-based adjustment of decision values [Cohen, M. X., & Ranganath, C. Reinforcement learning signals predict future decisions. Journal of Neuroscience, 27, 371–378, 2007]. Unlike previous paradigms, our paradigm enabled us to disentangle, on the one hand, mechanisms related to the reward prediction error, derived from reinforcement learning (RL) modeling, and on the other hand, mechanisms related to explicit rule-based adjustment of actual behavior. Our results demonstrate greater FRN amplitudes with greater RL model-derived prediction errors. Conversely expected negative outcomes that preceded rule-based behavioral reversal were not accompanied by an FRN. This pattern contrasted remarkably with that of the P3 amplitude, which was significantly greater for expected negative outcomes that preceded rule-based behavioral reversal than for unexpected negative outcomes that did not precede behavioral reversal. These data suggest that the FRN reflects prediction error and associated RL-based adjustment of decision values, whereas the P3 reflects adjustment of behavior on the basis of explicit rules.

Download Full-text

The neural basis of error processing : the error-related negativity associated with the feedback reflects response selection based on reward-prediction error

Japanese Journal of Physiological Psychology and Psychophysiology ◽

10.5674/jjppp1983.22.19 ◽

2004 ◽

Vol 22 (1) ◽

pp. 19-32

Author(s):

Atsushi SATO ◽

Asako YASUDA

Keyword(s):

Prediction Error ◽

Response Selection ◽

Error Processing ◽

Neural Basis ◽

Reward Prediction Error ◽

Error Related Negativity ◽

Reward Prediction

Download Full-text

Beyond the Broken Error-Related Negativity: Functional and Diagnostic Correlates of Error Processing in Psychosis

Biological Psychiatry ◽

10.1016/j.biopsych.2012.01.007 ◽

2012 ◽

Vol 71 (10) ◽

pp. 864-872 ◽

Cited By ~ 48

Author(s):

Dan Foti ◽

Roman Kotov ◽

Evelyn Bromet ◽

Greg Hajcak

Keyword(s):

Error Processing ◽

Error Related Negativity

Download Full-text

Language statistical learning responds to reinforcement learning principles rooted in the striatum

PLoS Biology ◽

10.1371/journal.pbio.3001119 ◽

2021 ◽

Vol 19 (9) ◽

pp. e3001119

Author(s):

Joan Orpella ◽

Ernest Mas-Herrero ◽

Pablo Ripollés ◽

Josep Marco-Pallarés ◽

Ruth de Diego-Balaguer

Keyword(s):

Reinforcement Learning ◽

Language Learning ◽

Statistical Learning ◽

Dorsal Striatum ◽

Rule Learning ◽

Prediction Errors ◽

Neural Basis ◽

Structural Rules ◽

Learning Principles ◽

Striatal Function

Statistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online SL of simple syntactic structures combined with computational modeling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate—on 2 different cohorts—that a temporal difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the long-standing gap between language learning and reinforcement learning phenomena.

Download Full-text

Intelligence moderates reinforcement learning: a mini-review of the neural evidence

Journal of Neurophysiology ◽

10.1152/jn.00600.2014 ◽

2015 ◽

Vol 113 (10) ◽

pp. 3459-3461 ◽

Cited By ~ 5

Author(s):

Chong Chen

Keyword(s):

Reinforcement Learning ◽

Prediction Error ◽

Dorsolateral Prefrontal Cortex ◽

Anterior Cingulate ◽

Neural Responses ◽

Key Factors ◽

Neural Basis ◽

Dorsal Anterior Cingulate Cortex ◽

Neural Signal ◽

Dorsolateral Prefrontal

Our understanding of the neural basis of reinforcement learning and intelligence, two key factors contributing to human strivings, has progressed significantly recently. However, the overlap of these two lines of research, namely, how intelligence affects neural responses during reinforcement learning, remains uninvestigated. A mini-review of three existing studies suggests that higher IQ (especially fluid IQ) may enhance the neural signal of positive prediction error in dorsolateral prefrontal cortex, dorsal anterior cingulate cortex, and striatum, several brain substrates of reinforcement learning or intelligence.

Download Full-text

Diminished error-related negativity and error positivity in children and adults with externalizing problems and disorders: a meta-analysis on error processing

Journal of Psychiatry and Neuroscience ◽

10.1503/jpn.200031 ◽

2021 ◽

Vol 46 (6) ◽

pp. E615-E627

Author(s):

Miranda Christine Lutz ◽

Rianne Kok ◽

Ilse Verveer ◽

Marcelo Malbec ◽

Susanne Koot ◽

...

Keyword(s):

Externalizing Problems ◽

Meta Analysis ◽

Error Processing ◽

Error Related Negativity ◽

Error Positivity

Download Full-text

Error-Related Activity in Striatal Local Field Potentials and Medial Frontal Cortex: Evidence From Patients With Severe Opioid Abuse Disorder

Frontiers in Human Neuroscience ◽

10.3389/fnhum.2020.627564 ◽

2021 ◽

Vol 14 ◽

Author(s):

Elena Sildatke ◽

Thomas Schüller ◽

Theo O. J. Gründler ◽

Markus Ullsperger ◽

Veerle Visser-Vandewalle ◽

...

Keyword(s):

Local Field ◽

Performance Monitoring ◽

Flanker Task ◽

Local Field Potentials ◽

Opioid Abuse ◽

Error Processing ◽

Field Potentials ◽

Error Awareness ◽

Error Related Negativity ◽

Behavioral Adaptations

For successful goal-directed behavior, a performance monitoring system is essential. It detects behavioral errors and initiates behavioral adaptations to improve performance. Two electrophysiological potentials are known to follow errors in reaction time tasks: the error-related negativity (ERN), which is linked to error processing, and the error positivity (Pe), which is associated with subjective error awareness. Furthermore, the correct-related negativity (CRN) is linked to uncertainty about the response outcome. Here we attempted to identify the involvement of the nucleus accumbens (NAc) in the aforementioned performance monitoring processes. To this end, we simultaneously recorded cortical activity (EEG) and local field potentials (LFP) during a flanker task performed by four patients with severe opioid abuse disorder who underwent electrode implantation in the NAc for deep brain stimulation. We observed significant accuracy-related modulations in the LFPs at the time of the ERN/CRN in two patients and at the time of Pe in three patients. These modulations correlated with the ERN in 2/8, with CRN in 5/8 and with Pe in 6/8, recorded channels, respectively. Our results demonstrate the functional interrelation of striatal and cortical processes in performance monitoring specifically related to error processing and subjective error awareness.

Download Full-text

The Homeostatic Logic of Reward

10.1101/242974 ◽

2018 ◽

Cited By ~ 3

Author(s):

Tobias Morville ◽

Karl Friston ◽

Denis Burdakov ◽

Hartwig R. Siebner ◽

Oliver J. Hulme

Keyword(s):

Reinforcement Learning ◽

Energy Homeostasis ◽

Active Inference ◽

Neural Basis ◽

Homeostatic Control ◽

Dopaminergic Cells ◽

Time Horizons ◽

Metabolic States ◽

Circular Definitions ◽

Insight Into

AbstractEnergy homeostasis depends on behavior to predictively regulate metabolic states within narrow bounds. Here we review three theories of homeostatic control and ask how they provide insight into the circuitry underlying energy homeostasis. We offer two contributions. First, we detail how control theory and reinforcement learning are applied to homeostatic control. We show how these schemes rest on implausible assumptions; either via circular definitions, unprincipled drive functions, or by ignoring environmental volatility. We argue active inference can elude these shortcomings while retaining important features of each model. Second, we review the neural basis of energetic control. We focus on a subset of arcuate subpopulations that project directly to, and are thus in a privileged position to opponently modulate, dopaminergic cells as a function of energetic predictions over a spectrum of time horizons. We discuss how this can be interpreted under these theories, and how this can resolve paradoxes that have arisen. We propose this circuit constitutes a homeostatic-reward interface that underwrites the conjoint optimisation of physiological and behavioural homeostasis.

Download Full-text