Positive reward prediction errors during decision-making strengthen memory encoding

People make decisions based on deviations from expected outcomes, known as prediction errors. Past work has focused on reward prediction errors, largely ignoring violations of expected emotional experiences—emotion prediction errors. We leverage a new method to measure real-time fluctuations in emotion as people decide to punish or forgive others. Across four studies (N=1,016), we reveal that emotion and reward prediction errors have distinguishable contributions to choice, such that emotion prediction errors exert the strongest impact during decision-making. We additionally find that a choice to punish or forgive can be decoded in less than a second from an evolving emotional response, suggesting emotions swiftly influence choice. Finally, individuals reporting significant levels of depression exhibit selective impairments in using emotion—but not reward—prediction errors. Evidence for emotion prediction errors potently guiding social behaviors challenge standard decision-making models that have focused solely on reward.

Download Full-text

The contribution of striatal pseudo-reward prediction errors to value-based decision-making

NeuroImage ◽

10.1016/j.neuroimage.2019.02.052 ◽

2019 ◽

Vol 193 ◽

pp. 67-74 ◽

Cited By ~ 2

Author(s):

Ernest Mas-Herrero ◽

Guillaume Sescousse ◽

Roshan Cools ◽

Josep Marco-Pallarés

Keyword(s):

Decision Making ◽

Prediction Errors ◽

Reward Prediction

Download Full-text

Increased fronto-striatal reward prediction errors moderate decision making in obsessive–compulsive disorder

Psychological Medicine ◽

10.1017/s0033291716003305 ◽

2017 ◽

Vol 47 (7) ◽

pp. 1246-1258 ◽

Cited By ~ 19

Author(s):

T. U. Hauser ◽

R. Iannaccone ◽

R. J. Dolan ◽

J. Ball ◽

J. Hättenschwiler ◽

...

Keyword(s):

Decision Making ◽

Obsessive Compulsive Disorder ◽

Late Onset ◽

Anterior Cingulate ◽

Prediction Errors ◽

Learning Mechanisms ◽

Obsessive Compulsive ◽

Compulsive Disorder ◽

Reward Prediction ◽

Reinforcement Learning Models

BackgroundObsessive–compulsive disorder (OCD) has been linked to functional abnormalities in fronto-striatal networks as well as impairments in decision making and learning. Little is known about the neurocognitive mechanisms causing these decision-making and learning deficits in OCD, and how they relate to dysfunction in fronto-striatal networks.MethodWe investigated neural mechanisms of decision making in OCD patients, including early and late onset of disorder, in terms of reward prediction errors (RPEs) using functional magnetic resonance imaging. RPEs index a mismatch between expected and received outcomes, encoded by the dopaminergic system, and are known to drive learning and decision making in humans and animals. We used reinforcement learning models and RPE signals to infer the learning mechanisms and to compare behavioural parameters and neural RPE responses of the OCD patients with those of healthy matched controls.ResultsPatients with OCD showed significantly increased RPE responses in the anterior cingulate cortex (ACC) and the putamen compared with controls. OCD patients also had a significantly lower perseveration parameter than controls.ConclusionsEnhanced RPE signals in the ACC and putamen extend previous findings of fronto-striatal deficits in OCD. These abnormally strong RPEs suggest a hyper-responsive learning network in patients with OCD, which might explain their indecisiveness and intolerance of uncertainty.

Download Full-text

Positive reward prediction errors strengthen incidental memory encoding

10.1101/327445 ◽

2018 ◽

Cited By ~ 2

Author(s):

Anthony I. Jang ◽

Matthew R. Nassar ◽

Daniel G. Dillon ◽

Michael J. Frank

Keyword(s):

Prediction Error ◽

Memory Systems ◽

Prediction Errors ◽

Memory Encoding ◽

Dopamine System ◽

Reward Prediction Error ◽

Reward Prediction ◽

Incidental Memory ◽

Episodic Memories ◽

The Impact

AbstractThe dopamine system is thought to provide a reward prediction error signal that facilitates reinforcement learning and reward-based choice in corticostriatal circuits. While it is believed that similar prediction error signals are also provided to temporal lobe memory systems, the impact of such signals on episodic memory encoding has not been fully characterized. Here we develop an incidental memory paradigm that allows us to 1) estimate the influence of reward prediction errors on the formation of episodic memories, 2) dissociate this influence from other factors such as surprise and uncertainty, 3) test the degree to which this influence depends on temporal correspondence between prediction error and memoranda presentation, and 4) determine the extent to which this influence is consolidation-dependent. We find that when choosing to gamble for potential rewards during a primary decision making task, people encode incidental memoranda more strongly even though they are not aware that their memory will be subsequently probed. Moreover, this strengthened encoding scales with the reward prediction error, and not overall reward, experienced selectively at the time of memoranda presentation (and not before or after). Finally, this strengthened encoding is identifiable within a few minutes and is not substantially enhanced after twenty-four hours, indicating that it is not consolidation-dependent. These results suggest a computationally and temporally specific role for putative dopaminergic reward prediction error signaling in memory formation.

Download Full-text

A neuronal mechanism underlying decision-making deficits during hyperdopaminergic states

10.1101/211862 ◽

2017 ◽

Author(s):

Jeroen P.H. Verharen ◽

Johannes W. de Jong ◽

Theresia J.M. Roelofs ◽

Christiaan F.M. Huffels ◽

Ruud van Zessen ◽

...

Keyword(s):

Decision Making ◽

Substance Abuse ◽

Prefrontal Cortex ◽

Nucleus Accumbens ◽

Multidisciplinary Approach ◽

Prediction Errors ◽

Risky Decision ◽

Reward Prediction ◽

Decision Making Processes ◽

Mesencephalic Dopamine

AbstractHyperdopaminergic states in mental disorders are associated with disruptive deficits in decision-making. However, the precise contribution of topographically distinct mesencephalic dopamine pathways to decision-making processes remains elusive. Here we show, using a multidisciplinary approach, how hyperactivity of ascending projections from the ventral tegmental area (VTA) contributes to faulty decision-making in rats. Activation of the VTA-nucleus accumbens pathway leads to insensitivity to loss and punishment due to impaired processing of negative reward prediction errors. In contrast, activation of the VTA-prefrontal cortex pathway promotes risky decision-making without affecting the ability to choose the economically most beneficial option. Together, these findings show how malfunction of ascending VTA projections affects value-based decision-making, providing a mechanistic understanding of the reckless behaviors seen in substance abuse, mania, and after dopamine replacement therapy in Parkinson’s disease.

Download Full-text

Pupil responses as indicators of value-based decision-making

10.1101/302166 ◽

2018 ◽

Cited By ~ 5

Author(s):

Joanne C. Van Slooten ◽

Sara Jahfari ◽

Tomas Knapen ◽

Jan Theeuwes

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Computational Model ◽

Cognitive Processes ◽

Learning Task ◽

Brain Regions ◽

Computational Approach ◽

Prediction Errors ◽

Reward Prediction ◽

Exciting Possibility

AbstractPupil responses have been used to track cognitive processes during decision-making. Studies have shown that in these cases the pupil reflects the joint activation of many cortical and subcortical brain regions, also those traditionally implicated in value-based learning. However, how the pupil tracks value-based decisions and reinforcement learning is unknown. We combined a reinforcement learning task with a computational model to study pupil responses during value-based decisions, and decision evaluations. We found that the pupil closely tracks reinforcement learning both across trials and participants. Prior to choice, the pupil dilated as a function of trial-by-trial fluctuations in value beliefs. After feedback, early dilation scaled with value uncertainty, whereas later constriction scaled with reward prediction errors. Our computational approach systematically implicates the pupil in value-based decisions, and the subsequent processing of violated value beliefs, ttese dissociable influences provide an exciting possibility to non-invasively study ongoing reinforcement learning in the pupil.

Download Full-text

Faculty Opinions recommendation of The contribution of striatal pseudo-reward prediction errors to value-based decision-making.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.735291840.793558381 ◽

2019 ◽

Author(s):

Michael Frank

Keyword(s):

Decision Making ◽

Prediction Errors ◽

Reward Prediction

Download Full-text

Dorsal Striatal–midbrain Connectivity in Humans Predicts How Reinforcements Are Used to Guide Decisions

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2009.21092 ◽

2009 ◽

Vol 21 (7) ◽

pp. 1332-1345 ◽

Cited By ~ 67

Author(s):

Thorsten Kahnt ◽

Soyoung Q Park ◽

Michael X Cohen ◽

Anne Beck ◽

Andreas Heinz ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Functional Connectivity ◽

Ventral Striatum ◽

Critical Role ◽

Prediction Errors ◽

Midbrain Neurons ◽

Reward Prediction ◽

Future Behavior ◽

The Impact

It has been suggested that the target areas of dopaminergic midbrain neurons, the dorsal (DS) and ventral striatum (VS), are differently involved in reinforcement learning especially as actor and critic. Whereas the critic learns to predict rewards, the actor maintains action values to guide future decisions. The different midbrain connections to the DS and the VS seem to play a critical role in this functional distinction. Here, subjects performed a dynamic, reward-based decision-making task during fMRI acquisition. A computational model of reinforcement learning was used to estimate the different effects of positive and negative reinforcements on future decisions for each subject individually. We found that activity in both the DS and the VS correlated with reward prediction errors. Using functional connectivity, we show that the DS and the VS are differentially connected to different midbrain regions (possibly corresponding to the substantia nigra [SN] and the ventral tegmental area [VTA], respectively). However, only functional connectivity between the DS and the putative SN predicted the impact of different reinforcement types on future behavior. These results suggest that connections between the putative SN and the DS are critical for modulating action values in the DS according to both positive and negative reinforcements to guide future decision making.

Download Full-text

No unified reward prediction error in local field potentials from the human nucleus accumbens: evidence from epilepsy patients

Journal of Neurophysiology ◽

10.1152/jn.00260.2015 ◽

2015 ◽

Vol 114 (2) ◽

pp. 781-792 ◽

Cited By ~ 7

Author(s):

Max-Philipp Stenner ◽

Robb B. Rutledge ◽

Tino Zaehle ◽

Friedhelm C. Schmitt ◽

Klaus Kopitzki ◽

...

Keyword(s):

Decision Making ◽

Nucleus Accumbens ◽

Local Field ◽

Local Field Potentials ◽

Prediction Errors ◽

Field Potentials ◽

Reward Prediction ◽

Outcome Valence ◽

Economic Decision Making ◽

The Difference

Functional magnetic resonance imaging (fMRI), cyclic voltammetry, and single-unit electrophysiology studies suggest that signals measured in the nucleus accumbens (Nacc) during value-based decision making represent reward prediction errors (RPEs), the difference between actual and predicted rewards. Here, we studied the precise temporal and spectral pattern of reward-related signals in the human Nacc. We recorded local field potentials (LFPs) from the Nacc of six epilepsy patients during an economic decision-making task. On each trial, patients decided whether to accept or reject a gamble with equal probabilities of a monetary gain or loss. The behavior of four patients was consistent with choices being guided by value expectations. Expected value signals before outcome onset were observed in three of those patients, at varying latencies and with nonoverlapping spectral patterns. Signals after outcome onset were correlated with RPE regressors in all subjects. However, further analysis revealed that these signals were better explained as outcome valence rather than RPE signals, with gamble gains and losses differing in the power of beta oscillations and in evoked response amplitudes. Taken together, our results do not support the idea that postsynaptic potentials in the Nacc represent a RPE that unifies outcome magnitude and prior value expectation. We discuss the generalizability of our findings to healthy individuals and the relation of our results to measurements of RPE signals obtained from the Nacc with other methods.

Download Full-text

The contribution of striatal pseudo-reward prediction errors to value-based decision-making

10.1101/097873 ◽

2017 ◽

Author(s):

Ernest Mas-Herrero ◽

Guillaume Sescousse ◽

Roshan Cools ◽

Josep Marco-Pallarés

Keyword(s):

Decision Making ◽

Choice Behavior ◽

Prediction Errors ◽

Life Outcomes ◽

Stimulus Response ◽

Hierarchical Reinforcement Learning ◽

Reward Prediction ◽

Fmri Study ◽

Reward Contingencies ◽

The Brain

AbstractMost studies that have investigated the brain mechanisms underlying learning have focused on the ability to learn simple stimulus-response associations. However, in everyday life, outcomes are often obtained through complex behavioral patterns involving a series of actions. In such scenarios, parallel learning systems are important to reduce the complexity of the learning problem, as proposed in the framework of hierarchical reinforcement learning (HRL). One of the key features of HRL is the computation of pseudo-reward prediction errors (PRPEs) which allow the reinforcement of actions that led to a sub-goal before the final goal itself is achieved. Here we wanted to test the hypothesis that, despite not carrying any rewarding value per se, pseudo-rewards might generate a bias in choice behavior when reward contingencies are not well-known or uncertain. Second, we also hypothesized that this bias might be related to the strength of PRPE striatal representations. In order to test these ideas, we developed a novel decision-making paradigm to assess reward prediction errors (RPEs) and PRPEs in two studies (fMRI study: n = 20; behavioural study: n = 19). Our results show that overall participants developed a preference for the most pseudo-rewarding option throughout the task, even though it did not lead to more monetary rewards. fMRI analyses revealed that this preference was predicted by individual differences in the relative striatal sensitivity to PRPEs vs RPEs. Together, our results indicate that pseudo-rewards generate learning signals in the striatum and subsequently bias choice behavior despite their lack of association with actual reward.

Download Full-text