Beyond Reward Prediction Errors: Human Striatum Updates Rule Values During Learning

ABSTRACTHumans naturally group the world into coherent categories defined by membership rules. Rules can be learned implicitly by building stimulus-response associations using reinforcement learning (RL) or by using explicit reasoning. We tested if the striatum, in which activation reliably scales with reward prediction error, would track prediction errors in a task that required explicit rule generation. Using functional magnetic resonance imaging during a categorization task, we show that striatal responses to feedback scale with a “surprise” signal derived from a Bayesian rule-learning model and are inconsistent with RL prediction error. We also find that striatum and caudal inferior frontal sulcus (cIFS) are involved in updating the likelihood of discriminative rules. We conclude that the striatum, in cooperation with the cIFS, is involved in updating the values assigned to categorization rules when people learn using explicit reasoning.

Download Full-text

Beyond Reward Prediction Errors: Human Striatum Updates Rule Values During Learning

Cerebral Cortex ◽

10.1093/cercor/bhx259 ◽

2017 ◽

Vol 28 (11) ◽

pp. 3965-3975 ◽

Cited By ~ 6

Author(s):

Ian Ballard ◽

Eric M Miller ◽

Steven T Piantadosi ◽

Noah D Goodman ◽

Samuel M McClure

Keyword(s):

Prediction Error ◽

Rule Learning ◽

Prediction Errors ◽

Rule Generation ◽

Functional Magnetic Resonance ◽

Resonance Imaging ◽

Stimulus Response ◽

Reward Prediction ◽

The World ◽

Explicit Rule

Abstract Humans naturally group the world into coherent categories defined by membership rules. Rules can be learned implicitly by building stimulus-response associations using reinforcement learning or by using explicit reasoning. We tested if the striatum, in which activation reliably scales with reward prediction error, would track prediction errors in a task that required explicit rule generation. Using functional magnetic resonance imaging during a categorization task, we show that striatal responses to feedback scale with a “surprise” signal derived from a Bayesian rule-learning model and are inconsistent with RL prediction error. We also find that striatum and caudal inferior frontal sulcus (cIFS) are involved in updating the likelihood of discriminative rules. We conclude that the striatum, in cooperation with the cIFS, is involved in updating the values assigned to categorization rules when people learn using explicit reasoning.

Download Full-text

Neural Mechanisms of Reward Prediction Error in Autism Spectrum Disorder

Autism Research and Treatment ◽

10.1155/2019/5469191 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10 ◽

Cited By ~ 3

Author(s):

Maya G. Mosner ◽

R. Edward McLaurin ◽

Jessica L. Kinard ◽

Shabnam Hakimi ◽

Jacob Parelman ◽

...

Keyword(s):

Prediction Error ◽

Autism Spectrum ◽

Neural Mechanisms ◽

Reward Learning ◽

Prediction Errors ◽

Frontal Pole ◽

Neural Basis ◽

Reward Prediction ◽

Behavioral Impairments ◽

Adults With Asd

Few studies have explored neural mechanisms of reward learning in ASD despite evidence of behavioral impairments of predictive abilities in ASD. To investigate the neural correlates of reward prediction errors in ASD, 16 adults with ASD and 14 typically developing controls performed a prediction error task during fMRI scanning. Results revealed greater activation in the ASD group in the left paracingulate gyrus during signed prediction errors and the left insula and right frontal pole during thresholded unsigned prediction errors. Findings support atypical neural processing of reward prediction errors in ASD in frontostriatal regions critical for prediction coding and reward learning. Results provide a neural basis for impairments in reward learning that may contribute to traits common in ASD (e.g., intolerance of unpredictability).

Download Full-text

Functional magnetic resonance imaging of mental rotation and memory scanning: a multidimensional scaling analysis of brain activation patterns1Published on the World Wide Web on 24 February 1998.1

Brain Research Reviews ◽

10.1016/s0165-0173(97)00060-x ◽

1998 ◽

Vol 26 (2-3) ◽

pp. 106-112 ◽

Cited By ~ 59

Author(s):

Georgios A Tagaris ◽

Wolfgang Richter ◽

Seong-Gi Kim ◽

Giuseppe Pellizzer ◽

Peter Andersen ◽

...

Keyword(s):

Magnetic Resonance Imaging ◽

Magnetic Resonance ◽

Multidimensional Scaling ◽

World Wide ◽

Scaling Analysis ◽

Memory Scanning ◽

Functional Magnetic Resonance ◽

Resonance Imaging ◽

Multidimensional Scaling Analysis ◽

The World

Download Full-text

Post-stimulus response in hemodynamics observed by functional magnetic resonance imaging—difference between the primary sensorimotor area and the supplementary motor area

Magnetic Resonance Imaging ◽

10.1016/s0730-725x(00)00217-4 ◽

2000 ◽

Vol 18 (10) ◽

pp. 1215-1219 ◽

Cited By ~ 14

Author(s):

Toshiharu Nakai ◽

Kayako Matsuo ◽

Chikako Kato ◽

Yasuo Takehara ◽

Haruo Isoda ◽

...

Keyword(s):

Magnetic Resonance Imaging ◽

Magnetic Resonance ◽

Functional Magnetic Resonance Imaging ◽

Supplementary Motor Area ◽

Motor Area ◽

Functional Magnetic Resonance ◽

Resonance Imaging ◽

Stimulus Response ◽

Sensorimotor Area

Download Full-text

Frontal Theta Oscillatory Activity Is a Common Mechanism for the Computation of Unexpected Outcomes and Learning Rate

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_00516 ◽

2014 ◽

Vol 26 (3) ◽

pp. 447-458 ◽

Cited By ~ 39

Author(s):

Ernest Mas-Herrero ◽

Josep Marco-Pallarés

Keyword(s):

Prediction Error ◽

Learning Task ◽

Learning Rate ◽

Oscillatory Activity ◽

Common Mechanism ◽

Prediction Errors ◽

Reward Prediction Error ◽

Reward Prediction ◽

Medial Pfc ◽

The Impact

In decision-making processes, the relevance of the information yielded by outcomes varies across time and situations. It increases when previous predictions are not accurate and in contexts with high environmental uncertainty. Previous fMRI studies have shown an important role of medial pFC in coding both reward prediction errors and the impact of this information to guide future decisions. However, it is unclear whether these two processes are dissociated in time or occur simultaneously, suggesting that a common mechanism is engaged. In the present work, we studied the modulation of two electrophysiological responses associated to outcome processing—the feedback-related negativity ERP and frontocentral theta oscillatory activity—with the reward prediction error and the learning rate. Twenty-six participants performed two learning tasks differing in the degree of predictability of the outcomes: a reversal learning task and a probabilistic learning task with multiple blocks of novel cue–outcome associations. We implemented a reinforcement learning model to obtain the single-trial reward prediction error and the learning rate for each participant and task. Our results indicated that midfrontal theta activity and feedback-related negativity increased linearly with the unsigned prediction error. In addition, variations of frontal theta oscillatory activity predicted the learning rate across tasks and participants. These results support the existence of a common brain mechanism for the computation of unsigned prediction error and learning rate.

Download Full-text

Dissociating Arbitrary Stimulus-Response Mapping from Movement Planning during Preparatory Period: Evidence from Event-Related Functional Magnetic Resonance Imaging

Journal of Neuroscience ◽

10.1523/jneurosci.3176-05.2006 ◽

2006 ◽

Vol 26 (10) ◽

pp. 2704-2713 ◽

Cited By ~ 65

Author(s):

C. Cavina-Pratesi

Keyword(s):

Magnetic Resonance Imaging ◽

Magnetic Resonance ◽

Functional Magnetic Resonance Imaging ◽

Movement Planning ◽

Functional Magnetic Resonance ◽

Resonance Imaging ◽

Response Mapping ◽

Stimulus Response ◽

Preparatory Period

Download Full-text

Positive reward prediction errors strengthen incidental memory encoding

10.1101/327445 ◽

2018 ◽

Cited By ~ 2

Author(s):

Anthony I. Jang ◽

Matthew R. Nassar ◽

Daniel G. Dillon ◽

Michael J. Frank

Keyword(s):

Prediction Error ◽

Memory Systems ◽

Prediction Errors ◽

Memory Encoding ◽

Dopamine System ◽

Reward Prediction Error ◽

Reward Prediction ◽

Incidental Memory ◽

Episodic Memories ◽

The Impact

AbstractThe dopamine system is thought to provide a reward prediction error signal that facilitates reinforcement learning and reward-based choice in corticostriatal circuits. While it is believed that similar prediction error signals are also provided to temporal lobe memory systems, the impact of such signals on episodic memory encoding has not been fully characterized. Here we develop an incidental memory paradigm that allows us to 1) estimate the influence of reward prediction errors on the formation of episodic memories, 2) dissociate this influence from other factors such as surprise and uncertainty, 3) test the degree to which this influence depends on temporal correspondence between prediction error and memoranda presentation, and 4) determine the extent to which this influence is consolidation-dependent. We find that when choosing to gamble for potential rewards during a primary decision making task, people encode incidental memoranda more strongly even though they are not aware that their memory will be subsequently probed. Moreover, this strengthened encoding scales with the reward prediction error, and not overall reward, experienced selectively at the time of memoranda presentation (and not before or after). Finally, this strengthened encoding is identifiable within a few minutes and is not substantially enhanced after twenty-four hours, indicating that it is not consolidation-dependent. These results suggest a computationally and temporally specific role for putative dopaminergic reward prediction error signaling in memory formation.

Download Full-text

PRN: a preprint service for catalyzing R-fMRI and neuroscience related studies

F1000Research ◽

10.12688/f1000research.5951.1 ◽

2014 ◽

Vol 3 ◽

pp. 313

Author(s):

Chao-gan Yan ◽

Qingyang Li ◽

Lei Gao

Keyword(s):

Magnetic Resonance Imaging ◽

Mental Health ◽

Open Access ◽

Resting State ◽

Early Exposure ◽

Functional Magnetic Resonance ◽

Resonance Imaging ◽

Journal Publication ◽

The World ◽

Accepted Practice

Sharing drafts of scientific manuscripts on preprint hosting services for early exposure and pre-publication feedback is a well-accepted practice in fields such as physics, astronomy, or mathematics. The field of neuroscience, however, has yet to adopt the preprint model. A reason for this reluctance might partly be the lack of central preprint services for the field of neuroscience. To address this issue, we announce the launch of Preprints of the R-fMRI Network (PRN), a community funded preprint hosting service. PRN provides free-submission and free hosting of manuscripts for resting state functional magnetic resonance imaging (R-fMRI) and neuroscience related studies. Submissions will be peer viewed and receive feedback from readers and a panel of invited consultants of the R-fMRI Network. All manuscripts and feedback will be freely available online with citable permanent URL for open-access. The goal of PRN is to supplement the “peer reviewed” journal publication system – by more rapidly communicating the latest research achievements throughout the world. We hope PRN will help the field to embrace the preprint model and thus further accelerate R-fMRI and neuroscience related studies, eventually enhancing human mental health.

Download Full-text

Multiple systems in macaques for tracking prediction errors and other types of surprise

PLoS Biology ◽

10.1371/journal.pbio.3000899 ◽

2020 ◽

Vol 18 (10) ◽

pp. e3000899

Author(s):

Jan Grohn ◽

Urs Schüffelgen ◽

Franz-Xaver Neubert ◽

Alessandro Bongioanni ◽

Lennart Verhagen ◽

...

Keyword(s):

Distinct Pattern ◽

Prediction Errors ◽

Average Reward ◽

Functional Magnetic Resonance ◽

General Role ◽

Multiple Systems ◽

The Past ◽

Reward Prediction ◽

Reward Amount ◽

To Receive

Animals learn from the past to make predictions. These predictions are adjusted after prediction errors, i.e., after surprising events. Generally, most reward prediction errors models learn the average expected amount of reward. However, here we demonstrate the existence of distinct mechanisms for detecting other types of surprising events. Six macaques learned to respond to visual stimuli to receive varying amounts of juice rewards. Most trials ended with the delivery of either 1 or 3 juice drops so that animals learned to expect 2 juice drops on average even though instances of precisely 2 drops were rare. To encourage learning, we also included sessions during which the ratio between 1 and 3 drops changed. Additionally, in all sessions, the stimulus sometimes appeared in an unexpected location. Thus, 3 types of surprising events could occur: reward amount surprise (i.e., a scalar reward prediction error), rare reward surprise, and visuospatial surprise. Importantly, we can dissociate scalar reward prediction errors—rewards that deviated from the average reward amount expected—and rare reward events—rewards that accorded with the average reward expectation but that rarely occurred. We linked each type of surprise to a distinct pattern of neural activity using functional magnetic resonance imaging. Activity in the vicinity of the dopaminergic midbrain only reflected surprise about the amount of reward. Lateral prefrontal cortex had a more general role in detecting surprising events. Posterior lateral orbitofrontal cortex specifically detected rare reward events regardless of whether they followed average reward amount expectations, but only in learnable reward environments.

Download Full-text

Aha! moments correspond to meta-cognitive prediction errors

10.31234/osf.io/c5v42 ◽

2021 ◽

Author(s):

Rachit Dubey ◽

Mark K Ho ◽

Hermish Mehta ◽

Tom Griffiths

Keyword(s):

Prediction Error ◽

Large Scale ◽

Cognitive Model ◽

Causal Link ◽

Prediction Errors ◽

Strongly Correlated ◽

Positive Error ◽

Cognitive Error ◽

Time Prediction ◽

Reward Prediction

Psychologists have long been fascinated with understanding the nature of Aha! moments, moments when we transition from not knowing to suddenly realizing the solution to a problem. In this work, we present a theoretical framework that explains when and why we experience Aha! moments. Our theory posits that during problem-solving, in addition to solving the problem, people also maintain a meta-cognitive model of their ability to solve the problem as well as a prediction about the time it would take them to solve that problem. Aha! moments arise when we experience a positive error in this meta-cognitive prediction, i.e. when we solve a problem much faster than we expected to solve it. We posit that this meta-cognitive error is analogous to a positive reward prediction error thereby explaining why we feel so good after an Aha! moment. A large-scale pre-registered experiment on anagram solving supports this theory, showing that people's time prediction errors are strongly correlated with their ratings of an Aha! experience while solving anagrams. A second experiment provides further evidence to our theory by demonstrating a causal link between time prediction errors and the Aha! experience. These results highlight the importance of meta-cognitive prediction errors and deepen our understanding of human meta-reasoning.

Download Full-text