scholarly journals Beyond Reward Prediction Errors: Human Striatum Updates Rule Values During Learning

2017 ◽  
Author(s):  
Ian Ballard ◽  
Eric M. Miller ◽  
Steven T. Piantadosi ◽  
Noah Goodman ◽  
Samuel M. McClure

ABSTRACTHumans naturally group the world into coherent categories defined by membership rules. Rules can be learned implicitly by building stimulus-response associations using reinforcement learning (RL) or by using explicit reasoning. We tested if the striatum, in which activation reliably scales with reward prediction error, would track prediction errors in a task that required explicit rule generation. Using functional magnetic resonance imaging during a categorization task, we show that striatal responses to feedback scale with a “surprise” signal derived from a Bayesian rule-learning model and are inconsistent with RL prediction error. We also find that striatum and caudal inferior frontal sulcus (cIFS) are involved in updating the likelihood of discriminative rules. We conclude that the striatum, in cooperation with the cIFS, is involved in updating the values assigned to categorization rules when people learn using explicit reasoning.

2017 ◽  
Vol 28 (11) ◽  
pp. 3965-3975 ◽  
Author(s):  
Ian Ballard ◽  
Eric M Miller ◽  
Steven T Piantadosi ◽  
Noah D Goodman ◽  
Samuel M McClure

Abstract Humans naturally group the world into coherent categories defined by membership rules. Rules can be learned implicitly by building stimulus-response associations using reinforcement learning or by using explicit reasoning. We tested if the striatum, in which activation reliably scales with reward prediction error, would track prediction errors in a task that required explicit rule generation. Using functional magnetic resonance imaging during a categorization task, we show that striatal responses to feedback scale with a “surprise” signal derived from a Bayesian rule-learning model and are inconsistent with RL prediction error. We also find that striatum and caudal inferior frontal sulcus (cIFS) are involved in updating the likelihood of discriminative rules. We conclude that the striatum, in cooperation with the cIFS, is involved in updating the values assigned to categorization rules when people learn using explicit reasoning.


2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Maya G. Mosner ◽  
R. Edward McLaurin ◽  
Jessica L. Kinard ◽  
Shabnam Hakimi ◽  
Jacob Parelman ◽  
...  

Few studies have explored neural mechanisms of reward learning in ASD despite evidence of behavioral impairments of predictive abilities in ASD. To investigate the neural correlates of reward prediction errors in ASD, 16 adults with ASD and 14 typically developing controls performed a prediction error task during fMRI scanning. Results revealed greater activation in the ASD group in the left paracingulate gyrus during signed prediction errors and the left insula and right frontal pole during thresholded unsigned prediction errors. Findings support atypical neural processing of reward prediction errors in ASD in frontostriatal regions critical for prediction coding and reward learning. Results provide a neural basis for impairments in reward learning that may contribute to traits common in ASD (e.g., intolerance of unpredictability).


2014 ◽  
Vol 26 (3) ◽  
pp. 447-458 ◽  
Author(s):  
Ernest Mas-Herrero ◽  
Josep Marco-Pallarés

In decision-making processes, the relevance of the information yielded by outcomes varies across time and situations. It increases when previous predictions are not accurate and in contexts with high environmental uncertainty. Previous fMRI studies have shown an important role of medial pFC in coding both reward prediction errors and the impact of this information to guide future decisions. However, it is unclear whether these two processes are dissociated in time or occur simultaneously, suggesting that a common mechanism is engaged. In the present work, we studied the modulation of two electrophysiological responses associated to outcome processing—the feedback-related negativity ERP and frontocentral theta oscillatory activity—with the reward prediction error and the learning rate. Twenty-six participants performed two learning tasks differing in the degree of predictability of the outcomes: a reversal learning task and a probabilistic learning task with multiple blocks of novel cue–outcome associations. We implemented a reinforcement learning model to obtain the single-trial reward prediction error and the learning rate for each participant and task. Our results indicated that midfrontal theta activity and feedback-related negativity increased linearly with the unsigned prediction error. In addition, variations of frontal theta oscillatory activity predicted the learning rate across tasks and participants. These results support the existence of a common brain mechanism for the computation of unsigned prediction error and learning rate.


2018 ◽  
Author(s):  
Anthony I. Jang ◽  
Matthew R. Nassar ◽  
Daniel G. Dillon ◽  
Michael J. Frank

AbstractThe dopamine system is thought to provide a reward prediction error signal that facilitates reinforcement learning and reward-based choice in corticostriatal circuits. While it is believed that similar prediction error signals are also provided to temporal lobe memory systems, the impact of such signals on episodic memory encoding has not been fully characterized. Here we develop an incidental memory paradigm that allows us to 1) estimate the influence of reward prediction errors on the formation of episodic memories, 2) dissociate this influence from other factors such as surprise and uncertainty, 3) test the degree to which this influence depends on temporal correspondence between prediction error and memoranda presentation, and 4) determine the extent to which this influence is consolidation-dependent. We find that when choosing to gamble for potential rewards during a primary decision making task, people encode incidental memoranda more strongly even though they are not aware that their memory will be subsequently probed. Moreover, this strengthened encoding scales with the reward prediction error, and not overall reward, experienced selectively at the time of memoranda presentation (and not before or after). Finally, this strengthened encoding is identifiable within a few minutes and is not substantially enhanced after twenty-four hours, indicating that it is not consolidation-dependent. These results suggest a computationally and temporally specific role for putative dopaminergic reward prediction error signaling in memory formation.


F1000Research ◽  
2014 ◽  
Vol 3 ◽  
pp. 313
Author(s):  
Chao-gan Yan ◽  
Qingyang Li ◽  
Lei Gao

Sharing drafts of scientific manuscripts on preprint hosting services for early exposure and pre-publication feedback is a well-accepted practice in fields such as physics, astronomy, or mathematics. The field of neuroscience, however, has yet to adopt the preprint model. A reason for this reluctance might partly be the lack of central preprint services for the field of neuroscience. To address this issue, we announce the launch of Preprints of the R-fMRI Network (PRN), a community funded preprint hosting service. PRN provides free-submission and free hosting of manuscripts for resting state functional magnetic resonance imaging (R-fMRI) and neuroscience related studies. Submissions will be peer viewed and receive feedback from readers and a panel of invited consultants of the R-fMRI Network. All manuscripts and feedback will be freely available online with citable permanent URL for open-access. The goal of PRN is to supplement the “peer reviewed” journal publication system – by more rapidly communicating the latest research achievements throughout the world. We hope PRN will help the field to embrace the preprint model and thus further accelerate R-fMRI and neuroscience related studies, eventually enhancing human mental health.


PLoS Biology ◽  
2020 ◽  
Vol 18 (10) ◽  
pp. e3000899
Author(s):  
Jan Grohn ◽  
Urs Schüffelgen ◽  
Franz-Xaver Neubert ◽  
Alessandro Bongioanni ◽  
Lennart Verhagen ◽  
...  

Animals learn from the past to make predictions. These predictions are adjusted after prediction errors, i.e., after surprising events. Generally, most reward prediction errors models learn the average expected amount of reward. However, here we demonstrate the existence of distinct mechanisms for detecting other types of surprising events. Six macaques learned to respond to visual stimuli to receive varying amounts of juice rewards. Most trials ended with the delivery of either 1 or 3 juice drops so that animals learned to expect 2 juice drops on average even though instances of precisely 2 drops were rare. To encourage learning, we also included sessions during which the ratio between 1 and 3 drops changed. Additionally, in all sessions, the stimulus sometimes appeared in an unexpected location. Thus, 3 types of surprising events could occur: reward amount surprise (i.e., a scalar reward prediction error), rare reward surprise, and visuospatial surprise. Importantly, we can dissociate scalar reward prediction errors—rewards that deviated from the average reward amount expected—and rare reward events—rewards that accorded with the average reward expectation but that rarely occurred. We linked each type of surprise to a distinct pattern of neural activity using functional magnetic resonance imaging. Activity in the vicinity of the dopaminergic midbrain only reflected surprise about the amount of reward. Lateral prefrontal cortex had a more general role in detecting surprising events. Posterior lateral orbitofrontal cortex specifically detected rare reward events regardless of whether they followed average reward amount expectations, but only in learnable reward environments.


2021 ◽  
Author(s):  
Rachit Dubey ◽  
Mark K Ho ◽  
Hermish Mehta ◽  
Tom Griffiths

Psychologists have long been fascinated with understanding the nature of Aha! moments, moments when we transition from not knowing to suddenly realizing the solution to a problem. In this work, we present a theoretical framework that explains when and why we experience Aha! moments. Our theory posits that during problem-solving, in addition to solving the problem, people also maintain a meta-cognitive model of their ability to solve the problem as well as a prediction about the time it would take them to solve that problem. Aha! moments arise when we experience a positive error in this meta-cognitive prediction, i.e. when we solve a problem much faster than we expected to solve it. We posit that this meta-cognitive error is analogous to a positive reward prediction error thereby explaining why we feel so good after an Aha! moment. A large-scale pre-registered experiment on anagram solving supports this theory, showing that people's time prediction errors are strongly correlated with their ratings of an Aha! experience while solving anagrams. A second experiment provides further evidence to our theory by demonstrating a causal link between time prediction errors and the Aha! experience. These results highlight the importance of meta-cognitive prediction errors and deepen our understanding of human meta-reasoning.


Sign in / Sign up

Export Citation Format

Share Document