scholarly journals The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty

Neuron ◽  
2018 ◽  
Vol 98 (3) ◽  
pp. 616-629.e6 ◽  
Author(s):  
Clara Kwon Starkweather ◽  
Samuel J. Gershman ◽  
Naoshige Uchida
2019 ◽  
Vol 31 (1) ◽  
pp. 8-23 ◽  
Author(s):  
José J. F. Ribas-Fernandes ◽  
Danesh Shahnazian ◽  
Clay B. Holroyd ◽  
Matthew M. Botvinick

A longstanding view of the organization of human and animal behavior holds that behavior is hierarchically organized—in other words, directed toward achieving superordinate goals through the achievement of subordinate goals or subgoals. However, most research in neuroscience has focused on tasks without hierarchical structure. In past work, we have shown that negative reward prediction error (RPE) signals in medial prefrontal cortex (mPFC) can be linked not only to superordinate goals but also to subgoals. This suggests that mPFC tracks impediments in the progression toward subgoals. Using fMRI of human participants engaged in a hierarchical navigation task, here we found that mPFC also processes positive prediction errors at the level of subgoals, indicating that this brain region is sensitive to advances in subgoal completion. However, when subgoal RPEs were elicited alongside with goal-related RPEs, mPFC responses reflected only the goal-related RPEs. These findings suggest that information from different levels of hierarchy is processed selectively, depending on the task context.


2020 ◽  
Vol 22 (8) ◽  
pp. 849-859
Author(s):  
Julian Macoveanu ◽  
Hanne L. Kjærstad ◽  
Henry W. Chase ◽  
Sophia Frangou ◽  
Gitte M. Knudsen ◽  
...  

Author(s):  
Benjamin Voloh ◽  
Mariann Oemisch ◽  
Thilo Womelsdorf

AbstractThe prefrontal cortex and striatum form a recurrent network whose spiking activity encodes multiple types of learning-relevant information. This spike-encoded information is evident in average firing rates, but finer temporal coding might allow multiplexing and enhanced readout across the connected the network. We tested this hypothesis in the fronto-striatal network of nonhuman primates during reversal learning of feature values. We found that neurons encoding current choice outcomes, outcome prediction errors, and outcome history in their firing rates also carried significant information in their phase-of-firing at a 10-25 Hz beta frequency at which they synchronized across lateral prefrontal cortex, anterior cingulate cortex and striatum. The phase-of-firing code exceeded information that could be obtained from firing rates alone, was strong for inter-areal connections, and multiplexed information at three different phases of the beta cycle that were offset from the preferred spiking phase of neurons. Taken together, these findings document the multiplexing of three different types of information in the phase-of-firing at an interareally shared beta oscillation frequency during goal-directed behavior.HighlightsLateral prefrontal cortex, anterior cingulate cortex and striatum show phase-of-firing encoding for outcome, outcome history and reward prediction errors.Neurons with phase-of-firing code synchronize long-range at 10-25 Hz.Spike phases encoding reward prediction errors deviate from preferred synchronization phases.Anterior cingulate cortex neurons show strongest long-range effects.


2017 ◽  
Author(s):  
Jeroen P.H. Verharen ◽  
Johannes W. de Jong ◽  
Theresia J.M. Roelofs ◽  
Christiaan F.M. Huffels ◽  
Ruud van Zessen ◽  
...  

AbstractHyperdopaminergic states in mental disorders are associated with disruptive deficits in decision-making. However, the precise contribution of topographically distinct mesencephalic dopamine pathways to decision-making processes remains elusive. Here we show, using a multidisciplinary approach, how hyperactivity of ascending projections from the ventral tegmental area (VTA) contributes to faulty decision-making in rats. Activation of the VTA-nucleus accumbens pathway leads to insensitivity to loss and punishment due to impaired processing of negative reward prediction errors. In contrast, activation of the VTA-prefrontal cortex pathway promotes risky decision-making without affecting the ability to choose the economically most beneficial option. Together, these findings show how malfunction of ascending VTA projections affects value-based decision-making, providing a mechanistic understanding of the reckless behaviors seen in substance abuse, mania, and after dopamine replacement therapy in Parkinson’s disease.


2010 ◽  
Vol 30 (22) ◽  
pp. 7749-7753 ◽  
Author(s):  
S. Q. Park ◽  
T. Kahnt ◽  
A. Beck ◽  
M. X. Cohen ◽  
R. J. Dolan ◽  
...  

2021 ◽  
Author(s):  
Patrick Wiegel ◽  
Meaghan Elizabeth Spedden ◽  
Christina Ramsenthaler ◽  
Mikkel Malling Beck ◽  
Jesper Lundbye-Jensen

AbstractThe history of our actions and the outcomes of these represent important information, which can inform choices, and efficiently guide future behaviour. While unsuccessful (S-) outcomes are expected to lead to more explorative motor states and increased behavioural variability, successful (S+) outcomes lead to reinforcement of the previous action and thus exploitation. Here, we show that during reinforcement motor learning, humans attribute different values to previous actions when they experience S- vs. S+ outcomes. Behavioural variability after S- outcomes is influenced more by the previous outcomes compared to what is observed after S+ outcomes. Using electroencephalography, we show that neural oscillations of the prefrontal cortex encode the level of reinforcement (high beta frequencies) and reflect the detection of reward prediction errors (theta frequencies). The results suggest that S+ experiences ‘overwrite’ previous motor states to a greater extent than S- experiences and that modulations in neural oscillations in the prefrontal cortex play a potential role in encoding the (changes in) movement variability state during reinforcement motor learning.


2017 ◽  
Vol 29 (4) ◽  
pp. 718-727 ◽  
Author(s):  
Sara Garofalo ◽  
Christopher Timmermann ◽  
Simone Battaglia ◽  
Martin E. Maier ◽  
Giuseppe di Pellegrino

The medial prefrontal cortex (mPFC) and ACC have been consistently implicated in learning predictions of future outcomes and signaling prediction errors (i.e., unexpected deviations from such predictions). A computational model of ACC/mPFC posits that these prediction errors should be modulated by outcomes occurring at unexpected times, even if the outcomes themselves are predicted. However, unexpectedness per se is not the only variable that modulates ACC/mPFC activity, as studies reported its sensitivity to the salience of outcomes. In this study, mediofrontal negativity, a component of the event-related brain potential generated in ACC/mPFC and coding for prediction errors, was measured in 48 participants performing a Pavlovian aversive conditioning task, during which aversive (thus salient) and neutral outcomes were unexpectedly shifted (i.e., anticipated or delayed) in time. Mediofrontal ERP signals of prediction error were observed for outcomes occurring at unexpected times but were specific for salient (shock-associated), as compared with neutral, outcomes. These findings have important implications for the theoretical accounts of ACC/mPFC and suggest a critical role of timing and salience information in prediction error signaling.


2019 ◽  
Author(s):  
John G. Mikhael ◽  
HyungGoo R. Kim ◽  
Naoshige Uchida ◽  
Samuel J. Gershman

AbstractReinforcement learning models of the basal ganglia map the phasic dopamine signal to reward prediction errors (RPEs). Conventional models assert that, when a stimulus reliably predicts a reward with fixed delay, dopamine activity during the delay period and at reward time should converge to baseline through learning. However, recent studies have found that dopamine exhibits a gradual ramp before reward in certain conditions even after extensive learning, such as when animals are trained to run to obtain the reward, thus challenging the conventional RPE models. In this work, we begin with the limitation of temporal uncertainty (animals cannot perfectly estimate time to reward), and show that sensory feedback, which reduces this uncertainty, will cause an unbiased learner to produce RPE ramps. On the other hand, in the absence of feedback, RPEs will be flat after learning. These results reconcile the seemingly conflicting data on dopamine behaviors under the RPE hypothesis.


2018 ◽  
Author(s):  
José J. F. Ribas Fernandes ◽  
Danesh Shahnazian ◽  
Clay B. Holroyd ◽  
Matthew M. Botvinick

AbstractA longstanding view of the organization of human and animal behavior holds that behavior is hierarchically organized, meaning that it can be understood as directed towards achieving superordinate goals through subordinate goals, or subgoals. For example, the superordinate goal of making coffee can be broken down as accomplishing a series of subgoals, namely boiling water, grinding coffee, pouring cream, etc.Learning and behavioral adaptation depend on prediction-error signals, which have been observed in ventral striatum (VS) and medial prefrontal cortex (mPFC). In past work, we have shown that prediction error signals (PEs) can be linked not only to superordinate goals, but also to subgoals.Here we present two functional magnetic resonance imagining experiments that replicate and extend these findings. In the first experiment, we replicated the finding that mPFC signals subgoal-related PEs, independently of goal PEs. Together with our past work, this experiment reveals that BOLD responses to PEs in mPFC are unsigned. In the second experiment, we showed that when a task involves both goal and subgoal PEs, mPFC shows only goal-related PEs, suggesting that context or attention can strongly impact hierarchical PE coding. Furthermore, we observed a dissociation between the coding of PEs in mPFC and VS. These experiments suggest that the mPFC selectively attends to information at different levels of hierarchy depending on the task context.


Sign in / Sign up

Export Citation Format

Share Document