scholarly journals Acute stress blunts prediction error signals in the dorsal striatum during reinforcement learning

2021 ◽  
Author(s):  
Joana Carvalheiro ◽  
Vasco A. Conceição ◽  
Ana Mesquita ◽  
Ana Seara-Cardoso

AbstractReinforcement learning, which implicates learning from the rewarding and punishing outcomes of our choices, is critical for adjusted behaviour. Acute stress seems to affect this ability but the neural mechanisms by which it disrupts this type of learning are still poorly understood. Here, we investigate whether and how acute stress blunts neural signalling of prediction errors during reinforcement learning using model-based functional magnetic resonance imaging. Male participants completed a well-established reinforcement learning task involving monetary gains and losses whilst under stress and control conditions. Acute stress impaired participants’ behavioural performance towards obtaining monetary gains, but not towards avoiding losses. Importantly, acute stress blunted signalling of prediction errors during gain and loss trials in the dorsal striatum — with subsidiary analyses suggesting that acute stress preferentially blunted signalling of positive prediction errors. Our results thus reveal a neurocomputational mechanism by which acute stress may impair reward learning.

2020 ◽  
Author(s):  
Joana Carvalheiro ◽  
Vasco A. Conceição ◽  
Ana Mesquita ◽  
Ana Seara-Cardoso

AbstractAcute stress is ubiquitous in everyday life, but the extent to which acute stress affects how people learn from the outcomes of their choices is still poorly understood. Here, we investigate how acute stress impacts reward and punishment learning in men using a reinforcement-learning task. Sixty-two male participants performed the task whilst under stress and control conditions. We observed that acute stress impaired participants’ choice performance towards monetary gains, but not losses. To unravel the mechanism(s) underlying such impairment, we fitted a reinforcement-learning model to participants’ trial-by-trial choices. Computational modeling indicated that under acute stress participants learned more slowly from positive prediction errors — when the outcomes were better than expected — consistent with stress-induced dopamine disruptions. Such mechanistic understanding of how acute stress impairs reward learning is particularly important given the pervasiveness of stress in our daily life and the impact that stress can have on our wellbeing and mental health.


PLoS Biology ◽  
2021 ◽  
Vol 19 (9) ◽  
pp. e3001119
Author(s):  
Joan Orpella ◽  
Ernest Mas-Herrero ◽  
Pablo Ripollés ◽  
Josep Marco-Pallarés ◽  
Ruth de Diego-Balaguer

Statistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online SL of simple syntactic structures combined with computational modeling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate—on 2 different cohorts—that a temporal difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the long-standing gap between language learning and reinforcement learning phenomena.


2018 ◽  
Author(s):  
Samuel D. McDougle ◽  
Peter A. Butcher ◽  
Darius Parvin ◽  
Fasial Mushtaq ◽  
Yael Niv ◽  
...  

AbstractDecisions must be implemented through actions, and actions are prone to error. As such, when an expected outcome is not obtained, an individual should not only be sensitive to whether the choice itself was suboptimal, but also whether the action required to indicate that choice was executed successfully. The intelligent assignment of credit to action execution versus action selection has clear ecological utility for the learner. To explore this scenario, we used a modified version of a classic reinforcement learning task in which feedback indicated if negative prediction errors were, or were not, associated with execution errors. Using fMRI, we asked if prediction error computations in the human striatum, a key substrate in reinforcement learning and decision making, are modulated when a failure in action execution results in the negative outcome. Participants were more tolerant of non-rewarded outcomes when these resulted from execution errors versus when execution was successful but the reward was withheld. Consistent with this behavior, a model-driven analysis of neural activity revealed an attenuation of the signal associated with negative reward prediction error in the striatum following execution failures. These results converge with other lines of evidence suggesting that prediction errors in the mesostriatal dopamine system integrate high-level information during the evaluation of instantaneous reward outcomes.


2018 ◽  
Author(s):  
Joanne C. Van Slooten ◽  
Sara Jahfari ◽  
Tomas Knapen ◽  
Jan Theeuwes

AbstractPupil responses have been used to track cognitive processes during decision-making. Studies have shown that in these cases the pupil reflects the joint activation of many cortical and subcortical brain regions, also those traditionally implicated in value-based learning. However, how the pupil tracks value-based decisions and reinforcement learning is unknown. We combined a reinforcement learning task with a computational model to study pupil responses during value-based decisions, and decision evaluations. We found that the pupil closely tracks reinforcement learning both across trials and participants. Prior to choice, the pupil dilated as a function of trial-by-trial fluctuations in value beliefs. After feedback, early dilation scaled with value uncertainty, whereas later constriction scaled with reward prediction errors. Our computational approach systematically implicates the pupil in value-based decisions, and the subsequent processing of violated value beliefs, ttese dissociable influences provide an exciting possibility to non-invasively study ongoing reinforcement learning in the pupil.


2021 ◽  
Author(s):  
J Orpella ◽  
E Mas-Herrero ◽  
P Ripollés ◽  
J Marco-Pallarés ◽  
R de Diego-Balaguer

AbstractStatistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online statistical learning of language rules combined with computational modelling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate - on two different cohorts - that a Temporal Difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the longstanding gap between language learning and reinforcement learning phenomena.


2021 ◽  
Author(s):  
Bianca Westhoff ◽  
Neeltje E. Blankenstein ◽  
Elisabeth Schreuders ◽  
Eveline A. Crone ◽  
Anna C. K. van Duijvenvoorde

AbstractLearning which of our behaviors benefit others contributes to social bonding and being liked by others. An important period for the development of (pro)social behavior is adolescence, in which peers become more salient and relationships intensify. It is, however, unknown how learning to benefit others develops across adolescence and what the underlying cognitive and neural mechanisms are. In this functional neuroimaging study, we assessed learning for self and others (i.e., prosocial learning) and the concurring neural tracking of prediction errors across adolescence (ages 9-21, N=74). Participants performed a two-choice probabilistic reinforcement learning task in which outcomes resulted in monetary consequences for themselves, an unknown other, or no one. Participants from all ages were able to learn for themselves and others, but learning for others showed a more protracted developmental trajectory. Prediction errors for self were observed in the ventral striatum and showed no age-related differences. However, prediction error coding for others was specifically observed in the ventromedial prefrontal cortex and showed age-related increases. These results reveal insights into the computational mechanisms of learning for others across adolescence, and highlight that learning for self and others show different age-related patterns.


2019 ◽  
Vol 9 (7) ◽  
pp. 174 ◽  
Author(s):  
Burak Erdeniz ◽  
John Done

Reinforcement learning studies in rodents and primates demonstrate that goal-directed and habitual choice behaviors are mediated through different fronto-striatal systems, but the evidence is less clear in humans. In this study, functional magnetic resonance imaging (fMRI) data were collected whilst participants (n = 20) performed a conditional associative learning task in which blocks of novel conditional stimuli (CS) required a deliberate choice, and blocks of familiar CS required an intuitive choice. Using standard subtraction analysis for fMRI event-related designs, activation shifted from the dorso-fronto-parietal network, which involves dorsolateral prefrontal cortex (DLPFC) for deliberate choice of novel CS, to ventro-medial frontal (VMPFC) and anterior cingulate cortex for intuitive choice of familiar CS. Supporting this finding, psycho-physiological interaction (PPI) analysis, using the peak active areas within the PFC for novel and familiar CS as seed regions, showed functional coupling between caudate and DLPFC when processing novel CS and VMPFC when processing familiar CS. These findings demonstrate separable systems for deliberate and intuitive processing, which is in keeping with rodent and primate reinforcement learning studies, although in humans they operate in a dynamic, possibly synergistic, manner particularly at the level of the striatum.


2021 ◽  
pp. 100412
Author(s):  
Joana Carvalheiro ◽  
Vasco A. Conceição ◽  
Ana Mesquita ◽  
Ana Seara-Cardoso

2018 ◽  
Author(s):  
Andre Chevrier ◽  
Mehereen Bhaijiwala ◽  
Jonathan Lipszyc ◽  
Douglas Cheyne ◽  
Simon Graham ◽  
...  

AbstractADHD is associated with altered dopamine regulated reinforcement learning on prediction errors. Despite evidence of categorically altered error processing in ADHD, neuroimaging advances have largely investigated models of normal reinforcement learning in greater detail. Further, although reinforcement leaning critically relies on ventral striatum exerting error magnitude related thresholding influences on substantia nigra (SN) and dorsal striatum, these thresholding influences have never been identified with neuroimaging. To identify such thresholding influences, we propose that error magnitude related activities must first be separated from opposite activities in overlapping neural regions during error detection. Here we separate error detection from magnitude related adjustment (post-error slowing) during inhibition errors in the stop signal task in typically developing (TD) and ADHD adolescents using fMRI. In TD, we predicted that: 1) deactivation of dorsal striatum on error detection interrupts ongoing processing, and should be proportional to right frontoparietal response phase activity that has been observed in the SST; 2) deactivation of ventral striatum on post-error slowing exerts thresholding influences on, and should be proportional to activity in dorsal striatum. In ADHD, we predicted that ventral striatum would instead correlate with heightened amygdala responses to errors. We found deactivation of dorsal striatum on error detection correlated with response-phase activity in both groups. In TD, post-error slowing deactivation of ventral striatum correlated with activation of dorsal striatum. In ADHD, ventral striatum correlated with heightened amygdala activity. Further, heightened activities in locus coeruleus (norepinephrine), raphe nucleus (serotonin) and medial septal nuclei (acetylcholine), which all compete for control of DA, and are altered in ADHD, exhibited altered correlations with SN. All correlations in TD were replicated in healthy adults. Results in TD are consistent with dopamine regulated reinforcement learning on post-error slowing. In ADHD, results are consistent with heightened activities in the amygdala and non-dopaminergic neurotransmitter nuclei preventing reinforcement learning.


Sign in / Sign up

Export Citation Format

Share Document