Language statistical learning responds to reinforcement learning principles rooted in the striatum

Joan Orpella; Ernest Mas-Herrero; Pablo Ripollés; Josep Marco-Pallarés; Ruth de Diego-Balaguer

doi:10.1371/journal.pbio.3001119

Language statistical learning responds to reinforcement learning principles rooted in the striatum

PLoS Biology ◽

10.1371/journal.pbio.3001119 ◽

2021 ◽

Vol 19 (9) ◽

pp. e3001119

Author(s):

Joan Orpella ◽

Ernest Mas-Herrero ◽

Pablo Ripollés ◽

Josep Marco-Pallarés ◽

Ruth de Diego-Balaguer

Keyword(s):

Reinforcement Learning ◽

Language Learning ◽

Statistical Learning ◽

Dorsal Striatum ◽

Rule Learning ◽

Prediction Errors ◽

Neural Basis ◽

Structural Rules ◽

Learning Principles ◽

Striatal Function

Statistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online SL of simple syntactic structures combined with computational modeling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate—on 2 different cohorts—that a temporal difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the long-standing gap between language learning and reinforcement learning phenomena.

Download Full-text

Statistical learning as reinforcement learning phenomena

10.1101/2021.01.28.428582 ◽

2021 ◽

Author(s):

J Orpella ◽

E Mas-Herrero ◽

P Ripollés ◽

J Marco-Pallarés ◽

R de Diego-Balaguer

Keyword(s):

Reinforcement Learning ◽

Language Learning ◽

Statistical Learning ◽

Computational Modelling ◽

Dorsal Striatum ◽

Rule Learning ◽

Prediction Errors ◽

Neural Basis ◽

Structural Rules ◽

Learning Principles

AbstractStatistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online statistical learning of language rules combined with computational modelling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate - on two different cohorts - that a Temporal Difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the longstanding gap between language learning and reinforcement learning phenomena.

Download Full-text

Linking the neural basis of hierarchical prediction with statistical learning: The paradox of attention

10.31234/osf.io/8rtsc ◽

2021 ◽

Author(s):

Julie M. Schneider ◽

Yi-Lun Weng ◽

Anqi Hu ◽

Zhenghan Qi

Keyword(s):

Statistical Learning ◽

Learning Task ◽

Prediction Errors ◽

Neural Basis ◽

Attentional Shift ◽

Cortical Responses ◽

Distributional Information ◽

Segmentation Task ◽

Dependent Learning

Statistical learning, the process of tracking distributional information and discovering embedded patterns, is traditionally regarded as a form of implicit learning. However, recent studies proposed that both implicit (attention-independent) and explicit (attention-dependent) learning systems are involved in statistical learning. To understand the role of attention in statistical learning, the current study investigates the cortical processing of prediction errors in speech based on either local or global distributional information. We then ask how these cortical responses relate to statistical learning behavior in a word segmentation task. We found ERP evidence of pre-attentive processing of both the local (mismatching negativity) and global distributional information (late discriminative negativity). However, as speech elements became less frequent and more surprising, some participants showed an involuntary attentional shift, reflected in a P3a response. Individuals who displayed attentive neural tracking of distributional information showed faster learning in a speech statistical learning task. These results provide important neural evidence elucidating the facilitatory role of attention in statistical learning.

Download Full-text

How We Learn to Make Decisions: Rapid Propagation of Reinforcement Learning Prediction Errors in Humans

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_00509 ◽

2014 ◽

Vol 26 (3) ◽

pp. 635-644 ◽

Cited By ~ 38

Author(s):

Olav E. Krigolson ◽

Cameron D. Hassall ◽

Todd C. Handy

Keyword(s):

Reinforcement Learning ◽

Prediction Error ◽

Human Error ◽

Dopamine Neurons ◽

Prediction Errors ◽

Neural Basis ◽

Error Related Negativity ◽

Reward Positivity ◽

Reward Prediction ◽

Feedback Error

Our ability to make decisions is predicated upon our knowledge of the outcomes of the actions available to us. Reinforcement learning theory posits that actions followed by a reward or punishment acquire value through the computation of prediction errors—discrepancies between the predicted and the actual reward. A multitude of neuroimaging studies have demonstrated that rewards and punishments evoke neural responses that appear to reflect reinforcement learning prediction errors [e.g., Krigolson, O. E., Pierce, L. J., Holroyd, C. B., & Tanaka, J. W. Learning to become an expert: Reinforcement learning and the acquisition of perceptual expertise. Journal of Cognitive Neuroscience, 21, 1833–1840, 2009; Bayer, H. M., & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47, 129–141, 2005; O'Doherty, J. P. Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769–776, 2004; Holroyd, C. B., & Coles, M. G. H. The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679–709, 2002]. Here, we used the brain ERP technique to demonstrate that not only do rewards elicit a neural response akin to a prediction error but also that this signal rapidly diminished and propagated to the time of choice presentation with learning. Specifically, in a simple, learnable gambling task, we show that novel rewards elicited a feedback error-related negativity that rapidly decreased in amplitude with learning. Furthermore, we demonstrate the existence of a reward positivity at choice presentation, a previously unreported ERP component that has a similar timing and topography as the feedback error-related negativity that increased in amplitude with learning. The pattern of results we observed mirrored the output of a computational model that we implemented to compute reward prediction errors and the changes in amplitude of these prediction errors at the time of choice presentation and reward delivery. Our results provide further support that the computations that underlie human learning and decision-making follow reinforcement learning principles.

Download Full-text

Acute stress blunts prediction error signals in the dorsal striatum during reinforcement learning

10.1101/2021.02.11.430640 ◽

2021 ◽

Author(s):

Joana Carvalheiro ◽

Vasco A. Conceição ◽

Ana Mesquita ◽

Ana Seara-Cardoso

Keyword(s):

Reinforcement Learning ◽

Acute Stress ◽

Dorsal Striatum ◽

Learning Task ◽

Neural Mechanisms ◽

Prediction Errors ◽

Functional Magnetic Resonance ◽

Behavioural Performance ◽

And Control ◽

Gains And Losses

AbstractReinforcement learning, which implicates learning from the rewarding and punishing outcomes of our choices, is critical for adjusted behaviour. Acute stress seems to affect this ability but the neural mechanisms by which it disrupts this type of learning are still poorly understood. Here, we investigate whether and how acute stress blunts neural signalling of prediction errors during reinforcement learning using model-based functional magnetic resonance imaging. Male participants completed a well-established reinforcement learning task involving monetary gains and losses whilst under stress and control conditions. Acute stress impaired participants’ behavioural performance towards obtaining monetary gains, but not towards avoiding losses. Importantly, acute stress blunted signalling of prediction errors during gain and loss trials in the dorsal striatum — with subsidiary analyses suggesting that acute stress preferentially blunted signalling of positive prediction errors. Our results thus reveal a neurocomputational mechanism by which acute stress may impair reward learning.

Download Full-text

Disrupted reinforcement learning during post-error slowing in ADHD

10.1101/449975 ◽

2018 ◽

Author(s):

Andre Chevrier ◽

Mehereen Bhaijiwala ◽

Jonathan Lipszyc ◽

Douglas Cheyne ◽

Simon Graham ◽

...

Keyword(s):

Reinforcement Learning ◽

Error Detection ◽

Ventral Striatum ◽

Dorsal Striatum ◽

Stop Signal Task ◽

Prediction Errors ◽

Error Magnitude ◽

Response Phase ◽

Dopaminergic Neurotransmitter ◽

Phase Activity

AbstractADHD is associated with altered dopamine regulated reinforcement learning on prediction errors. Despite evidence of categorically altered error processing in ADHD, neuroimaging advances have largely investigated models of normal reinforcement learning in greater detail. Further, although reinforcement leaning critically relies on ventral striatum exerting error magnitude related thresholding influences on substantia nigra (SN) and dorsal striatum, these thresholding influences have never been identified with neuroimaging. To identify such thresholding influences, we propose that error magnitude related activities must first be separated from opposite activities in overlapping neural regions during error detection. Here we separate error detection from magnitude related adjustment (post-error slowing) during inhibition errors in the stop signal task in typically developing (TD) and ADHD adolescents using fMRI. In TD, we predicted that: 1) deactivation of dorsal striatum on error detection interrupts ongoing processing, and should be proportional to right frontoparietal response phase activity that has been observed in the SST; 2) deactivation of ventral striatum on post-error slowing exerts thresholding influences on, and should be proportional to activity in dorsal striatum. In ADHD, we predicted that ventral striatum would instead correlate with heightened amygdala responses to errors. We found deactivation of dorsal striatum on error detection correlated with response-phase activity in both groups. In TD, post-error slowing deactivation of ventral striatum correlated with activation of dorsal striatum. In ADHD, ventral striatum correlated with heightened amygdala activity. Further, heightened activities in locus coeruleus (norepinephrine), raphe nucleus (serotonin) and medial septal nuclei (acetylcholine), which all compete for control of DA, and are altered in ADHD, exhibited altered correlations with SN. All correlations in TD were replicated in healthy adults. Results in TD are consistent with dopamine regulated reinforcement learning on post-error slowing. In ADHD, results are consistent with heightened activities in the amygdala and non-dopaminergic neurotransmitter nuclei preventing reinforcement learning.

Download Full-text

Neural basis of decision making guided by emotional outcomes

Journal of Neurophysiology ◽

10.1152/jn.00564.2014 ◽

2015 ◽

Vol 113 (9) ◽

pp. 3056-3068 ◽

Cited By ~ 13

Author(s):

Kentaro Katahira ◽

Yoshi-Taka Matsuda ◽

Tomomi Fujimura ◽

Kenichi Ueno ◽

Takeshi Asamizuya ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Prediction Error ◽

Brain Regions ◽

Parahippocampal Gyrus ◽

Prediction Errors ◽

Neural Basis ◽

Decision Outcomes ◽

Gain Loss ◽

Emotional Events

Emotional events resulting from a choice influence an individual's subsequent decision making. Although the relationship between emotion and decision making has been widely discussed, previous studies have mainly investigated decision outcomes that can easily be mapped to reward and punishment, including monetary gain/loss, gustatory stimuli, and pain. These studies regard emotion as a modulator of decision making that can be made rationally in the absence of emotions. In our daily lives, however, we often encounter various emotional events that affect decisions by themselves, and mapping the events to a reward or punishment is often not straightforward. In this study, we investigated the neural substrates of how such emotional decision outcomes affect subsequent decision making. By using functional magnetic resonance imaging (fMRI), we measured brain activities of humans during a stochastic decision-making task in which various emotional pictures were presented as decision outcomes. We found that pleasant pictures differentially activated the midbrain, fusiform gyrus, and parahippocampal gyrus, whereas unpleasant pictures differentially activated the ventral striatum, compared with neutral pictures. We assumed that the emotional decision outcomes affect the subsequent decision by updating the value of the options, a process modeled by reinforcement learning models, and that the brain regions representing the prediction error that drives the reinforcement learning are involved in guiding subsequent decisions. We found that some regions of the striatum and the insula were separately correlated with the prediction error for either pleasant pictures or unpleasant pictures, whereas the precuneus was correlated with prediction errors for both pleasant and unpleasant pictures.

Download Full-text

Faculty Opinions recommendation of The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1010152.145565 ◽

2002 ◽

Author(s):

Randall C O'Reilly

Keyword(s):

Reinforcement Learning ◽

Human Error ◽

Error Processing ◽

Neural Basis ◽

Error Related Negativity

Download Full-text

THE CONTRIBUTIONS OF IMPLICIT-STATISTICAL LEARNING APTITUDE TO IMPLICIT SECOND-LANGUAGE KNOWLEDGE

Studies in Second Language Acquisition ◽

10.1017/s0272263121000085 ◽

2021 ◽

pp. 1-29

Author(s):

Aline Godfroid ◽

Kathy MinHye Kim

Keyword(s):

Second Language ◽

Reaction Time ◽

Language Learning ◽

Statistical Learning ◽

Structural Equation ◽

Reaction Time Task ◽

Implicit Knowledge ◽

Equation Modeling ◽

Language Tests ◽

Language Knowledge

Abstract This study addresses the role of domain-general mechanisms in second-language learning and knowledge using an individual differences approach. We examine the predictive validity of implicit-statistical learning aptitude for implicit second-language knowledge. Participants (n = 131) completed a battery of four aptitude measures and nine grammar tests. Structural equation modeling revealed that only the alternating serial reaction time task (a measure of implicit-statistical learning aptitude) significantly predicted learners’ performance on timed, accuracy-based language tests, but not their performance on reaction-time measures. These results inform ongoing debates about the nature of implicit knowledge in SLA: they lend support to the validity of timed, accuracy-based language tests as measures of implicit knowledge. Auditory and visual statistical learning were correlated with medium strength, while the remaining implicit-statistical learning aptitude measures were not correlated, highlighting the multicomponential nature of implicit-statistical learning aptitude and the corresponding need for a multitest approach to assess its different facets.

Download Full-text

Prefrontal solution to the bias-variance tradeoff during reinforcement learning

10.1101/2020.12.23.424258 ◽

2020 ◽

Author(s):

Dongjae Kim ◽

Jaeseung Jeong ◽

Sang Wan Lee

Keyword(s):

Adaptive Control ◽

Reinforcement Learning ◽

Prediction Error ◽

Brain Regions ◽

Decision Task ◽

Prediction Errors ◽

Model Based ◽

Model Free ◽

Bias Variance ◽

The Brain

AbstractThe goal of learning is to maximize future rewards by minimizing prediction errors. Evidence have shown that the brain achieves this by combining model-based and model-free learning. However, the prediction error minimization is challenged by a bias-variance tradeoff, which imposes constraints on each strategy’s performance. We provide new theoretical insight into how this tradeoff can be resolved through the adaptive control of model-based and model-free learning. The theory predicts the baseline correction for prediction error reduces the lower bound of the bias–variance error by factoring out irreducible noise. Using a Markov decision task with context changes, we showed behavioral evidence of adaptive control. Model-based behavioral analyses show that the prediction error baseline signals context changes to improve adaptability. Critically, the neural results support this view, demonstrating multiplexed representations of prediction error baseline within the ventrolateral and ventromedial prefrontal cortex, key brain regions known to guide model-based and model-free learning.One sentence summaryA theoretical, behavioral, computational, and neural account of how the brain resolves the bias-variance tradeoff during reinforcement learning is described.

Download Full-text

Attenuated directed exploration during reinforcement learning in gambling disorder

10.1101/823583 ◽

2019 ◽

Cited By ~ 3

Author(s):

A. Wiehler ◽

K. Chakroun ◽

J. Peters

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Gambling Disorder ◽

Brain Activity ◽

Clinical Status ◽

Classical Problem ◽

Behavioral Flexibility ◽

Network Connectivity ◽

Prediction Errors ◽

Reward Contingencies

AbstractGambling disorder is a behavioral addiction associated with impairments in decision-making and reduced behavioral flexibility. Decision-making in volatile environments requires a flexible trade-off between exploitation of options with high expected values and exploration of novel options to adapt to changing reward contingencies. This classical problem is known as the exploration-exploitation dilemma. We hypothesized gambling disorder to be associated with a specific reduction in directed (uncertainty-based) exploration compared to healthy controls, accompanied by changes in brain activity in a fronto-parietal exploration-related network.Twenty-three frequent gamblers and nineteen matched controls performed a classical four-armed bandit task during functional magnetic resonance imaging. Computational modeling revealed that choice behavior in both groups contained signatures of directed exploration, random exploration and perseveration. Gamblers showed a specific reduction in directed exploration, while random exploration and perseveration were similar between groups.Neuroimaging revealed no evidence for group differences in neural representations of expected value and reward prediction errors. Likewise, our hypothesis of attenuated fronto-parietal exploration effects in gambling disorder was not supported. However, during directed exploration, gamblers showed reduced parietal and substantia nigra / ventral tegmental area activity. Cross-validated classification analyses revealed that connectivity in an exploration-related network was predictive of clinical status, suggesting alterations in network dynamics in gambling disorder.In sum, we show that reduced flexibility during reinforcement learning in volatile environments in gamblers is attributable to a reduction in directed exploration rather than an increase in perseveration. Neuroimaging findings suggest that patterns of network connectivity might be more diagnostic of gambling disorder than univariate value and prediction error effects. We provide a computational account of flexibility impairments in gamblers during reinforcement learning that might arise as a consequence of dopaminergic dysregulation in this disorder.

Download Full-text