scholarly journals Statistical learning as reinforcement learning phenomena

2021 ◽  
Author(s):  
J Orpella ◽  
E Mas-Herrero ◽  
P Ripollés ◽  
J Marco-Pallarés ◽  
R de Diego-Balaguer

AbstractStatistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online statistical learning of language rules combined with computational modelling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate - on two different cohorts - that a Temporal Difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the longstanding gap between language learning and reinforcement learning phenomena.

PLoS Biology ◽  
2021 ◽  
Vol 19 (9) ◽  
pp. e3001119
Author(s):  
Joan Orpella ◽  
Ernest Mas-Herrero ◽  
Pablo Ripollés ◽  
Josep Marco-Pallarés ◽  
Ruth de Diego-Balaguer

Statistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online SL of simple syntactic structures combined with computational modeling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate—on 2 different cohorts—that a temporal difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the long-standing gap between language learning and reinforcement learning phenomena.


2021 ◽  
Author(s):  
Julie M. Schneider ◽  
Yi-Lun Weng ◽  
Anqi Hu ◽  
Zhenghan Qi

Statistical learning, the process of tracking distributional information and discovering embedded patterns, is traditionally regarded as a form of implicit learning. However, recent studies proposed that both implicit (attention-independent) and explicit (attention-dependent) learning systems are involved in statistical learning. To understand the role of attention in statistical learning, the current study investigates the cortical processing of prediction errors in speech based on either local or global distributional information. We then ask how these cortical responses relate to statistical learning behavior in a word segmentation task. We found ERP evidence of pre-attentive processing of both the local (mismatching negativity) and global distributional information (late discriminative negativity). However, as speech elements became less frequent and more surprising, some participants showed an involuntary attentional shift, reflected in a P3a response. Individuals who displayed attentive neural tracking of distributional information showed faster learning in a speech statistical learning task. These results provide important neural evidence elucidating the facilitatory role of attention in statistical learning.


Author(s):  
Patricia L Lockwood ◽  
Miriam C Klein-Flügge

Abstract Social neuroscience aims to describe the neural systems that underpin social cognition and behaviour. Over the past decade, researchers have begun to combine computational models with neuroimaging to link social computations to the brain. Inspired by approaches from reinforcement learning theory, which describes how decisions are driven by the unexpectedness of outcomes, accounts of the neural basis of prosocial learning, observational learning, mentalizing and impression formation have been developed. Here we provide an introduction for researchers who wish to use these models in their studies. We consider both theoretical and practical issues related to their implementation, with a focus on specific examples from the field.


2014 ◽  
Vol 26 (3) ◽  
pp. 635-644 ◽  
Author(s):  
Olav E. Krigolson ◽  
Cameron D. Hassall ◽  
Todd C. Handy

Our ability to make decisions is predicated upon our knowledge of the outcomes of the actions available to us. Reinforcement learning theory posits that actions followed by a reward or punishment acquire value through the computation of prediction errors—discrepancies between the predicted and the actual reward. A multitude of neuroimaging studies have demonstrated that rewards and punishments evoke neural responses that appear to reflect reinforcement learning prediction errors [e.g., Krigolson, O. E., Pierce, L. J., Holroyd, C. B., & Tanaka, J. W. Learning to become an expert: Reinforcement learning and the acquisition of perceptual expertise. Journal of Cognitive Neuroscience, 21, 1833–1840, 2009; Bayer, H. M., & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47, 129–141, 2005; O'Doherty, J. P. Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769–776, 2004; Holroyd, C. B., & Coles, M. G. H. The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679–709, 2002]. Here, we used the brain ERP technique to demonstrate that not only do rewards elicit a neural response akin to a prediction error but also that this signal rapidly diminished and propagated to the time of choice presentation with learning. Specifically, in a simple, learnable gambling task, we show that novel rewards elicited a feedback error-related negativity that rapidly decreased in amplitude with learning. Furthermore, we demonstrate the existence of a reward positivity at choice presentation, a previously unreported ERP component that has a similar timing and topography as the feedback error-related negativity that increased in amplitude with learning. The pattern of results we observed mirrored the output of a computational model that we implemented to compute reward prediction errors and the changes in amplitude of these prediction errors at the time of choice presentation and reward delivery. Our results provide further support that the computations that underlie human learning and decision-making follow reinforcement learning principles.


2019 ◽  
Author(s):  
Patricia Lockwood ◽  
Miriam Klein-Flugge

Social neuroscience aims to describe the neural systems that underpin social cognition and behaviour. Over the past decade, researchers have begun to combine computational models with neuroimaging to link social computations to the brain. Inspired by approaches from reinforcement learning theory, which describes how decisions are driven by the unexpectedness of outcomes, accounts of the neural basis of prosocial learning, observational learning, mentalising and impression formation have been developed. Here we provide an introduction for researchers who wish to use these models in their studies. We consider both theoretical and practical issues related to their implementation, with a focus on specific examples from the field.


2021 ◽  
Author(s):  
Joana Carvalheiro ◽  
Vasco A. Conceição ◽  
Ana Mesquita ◽  
Ana Seara-Cardoso

AbstractReinforcement learning, which implicates learning from the rewarding and punishing outcomes of our choices, is critical for adjusted behaviour. Acute stress seems to affect this ability but the neural mechanisms by which it disrupts this type of learning are still poorly understood. Here, we investigate whether and how acute stress blunts neural signalling of prediction errors during reinforcement learning using model-based functional magnetic resonance imaging. Male participants completed a well-established reinforcement learning task involving monetary gains and losses whilst under stress and control conditions. Acute stress impaired participants’ behavioural performance towards obtaining monetary gains, but not towards avoiding losses. Importantly, acute stress blunted signalling of prediction errors during gain and loss trials in the dorsal striatum — with subsidiary analyses suggesting that acute stress preferentially blunted signalling of positive prediction errors. Our results thus reveal a neurocomputational mechanism by which acute stress may impair reward learning.


2018 ◽  
Author(s):  
Andre Chevrier ◽  
Mehereen Bhaijiwala ◽  
Jonathan Lipszyc ◽  
Douglas Cheyne ◽  
Simon Graham ◽  
...  

AbstractADHD is associated with altered dopamine regulated reinforcement learning on prediction errors. Despite evidence of categorically altered error processing in ADHD, neuroimaging advances have largely investigated models of normal reinforcement learning in greater detail. Further, although reinforcement leaning critically relies on ventral striatum exerting error magnitude related thresholding influences on substantia nigra (SN) and dorsal striatum, these thresholding influences have never been identified with neuroimaging. To identify such thresholding influences, we propose that error magnitude related activities must first be separated from opposite activities in overlapping neural regions during error detection. Here we separate error detection from magnitude related adjustment (post-error slowing) during inhibition errors in the stop signal task in typically developing (TD) and ADHD adolescents using fMRI. In TD, we predicted that: 1) deactivation of dorsal striatum on error detection interrupts ongoing processing, and should be proportional to right frontoparietal response phase activity that has been observed in the SST; 2) deactivation of ventral striatum on post-error slowing exerts thresholding influences on, and should be proportional to activity in dorsal striatum. In ADHD, we predicted that ventral striatum would instead correlate with heightened amygdala responses to errors. We found deactivation of dorsal striatum on error detection correlated with response-phase activity in both groups. In TD, post-error slowing deactivation of ventral striatum correlated with activation of dorsal striatum. In ADHD, ventral striatum correlated with heightened amygdala activity. Further, heightened activities in locus coeruleus (norepinephrine), raphe nucleus (serotonin) and medial septal nuclei (acetylcholine), which all compete for control of DA, and are altered in ADHD, exhibited altered correlations with SN. All correlations in TD were replicated in healthy adults. Results in TD are consistent with dopamine regulated reinforcement learning on post-error slowing. In ADHD, results are consistent with heightened activities in the amygdala and non-dopaminergic neurotransmitter nuclei preventing reinforcement learning.


2015 ◽  
Vol 113 (9) ◽  
pp. 3056-3068 ◽  
Author(s):  
Kentaro Katahira ◽  
Yoshi-Taka Matsuda ◽  
Tomomi Fujimura ◽  
Kenichi Ueno ◽  
Takeshi Asamizuya ◽  
...  

Emotional events resulting from a choice influence an individual's subsequent decision making. Although the relationship between emotion and decision making has been widely discussed, previous studies have mainly investigated decision outcomes that can easily be mapped to reward and punishment, including monetary gain/loss, gustatory stimuli, and pain. These studies regard emotion as a modulator of decision making that can be made rationally in the absence of emotions. In our daily lives, however, we often encounter various emotional events that affect decisions by themselves, and mapping the events to a reward or punishment is often not straightforward. In this study, we investigated the neural substrates of how such emotional decision outcomes affect subsequent decision making. By using functional magnetic resonance imaging (fMRI), we measured brain activities of humans during a stochastic decision-making task in which various emotional pictures were presented as decision outcomes. We found that pleasant pictures differentially activated the midbrain, fusiform gyrus, and parahippocampal gyrus, whereas unpleasant pictures differentially activated the ventral striatum, compared with neutral pictures. We assumed that the emotional decision outcomes affect the subsequent decision by updating the value of the options, a process modeled by reinforcement learning models, and that the brain regions representing the prediction error that drives the reinforcement learning are involved in guiding subsequent decisions. We found that some regions of the striatum and the insula were separately correlated with the prediction error for either pleasant pictures or unpleasant pictures, whereas the precuneus was correlated with prediction errors for both pleasant and unpleasant pictures.


2020 ◽  
Author(s):  
Alessandra D. Nostro ◽  
Kalliopi Ioumpa ◽  
Riccardo Paracampo ◽  
Selene Gallo ◽  
Laura Fornari ◽  
...  

AbstractLearning to predict how our actions result in conflicting outcomes for self and others is essential for social functioning, but remains poorly understood. We test whether Reinforcement Learning Theory captures how participants learn to choose between two symbols that define a moral conflict between financial gain to self and pain for others. Computational modelling and fMRI imaging show that participants have dissociable representations for self-gain and pain to others. Signals in dorsal rostral cingulate and insulae track more closely with outcomes than prediction errors, while the opposite is true for the ventral rostral cingulate. Cognitive computational models estimated a valuational preference parameter that captured individual variability of choice in this moral conflict task. Participants’ valuational preferences predicted how much they chose to spend to reduce another person’s pain in an independent task. Learning separate representations for self and others allows participants to rapidly adapt to changes in contingencies during conflicts.


Author(s):  
Xinqi Zhou ◽  
Ting Xu ◽  
Yixu Zeng ◽  
Ran Zhang ◽  
Ziyu Qi ◽  
...  

Background Social deficits and dysregulations in dopaminergic midbrain-striato-frontal circuits represent transdiagnostic symptoms across psychiatric disorders. Animal models suggest that modulating interactions between the dopamine and renin-angiotensin system with the angiotensin receptor antagonist Losartan (LT) can modulate learning and reward-related processes. We have therefore determined the behavioral and neural effects of LT on social reward and punishment processing in humans. Methods A pre-registered randomized double-blind placebo-controlled between-subject pharmacological design was combined with a social incentive delay fMRI paradigm during which subjects could avoid social punishment or gain social reward. Healthy volunteers received a single-dose of LT (50mg, n=43) or placebo (n=44). Reaction times and emotional ratings served as behavioral outcomes, on the neural level activation, connectivity and social feedback prediction errors were modelled. Results Relative to placebo, LT switched reaction times and arousal away from prioritizing punishment towards social reward. On the neural level the LT-enhanced motivational salience of social rewards was accompanied by stronger ventral striatum-prefrontal connectivity during reward anticipation and attenuated activity in the ventral tegmental area (VTA) and associated connectivity with the bilateral insula in response to punishment during the outcome phase. Computational modelling further revealed an LT-enhanced social reward prediction error signal in VTA and dorsal striatum. Conclusions LT shifted motivational and emotional salience away from social punishment towards social reward via modulating distinct core nodes of the midbrain-striato-frontal circuits. The findings document a modulatory role of the renin-angiotensin system in these circuits and associated social processes, suggesting a promising treatment target to alleviate social dysregulations.


Sign in / Sign up

Export Citation Format

Share Document