From Feedback- to Response-based Performance Monitoring in Active and Observational Learning

2014 ◽  
Vol 26 (9) ◽  
pp. 2111-2127 ◽  
Author(s):  
Christian Bellebaum ◽  
Marco Colosio

Humans can adapt their behavior by learning from the consequences of their own actions or by observing others. Gradual active learning of action–outcome contingencies is accompanied by a shift from feedback- to response-based performance monitoring. This shift is reflected by complementary learning-related changes of two ACC-driven ERP components, the feedback-related negativity (FRN) and the error-related negativity (ERN), which have both been suggested to signal events “worse than expected,” that is, a negative prediction error. Although recent research has identified comparable components for observed behavior and outcomes (observational ERN and FRN), it is as yet unknown, whether these components are similarly modulated by prediction errors and thus also reflect behavioral adaptation. In this study, two groups of 15 participants learned action–outcome contingencies either actively or by observation. In active learners, FRN amplitude for negative feedback decreased and ERN amplitude in response to erroneous actions increased with learning, whereas observational ERN and FRN in observational learners did not exhibit learning-related changes. Learning performance, assessed in test trials without feedback, was comparable between groups, as was the ERN following actively performed errors during test trials. In summary, the results show that action–outcome associations can be learned similarly well actively and by observation. The mechanisms involved appear to differ, with the FRN in active learning reflecting the integration of information about own actions and the accompanying outcomes.

2020 ◽  
Author(s):  
Pieter Verbeke ◽  
Kate Ergo ◽  
Esther De Loof ◽  
Tom Verguts

AbstractIn recent years, several hierarchical extensions of well-known learning algorithms have been proposed. For example, when stimulus-action mappings vary across time or context, the brain may learn two or more stimulus-action mappings in separate modules, and additionally (at a hierarchically higher level) learn to appropriately switch between those modules. However, how the brain mechanistically coordinates neural communication to implement such hierarchical learning, remains unknown. Therefore, the current study tests a recent computational model that proposed how midfrontal theta oscillations implement such hierarchical learning via the principle of binding by synchrony (Sync model). More specifically, the Sync model employs bursts at theta frequency to flexibly bind appropriate task modules by synchrony. 64-channel EEG signal was recorded while 27 human subjects (Female: 21, Male: 6) performed a probabilistic reversal learning task. In line with the Sync model, post-feedback theta power showed a linear relationship with negative prediction errors, but not with positive prediction errors. This relationship was especially pronounced for subjects with better behavioral fit (measured via AIC) of the Sync model. Also consistent with Sync model simulations, theta phase-coupling between midfrontal electrodes and temporo-parietal electrodes was stronger after negative feedback. Our data suggest that the brain uses theta power and synchronization for flexibly switching between task rule modules, as is useful for example when multiple stimulus-action mappings must be retained and used.Significance StatementEveryday life requires flexibility in switching between several rules. A key question in understanding this ability is how the brain mechanistically coordinates such switches. The current study tests a recent computational framework (Sync model) that proposed how midfrontal theta oscillations coordinate activity in hierarchically lower task-related areas. In line with predictions of this Sync model, midfrontal theta power was stronger when rule switches were most likely (strong negative prediction error), especially in subjects who obtained a better model fit. Additionally, also theta phase connectivity between midfrontal and task-related areas was increased after negative feedback. Thus, the data provided support for the hypothesis that the brain uses theta power and synchronization for flexibly switching between rules.


2016 ◽  
Vol 18 (1) ◽  
pp. 23-32 ◽  

Reward prediction errors consist of the differences between received and predicted rewards. They are crucial for basic forms of learning about rewards and make us strive for more rewards—an evolutionary beneficial trait. Most dopamine neurons in the midbrain of humans, monkeys, and rodents signal a reward prediction error; they are activated by more reward than predicted (positive prediction error), remain at baseline activity for fully predicted rewards, and show depressed activity with less reward than predicted (negative prediction error). The dopamine signal increases nonlinearly with reward value and codes formal economic utility. Drugs of addiction generate, hijack, and amplify the dopamine reward signal and induce exaggerated, uncontrolled dopamine effects on neuronal plasticity. The striatum, amygdala, and frontal cortex also show reward prediction error coding, but only in subpopulations of neurons. Thus, the important concept of reward prediction errors is implemented in neuronal hardware.


2014 ◽  
Vol 26 (3) ◽  
pp. 635-644 ◽  
Author(s):  
Olav E. Krigolson ◽  
Cameron D. Hassall ◽  
Todd C. Handy

Our ability to make decisions is predicated upon our knowledge of the outcomes of the actions available to us. Reinforcement learning theory posits that actions followed by a reward or punishment acquire value through the computation of prediction errors—discrepancies between the predicted and the actual reward. A multitude of neuroimaging studies have demonstrated that rewards and punishments evoke neural responses that appear to reflect reinforcement learning prediction errors [e.g., Krigolson, O. E., Pierce, L. J., Holroyd, C. B., & Tanaka, J. W. Learning to become an expert: Reinforcement learning and the acquisition of perceptual expertise. Journal of Cognitive Neuroscience, 21, 1833–1840, 2009; Bayer, H. M., & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47, 129–141, 2005; O'Doherty, J. P. Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769–776, 2004; Holroyd, C. B., & Coles, M. G. H. The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679–709, 2002]. Here, we used the brain ERP technique to demonstrate that not only do rewards elicit a neural response akin to a prediction error but also that this signal rapidly diminished and propagated to the time of choice presentation with learning. Specifically, in a simple, learnable gambling task, we show that novel rewards elicited a feedback error-related negativity that rapidly decreased in amplitude with learning. Furthermore, we demonstrate the existence of a reward positivity at choice presentation, a previously unreported ERP component that has a similar timing and topography as the feedback error-related negativity that increased in amplitude with learning. The pattern of results we observed mirrored the output of a computational model that we implemented to compute reward prediction errors and the changes in amplitude of these prediction errors at the time of choice presentation and reward delivery. Our results provide further support that the computations that underlie human learning and decision-making follow reinforcement learning principles.


2015 ◽  
Vol 114 (3) ◽  
pp. 1628-1640 ◽  
Author(s):  
Kelly M. J. Diederen ◽  
Wolfram Schultz

Effective error-driven learning requires individuals to adapt learning to environmental reward variability. The adaptive mechanism may involve decays in learning rate across subsequent trials, as shown previously, and rescaling of reward prediction errors. The present study investigated the influence of prediction error scaling and, in particular, the consequences for learning performance. Participants explicitly predicted reward magnitudes that were drawn from different probability distributions with specific standard deviations. By fitting the data with reinforcement learning models, we found scaling of prediction errors, in addition to the learning rate decay shown previously. Importantly, the prediction error scaling was closely related to learning performance, defined as accuracy in predicting the mean of reward distributions, across individual participants. In addition, participants who scaled prediction errors relative to standard deviation also presented with more similar performance for different standard deviations, indicating that increases in standard deviation did not substantially decrease “adapters'” accuracy in predicting the means of reward distributions. However, exaggerated scaling beyond the standard deviation resulted in impaired performance. Thus efficient adaptation makes learning more robust to changing variability.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Loreen Hertäg ◽  
Henning Sprekeler

Sensory systems constantly compare external sensory information with internally generated predictions. While neural hallmarks of prediction errors have been found throughout the brain, the circuit-level mechanisms that underlie their computation are still largely unknown. Here, we show that a well-orchestrated interplay of three interneuron types shapes the development and refinement of negative prediction-error neurons in a computational model of mouse primary visual cortex. By balancing excitation and inhibition in multiple pathways, experience-dependent inhibitory plasticity can generate different variants of prediction-error circuits, which can be distinguished by simulated optogenetic experiments. The experience-dependence of the model circuit is consistent with that of negative prediction-error circuits in layer 2/3 of mouse primary visual cortex. Our model makes a range of testable predictions that may shed light on the circuitry underlying the neural computation of prediction errors.


2021 ◽  
Vol 15 ◽  
Author(s):  
Jessica A. Mollick ◽  
Luke J. Chang ◽  
Anjali Krishnan ◽  
Thomas E. Hazy ◽  
Kai A. Krueger ◽  
...  

Compared to our understanding of positive prediction error signals occurring due to unexpected reward outcomes, less is known about the neural circuitry in humans that drives negative prediction errors during omission of expected rewards. While classical learning theories such as Rescorla–Wagner or temporal difference learning suggest that both types of prediction errors result from a simple subtraction, there has been recent evidence suggesting that different brain regions provide input to dopamine neurons which contributes to specific components of this prediction error computation. Here, we focus on the brain regions responding to negative prediction error signals, which has been well-established in animal studies to involve a distinct pathway through the lateral habenula. We examine the activity of this pathway in humans, using a conditioned inhibition paradigm with high-resolution functional MRI. First, participants learned to associate a sensory stimulus with reward delivery. Then, reward delivery was omitted whenever this stimulus was presented simultaneously with a different sensory stimulus, the conditioned inhibitor (CI). Both reward presentation and the reward-predictive cue activated midbrain dopamine regions, insula and orbitofrontal cortex. While we found significant activity at an uncorrected threshold for the CI in the habenula, consistent with our predictions, it did not survive correction for multiple comparisons and awaits further replication. Additionally, the pallidum and putamen regions of the basal ganglia showed modulations of activity for the inhibitor that did not survive the corrected threshold.


Author(s):  
Loreen Hertäg ◽  
Henning Sprekeler

AbstractSensory systems constantly compare external sensory information with internally generated predictions. While neural hallmarks of prediction errors have been found throughout the brain, the circuit-level mechanisms that underlie their computation are still largely unknown. Here, we show that a well-orchestrated interplay of three interneuron types shapes the development and refinement of negative prediction-error neurons in a computational model of mouse primary visual cortex. By balancing excitation and inhibition in multiple pathways, experience-dependent inhibitory plasticity can generate different variants of prediction-error circuits, which can be distinguished by simulated optogenetic experiments. The experience-dependence of the model circuit is consistent with that of negative prediction-error circuits in layer 2/3 of mouse primary visual cortex. Our model makes a range of testable predictions that may shed light on the circuitry underlying the neural computation of prediction errors.


2016 ◽  
Author(s):  
Stefano Palminteri ◽  
Germain Lefebvre ◽  
Emma J. Kilford ◽  
Sarah-Jayne Blakemore

AbstractPrevious studies suggest that factual learning, that is, learning from obtained outcomes, is biased, such that participants preferentially take into account positive, as compared to negative, prediction errors. However, whether or not the prediction error valence also affects counterfactual learning, that is, learning from forgone outcomes, is unknown. To address this question, we analysed the performance of two cohorts of participants on reinforcement learning tasks using a computational model that was adapted to test if prediction error valance influences learning. Concerning factual learning, we replicated previous findings of a valence-induced bias, whereby participants learned preferentially from positive, relative to negative, prediction errors. In contrast, for counterfactual learning, we found the opposite valence-induced bias: negative prediction errors were preferentially taken into account relative to positive ones. When considering valence-induced bias in the context of both factual and counterfactual learning, it appears that people tend to preferentially take into account information that confirms their current choice


2019 ◽  
Author(s):  
Ido Toren ◽  
Kristoffer Aberg ◽  
Rony Paz

SummaryThe brain updates internal representation of the environment by using the mismatch between the predicted state/outcome and the actual one, termed prediction-error. In parallel, time perception in the sub-second range is crucial for many behaviors such as movement, learning, memory, attention and speech. Both time-perception and prediction-errors are essential for everyday life function of an organism, and interestingly, the striatum was shown to be independently involved in both functions. We therefore hypothesized that the putative shared circuitry might induce behavioral interaction, namely that prediction-errors might bias time perception. To examine this, participants performed a time-duration discrimination task in the presence of positive and negative prediction-errors that were irrelevant and independent of the main task. We find that positive/negative prediction-errors induce a bias in time perception by increasing/decreasing the perceived time, respectively. Using functional imaging, we identify an interaction in Putamen activity between encoding of prediction-error and performance in the discrimination task. A model that accounts for the behavioral and physiological observations confirms that the interaction in regional activations for prediction-errors and time-estimation underlies the observed bias. Our results demonstrate that these two presumably independent roles of the striatum can actually interfere or aid one another in specific scenarios.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yibing Zhang ◽  
Tingyang Li ◽  
Aparna Reddy ◽  
Nambi Nallasamy

Abstract Objectives To evaluate gender differences in optical biometry measurements and lens power calculations. Methods Eight thousand four hundred thirty-one eyes of five thousand five hundred nineteen patients who underwent cataract surgery at University of Michigan’s Kellogg Eye Center were included in this retrospective study. Data including age, gender, optical biometry, postoperative refraction, implanted intraocular lens (IOL) power, and IOL formula refraction predictions were gathered and/or calculated utilizing the Sight Outcomes Research Collaborative (SOURCE) database and analyzed. Results There was a statistical difference between every optical biometry measure between genders. Despite lens constant optimization, mean signed prediction errors (SPEs) of modern IOL formulas differed significantly between genders, with predictions skewed more hyperopic for males and myopic for females for all 5 of the modern IOL formulas tested. Optimization of lens constants by gender significantly decreased prediction error for 2 of the 5 modern IOL formulas tested. Conclusions Gender was found to be an independent predictor of refraction prediction error for all 5 formulas studied. Optimization of lens constants by gender can decrease refraction prediction error for certain modern IOL formulas.


Sign in / Sign up

Export Citation Format

Share Document