scholarly journals Scaling prediction errors to reward variability benefits error-driven learning in humans

2015 ◽  
Vol 114 (3) ◽  
pp. 1628-1640 ◽  
Author(s):  
Kelly M. J. Diederen ◽  
Wolfram Schultz

Effective error-driven learning requires individuals to adapt learning to environmental reward variability. The adaptive mechanism may involve decays in learning rate across subsequent trials, as shown previously, and rescaling of reward prediction errors. The present study investigated the influence of prediction error scaling and, in particular, the consequences for learning performance. Participants explicitly predicted reward magnitudes that were drawn from different probability distributions with specific standard deviations. By fitting the data with reinforcement learning models, we found scaling of prediction errors, in addition to the learning rate decay shown previously. Importantly, the prediction error scaling was closely related to learning performance, defined as accuracy in predicting the mean of reward distributions, across individual participants. In addition, participants who scaled prediction errors relative to standard deviation also presented with more similar performance for different standard deviations, indicating that increases in standard deviation did not substantially decrease “adapters'” accuracy in predicting the means of reward distributions. However, exaggerated scaling beyond the standard deviation resulted in impaired performance. Thus efficient adaptation makes learning more robust to changing variability.

2014 ◽  
Vol 26 (3) ◽  
pp. 447-458 ◽  
Author(s):  
Ernest Mas-Herrero ◽  
Josep Marco-Pallarés

In decision-making processes, the relevance of the information yielded by outcomes varies across time and situations. It increases when previous predictions are not accurate and in contexts with high environmental uncertainty. Previous fMRI studies have shown an important role of medial pFC in coding both reward prediction errors and the impact of this information to guide future decisions. However, it is unclear whether these two processes are dissociated in time or occur simultaneously, suggesting that a common mechanism is engaged. In the present work, we studied the modulation of two electrophysiological responses associated to outcome processing—the feedback-related negativity ERP and frontocentral theta oscillatory activity—with the reward prediction error and the learning rate. Twenty-six participants performed two learning tasks differing in the degree of predictability of the outcomes: a reversal learning task and a probabilistic learning task with multiple blocks of novel cue–outcome associations. We implemented a reinforcement learning model to obtain the single-trial reward prediction error and the learning rate for each participant and task. Our results indicated that midfrontal theta activity and feedback-related negativity increased linearly with the unsigned prediction error. In addition, variations of frontal theta oscillatory activity predicted the learning rate across tasks and participants. These results support the existence of a common brain mechanism for the computation of unsigned prediction error and learning rate.


2020 ◽  
Vol 15 (6) ◽  
pp. 695-707 ◽  
Author(s):  
Lei Zhang ◽  
Lukas Lengersdorff ◽  
Nace Mikus ◽  
Jan Gläscher ◽  
Claus Lamm

Abstract The recent years have witnessed a dramatic increase in the use of reinforcement learning (RL) models in social, cognitive and affective neuroscience. This approach, in combination with neuroimaging techniques such as functional magnetic resonance imaging, enables quantitative investigations into latent mechanistic processes. However, increased use of relatively complex computational approaches has led to potential misconceptions and imprecise interpretations. Here, we present a comprehensive framework for the examination of (social) decision-making with the simple Rescorla–Wagner RL model. We discuss common pitfalls in its application and provide practical suggestions. First, with simulation, we unpack the functional role of the learning rate and pinpoint what could easily go wrong when interpreting differences in the learning rate. Then, we discuss the inevitable collinearity between outcome and prediction error in RL models and provide suggestions of how to justify whether the observed neural activation is related to the prediction error rather than outcome valence. Finally, we suggest posterior predictive check is a crucial step after model comparison, and we articulate employing hierarchical modeling for parameter estimation. We aim to provide simple and scalable explanations and practical guidelines for employing RL models to assist both beginners and advanced users in better implementing and interpreting their model-based analyses.


2014 ◽  
Vol 26 (9) ◽  
pp. 2111-2127 ◽  
Author(s):  
Christian Bellebaum ◽  
Marco Colosio

Humans can adapt their behavior by learning from the consequences of their own actions or by observing others. Gradual active learning of action–outcome contingencies is accompanied by a shift from feedback- to response-based performance monitoring. This shift is reflected by complementary learning-related changes of two ACC-driven ERP components, the feedback-related negativity (FRN) and the error-related negativity (ERN), which have both been suggested to signal events “worse than expected,” that is, a negative prediction error. Although recent research has identified comparable components for observed behavior and outcomes (observational ERN and FRN), it is as yet unknown, whether these components are similarly modulated by prediction errors and thus also reflect behavioral adaptation. In this study, two groups of 15 participants learned action–outcome contingencies either actively or by observation. In active learners, FRN amplitude for negative feedback decreased and ERN amplitude in response to erroneous actions increased with learning, whereas observational ERN and FRN in observational learners did not exhibit learning-related changes. Learning performance, assessed in test trials without feedback, was comparable between groups, as was the ERN following actively performed errors during test trials. In summary, the results show that action–outcome associations can be learned similarly well actively and by observation. The mechanisms involved appear to differ, with the FRN in active learning reflecting the integration of information about own actions and the accompanying outcomes.


2019 ◽  
Author(s):  
Lei Zhang ◽  
Lukas Lengersdorff ◽  
Nace Mikus ◽  
Jan Gläscher ◽  
Claus Lamm

Recent years have witnessed a dramatic increase in the use of reinforcement learning (RL) models in social, cognitive and affective neuroscience. This approach, in combination with neuroimaging techniques such as functional magnetic resonance imaging, enables quantitative investigations into latent mechanistic processes. However, increased use of relatively complex computational approaches has led to potential misconceptions and imprecise interpretations. Here, we present a comprehensive framework for the examination of (social) decision-making with the simple Rescorla-Wagner RL model. We discuss common pitfalls in its application and provide practical suggestions. First, with simulation, we unpack the functional role of the learning rate and pinpoint what could easily go wrong when interpreting differences in the learning rate. Then, we discuss the inevitable collinearity between outcome and prediction error in RL models and provide suggestions of how to justify whether the observed neural activation is related to the prediction error rather than outcome valence. Finally, we suggest posterior predictive check is a crucial step after model comparison, and we articulate employing hierarchical modeling for parameter estimation. We aim to provide simple and scalable explanations and practical guidelines for employing RL models to assist both beginners and advanced users in better implementing and interpreting their model-based analyses.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Nina Rouhani ◽  
Yael Niv

Memory helps guide behavior, but which experiences from the past are prioritized? Classic models of learning posit that events associated with unpredictable outcomes as well as, paradoxically, predictable outcomes, deploy more attention and learning for those events. Here, we test reinforcement learning and subsequent memory for those events, and treat signed and unsigned reward prediction errors (RPEs), experienced at the reward-predictive cue or reward outcome, as drivers of these two seemingly contradictory signals. By fitting reinforcement learning models to behavior, we find that both RPEs contribute to learning by modulating a dynamically changing learning rate. We further characterize the effects of these RPE signals on memory, and show that both signed and unsigned RPEs enhance memory, in line with midbrain dopamine and locus-coeruleus modulation of hippocampal plasticity, thereby reconciling separate findings in the literature.


2017 ◽  
Author(s):  
N. Rouhani ◽  
K. A. Norman ◽  
Y. Niv

The extent to which rewards deviate from learned expectations is tracked by a signal known as a “reward prediction error”, but it is unclear how this signal interacts with episodic memory. Here, we investigated whether learning in a high-risk environment, with frequent large prediction errors, gives rise to higher fidelity memory traces than learning in a low-risk environment. In Experiment 1, we showed that higher magnitude prediction errors, positive or negative, improved recognition memory for trial-unique items. Participants also increased their learning rate after large prediction errors. In addition, there was an overall higher learning rate in the low-risk environment. Although unsigned prediction errors enhanced memory and increased learning rate, we did not find a relationship between learning rate and memory, suggesting that these two effects were due to separate underlying mechanisms. In Experiment 2, we replicated these results with a longer task that posed stronger memory demands and allowed for more learning. We also showed improved source and sequence memory for high-risk items. In Experiment 3, we controlled for the difficulty of learning in the two risk environments, again replicating the previous results. Moreover, equating the range of prediction errors in the two risk environments revealed that learning in a high-risk context enhanced episodic memory above and beyond the effect of prediction errors to individual items. In summary, our results across three studies showed that (absolute) prediction error magnitude boosted both episodic memory and incremental learning, but the two effects were not correlated, suggesting distinct underlying systems.


2020 ◽  
Vol 38 (3) ◽  
Author(s):  
Ainhoa Fernández-Pérez ◽  
María de las Nieves López-García ◽  
José Pedro Ramos Requena

In this paper we present a non-conventional statistical arbitrage technique based in varying the number of standard deviations used to carry the trading strategy. We will show how values of 1 and 1,2 in the standard deviation provide better results that the classic strategy of Gatev et al (2006). An empirical application is performance using data of the FST100 index during the period 2010 to June 2019.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yibing Zhang ◽  
Tingyang Li ◽  
Aparna Reddy ◽  
Nambi Nallasamy

Abstract Objectives To evaluate gender differences in optical biometry measurements and lens power calculations. Methods Eight thousand four hundred thirty-one eyes of five thousand five hundred nineteen patients who underwent cataract surgery at University of Michigan’s Kellogg Eye Center were included in this retrospective study. Data including age, gender, optical biometry, postoperative refraction, implanted intraocular lens (IOL) power, and IOL formula refraction predictions were gathered and/or calculated utilizing the Sight Outcomes Research Collaborative (SOURCE) database and analyzed. Results There was a statistical difference between every optical biometry measure between genders. Despite lens constant optimization, mean signed prediction errors (SPEs) of modern IOL formulas differed significantly between genders, with predictions skewed more hyperopic for males and myopic for females for all 5 of the modern IOL formulas tested. Optimization of lens constants by gender significantly decreased prediction error for 2 of the 5 modern IOL formulas tested. Conclusions Gender was found to be an independent predictor of refraction prediction error for all 5 formulas studied. Optimization of lens constants by gender can decrease refraction prediction error for certain modern IOL formulas.


2012 ◽  
Vol 6-7 ◽  
pp. 428-433
Author(s):  
Yan Wei Li ◽  
Mei Chen Wu ◽  
Tung Shou Chen ◽  
Wien Hong

We propose a reversible data hiding technique to improve Hong and Chen’s (2010) method. Hong and Chen divide the cover image into pixel group, and use reference pixels to predict other pixel values. Data are then embedded by modifying the prediction errors. However, when solving the overflow and underflow problems, they employ a location map to record the position of saturated pixels, and these pixels will not be used to carry data. In their method, if the image has a plenty of saturated pixels, the payload is decreased significantly because a lot of saturated pixels will not joint the embedment. We improve Hong and Chen’s method such that the saturated pixels can be used to carry data. The positions of these saturated pixels are then recorded in a location map, and the location map is embedded together with the secret data. The experimental results illustrate that the proposed method has better payload, will providing a comparable image quality.


2018 ◽  
Vol 8 (12) ◽  
pp. 228 ◽  
Author(s):  
Akiko Mizuno ◽  
Maria Ly ◽  
Howard Aizenstein

Subjective Cognitive Decline (SCD) is possibly one of the earliest detectable signs of dementia, but we do not know which mental processes lead to elevated concern. In this narrative review, we will summarize the previous literature on the biomarkers and functional neuroanatomy of SCD. In order to extend upon the prevailing theory of SCD, compensatory hyperactivation, we will introduce a new model: the breakdown of homeostasis in the prediction error minimization system. A cognitive prediction error is a discrepancy between an implicit cognitive prediction and the corresponding outcome. Experiencing frequent prediction errors may be a primary source of elevated subjective concern. Our homeostasis breakdown model provides an explanation for the progression from both normal cognition to SCD and from SCD to advanced dementia stages.


Sign in / Sign up

Export Citation Format

Share Document