scholarly journals Multiple systems in macaques for tracking prediction errors and other types of surprise

PLoS Biology ◽  
2020 ◽  
Vol 18 (10) ◽  
pp. e3000899
Author(s):  
Jan Grohn ◽  
Urs Schüffelgen ◽  
Franz-Xaver Neubert ◽  
Alessandro Bongioanni ◽  
Lennart Verhagen ◽  
...  

Animals learn from the past to make predictions. These predictions are adjusted after prediction errors, i.e., after surprising events. Generally, most reward prediction errors models learn the average expected amount of reward. However, here we demonstrate the existence of distinct mechanisms for detecting other types of surprising events. Six macaques learned to respond to visual stimuli to receive varying amounts of juice rewards. Most trials ended with the delivery of either 1 or 3 juice drops so that animals learned to expect 2 juice drops on average even though instances of precisely 2 drops were rare. To encourage learning, we also included sessions during which the ratio between 1 and 3 drops changed. Additionally, in all sessions, the stimulus sometimes appeared in an unexpected location. Thus, 3 types of surprising events could occur: reward amount surprise (i.e., a scalar reward prediction error), rare reward surprise, and visuospatial surprise. Importantly, we can dissociate scalar reward prediction errors—rewards that deviated from the average reward amount expected—and rare reward events—rewards that accorded with the average reward expectation but that rarely occurred. We linked each type of surprise to a distinct pattern of neural activity using functional magnetic resonance imaging. Activity in the vicinity of the dopaminergic midbrain only reflected surprise about the amount of reward. Lateral prefrontal cortex had a more general role in detecting surprising events. Posterior lateral orbitofrontal cortex specifically detected rare reward events regardless of whether they followed average reward amount expectations, but only in learnable reward environments.


2017 ◽  
Author(s):  
Ian Ballard ◽  
Eric M. Miller ◽  
Steven T. Piantadosi ◽  
Noah Goodman ◽  
Samuel M. McClure

ABSTRACTHumans naturally group the world into coherent categories defined by membership rules. Rules can be learned implicitly by building stimulus-response associations using reinforcement learning (RL) or by using explicit reasoning. We tested if the striatum, in which activation reliably scales with reward prediction error, would track prediction errors in a task that required explicit rule generation. Using functional magnetic resonance imaging during a categorization task, we show that striatal responses to feedback scale with a “surprise” signal derived from a Bayesian rule-learning model and are inconsistent with RL prediction error. We also find that striatum and caudal inferior frontal sulcus (cIFS) are involved in updating the likelihood of discriminative rules. We conclude that the striatum, in cooperation with the cIFS, is involved in updating the values assigned to categorization rules when people learn using explicit reasoning.



eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Nina Rouhani ◽  
Yael Niv

Memory helps guide behavior, but which experiences from the past are prioritized? Classic models of learning posit that events associated with unpredictable outcomes as well as, paradoxically, predictable outcomes, deploy more attention and learning for those events. Here, we test reinforcement learning and subsequent memory for those events, and treat signed and unsigned reward prediction errors (RPEs), experienced at the reward-predictive cue or reward outcome, as drivers of these two seemingly contradictory signals. By fitting reinforcement learning models to behavior, we find that both RPEs contribute to learning by modulating a dynamically changing learning rate. We further characterize the effects of these RPE signals on memory, and show that both signed and unsigned RPEs enhance memory, in line with midbrain dopamine and locus-coeruleus modulation of hippocampal plasticity, thereby reconciling separate findings in the literature.



2017 ◽  
Vol 28 (11) ◽  
pp. 3965-3975 ◽  
Author(s):  
Ian Ballard ◽  
Eric M Miller ◽  
Steven T Piantadosi ◽  
Noah D Goodman ◽  
Samuel M McClure

Abstract Humans naturally group the world into coherent categories defined by membership rules. Rules can be learned implicitly by building stimulus-response associations using reinforcement learning or by using explicit reasoning. We tested if the striatum, in which activation reliably scales with reward prediction error, would track prediction errors in a task that required explicit rule generation. Using functional magnetic resonance imaging during a categorization task, we show that striatal responses to feedback scale with a “surprise” signal derived from a Bayesian rule-learning model and are inconsistent with RL prediction error. We also find that striatum and caudal inferior frontal sulcus (cIFS) are involved in updating the likelihood of discriminative rules. We conclude that the striatum, in cooperation with the cIFS, is involved in updating the values assigned to categorization rules when people learn using explicit reasoning.



2017 ◽  
Vol 69 (5) ◽  
pp. 486-502 ◽  
Author(s):  
Wei Quan ◽  
Bikun Chen ◽  
Fei Shu

Purpose The purpose of this paper is to present the landscape of the cash-per-publication reward policy in China and reveal its trend since the late 1990s. Design/methodology/approach This study is based on the analysis of 168 university documents regarding the cash-per-publication reward policy at 100 Chinese universities. Findings Chinese universities offer cash rewards from USD30 to USD165,000 for papers published in journals indexed by Web of Science, and the average reward amount has been increasing for the past ten years. Originality/value The cash-per-publication reward policy in China has never been systematically studied and investigated before except for in some case studies. This is the first paper that reveals the landscape of the cash-per-publication reward policy in China.



2020 ◽  
Vol 28 (2) ◽  
pp. 298-318
Author(s):  
Roman Girma Teshome

The effectiveness of human rights adjudicative procedures partly, if not most importantly, hinges upon the adequacy of the remedies they grant and the implementation of those remedies. This assertion also holds water with regard to the international and regional monitoring bodies established to receive individual complaints related to economic, social and cultural rights (hereinafter ‘ESC rights’ or ‘socio-economic rights’). Remedies can serve two major functions: they are meant, first, to rectify the pecuniary and non-pecuniary damage sustained by the particular victim, and second, to resolve systematic problems existing in the state machinery in order to ensure the non-repetition of the act. Hence, the role of remedies is not confined to correcting the past but also shaping the future by providing reforming measures a state has to undertake. The adequacy of remedies awarded by international and regional human rights bodies is also assessed based on these two benchmarks. The present article examines these issues in relation to individual complaint procedures that deal with the violation of ESC rights, with particular reference to the case laws of the three jurisdictions selected for this work, i.e. the United Nations, Inter-American and African Human Rights Systems.



2020 ◽  
Author(s):  
Kate Ergo ◽  
Luna De Vilder ◽  
Esther De Loof ◽  
Tom Verguts

Recent years have witnessed a steady increase in the number of studies investigating the role of reward prediction errors (RPEs) in declarative learning. Specifically, in several experimental paradigms RPEs drive declarative learning; with larger and more positive RPEs enhancing declarative learning. However, it is unknown whether this RPE must derive from the participant’s own response, or whether instead any RPE is sufficient to obtain the learning effect. To test this, we generated RPEs in the same experimental paradigm where we combined an agency and a non-agency condition. We observed no interaction between RPE and agency, suggesting that any RPE (irrespective of its source) can drive declarative learning. This result holds implications for declarative learning theory.



Author(s):  
Rowland W Pettit ◽  
Jordan Kaplan ◽  
Matthew M Delancy ◽  
Edward Reece ◽  
Sebastian Winocour ◽  
...  

Abstract Background The Open Payments Program, as designated by the Physician Payments Sunshine Act is the single largest repository of industry payments made to licensed physicians within the United States. Though sizeable in its dataset, the database and user interface are limited in their ability to permit expansive data interpretation and summarization. Objectives We sought to comprehensively compare industry payments made to plastic surgeons with payments made to all surgeons and all physicians to elucidate industry relationships since implementation. Methods The Open Payments Database was queried between 2014 and 2019, and inclusion criteria were applied. These data were evaluated in aggregate and for yearly totals, payment type, and geographic distribution. Results 61,000,728 unique payments totaling $11,815,248,549 were identified over the six-year study period. 9,089 plastic surgeons, 121,151 surgeons, and 796,260 total physicians received these payments. Plastic surgeons annually received significantly less payment than all surgeons (p=0.0005). However, plastic surgeons did not receive significantly more payment than all physicians (p = 0.0840). Cash and cash equivalents proved to be the most common form of payment; Stock and stock options were least commonly transferred. Plastic surgeons in Tennessee received the most in payments between 2014-2019 (mean $ 76,420.75). California had the greatest number of plastic surgeons to receive payments (1,452 surgeons). Conclusions Plastic surgeons received more in industry payments than the average of all physicians but received less than all surgeons. The most common payment was cash transactions. Over the past six years, geographic trends in industry payments have remained stable.





2017 ◽  
Vol 210 (4) ◽  
pp. 307-308 ◽  
Author(s):  
Derek K. Tracy ◽  
Dan W. Joyce ◽  
Sukhwinder S. Shergill

Quitting smoking isn't easy, even with the advent of e-cigarettes. The NHS Stop Smoking Services (SSSs) were established in 2000, and have shown superior results to nicotine replacement alone, but are characterised by low, and dropping, attendance rates. Beneath the highlight figure of a halving of UK smoking prevalence over the past 40 years lies a direct £6 billion cost to the NHS and 80000 deaths each year, as well as recent concern that clinical commissioning groups are not renewing service funding. Given that the ‘health belief model’ is based upon a trigger changing behaviour, what will encourage attendance at SSSs, especially with evidence that smokers underestimate their own personal risk? Gilbert et al randomised over 4000 smokers across almost 100 general practices to receive either a standard generic advertisement of the SSS clinic, or an individually tailored risk letter and invitation to a no-commitment introductory SSS session. The hosting general practitioners (GPs) and SSS advisors were masked to the allocation. The personalised letter more than doubled the odds of attending the SSS, showing that a more proactive approach can help engagement. Interestingly, the intervention was more effective with men, who are typically less likely to attend and set quit dates.



2021 ◽  
Author(s):  
Joseph Heffner ◽  
Jae-Young Son ◽  
Oriel FeldmanHall

People make decisions based on deviations from expected outcomes, known as prediction errors. Past work has focused on reward prediction errors, largely ignoring violations of expected emotional experiences—emotion prediction errors. We leverage a new method to measure real-time fluctuations in emotion as people decide to punish or forgive others. Across four studies (N=1,016), we reveal that emotion and reward prediction errors have distinguishable contributions to choice, such that emotion prediction errors exert the strongest impact during decision-making. We additionally find that a choice to punish or forgive can be decoded in less than a second from an evolving emotional response, suggesting emotions swiftly influence choice. Finally, individuals reporting significant levels of depression exhibit selective impairments in using emotion—but not reward—prediction errors. Evidence for emotion prediction errors potently guiding social behaviors challenge standard decision-making models that have focused solely on reward.



Sign in / Sign up

Export Citation Format

Share Document