reward prediction error
Recently Published Documents


TOTAL DOCUMENTS

127
(FIVE YEARS 38)

H-INDEX

34
(FIVE YEARS 5)

2021 ◽  
Author(s):  
Karel Kieslich ◽  
Vincent Valton ◽  
Jonathan Paul Roiser

In order to develop effective treatments for anhedonia we need to understand its underlying neurobiological mechanisms. Anhedonia is conceptually strongly linked to reward processing, which involves a variety of cognitive and neural operations. This article reviews the evidence for impairments in experiencing hedonic response (pleasure), reward valuation, and reward learning based on outcomes (commonly conceptualised in terms of “reward prediction error”). Synthesizing behavioural and neuroimaging findings, we examine case-control studies of patients with depression and schizophrenia, including those focusing specifically on anhedonia. Overall, there is reliable evidence that depression and schizophrenia are associated with disrupted reward processing. In contrast to the historical definition of anhedonia, there is surprisingly limited evidence for impairment in the ability to experience pleasure in depression and schizophrenia. There is some evidence that learning about reward and reward prediction error signals are impaired in depression and schizophrenia, but the literature is inconsistent. The strongest evidence is for impairments in the representation of reward value and how this is used to guide action. Future studies would benefit from focusing on impairments in reward processing specifically in anhedonic samples, including transdiagnostically, and from using designs separating different components of reward processing, formulating them in computational terms, and moving beyond cross-sectional designs to provide an assessment of causality.


2021 ◽  
Author(s):  
Anthony M.V. Jakob ◽  
John G Mikhael ◽  
Allison E Hamilos ◽  
John A Assad ◽  
Samuel J Gershman

The role of dopamine as a reward prediction error signal in reinforcement learning tasks has been well-established over the past decades. Recent work has shown that the reward prediction error interpretation can also account for the effects of dopamine on interval timing by controlling the speed of subjective time. According to this theory, the timing of the dopamine signal relative to reward delivery dictates whether subjective time speeds up or slows down: Early DA signals speed up subjective time and late signals slow it down. To test this bidirectional prediction, we reanalyzed measurements of dopaminergic neurons in the substantia nigra pars compacta of mice performing a self-timed movement task. Using the slope of ramping dopamine activity as a read-out of subjective time speed, we found that trial-by-trial changes in the slope could be predicted from the timing of dopamine activity on the previous trial. This result provides a key piece of evidence supporting a unified computational theory of reinforcement learning and interval timing.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Harry J. Stewardson ◽  
Thomas D. Sambrook

AbstractReinforcement learning in humans and other animals is driven by reward prediction errors: deviations between the amount of reward or punishment initially expected and that which is obtained. Temporal difference methods of reinforcement learning generate this reward prediction error at the earliest time at which a revision in reward or punishment likelihood is signalled, for example by a conditioned stimulus. Midbrain dopamine neurons, believed to compute reward prediction errors, generate this signal in response to both conditioned and unconditioned stimuli, as predicted by temporal difference learning. Electroencephalographic recordings of human participants have suggested that a component named the feedback-related negativity (FRN) is generated when this signal is carried to the cortex. If this is so, the FRN should be expected to respond equivalently to conditioned and unconditioned stimuli. However, very few studies have attempted to measure the FRN’s response to unconditioned stimuli. The present study attempted to elicit the FRN in response to a primary aversive stimulus (electric shock) using a design that varied reward prediction error while holding physical intensity constant. The FRN was strongly elicited, but earlier and more transiently than typically seen, suggesting that it may incorporate other processes than the midbrain dopamine system.


2021 ◽  
Vol 67 ◽  
pp. 123-130
Author(s):  
Talia N. Lerner ◽  
Ashley L. Holloway ◽  
Jillian L. Seiler

eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Bastien Blain ◽  
Robb B Rutledge

Subjective well-being or happiness is often associated with wealth. Recent studies suggest that momentary happiness is associated with reward prediction error, the difference between experienced and predicted reward, a key component of adaptive behaviour. We tested subjects in a reinforcement learning task in which reward size and probability were uncorrelated, allowing us to dissociate between the contributions of reward and learning to happiness. Using computational modelling, we found convergent evidence across stable and volatile learning tasks that happiness, like behaviour, is sensitive to learning-relevant variables (i.e. probability prediction error). Unlike behaviour, happiness is not sensitive to learning-irrelevant variables (i.e. reward prediction error). Increasing volatility reduces how many past trials influence behaviour but not happiness. Finally, depressive symptoms reduce happiness more in volatile than stable environments. Our results suggest that how we learn about our world may be more important for how we feel than the rewards we actually receive.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Alexandre Y. Dombrovski ◽  
Beatriz Luna ◽  
Michael N. Hallquist

Abstract When making decisions, should one exploit known good options or explore potentially better alternatives? Exploration of spatially unstructured options depends on the neocortex, striatum, and amygdala. In natural environments, however, better options often cluster together, forming structured value distributions. The hippocampus binds reward information into allocentric cognitive maps to support navigation and foraging in such spaces. Here we report that human posterior hippocampus (PH) invigorates exploration while anterior hippocampus (AH) supports the transition to exploitation on a reinforcement learning task with a spatially structured reward function. These dynamics depend on differential reinforcement representations in the PH and AH. Whereas local reward prediction error signals are early and phasic in the PH tail, global value maximum signals are delayed and sustained in the AH body. AH compresses reinforcement information across episodes, updating the location and prominence of the value maximum and displaying goal cell-like ramping activity when navigating toward it.


Sign in / Sign up

Export Citation Format

Share Document