scholarly journals Phasic dopamine enhances the distinct decoding and perceived salience of stimuli

2019 ◽  
Author(s):  
Lars-Lennart Oettl ◽  
Max Scheller ◽  
Sebastian Wieland ◽  
Franziska Haag ◽  
David Wolf ◽  
...  

AbstractSubjects learn to assign value to stimuli that predict outcomes. Novelty, rewards or punishment evoke reinforcing phasic dopamine release from midbrain neurons to ventral striatum that mediates expected value and salience of stimuli in humans and animals. It is however not clear whether phasic dopamine release is sufficient to form distinct engrams that encode salient stimuli within these circuits. We addressed this question in awake mice. Evoked phasic dopamine induced plasticity selectively to the population encoding of coincidently presented stimuli and increased their distinctness from other stimuli. Phasic dopamine thereby enhanced the decoding of previously paired stimuli and increased their perceived salience. This dopamine-induced plasticity mimicked population coding dynamics of conditioned stimuli during reinforcement learning. These findings provide a network coding mechanism of how dopaminergic learning signals promote value assignment to stimulus representations.

2009 ◽  
Vol 21 (7) ◽  
pp. 1332-1345 ◽  
Author(s):  
Thorsten Kahnt ◽  
Soyoung Q Park ◽  
Michael X Cohen ◽  
Anne Beck ◽  
Andreas Heinz ◽  
...  

It has been suggested that the target areas of dopaminergic midbrain neurons, the dorsal (DS) and ventral striatum (VS), are differently involved in reinforcement learning especially as actor and critic. Whereas the critic learns to predict rewards, the actor maintains action values to guide future decisions. The different midbrain connections to the DS and the VS seem to play a critical role in this functional distinction. Here, subjects performed a dynamic, reward-based decision-making task during fMRI acquisition. A computational model of reinforcement learning was used to estimate the different effects of positive and negative reinforcements on future decisions for each subject individually. We found that activity in both the DS and the VS correlated with reward prediction errors. Using functional connectivity, we show that the DS and the VS are differentially connected to different midbrain regions (possibly corresponding to the substantia nigra [SN] and the ventral tegmental area [VTA], respectively). However, only functional connectivity between the DS and the putative SN predicted the impact of different reinforcement types on future behavior. These results suggest that connections between the putative SN and the DS are critical for modulating action values in the DS according to both positive and negative reinforcements to guide future decision making.


2018 ◽  
Vol 115 (52) ◽  
pp. E12398-E12406 ◽  
Author(s):  
Craig A. Taswell ◽  
Vincent D. Costa ◽  
Elisabeth A. Murray ◽  
Bruno B. Averbeck

Adaptive behavior requires animals to learn from experience. Ideally, learning should both promote choices that lead to rewards and reduce choices that lead to losses. Because the ventral striatum (VS) contains neurons that respond to aversive stimuli and aversive stimuli can drive dopamine release in the VS, it is possible that the VS contributes to learning about aversive outcomes, including losses. However, other work suggests that the VS may play a specific role in learning to choose among rewards, with other systems mediating learning from aversive outcomes. To examine the role of the VS in learning from gains and losses, we compared the performance of macaque monkeys with VS lesions and unoperated controls on a reinforcement learning task. In the task, the monkeys gained or lost tokens, which were periodically cashed out for juice, as outcomes for choices. They learned over trials to choose cues associated with gains, and not choose cues associated with losses. We found that monkeys with VS lesions had a deficit in learning to choose between cues that differed in reward magnitude. By contrast, monkeys with VS lesions performed as well as controls when choices involved a potential loss. We also fit reinforcement learning models to the behavior and compared learning rates between groups. Relative to controls, the monkeys with VS lesions had reduced learning rates for gain cues. Therefore, in this task, the VS plays a specific role in learning to choose between rewarding options.


Author(s):  
Valery N. Mukhin ◽  
Ivan R. Borovets ◽  
Vadim V. Sizov ◽  
Konstantin I. Pavlov ◽  
Victor M. Klimenko

Author(s):  
Qi Wang ◽  
Jianmin Liu ◽  
Katia Jaffres-Runser ◽  
Yongqing Wang ◽  
Chentao He ◽  
...  

2013 ◽  
Vol 125 (3) ◽  
pp. 373-385 ◽  
Author(s):  
Alicia J. Avelar ◽  
Steven A. Juliano ◽  
Paul A. Garris

2010 ◽  
Vol 122 (4) ◽  
pp. 326-333 ◽  
Author(s):  
J. Linnet ◽  
E. Peterson ◽  
D. J. Doudet ◽  
A. Gjedde ◽  
A. Møller

2016 ◽  
Vol 115 (6) ◽  
pp. 3195-3203 ◽  
Author(s):  
Simon Dunne ◽  
Arun D'Souza ◽  
John P. O'Doherty

A major open question is whether computational strategies thought to be used during experiential learning, specifically model-based and model-free reinforcement learning, also support observational learning. Furthermore, the question of how observational learning occurs when observers must learn about the value of options from observing outcomes in the absence of choice has not been addressed. In the present study we used a multi-armed bandit task that encouraged human participants to employ both experiential and observational learning while they underwent functional magnetic resonance imaging (fMRI). We found evidence for the presence of model-based learning signals during both observational and experiential learning in the intraparietal sulcus. However, unlike during experiential learning, model-free learning signals in the ventral striatum were not detectable during this form of observational learning. These results provide insight into the flexibility of the model-based learning system, implicating this system in learning during observation as well as from direct experience, and further suggest that the model-free reinforcement learning system may be less flexible with regard to its involvement in observational learning.


2018 ◽  
Author(s):  
Xian Zhang ◽  
Bo Li

AbstractThe basolateral amygdala (BLA) plays an important role in associative learning, by representing both conditioned stimuli (CSs) and unconditioned stimuli (USs) of positive and negative valences, and by forming associations between CSs and USs. However, how such associations are formed and updated during learning remains unclear. Here we show that associative learning driven by reward and punishment profoundly alters BLA neuronal responses at population levels, reducing noise correlations and transforming the representations of CSs to resemble the distinctive valence-specific representations of USs. This transformation is accompanied by the emergence of prevalent inhibitory CS and US responses, and by the plasticity of CS responses in individual BLA neurons. During reversal learning wherein the expected valences are reversed, BLA population CS representations are remapped onto ensembles representing the opposite valences and track the switching in valence-specific behavioral actions. Our results reveal how signals predictive of opposing valences in the BLA evolve during reward and punishment learning, and how these signals might be updated and used to guide flexible behaviors.


2020 ◽  
Author(s):  
Hiroshi Yamada ◽  
Yuri Imaizumi ◽  
Masayuki Matsumoto

AbstractComputation of expected values, i.e., probability times magnitude, seems to be a dynamic integrative process performed in the brain for efficient economic behavior. However, neural dynamics underlying this computation remain largely unknown. We examined (1) whether four core reward-related regions detect and integrate the probability and magnitude cued by numerical symbols and (2) whether these regions have distinct dynamics in the integrative process. Extractions of mechanistic structure of neural population signal demonstrated that expected-value signals simultaneously arose in central part of orbitofrontal cortex (cOFC, area 13m) and ventral striatum (VS). These expected-value signals were incredibly stable in contrast to weak and/or fluctuated signals in dorsal striatum and medial OFC. Notably, temporal dynamics of these stable expected-value signals were unambiguously distinct: sharp and gradual signal evolutions in cOFC and VS, respectively. These intimate dynamics suggest that cOFC and VS compute the expected-values with unique time constants, as distinct, partially overlapping processes.


Sign in / Sign up

Export Citation Format

Share Document