scholarly journals Computational Characteristics of the Striatal Dopamine System Described by Reinforcement Learning With Fast Generalization

2020 ◽  
Vol 14 ◽  
Author(s):  
Yoshihisa Fujita ◽  
Sho Yagishita ◽  
Haruo Kasai ◽  
Shin Ishii
Author(s):  
Yoshihisa Fujita ◽  
Sho Yagishita ◽  
Haruo Kasai ◽  
Shin Ishii

AbstractGeneralization enables applying past experience to similar but nonidentical situations. Therefore, it may be essential for adaptive behaviors. Recent neurobiological observation indicates that the striatal dopamine system achieves generalization and subsequent discrimination by updating corticostriatal synaptic connections in differential response to reward and punishment. To analyze how the computational characteristics in this system affect behaviors, we proposed a novel reinforcement learning model with multilayer neural networks in which the synaptic weights of only the last layer are updated according to the prediction error. We set fixed connections between the input and hidden layers so as to maintain the similarity of inputs in the hidden-layer representation. This network enabled fast generalization, and thereby facilitated safe and efficient exploration in reinforcement learning tasks, compared to algorithms which do not show generalization. However, disturbance in the network induced aberrant valuation. In conclusion, the unique computation suggested by corticostriatal plasticity has the advantage of providing safe and quick adaptations to unknown environments, but on the other hand has the potential defect which can induce maladaptive behaviors like delusional symptoms of psychiatric disorders.Author summaryThe brain has an ability to generalize knowledge obtained from reward- and punishment-related learning. Animals that have been trained to associate a stimulus with subsequent reward or punishment respond not only to the same stimulus but also to resembling stimuli. How does generalization affect behaviors in situations where individuals are required to adapt to unknown environments? It may enable efficient learning and promote adaptive behaviors, but inappropriate generalization may disrupt behaviors by associating reward or punishment with irrelevant stimuli. The effect of generalization here should depend on computational characteristics of underlying biological basis in the brain, namely, the striatal dopamine system. In this research, we made a novel computational model based on the characteristics of the striatal dopamine system. Our model enabled fast generalization and showed its advantage of providing safe and quick adaptation to unknown environments. By contrast, disturbance of our model induced abnormal behaviors. The results suggested the advantage and the shortcoming of generalization by the striatal dopamine system.


2018 ◽  
Vol 33 (4) ◽  
pp. 652-654 ◽  
Author(s):  
Gian Pal ◽  
Bichun Ouyang ◽  
Leo Verhagen ◽  
Geidy Serrano ◽  
Holly A. Shill ◽  
...  

2020 ◽  
Vol 30 (6) ◽  
pp. 3573-3589 ◽  
Author(s):  
Rick A Adams ◽  
Michael Moutoussis ◽  
Matthew M Nour ◽  
Tarik Dahoun ◽  
Declan Lewis ◽  
...  

Abstract Choosing actions that result in advantageous outcomes is a fundamental function of nervous systems. All computational decision-making models contain a mechanism that controls the variability of (or confidence in) action selection, but its neural implementation is unclear—especially in humans. We investigated this mechanism using two influential decision-making frameworks: active inference (AI) and reinforcement learning (RL). In AI, the precision (inverse variance) of beliefs about policies controls action selection variability—similar to decision ‘noise’ parameters in RL—and is thought to be encoded by striatal dopamine signaling. We tested this hypothesis by administering a ‘go/no-go’ task to 75 healthy participants, and measuring striatal dopamine 2/3 receptor (D2/3R) availability in a subset (n = 25) using [11C]-(+)-PHNO positron emission tomography. In behavioral model comparison, RL performed best across the whole group but AI performed best in participants performing above chance levels. Limbic striatal D2/3R availability had linear relationships with AI policy precision (P = 0.029) as well as with RL irreducible decision ‘noise’ (P = 0.020), and this relationship with D2/3R availability was confirmed with a ‘decision stochasticity’ factor that aggregated across both models (P = 0.0006). These findings are consistent with occupancy of inhibitory striatal D2/3Rs decreasing the variability of action selection in humans.


2008 ◽  
Vol 22 (10) ◽  
pp. 3736-3746 ◽  
Author(s):  
Irene Brunk ◽  
Christian Blex ◽  
Carles Sanchis‐Segura ◽  
Jan Sternberg ◽  
Stephanie Perreau‐Lenz ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document