Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model

2005 ◽  
Vol 18 (9) ◽  
pp. 1163-1171 ◽  
Author(s):  
Adam Johnson ◽  
A. David Redish

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 121922-121930 ◽  
Author(s):  
Xiali Li ◽  
Zhengyu Lv ◽  
Song Wang ◽  
Zhi Wei ◽  
Licheng Wu




2020 ◽  
Author(s):  
Ben Lonnqvist ◽  
Micha Elsner ◽  
Amelia R. Hunt ◽  
Alasdair D F Clarke

Experiments on the efficiency of human search sometimes reveal large differences between individual participants. We argue that reward-driven task-specific learning may account for some of this variation. In a computational reinforcement learning model of this process, a wide variety of strategies emerge, despite all simulated participants having the same visual acuity. We conduct a visual search experiment, and replicate previous findings that participant preferences about where to search are highly varied, with a distribution comparable to the simulated results. Thus, task-specific learning is an under-explored mechanism by which large inter-participant differences can arise.



2018 ◽  
Author(s):  
Minryung R. Song ◽  
Sang Wan Lee

AbstractDopamine activity may transition between two patterns: phasic responses to reward-predicting cues and ramping activity arising when an agent approaches the reward. However, when and why dopamine activity transitions between these modes is not understood. We hypothesize that the transition between ramping and phasic patterns reflects resource allocation which addresses the task dimensionality problem during reinforcement learning (RL). By parsimoniously modifying a standard temporal difference (TD) learning model to accommodate a mixed presentation of both experimental and environmental stimuli, we simulated dopamine transitions and compared it with experimental data from four different studies. The results suggested that dopamine transitions from ramping to phasic patterns as the agent narrows down candidate stimuli for the task; the opposite occurs when the agent needs to re-learn candidate stimuli due to a value change. These results lend insight into how dopamine deals with the tradeoff between cognitive resource and task dimensionality during RL.





Sign in / Sign up

Export Citation Format

Share Document