Parallel Computing of Spatio-Temporal Model Based on Deep Reinforcement Learning

2021 ◽  
pp. 391-403
Author(s):  
Zhiqiang Lv ◽  
Jianbo Li ◽  
Zhihao Xu ◽  
Yue Wang ◽  
Haoran Li
2021 ◽  
Author(s):  
Nora Harhen ◽  
Catherine A. Hartley ◽  
Aaron Bornstein

Foraging has been suggested to provide a more ecologicallyvalidcontext for studying decision-making. However, the environmentsused in human foraging tasks fail to capture thestructure of real world environments which contain multiplelevels of spatio-temporal regularities. We ask if foragers detectthese statistical regularities and use them to construct amodel of the environment that guides their patch-leaving decisions.We propose a model of how foragers might accomplishthis, and test its predictions in a foraging task with a structuredenvironment that includes patches of varying quality andpredictable transitions. Here, we show that human foragingdecisions reflect ongoing, statistically-optimal structure learning.Participants modulated decisions based on the current andpotential future context. From model fits to behavior, we canidentify an individual’s structure learning ability and relate itto decision strategy. These findings demonstrate the utility ofleveraging model-based reinforcement learning to understandforaging behavior.


2019 ◽  
Author(s):  
Leor M Hackel ◽  
Jeffrey Jordan Berg ◽  
Björn Lindström ◽  
David Amodio

Do habits play a role in our social impressions? To investigate the contribution of habits to the formation of social attitudes, we examined the roles of model-free and model-based reinforcement learning in social interactions—computations linked in past work to habit and planning, respectively. Participants in this study learned about novel individuals in a sequential reinforcement learning paradigm, choosing financial advisors who led them to high- or low-paying stocks. Results indicated that participants relied on both model-based and model-free learning, such that each independently predicted choice during the learning task and self-reported liking in a post-task assessment. Specifically, participants liked advisors who could provide large future rewards as well as advisors who had provided them with large rewards in the past. Moreover, participants varied in their use of model-based and model-free learning strategies, and this individual difference influenced the way in which learning related to self-reported attitudes: among participants who relied more on model-free learning, model-free social learning related more to post-task attitudes. We discuss implications for attitudes, trait impressions, and social behavior, as well as the role of habits in a memory systems model of social cognition.


Sign in / Sign up

Export Citation Format

Share Document