scholarly journals Humans are primarily model-based learners in the two-stage task

2019 ◽  
Author(s):  
Carolina Feher da Silva ◽  
Todd A. Hare

AbstractDistinct model-free and model-based learning processes are thought to drive both typical and dysfunctional behaviours. Data from two-stage decision tasks have seemingly shown that human behaviour is driven by both processes operating in parallel. However, in this study, we show that more detailed task instructions lead participants to make primarily model-based choices that have little, if any, simple model-free influence. We also demonstrate that behaviour in the two-stage task may falsely appear to be driven by a combination of simple model-free and model-based learning if purely model-based agents form inaccurate models of the task because of misconceptions. Furthermore, we report evidence that many participants do misconceive the task in important ways. Overall, we argue that humans formulate a wide variety of learning models. Consequently, the simple dichotomy of model-free versus model-based learning is inadequate to explain behaviour in the two-stage task and connections between reward learning, habit formation, and compulsivity.

Author(s):  
Xiaomei Wang ◽  
Kit-Hang Lee ◽  
Denny K. C. Fu ◽  
Ziyang Dong ◽  
Kui Wang ◽  
...  

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Alexander Steinke ◽  
Florian Lange ◽  
Bruno Kopp

Abstract The Wisconsin Card Sorting Test (WCST) is considered a gold standard for the assessment of cognitive flexibility. On the WCST, repeating a sorting category following negative feedback is typically treated as indicating reduced cognitive flexibility. Therefore such responses are referred to as ‘perseveration’ errors. Recent research suggests that the propensity for perseveration errors is modulated by response demands: They occur less frequently when their commitment repeats the previously executed response. Here, we propose parallel reinforcement-learning models of card sorting performance, which assume that card sorting performance can be conceptualized as resulting from model-free reinforcement learning at the level of responses that occurs in parallel with model-based reinforcement learning at the categorical level. We compared parallel reinforcement-learning models with purely model-based reinforcement learning, and with the state-of-the-art attentional-updating model. We analyzed data from 375 participants who completed a computerized WCST. Parallel reinforcement-learning models showed best predictive accuracies for the majority of participants. Only parallel reinforcement-learning models accounted for the modulation of perseveration propensity by response demands. In conclusion, parallel reinforcement-learning models provide a new theoretical perspective on card sorting and it offers a suitable framework for discerning individual differences in latent processes that subserve behavioral flexibility.


Author(s):  
A. Ross Otto ◽  
Candace M. Raio ◽  
Elizabeth A. Phelps ◽  
Nathaniel Daw

Sign in / Sign up

Export Citation Format

Share Document