scholarly journals The comparable strategic flexibility of model-free and model-based learning

Author(s):  
Alexandre L. S. Filipowicz ◽  
Jonathan Levine ◽  
Eugenio Piasini ◽  
Gaia Tavoni ◽  
Joseph W. Kable ◽  
...  

AbstractDifferent learning strategies are thought to fall along a continuum that ranges from simple, inflexible, and fast “model-free” strategies, to more complex, flexible, and deliberative “model-based strategies”. Here we show that, contrary to this proposal, strategies at both ends of this continuum can be equally flexible, effective, and time-intensive. We analyzed behavior of adult human subjects performing a canonical learning task used to distinguish between model-free and model-based strategies. Subjects using either strategy showed similarly high information complexity, a measure of strategic flexibility, and comparable accuracy and response times. This similarity was apparent despite the generally higher computational complexity of model-based algorithms and fundamental differences in how each strategy learned: model-free learning was driven primarily by observed past responses, whereas model-based learning was driven primarily by inferences about latent task features. Thus, model-free and model-based learning differ in the information they use to learn but can support comparably flexible behavior.Statement of RelevanceThe distinction between model-free and model-based learning is an influential framework that has been used extensively to understand individual- and task-dependent differences in learning by both healthy and clinical populations. A common interpretation of this distinction that model-based strategies are more complex and therefore more flexible than model-free strategies. However, this interpretation conflates computational complexity, which relates to processing resources and generally higher for model-based algorithms, with information complexity, which reflects flexibility but has rarely been measured. Here we use a metric of information complexity to demonstrate that, contrary to this interpretation, model-free and model-based strategies can be equally flexible, effective, and time-intensive and are better distinguished by the nature of the information from which they learn. Our results counter common interpretations of model-free versus model-based learning and demonstrate the general usefulness of information complexity for assessing different forms of strategic flexibility.

2019 ◽  
Author(s):  
Leor M Hackel ◽  
Jeffrey Jordan Berg ◽  
Björn Lindström ◽  
David Amodio

Do habits play a role in our social impressions? To investigate the contribution of habits to the formation of social attitudes, we examined the roles of model-free and model-based reinforcement learning in social interactions—computations linked in past work to habit and planning, respectively. Participants in this study learned about novel individuals in a sequential reinforcement learning paradigm, choosing financial advisors who led them to high- or low-paying stocks. Results indicated that participants relied on both model-based and model-free learning, such that each independently predicted choice during the learning task and self-reported liking in a post-task assessment. Specifically, participants liked advisors who could provide large future rewards as well as advisors who had provided them with large rewards in the past. Moreover, participants varied in their use of model-based and model-free learning strategies, and this individual difference influenced the way in which learning related to self-reported attitudes: among participants who relied more on model-free learning, model-free social learning related more to post-task attitudes. We discuss implications for attitudes, trait impressions, and social behavior, as well as the role of habits in a memory systems model of social cognition.


Mechatronics ◽  
2014 ◽  
Vol 24 (8) ◽  
pp. 1008-1020 ◽  
Author(s):  
Abhishek Dutta ◽  
Yu Zhong ◽  
Bruno Depraetere ◽  
Kevin Van Vaerenbergh ◽  
Clara Ionescu ◽  
...  

2019 ◽  
Author(s):  
Sara Ershadmanesh ◽  
Mostafa Miandari ◽  
Abdol-hossein Vahabie ◽  
Majid Nili Ahmadabadi

AbstractMany studies on human and animals have provided evidence for the contribution of goal-directed and habitual valuation systems in learning and decision-making. These two systems can be modeled using model-based (MB) and model-free (MF) algorithms in Reinforcement Learning (RL) framework. Here, we study the link between the contribution of these two learning systems to behavior and meta-cognitive capabilities. Using computational modeling we showed that in a highly variable environment, where both learning strategies have chance level performances, model-free learning predicts higher confidence in decisions compared to model-based strategy. Our experimental results showed that the subjects’ meta-cognitive ability is negatively correlated with the contribution of model-free system to their behavior while having no correlation with the contribution of model-based system. Over-confidence of the model-free system justifies this counter-intuitive result. This is a new explanation for individual difference in learning style.


2018 ◽  
Author(s):  
S Ritter ◽  
JX Wang ◽  
Z Kurth-Nelson ◽  
M Botvinick

AbstractRecent research has placed episodic reinforcement learning (RL) alongside model-free and model-based RL on the list of processes centrally involved in human reward-based learning. In the present work, we extend the unified account of model-free and model-based RL developed by Wang et al. (2018) to further integrate episodic learning. In this account, a generic model-free “meta-learner” learns to deploy and coordinate among all of these learning algorithms. The meta-learner learns through brief encounters with many novel tasks, so that it learns to learn about new tasks. We show that when equipped with an episodic memory system inspired by theories of reinstatement and gating, the meta-learner learns to use the episodic and model-based learning algorithms observed in humans in a task designed to dissociate among the influences of various learning strategies. We discuss implications and predictions of the model.


2020 ◽  
Vol 9 (5) ◽  
pp. 1453 ◽  
Author(s):  
Julia Berghäuser ◽  
Wiebke Bensmann ◽  
Nicolas Zink ◽  
Tanja Endrass ◽  
Christian Beste ◽  
...  

Frequent alcohol binges shift behavior from goal-directed to habitual processing modes. This shift in reward-associated learning strategies plays a key role in the development and maintenance of alcohol use disorders and seems to persist during (early stages of) sobriety in at-risk drinkers. Yet still, it has remained unclear whether this phenomenon might be associated with alcohol hangover and thus also be found in social drinkers. In an experimental crossover design, n = 25 healthy young male participants performed a two-step decision-making task once sober and once hungover (i.e., when reaching sobriety after consuming 2.6 g of alcohol per estimated liter of total body water). This task allows the separation of effortful model-based and computationally less demanding model-free learning strategies. The experimental induction of alcohol hangover was successful, but we found no significant hangover effects on model-based and model-free learning scores, the balance between model-free and model-based valuation (ω), or perseveration tendencies (π). Bayesian analyses provided positive evidence for the null hypothesis for all measures except π (anecdotal evidence for the null hypothesis). Taken together, alcohol hangover, which results from a single binge drinking episode, does not impair the application of effortful and computationally costly model-based learning strategies and/or increase model-free learning strategies. This supports the notion that the behavioral deficits observed in at-risk drinkers are most likely not caused by the immediate aftereffects of individual binge drinking events.


2015 ◽  
Author(s):  
Thomas Akam ◽  
Rui Costa ◽  
Peter Dayan

The recently developed ‘two-step’ behavioural task promises to differentiate model-based or goal-directed from model-free or habitual reinforcement learning, while generating neurophysiologically-friendly decision datasets with parametric variation of decision variables. These desirable features have prompted widespread adoption of the task. However, the signatures of model-based control can be elusive – here, we investigate model-free learning methods that, depending on the analysis strategy, can masquerade as being model-based. We first show that unadorned model-free reinforcement learning can induce correlations between action values at the start of the trial and the subsequent trial events in such a way that analysis based on comparing successive trials can lead to erroneous conclusions. We also suggest a correction to the analysis that can alleviate this problem. We then consider model-free reinforcement learning strategies based on different state representations from those envisioned by the experimenter, which generate behaviour that appears model-based under these, and also more sophisticated, analyses. The existence of such strategies is of particular relevance to the design and interpretation of animal studies using the two-step task, as extended training and a sharp contrast between good and bad options are likely to promote their use.


2020 ◽  
Vol 43 ◽  
Author(s):  
Peter Dayan

Abstract Bayesian decision theory provides a simple formal elucidation of some of the ways that representation and representational abstraction are involved with, and exploit, both prediction and its rather distant cousin, predictive coding. Both model-free and model-based methods are involved.


Sign in / Sign up

Export Citation Format

Share Document