Executive Control and Model-Based Decision Making

Author(s):  
Andreas Heinz

Within instrumental behavior, more complex goal-directed decision making can be distinguished from habitual responding. This is illustrated by comparing model-based vs. model-free decision making and by explaining its relevance in addictive disorders. Model-based decision making aims at constructing a map of the world, while habitual decisions prune a “decision tree” and thus facilitate rather automatic responding.

2015 ◽  
Vol 22 (2) ◽  
pp. 188-198 ◽  
Author(s):  
Patricia Gruner ◽  
Alan Anticevic ◽  
Daeyeol Lee ◽  
Christopher Pittenger

Decision making in a complex world, characterized both by predictable regularities and by frequent departures from the norm, requires dynamic switching between rapid habit-like, automatic processes and slower, more flexible evaluative processes. These strategies, formalized as “model-free” and “model-based” reinforcement learning algorithms, respectively, can lead to divergent behavioral outcomes, requiring a mechanism to arbitrate between them in a context-appropriate manner. Recent data suggest that individuals with obsessive-compulsive disorder (OCD) rely excessively on inflexible habit-like decision making during reinforcement-driven learning. We propose that inflexible reliance on habit in OCD may reflect a functional weakness in the mechanism for context-appropriate dynamic arbitration between model-free and model-based decision making. Support for this hypothesis derives from emerging functional imaging findings. A deficit in arbitration in OCD may help reconcile evidence for excessive reliance on habit in rewarded learning tasks with an older literature suggesting inappropriate recruitment of circuitry associated with model-based decision making in unreinforced procedural learning. The hypothesized deficit and corresponding circuitry may be a particularly fruitful target for interventions, including cognitive remediation.


2019 ◽  
Author(s):  
Sara Ershadmanesh ◽  
Mostafa Miandari ◽  
Abdol-hossein Vahabie ◽  
Majid Nili Ahmadabadi

AbstractMany studies on human and animals have provided evidence for the contribution of goal-directed and habitual valuation systems in learning and decision-making. These two systems can be modeled using model-based (MB) and model-free (MF) algorithms in Reinforcement Learning (RL) framework. Here, we study the link between the contribution of these two learning systems to behavior and meta-cognitive capabilities. Using computational modeling we showed that in a highly variable environment, where both learning strategies have chance level performances, model-free learning predicts higher confidence in decisions compared to model-based strategy. Our experimental results showed that the subjects’ meta-cognitive ability is negatively correlated with the contribution of model-free system to their behavior while having no correlation with the contribution of model-based system. Over-confidence of the model-free system justifies this counter-intuitive result. This is a new explanation for individual difference in learning style.


2020 ◽  
Vol 46 (Supplement_1) ◽  
pp. S91-S92
Author(s):  
Felix Brandl ◽  
Mihai Avram ◽  
Jorge Cabello ◽  
Mona Mustafa ◽  
Claudia Leucht ◽  
...  

Abstract Background Human decision-making ranges between the extremes of automatic and fast model-free behavior (i.e., relying only on previous outcomes) and more flexible, but computationally demanding model-based behavior (i.e., implementing cognitive models). Model-based/model-free decision-making can be investigated using sequential decision tasks and has been shown to be associated with presynaptic striatal dopamine synthesis. During phases of psychotic remission in schizophrenia, dopamine synthesis in the dorsal striatum is reduced. We hypothesized that particularly model-free decision-making is impaired in schizophrenia during psychotic remission and is associated with (i) abnormal dopamine synthesis in dorsal striatum, (ii) aberrant task-activation in dorsal striatum, and (iii) cognitive difficulties in patients (e.g., reduced speed). Methods 26 patients with chronic schizophrenia, currently in psychotic remission, and 22 healthy controls (matched by age and gender) were enrolled in the study. Model-based/model-free decision-making was evaluated with a two-stage Markov decision task, followed by computational modeling of subjects’ learning behavior. Presynaptic dopamine synthesis was assessed by 18F-DOPA positron emission tomography and subsequent graphical Patlak analysis. Task-activation was measured by functional magnetic resonance imaging. Cognitive impairments were quantified by Trail-Making-Test A (among others). Associations between decision-making parameters, dopamine synthesis, task-activation, and cognitive impairments were tested by correlation analyses. Results Patients with schizophrenia showed selectively impaired model-free decision-making. 18F-DOPA uptake (i.e., presynaptic dopamine synthesis capacity) in the dorsal striatum was decreased in patients. Impaired model-free decision-making in patients correlated with (i) decreased dopamine synthesis in dorsal striatum, (ii) abnormal task-activation in dorsal striatum, and (iii) lower speed in Trail-Making-Test A. Discussion Results demonstrate an association of reduced dorsal striatal dopamine synthesis and brain activity with impaired model-free decision-making in schizophrenia, which potentially contributes to cognitive difficulties.


2021 ◽  
Author(s):  
Maaike M.H. van Swieten ◽  
Rafal Bogacz ◽  
Sanjay G. Manohar

AbstractHuman decisions can be reflexive or planned, being governed respectively by model-free and model-based learning systems. These two systems might differ in their responsiveness to our needs. Hunger drives us to specifically seek food rewards, but here we ask whether it might have more general effects on these two decision systems. On one hand, the model-based system is often considered flexible and context-sensitive, and might therefore be modulated by metabolic needs. On the other hand, the model-free system’s primitive reinforcement mechanisms may have closer ties to biological drives. Here, we tested participants on a well-established two-stage sequential decision-making task that dissociates the contribution of model-based and model-free control. Hunger enhanced overall performance by increasing model-free control, without affecting model-based control. These results demonstrate a generalised effect of hunger on decision-making that enhances reliance on primitive reinforcement learning, which in some situations translates into adaptive benefits.Significance statementThe prevalence of obesity and eating disorder is steadily increasing. To counteract problems related to eating, people need to make rational decisions. However, appetite may switch us to a different decision mode, making it harder to achieve long-term goals. Here we show that planned and reinforcement-driven actions are differentially sensitive to hunger. Hunger specifically affected reinforcement-driven actions, and did not affect the planning of actions. Our data shows that people behave differently when they are hungry. We also provide a computational model of how the behavioural changes might arise.


2021 ◽  
Vol 17 (6) ◽  
pp. e1009070
Author(s):  
He A. Xu ◽  
Alireza Modirshanechi ◽  
Marco P. Lehmann ◽  
Wulfram Gerstner ◽  
Michael H. Herzog

Classic reinforcement learning (RL) theories cannot explain human behavior in the absence of external reward or when the environment changes. Here, we employ a deep sequential decision-making paradigm with sparse reward and abrupt environmental changes. To explain the behavior of human participants in these environments, we show that RL theories need to include surprise and novelty, each with a distinct role. While novelty drives exploration before the first encounter of a reward, surprise increases the rate of learning of a world-model as well as of model-free action-values. Even though the world-model is available for model-based RL, we find that human decisions are dominated by model-free action choices. The world-model is only marginally used for planning, but it is important to detect surprising events. Our theory predicts human action choices with high probability and allows us to dissociate surprise, novelty, and reward in EEG signals.


2020 ◽  
Vol 30 (15) ◽  
pp. R860-R865 ◽  
Author(s):  
Nicole Drummond ◽  
Yael Niv

Author(s):  
Roni Stern ◽  
Brendan Juba

In this paper we explore the theoretical boundaries of planning in a setting where no model of the agent's actions is given. Instead of an action model, a set of successfully executed plans are given and the task is to generate a plan that is safe, i.e., guaranteed to achieve the goal without failing. To this end, we show how to learn a conservative model of the world in which actions are guaranteed to be applicable. This conservative model is then given to an off-the-shelf classical planner, resulting in a plan that is guaranteed to achieve the goal. However, this reduction from a model-free planning to a model-based planning is not complete: in some cases a plan will not be found even when such exists. We analyze the relation between the number of observed plans and the likelihood that our conservative approach will indeed fail to solve a solvable problem. Our analysis show that the number of trajectories needed scales gracefully.


2021 ◽  
Vol 17 (1) ◽  
pp. e1008552
Author(s):  
Rani Moran ◽  
Mehdi Keramati ◽  
Raymond J. Dolan

Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes account of influences from its MF counterpart. We present evidence that such a sophisticated self-reflective MB planner incorporates an anticipation of the influences its own MF-proclivities exerts on the execution of its planned future actions. Using a novel bandit task, wherein subjects were periodically allowed to design their environment, we show that reward-assignments were constructed in a manner consistent with a MB system taking account of its MF propensities. Thus, in the task participants assigned higher rewards to bandits that were momentarily associated with stronger MF tendencies. Our findings have implications for a range of decision making domains that includes drug abuse, pre-commitment, and the tension between short and long-term decision horizons in economics.


2019 ◽  
Author(s):  
Florian Bolenz ◽  
Wouter Kool ◽  
Andrea M.F. Reiter ◽  
Ben Eppinger

When making decisions, humans employ different strategies which are commonly formalized as model-free and model-based reinforcement learning. While previous research has reported reduced model-based control with aging, it remains unclear whether this is due to limited cognitive capacities or a reduced willingness to engage in an effortful strategy. Moreover, it is not clear how aging affects the metacontrol of decision making, i.e. the dynamic adaptation of decision-making strategies to varying situational demands. To this end, we tested younger and older adults in a sequential decision-making task that dissociates model-free and model-based control. In contrast to previous research, in this study we applied a task in which model-based control led to higher payoffs in terms of monetary reward. Moreover, we manipulated the costs and benefits associated with model-based control by varying reward magnitude as well as the stability of the task structure. Compared to younger adults, older adults showed reduced reliance on model-based decision making and less adaptation of decision-making strategies to varying costs and benefits of model-based control. Our findings suggest that aging affects the dynamic metacontrol of decision-making strategies and that reduced model-based control in older adults is due to limited cognitive abilities to represent the structure of the task.


Sign in / Sign up

Export Citation Format

Share Document