scholarly journals Revealing neuro-computational mechanisms of reinforcement learning and decision-making with the hBayesDM package

2016 ◽  
Author(s):  
Woo-Young Ahn ◽  
Nathaniel Haines ◽  
Lei Zhang

AbstractReinforcement learning and decision-making (RLDM) provide a quantitative framework and computational theories, with which we can disentangle psychiatric conditions into basic dimensions of neurocognitive functioning. RLDM offer a novel approach to assess and potentially diagnose psychiatric patients, and there is growing enthusiasm on RLDM and Computational Psychiatry among clinical researchers. Such a framework can also provide insights into the brain substrates of particular RLDM processes as exemplified by model-based functional magnetic resonance imaging (fMRI) or electroencephalogram (EEG). However, many researchers often find the approach too technical and have difficulty adopting it for their research. Thus, there remains a critical need to develop a user-friendly tool for the wide dissemination of computational psychiatric methods. We introduce an R package called hBayesDM (hierarchical Bayesian modeling of Decision-Making tasks), which offers computational modeling on an array of RLDM tasks and social exchange games. The hBayesDM package offers state-of-the-art hierarchical Bayesian modeling, where both individual and group parameters (i.e., posterior distributions) are estimated simultaneously in a mutually constraining fashion. At the same time, it is extremely user-friendly: users can perform computational modeling, output visualization, and Bayesian model comparisons–each with a single line of coding. Users can also extract trial-by-trial latent variables (e.g., prediction errors) required for model-based fMRI/EEG. With the hBayesDM package, we anticipate that anyone with minimal knowledge of programming can take advantage of cutting-edge computational modeling approaches and investigate the underlying processes of and interactions between multiple decision-making (e.g., goal-directed, habitual, and Pavlovian) systems. In this way, it is our expectation that the hBayesDM package will contribute to the dissemination of advanced modeling approaches and enable a wide range of researchers to easily perform computational psychiatric research within their populations.

2017 ◽  
Vol 1 ◽  
pp. 24-57 ◽  
Author(s):  
Woo-Young Ahn ◽  
Nathaniel Haines ◽  
Lei Zhang

Reinforcement learning and decision-making (RLDM) provide a quantitative framework and computational theories with which we can disentangle psychiatric conditions into the basic dimensions of neurocognitive functioning. RLDM offer a novel approach to assessing and potentially diagnosing psychiatric patients, and there is growing enthusiasm for both RLDM and computational psychiatry among clinical researchers. Such a framework can also provide insights into the brain substrates of particular RLDM processes, as exemplified by model-based analysis of data from functional magnetic resonance imaging (fMRI) or electroencephalography (EEG). However, researchers often find the approach too technical and have difficulty adopting it for their research. Thus, a critical need remains to develop a user-friendly tool for the wide dissemination of computational psychiatric methods. We introduce an R package called hBayesDM (hierarchical Bayesian modeling of Decision-Making tasks), which offers computational modeling of an array of RLDM tasks and social exchange games. The hBayesDM package offers state-of-the-art hierarchical Bayesian modeling, in which both individual and group parameters (i.e., posterior distributions) are estimated simultaneously in a mutually constraining fashion. At the same time, the package is extremely user-friendly: users can perform computational modeling, output visualization, and Bayesian model comparisons, each with a single line of coding. Users can also extract the trial-by-trial latent variables (e.g., prediction errors) required for model-based fMRI/EEG. With the hBayesDM package, we anticipate that anyone with minimal knowledge of programming can take advantage of cutting-edge computational-modeling approaches to investigate the underlying processes of and interactions between multiple decision-making (e.g., goal-directed, habitual, and Pavlovian) systems. In this way, we expect that the hBayesDM package will contribute to the dissemination of advanced modeling approaches and enable a wide range of researchers to easily perform computational psychiatric research within different populations.


2021 ◽  
Vol 35 (2) ◽  
Author(s):  
Nicolas Bougie ◽  
Ryutaro Ichise

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.


2020 ◽  
Author(s):  
Gabriel Weindel ◽  
Royce anders ◽  
F.-Xavier Alario ◽  
Boris BURLE

Decision-making models based on evidence accumulation processes (the most prolific one being the drift-diffusion model – DDM) are widely used to draw inferences about latent psychological processes from chronometric data. While the observed goodness of fit in a wide range of tasks supports the model’s validity, the derived interpretations have yet to be sufficiently cross-validated with other measures that also reflect cognitive processing. To do so, we recorded electromyographic (EMG) activity along with response times (RT), and used it to decompose every RT into two components: a pre-motor (PMT) and motor time (MT). These measures were mapped to the DDM's parameters, thus allowing a test, beyond quality of fit, of the validity of the model’s assumptions and their usual interpretation. In two perceptual decision tasks, performed within a canonical task setting, we manipulated stimulus contrast, speed-accuracy trade-off, and response force, and assessed their effects on PMT, MT, and RT. Contrary to common assumptions, these three factors consistently affected MT. DDM parameter estimates of non-decision processes are thought to include motor execution processes, and they were globally linked to the recorded response execution MT. However, when the assumption of independence between decision and non-decision processes was not met, in the fastest trials, the link was weaker. Overall, the results show a fair concordance between model-based and EMG-based decompositions of RTs, but also establish some limits on the interpretability of decision model parameters linked to response execution.


2021 ◽  
Vol 17 (1) ◽  
pp. e1008552
Author(s):  
Rani Moran ◽  
Mehdi Keramati ◽  
Raymond J. Dolan

Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes account of influences from its MF counterpart. We present evidence that such a sophisticated self-reflective MB planner incorporates an anticipation of the influences its own MF-proclivities exerts on the execution of its planned future actions. Using a novel bandit task, wherein subjects were periodically allowed to design their environment, we show that reward-assignments were constructed in a manner consistent with a MB system taking account of its MF propensities. Thus, in the task participants assigned higher rewards to bandits that were momentarily associated with stronger MF tendencies. Our findings have implications for a range of decision making domains that includes drug abuse, pre-commitment, and the tension between short and long-term decision horizons in economics.


2014 ◽  
Vol 369 (1655) ◽  
pp. 20130480 ◽  
Author(s):  
Matthew Botvinick ◽  
Ari Weinstein

Recent work has reawakened interest in goal-directed or ‘model-based’ choice, where decisions are based on prospective evaluation of potential action outcomes. Concurrently, there has been growing attention to the role of hierarchy in decision-making and action control. We focus here on the intersection between these two areas of interest, considering the topic of hierarchical model-based control. To characterize this form of action control, we draw on the computational framework of hierarchical reinforcement learning, using this to interpret recent empirical findings. The resulting picture reveals how hierarchical model-based mechanisms might play a special and pivotal role in human decision-making, dramatically extending the scope and complexity of human behaviour.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Florent Wyckmans ◽  
A. Ross Otto ◽  
Miriam Sebold ◽  
Nathaniel Daw ◽  
Antoine Bechara ◽  
...  

AbstractCompulsive behaviors (e.g., addiction) can be viewed as an aberrant decision process where inflexible reactions automatically evoked by stimuli (habit) take control over decision making to the detriment of a more flexible (goal-oriented) behavioral learning system. These behaviors are thought to arise from learning algorithms known as “model-based” and “model-free” reinforcement learning. Gambling disorder, a form of addiction without the confound of neurotoxic effects of drugs, showed impaired goal-directed control but the way in which problem gamblers (PG) orchestrate model-based and model-free strategies has not been evaluated. Forty-nine PG and 33 healthy participants (CP) completed a two-step sequential choice task for which model-based and model-free learning have distinct and identifiable trial-by-trial learning signatures. The influence of common psychopathological comorbidities on those two forms of learning were investigated. PG showed impaired model-based learning, particularly after unrewarded outcomes. In addition, PG exhibited faster reaction times than CP following unrewarded decisions. Troubled mood, higher impulsivity (i.e., positive and negative urgency) and current and chronic stress reported via questionnaires did not account for those results. These findings demonstrate specific reinforcement learning and decision-making deficits in behavioral addiction that advances our understanding and may be important dimensions for designing effective interventions.


2022 ◽  
Vol 12 ◽  
Author(s):  
Miriam Sebold ◽  
Hao Chen ◽  
Aleyna Önal ◽  
Sören Kuitunen-Paul ◽  
Negin Mojtahedzadeh ◽  
...  

Background: Prejudices against minorities can be understood as habitually negative evaluations that are kept in spite of evidence to the contrary. Therefore, individuals with strong prejudices might be dominated by habitual or “automatic” reactions at the expense of more controlled reactions. Computational theories suggest individual differences in the balance between habitual/model-free and deliberative/model-based decision-making.Methods: 127 subjects performed the two Step task and completed the blatant and subtle prejudice scale.Results: By using analyses of choices and reaction times in combination with computational modeling, subjects with stronger blatant prejudices showed a shift away from model-based control. There was no association between these decision-making processes and subtle prejudices.Conclusion: These results support the idea that blatant prejudices toward minorities are related to a relative dominance of habitual decision-making. This finding has important implications for developing interventions that target to change prejudices across societies.


Author(s):  
Andreas Heinz

While dopaminergic neurotransmission has largely been implicated in reinforcement learning and model-based versus model-free decision making, serotonergic neurotransmission has been implicated in encoding aversive outcomes. Accordingly, serotonin dysfunction has been observed in disorders characterized by negative affect including depression, anxiety and addiction. Serotonin dysfunction in these mental disorders is described and its association with negative affect is discussed.


2021 ◽  
Author(s):  
Sarah M. Tashjian ◽  
Toby Wise ◽  
dean mobbs

Protection, or the mitigation of harm, often involves the capacity to prospectively plan the actions needed to combat a threat. The computational architecture of decisions involving protection remains unclear, as well as whether these decisions differ from other positive prospective actions. Here we examine effects of valence and context by comparing protection to reward, which occurs in a different context but is also positively valenced, and punishment, which also occurs in an aversive context but differs in valence. We applied computational modeling across three independent studies (Total N=600) using five iterations of a ‘two-step’ behavioral task to examine model-based reinforcement learning for protection, reward, and punishment in humans. Decisions motivated by acquiring safety via protection evoked a higher degree of model-based control than acquiring reward and avoiding punishment, with no significant differences in learning rate. The context-valence asymmetry characteristic of protection increased deployment of flexible decision strategies, suggesting model-based control depends on the context in which outcomes are encountered as well as the valence of the outcome.


Sign in / Sign up

Export Citation Format

Share Document