scholarly journals The Straw That Broke the Camel’s Back: Natural Variations in 17β-Estradiol and COMT-Val158Met Genotype Interact in the Modulation of Model-Free and Model-Based Control

2021 ◽  
Vol 15 ◽  
Author(s):  
Esther K. Diekhof ◽  
Andra Geana ◽  
Frederike Ohm ◽  
Bradley B. Doll ◽  
Michael J. Frank

The sex hormone estradiol has recently gained attention in human decision-making research. Animal studies have already shown that estradiol promotes dopaminergic transmission and thus supports reward-seeking behavior and aspects of addiction. In humans, natural variations of estradiol across the menstrual cycle modulate the ability to learn from direct performance feedback (“model-free” learning). However, it remains unclear whether estradiol also influences more complex “model-based” contributions to reinforcement learning. Here, 41 women were tested twice – in the low and high estradiol state of the follicular phase of their menstrual cycle – with a Two-Step decision task designed to separate model-free from model-based learning. The results showed that in the high estradiol state women relied more heavily on model-free learning, and accomplished reduced performance gains, particularly during the more volatile periods of the task that demanded increased learning effort. In contrast, model-based control remained unaltered by the influence of hormonal state across the group. Yet, when accounting for individual differences in the genetic proxy of the COMT-Val158Met polymorphism (rs4680), we observed that only the participants homozygote for the methionine allele (n = 12; with putatively higher prefrontal dopamine) experienced a decline in model-based control when facing volatile reward probabilities. This group also showed the increase in suboptimal model-free control, while the carriers of the valine allele remained unaffected by the rise in endogenous estradiol. Taken together, these preliminary findings suggest that endogenous estradiol may affect the balance between model-based and model-free control, and particularly so in women with a high prefrontal baseline dopamine capacity and in situations of increased environmental volatility.

2016 ◽  
Author(s):  
Λεωνίδας Δρούκας

Όπως στην περίπτωση του ανθρώπου έτσι και στην περίπτωση των δακτύλων ενός ρομποτικού χεριού, η κίνηση μέσω κύλισης των ακροδακτύλων είναι καθοριστική για την επίτευξη της ευσταθούς σύλληψης και του ομαλού χειρισμού ενός αντικειμένου από το ρομποτικό χέρι. Σε αντίθεση με μια κίνηση ολίσθησης πάνω στην επιφάνεια με την οποία έρχεται σε επαφή το ρομποτικό δάκτυλο, η κύλιση του ακροδακτύλου βοηθάει στην ακριβέστερη τοποθέτησή του πάνω στην επιφάνεια αυτή, συμβάλλοντας έτσι στον ομαλότερο συνολικά χειρισμό του εκάστοτε αντικειμένου όπως για παράδειγμα η μετακίνηση ή η περιστροφή του. Στην υπάρχουσα βιβλιογραφία, οι περισσότεροι ελεγκτικοί νόμοι που έχουν προταθεί σχεδιάζονται με βάση ένα εξιδανικευμένο μοντέλο του συστήματος ρομποτικού δακτύλου - επιφάνειας επαφής. Το μοντέλο αυτό περιλαμβάνει ενσωματωμένους τους δεσμούς κύλισης του ακροδακτύλου, με αποτέλεσμα η κίνηση κύλισης να θεωρείται επί της ουσίας ως κάτι δεδομένο και εκ των προτέρων εξασφαλισμένο. Στην πραγματικότητα φυσικά το παραπάνω δεν ισχύει, καθώς οι εκάστοτε υπάρχουσες συνθήκες τριβής που εξαρτώνται από τα υλικά του ακροδακτύλου και της επιφάνειας μπορεί να ευνοούν λίγο ή και καθόλου την κύλιση του πρώτου πάνω στην δεύτερη. Έτσι στις περιπτώσεις αυτές, οι παραπάνω ελεγκτικές μεθοδολογίες είναι πολύ πιθανό να οδηγήσουν το ακροδάκτυλο σε μια κίνηση που θα αποτελείται από συνδυασμένη κύλιση και ολίσθηση ή ακόμη και μόνον από ολίσθηση. Στην διδακτορική αυτή διατριβή, η εξασφάλιση της κύλισης του ακροδακτύλου πάνω σε μια επιφάνεια επαφής δεν θεωρείται δεδομένη από πριν, αλλά λαμβάνεται υπόψιν ως ένας επιπλέον στόχος ελέγχου. Οι ελεγκτές που προτείνονται επιτυγχάνουν την διασφάλιση της κύλισης του σφαιρικού άκρου ενός ρομποτικού δακτύλου σε μια οποιαδήποτε επιφάνεια με άγνωστες ή και μη ευνοϊκές συνθήκες τριβής, συνδυάζοντας παράλληλα την επίτευξη του ευρέως διαδεδομένου στη ρομποτική ελέγχου της θέσης του ακροδακτύλου και του μέτρου της ασκούμενης από αυτό κάθετης δύναμης. Σχεδιάζονται ελεγκτές οι οποίοι είτε βασίζονται στην γνώση του μοντέλου του ρομποτικού δακτύλου (model based control), είτε δεν απαιτούν καμία τέτοια πληροφορία (model free control). Για την σχεδίαση των τελευταίων χρησιμοποιείται η Μέθοδος Ελέγχου Προδιαγεγραμμένης Επίδοσης (Prescribed Performance Control Methodology). Για όλους τους προτεινόμενους ελεγκτές, παρουσιάζεται η θεωρητική σχεδίαση και η προσομοιακή τους αξιολόγηση που αποδεικνύουν την εξασφάλιση του βασικότερου ελεγκτικού μας στόχου, δηλαδή της κύλισης του ρομποτικού ακροδακτύλου πάνω στην επιφάνεια επαφής, ενώ σε ορισμένες περιπτώσεις παρουσιάζονται επιπροσθέτως και πειραματικά αποτελέσματα που ενισχύουν ακόμα περισσότερο την αποτελεσματικότητα των αντίστοιχων προτεινόμενων ελεγκτών. Τέλος, η εξασφάλιση της κύλισης του ακροδακτύλου μέσω ενός από τους προτεινόμενους ελεγκτές αξιοποιείται για την εφαρμογή κατάλληλης εφαπτομενικής δύναμης ώστε να επιτευχθεί ο επιθυμητός χειρισμός χωρίς σύλληψη ενός αντικειμένου από ένα ρομποτικό δάκτυλο. Ειδικότερα, παρουσιάζεται μέσω προσομοιώσεων η επιτυχής μετακίνηση ή/και περιστροφή ενός επίπεδου και ενός κυλινδρικού αντικειμένου μέσω χειρισμού χωρίς σύλληψη από ένα ρομποτικό δάκτυλο με εκμετάλλευση της κύλισης του άκρου του πάνω στο εκάστοτε αντικείμενο.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Meltem I. Kasal ◽  
Lutfullah Besiroglu ◽  
Nabi Zorlu ◽  
Nur Dikmeer ◽  
Aslıhan Bilge ◽  
...  

AbstractRecent theories suggest a shift from model-based goal-directed to model-free habitual decision-making in obsessive–compulsive disorder (OCD). However, it is yet unclear, whether this shift in the decision process is heritable. We investigated 32 patients with OCD, 27 unaffected siblings (SIBs) and 31 healthy controls (HCs) using the two-step task. We computed behavioral and reaction time analyses and fitted a computational model to assess the balance between model-based and model-free control. 80 subjects also underwent structural imaging. We observed a significant ordered effect for the shift towards model-free control in the direction OCD > SIB > HC in our computational parameter of interest. However less directed analyses revealed no shift towards model-free control in OCDs. Nonetheless, we found evidence for reduced model-based control in OCDs compared to HCs and SIBs via 2nd stage reaction time analyses. In this measure SIBs also showed higher levels of model-based control than HCs. Across all subjects these effects were associated with the surface area of the left medial/right dorsolateral prefrontal cortex. Moreover, correlations between bilateral putamen/right caudate volumes and these effects varied as a function of group: they were negative in SIBs and OCDs, but positive in HCs. Associations between fronto-striatal regions and model-based reaction time effects point to a potential endophenotype for OCD.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Lieneke K. Janssen ◽  
Florian P. Mahner ◽  
Florian Schlagenhauf ◽  
Lorenz Deserno ◽  
Annette Horstmann

AbstractConsuming more energy than is expended may reflect a failure of control over eating behaviour in obesity. Behavioural control arises from a balance between two dissociable strategies of reinforcement learning: model-free and model-based. We hypothesized that weight status relates to an imbalance in reliance on model-based and model-free control, and that it may do so in a linear or quadratic manner. To test this, 90 healthy participants in a wide BMI range [normal-weight (n = 31), overweight (n = 29), obese (n = 30)] performed a sequential decision-making task. The primary analysis indicated that obese participants relied less on model-based control than overweight and normal-weight participants, with no difference between overweight and normal-weight participants. In line, secondary continuous analyses revealed a negative linear, but not quadratic, relationship between BMI and model-based control. Computational modelling of choice behaviour suggested that a mixture of both strategies was shifted towards less model-based control in obese participants. Our findings suggest that obesity may indeed be related to an imbalance in behavioural control as expressed in a phenotype of less model-based control potentially resulting from enhanced reliance on model-free computations.


2019 ◽  
Author(s):  
Lieneke Katharina Janssen ◽  
Florian Paul Mahner ◽  
Florian Schlagenhauf ◽  
Lorenz Deserno ◽  
Annette Horstmann

Consuming more energy than is expended may reflect a failure of control over eating behaviour in obesity. Behavioural control arises from a balance between two dissociable strategies of reinforcement learning: model-free and model-based. We hypothesized that weight status relates to an imbalance in reliance on model-based and model-free control, and that it may do so in a linear or quadratic manner. To test this, 90 healthy participants in a wide BMI range (normal-weight (n=31), overweight (n=29), obese (n=30)) performed a sequential decision-making task. The primary analysis indicated that obese participants relied less on model-based control than overweight and normal-weight participants, with no difference between overweight and normal-weight participants. In line, secondary continuous analyses revealed a negative linear, but not quadratic, relationship between BMI and model-based control. Computational modelling of choice behaviour suggested that a mixture of both strategies was shifted towards less model-based control in obese participants. Furthermore, exploratory analyses of separate weights for model-free and model-based control showed stronger reliance on model-free control with increased BMI. Our findings suggest that obesity may indeed be related to an imbalance in behavioural control as expressed in a phenotype of less model-based control potentially resulting from enhanced reliance on model-free computations.


2019 ◽  
Author(s):  
Edward Patzelt ◽  
Wouter Kool ◽  
Samuel J. Gershman

The tension between habits and plans is reflected in everyday decision-making. Habits are computationally cheap, but fail to flexibly adapt to changes in the environment. Planning is a flexible decision-making strategy, but requires greater resources. Arbitration between habits and plans has been formalized using reinforcement learning algorithms that distinguish between model-free control (habits) and model-based control (plans). Evidence about these two decision-making approaches suggests model-based control follows a developmental trajectory, emerging during adolescence, strengthening during young adulthood, and declining in older adulthood. The normative decline in planning (model-based control) presents the opportunity to develop interventions to increase flexible decision-making. Therefore, we asked if incentives could be used to increase model-based control in older adults. We expected older adults would fail to increase model-based control in response to incentives. This prediction was based upon prior research suggesting older adulthood is associated with deficits in representing and updating the expected value of rewards. Contrary to our expectations, in Experiment 1 we found that incentives could be used to boost model-based control in older adults sampled from an online population. We hypothesized this may be due to previous experience with the task (or with similar tasks). In Experiment 2, a naïve sample of older adults did not boost model-based control in response to incentives. These results suggest that incentives may be a useful intervention to increase model-based planning in older adulthood, but this may require extensive experience with the incentive structure.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Lieneke K. Janssen ◽  
Florian P. Mahner ◽  
Florian Schlagenhauf ◽  
Lorenz Deserno ◽  
Annette Horstmann

An amendment to this paper has been published and can be accessed via a link at the top of the paper.


Author(s):  
Javier Loranca ◽  
Jonathan Carlos Mayo Maldonado ◽  
Gerardo Escobar ◽  
Carlos Villarreal-Hernandez ◽  
Thabiso Maupong ◽  
...  

2020 ◽  
Author(s):  
Dongjae Kim ◽  
Jaeseung Jeong ◽  
Sang Wan Lee

AbstractThe goal of learning is to maximize future rewards by minimizing prediction errors. Evidence have shown that the brain achieves this by combining model-based and model-free learning. However, the prediction error minimization is challenged by a bias-variance tradeoff, which imposes constraints on each strategy’s performance. We provide new theoretical insight into how this tradeoff can be resolved through the adaptive control of model-based and model-free learning. The theory predicts the baseline correction for prediction error reduces the lower bound of the bias–variance error by factoring out irreducible noise. Using a Markov decision task with context changes, we showed behavioral evidence of adaptive control. Model-based behavioral analyses show that the prediction error baseline signals context changes to improve adaptability. Critically, the neural results support this view, demonstrating multiplexed representations of prediction error baseline within the ventrolateral and ventromedial prefrontal cortex, key brain regions known to guide model-based and model-free learning.One sentence summaryA theoretical, behavioral, computational, and neural account of how the brain resolves the bias-variance tradeoff during reinforcement learning is described.


2015 ◽  
Vol 114 (3) ◽  
pp. 1577-1592 ◽  
Author(s):  
Barbara La Scaleia ◽  
Myrka Zago ◽  
Francesco Lacquaniti

Two control schemes have been hypothesized for the manual interception of fast visual targets. In the model-free on-line control, extrapolation of target motion is based on continuous visual information, without resorting to physical models. In the model-based control, instead, a prior model of target motion predicts the future spatiotemporal trajectory. To distinguish between the two hypotheses in the case of projectile motion, we asked participants to hit a ball that rolled down an incline at 0.2 g and then fell in air at 1 g along a parabola. By varying starting position, ball velocity and trajectory differed between trials. Motion on the incline was always visible, whereas parabolic motion was either visible or occluded. We found that participants were equally successful at hitting the falling ball in both visible and occluded conditions. Moreover, in different trials the intersection points were distributed along the parabolic trajectories of the ball, indicating that subjects were able to extrapolate an extended segment of the target trajectory. Remarkably, this trend was observed even at the very first repetition of movements. These results are consistent with the hypothesis of model-based control, but not with on-line control. Indeed, ball path and speed during the occlusion could not be extrapolated solely from the kinematic information obtained during the preceding visible phase. The only way to extrapolate ball motion correctly during the occlusion was to assume that the ball would fall under gravity and air drag when hidden from view. Such an assumption had to be derived from prior experience.


Author(s):  
Xiaomei Wang ◽  
Kit-Hang Lee ◽  
Denny K. C. Fu ◽  
Ziyang Dong ◽  
Kui Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document