scholarly journals A learning mechanism shaping risk preferences and a preliminary test of its relationship with psychopathic traits

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Takeyuki Oba ◽  
Kentaro Katahira ◽  
Hideki Ohira

AbstractPeople tend to avoid risk in the domain of gains but take risks in the domain of losses; this is called the reflection effect. Formal theories of decision-making have provided important perspectives on risk preferences, but how individuals acquire risk preferences through experiences remains unknown. In the present study, we used reinforcement learning (RL) models to examine the learning processes that can shape attitudes toward risk in both domains. In addition, relationships between learning parameters and personality traits were investigated. Fifty-one participants performed a learning task, and we examined learning parameters and risk preference in each domain. Our results revealed that an RL model that included a nonlinear subjective utility parameter and differential learning rates for positive and negative prediction errors exhibited better fit than other models and that these parameters independently predicted risk preferences and the reflection effect. Regarding personality traits, although the sample sizes may be too small to test personality traits, increased primary psychopathy scores could be linked with decreased learning rates for positive prediction error in loss conditions among participants who had low anxiety traits. The present findings not only contribute to understanding how decision-making in risky conditions is influenced by past experiences but also provide insights into certain psychiatric problems.

2018 ◽  
Author(s):  
Nura Sidarus ◽  
Stefano Palminteri ◽  
Valérian Chambon

AbstractValue-based decision-making involves trading off the cost associated with an action against its expected reward. Research has shown that both physical and mental effort constitute such subjective costs, biasing choices away from effortful actions, and discounting the value of obtained rewards. Facing conflicts between competing action alternatives is considered aversive, as recruiting cognitive control to overcome conflict is effortful. Yet, it remains unclear whether conflict is also perceived as a cost in value-based decisions. The present study investigated this question by embedding irrelevant distractors (flanker arrows) within a reversal-learning task, with intermixed free and instructed trials. Results showed that participants learned to adapt their choices to maximize rewards, but were nevertheless biased to follow the suggestions of irrelevant distractors. Thus, the perceived cost of being in conflict with an external suggestion could sometimes trump internal value representations. By adapting computational models of reinforcement learning, we assessed the influence of conflict at both the decision and learning stages. Modelling the decision showed that conflict was avoided when evidence for either action alternative was weak, demonstrating that the cost of conflict was traded off against expected rewards. During the learning phase, we found that learning rates were reduced in instructed, relative to free, choices. Learning rates were further reduced by conflict between an instruction and subjective action values, whereas learning was not robustly influenced by conflict between one’s actions and external distractors. Our results show that the subjective cost of conflict factors into value-based decision-making, and highlights that different types of conflict may have different effects on learning about action outcomes.


2018 ◽  
Author(s):  
Samuel D. McDougle ◽  
Peter A. Butcher ◽  
Darius Parvin ◽  
Fasial Mushtaq ◽  
Yael Niv ◽  
...  

AbstractDecisions must be implemented through actions, and actions are prone to error. As such, when an expected outcome is not obtained, an individual should not only be sensitive to whether the choice itself was suboptimal, but also whether the action required to indicate that choice was executed successfully. The intelligent assignment of credit to action execution versus action selection has clear ecological utility for the learner. To explore this scenario, we used a modified version of a classic reinforcement learning task in which feedback indicated if negative prediction errors were, or were not, associated with execution errors. Using fMRI, we asked if prediction error computations in the human striatum, a key substrate in reinforcement learning and decision making, are modulated when a failure in action execution results in the negative outcome. Participants were more tolerant of non-rewarded outcomes when these resulted from execution errors versus when execution was successful but the reward was withheld. Consistent with this behavior, a model-driven analysis of neural activity revealed an attenuation of the signal associated with negative reward prediction error in the striatum following execution failures. These results converge with other lines of evidence suggesting that prediction errors in the mesostriatal dopamine system integrate high-level information during the evaluation of instantaneous reward outcomes.


2018 ◽  
Author(s):  
Joanne C. Van Slooten ◽  
Sara Jahfari ◽  
Tomas Knapen ◽  
Jan Theeuwes

AbstractPupil responses have been used to track cognitive processes during decision-making. Studies have shown that in these cases the pupil reflects the joint activation of many cortical and subcortical brain regions, also those traditionally implicated in value-based learning. However, how the pupil tracks value-based decisions and reinforcement learning is unknown. We combined a reinforcement learning task with a computational model to study pupil responses during value-based decisions, and decision evaluations. We found that the pupil closely tracks reinforcement learning both across trials and participants. Prior to choice, the pupil dilated as a function of trial-by-trial fluctuations in value beliefs. After feedback, early dilation scaled with value uncertainty, whereas later constriction scaled with reward prediction errors. Our computational approach systematically implicates the pupil in value-based decisions, and the subsequent processing of violated value beliefs, ttese dissociable influences provide an exciting possibility to non-invasively study ongoing reinforcement learning in the pupil.


2020 ◽  
Author(s):  
Jil Humann ◽  
Adrian Georg Fischer ◽  
Markus Ullsperger

Research suggests that working memory (WM) has an important role in instrumental learning in changeable environments when reinforcement histories of multiple options must be tracked. Working memory capacity (WMC) not only reflects the ability to maintain items, but also to update and shield items against interference in a context-dependent manner; functions conceivably also essential to instrumental learning. To address the relationship of WMC and instrumental learning, we studied choice behavior and EEG of participants performing a probabilistic reversal learning task. Their separately measured WMC positively correlated with reversal learning performance. Computational modeling revealed that low-capacity participants modulated learning rates less dynamically around value reversals. Their choices were more stochastic and less guided by learnt values, resulting in less stable performance and higher susceptibility to misleading probabilistic feedback. Single-trial model-based EEG analysis revealed that prediction errors and learning rates were less strongly represented in cortical activity of low-capacity participants, while the centroparietal positivity, a general correlate of adaptation, was independent of WMC. In conclusion, cognitive functions tackled by WMC tasks are also necessary in instrumental learning. We suggest that noisier representations render items held in WM as well as tracked values in instrumental learning less stable and more susceptible to distractors.


2021 ◽  
Vol 17 (7) ◽  
pp. e1009213
Author(s):  
Moritz Moeller ◽  
Jan Grohn ◽  
Sanjay Manohar ◽  
Rafal Bogacz

Reward prediction errors (RPEs) and risk preferences have two things in common: both can shape decision making behavior, and both are commonly associated with dopamine. RPEs drive value learning and are thought to be represented in the phasic release of striatal dopamine. Risk preferences bias choices towards or away from uncertainty; they can be manipulated with drugs that target the dopaminergic system. Based on the common neural substrate, we hypothesize that RPEs and risk preferences are linked on the level of behavior as well. Here, we develop this hypothesis theoretically and test it empirically. First, we apply a recent theory of learning in the basal ganglia to predict how RPEs influence risk preferences. We find that positive RPEs should cause increased risk-seeking, while negative RPEs should cause risk-aversion. We then test our behavioral predictions using a novel bandit task in which value and risk vary independently across options. Critically, conditions are included where options vary in risk but are matched for value. We find that our prediction was correct: participants become more risk-seeking if choices are preceded by positive RPEs, and more risk-averse if choices are preceded by negative RPEs. These findings cannot be explained by other known effects, such as nonlinear utility curves or dynamic learning rates.


2020 ◽  
Vol 46 (Supplement_1) ◽  
pp. S255-S255
Author(s):  
James Waltz ◽  
Dennis Hernaus ◽  
Robert Wilson ◽  
Elliot Brown ◽  
Michael Frank ◽  
...  

Abstract Background We have found that measures of reinforcement learning (RL) performance correlate with negative symptoms severity in adult schizophrenia patients as well as in adolescents and young adults seeking psychiatric services. Most of these tasks assess reinforcement learning in stable environments, however. In unstable, or volatile environments, adaptive learning and decision making depend on the ability to use one’s own uncertainty to modulate attention to feedback. In stable RL environments, parameters called learning rates (signified by ⍺) capture the impact of prediction errors on changes in association strength with each subject having a single learning rate for a given kind of prediction error (positive and negative, e.g.). In volatile environments, learning rates might be more appropriately modeled as dynamic, modulated by uncertainty. Furthermore, uncertainty is known to guide what is called “the explore/exploit trade-off” – the threshold for choosing more informative options potentially at the expense of options with higher expected value. Methods We have examined the contribution of uncertainty processing to the emergence of negative symptoms in people along the schizophrenia spectrum, in several ways. First, in conjunction with fMRI, we administered 26 patients with schizophrenia (PSZ) and 23 healthy volunteers (HV) a 3-choice version of a probabilistic reversal learning task that required participants to resolve uncertainty and determine the new best option after sudden, sporadic contingency shifts. Second, we assessed the role of uncertainty in driving decision making under ambiguity, using two distinct tasks in cohorts of schizophrenia patients and healthy volunteers. Motivational symptoms were assessed in PSZ using the Scale for the Assessment of Negative Symptoms (SANS), from which we computed scores for Avolition/Role-Functioning, Anhedonia/Asociality, and an Avolition/Anhedonia/Asociality (AAA) factor. Results In the context of the 3-choice version of a probabilistic reversal learning task, we found that SZ patients with more severe anhedonia and avolition show a reduced ability to dynamically modulate learning rates in a volatile environment. A follow-up psychophysiological interaction analysis revealed decreased dmPFC-VS connectivity concurrent with learning rate modulation, most prominently in individuals with the most severe motivational deficits. Finally, in the context of decision making under ambiguity, we have found that SZ patients with more severe anhedonia and avolition, as measured by the SANS, show a reduced tendency to explore contingences in the service of reducing uncertainty. Furthermore, we found that mean negative symptom scores correlated negatively with change in information weight, a model-based measure of directed exploration. Discussion These results indicate that multiple potential mechanisms underlie motivational deficits in schizophrenia spectrum disorders, including processes related to the ability to flexibly modulate learning and decision making according to one’s level of certainty about contingencies in the environment. That is, beyond deficits in reward-seeking behavior, a reduced ability to use uncertainty to modulate learning rates and a reduced tendency to engage in information-seeking behavior may make substantial contributions to negative symptoms in people with psychotic illness and people at risk for psychotic illness. The ability to dynamically value actions in terms of both prospective reward and information is likely to contribute deficits in motivation across diagnoses.


Author(s):  
Christina E. Wierenga ◽  
Erin Reilly ◽  
Amanda Bischoff-Grethe ◽  
Walter H. Kaye ◽  
Gregory G. Brown

ABSTRACT Objectives: Anorexia nervosa (AN) is associated with altered sensitivity to reward and punishment. Few studies have investigated whether this results in aberrant learning. The ability to learn from rewarding and aversive experiences is essential for flexibly adapting to changing environments, yet individuals with AN tend to demonstrate cognitive inflexibility, difficulty set-shifting and altered decision-making. Deficient reinforcement learning may contribute to repeated engagement in maladaptive behavior. Methods: This study investigated learning in AN using a probabilistic associative learning task that separated learning of stimuli via reward from learning via punishment. Forty-two individuals with Diagnostic and Statistical Manual of Mental Disorders (DSM)-5 restricting-type AN were compared to 38 healthy controls (HCs). We applied computational models of reinforcement learning to assess group differences in learning, thought to be driven by violations in expectations, or prediction errors (PEs). Linear regression analyses examined whether learning parameters predicted BMI at discharge. Results: AN had lower learning rates than HC following both positive and negative PE (p < .02), and were less likely to exploit what they had learned. Negative PE on punishment trials predicted lower discharge BMI (p < .001), suggesting individuals with more negative expectancies about avoiding punishment had the poorest outcome. Conclusions: This is the first study to show lower rates of learning in AN following both positive and negative outcomes, with worse punishment learning predicting less weight gain. An inability to modify expectations about avoiding punishment might explain persistence of restricted eating despite negative consequences, and suggests that treatments that modify negative expectancy might be effective in reducing food avoidance in AN.


2020 ◽  
Author(s):  
Moritz Moeller ◽  
Jan Grohn ◽  
Sanjay Manohar ◽  
Rafal Bogacz

AbstractReinforcement learning theories propose that humans choose based on the estimated values of available options, and that they learn from rewards by reducing the difference between the experienced and expected value. In the brain, such prediction errors are broadcasted by dopamine. However, choices are not only influenced by expected value, but also by risk. Like reinforcement learning, risk preferences are modulated by dopamine: enhanced dopamine levels induce risk-seeking. Learning and risk preferences have so far been studied independently, even though it is commonly assumed that they are (partly) regulated by the same neurotransmitter. Here, we use a novel learning task to look for prediction-error induced risk-seeking in human behavior and pupil responses. We find that prediction errors are positively correlated with risk-preferences in imminent choices. Physiologically, this effect is indexed by pupil dilation: only participants whose pupil response indicates that they experienced the prediction error also show the behavioral effect.


Sign in / Sign up

Export Citation Format

Share Document