scholarly journals Ageing disrupts reinforcement learning whilst learning to help others is preserved

2020 ◽  
Author(s):  
Jo Cutler ◽  
Marco Wittmann ◽  
Ayat Abdurahman ◽  
Luca Hargitai ◽  
Daniel Drew ◽  
...  

AbstractReinforcement learning is a fundamental mechanism displayed by many species from mice to humans. However, adaptive behaviour depends not only on learning associations between actions and outcomes that affect ourselves, but critically, also outcomes that affect other people. Existing studies suggest reinforcement learning ability declines across the lifespan and self-relevant learning can be computationally separated from learning about rewards for others, yet how older adults learn what rewards others is unknown. Here, using computational modelling of a probabilistic reinforcement learning task, we tested whether young (age 18-36) and older (age 60-80, total n=152) adults can learn to gain rewards for themselves, another person (prosocial), or neither individual (control). Detailed model comparison showed that a computational model with separate learning rates best explained how people learn associations for different recipients. Young adults were faster to learn when their actions benefitted themselves, compared to when they helped others. Strikingly however, older adults showed reduced self-bias, with a relative increase in the rate at which they learnt about actions that helped others, compared to themselves. Moreover, we find evidence that these group differences are associated with changes in psychopathic traits over the lifespan. In older adults, psychopathic traits were significantly reduced and negatively correlated with prosocial learning rates. Importantly, older people with the lowest levels of psychopathy had the highest prosocial learning rates. These findings suggest learning how our actions help others is preserved across the lifespan with implications for our understanding of reinforcement learning mechanisms and theoretical accounts of healthy ageing.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Jo Cutler ◽  
Marco K. Wittmann ◽  
Ayat Abdurahman ◽  
Luca D. Hargitai ◽  
Daniel Drew ◽  
...  

AbstractReinforcement learning is a fundamental mechanism displayed by many species. However, adaptive behaviour depends not only on learning about actions and outcomes that affect ourselves, but also those that affect others. Using computational reinforcement learning models, we tested whether young (age 18–36) and older (age 60–80, total n = 152) adults learn to gain rewards for themselves, another person (prosocial), or neither individual (control). Detailed model comparison showed that a model with separate learning rates for each recipient best explained behaviour. Young adults learned faster when their actions benefitted themselves, compared to others. Compared to young adults, older adults showed reduced self-relevant learning rates but preserved prosocial learning. Moreover, levels of subclinical self-reported psychopathic traits (including lack of concern for others) were lower in older adults and the core affective-interpersonal component of this measure negatively correlated with prosocial learning. These findings suggest learning to benefit others is preserved across the lifespan with implications for reinforcement learning and theories of healthy ageing.


2020 ◽  
Author(s):  
Jonathan W. Kanen ◽  
Qiang Luo ◽  
Mojtaba R. Kandroodi ◽  
Rudolf N. Cardinal ◽  
Trevor W. Robbins ◽  
...  

AbstractThe non-selective serotonin 2A (5-HT2A) receptor agonist lysergic acid diethylamide (LSD) holds promise as a treatment for some psychiatric disorders. Psychedelic drugs such as LSD have been suggested to have therapeutic actions through their effects on learning. The behavioural effects of LSD in humans, however, remain largely unexplored. Here we examined how LSD affects probabilistic reversal learning in healthy humans. Conventional measures assessing sensitivity to immediate feedback (“win-stay” and “lose-shift” probabilities) were unaffected, whereas LSD increased the impact of the strength of initial learning on perseveration. Computational modelling revealed that the most pronounced effect of LSD was enhancement of the reward learning rate. The punishment learning rate was also elevated. Increased reinforcement learning rates suggest LSD induced a state of heightened plasticity. These results indicate a potential mechanism through which revision of maladaptive associations could occur.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 471
Author(s):  
Jai Hoon Park ◽  
Kang Hoon Lee

Designing novel robots that can cope with a specific task is a challenging problem because of the enormous design space that involves both morphological structures and control mechanisms. To this end, we present a computational method for automating the design of modular robots. Our method employs a genetic algorithm to evolve robotic structures as an outer optimization, and it applies a reinforcement learning algorithm to each candidate structure to train its behavior and evaluate its potential learning ability as an inner optimization. The size of the design space is reduced significantly by evolving only the robotic structure and by performing behavioral optimization using a separate training algorithm compared to that when both the structure and behavior are evolved simultaneously. Mutual dependence between evolution and learning is achieved by regarding the mean cumulative rewards of a candidate structure in the reinforcement learning as its fitness in the genetic algorithm. Therefore, our method searches for prospective robotic structures that can potentially lead to near-optimal behaviors if trained sufficiently. We demonstrate the usefulness of our method through several effective design results that were automatically generated in the process of experimenting with actual modular robotics kit.


2019 ◽  
Author(s):  
Erdem Pulcu

AbstractWe are living in a dynamic world in which stochastic relationships between cues and outcome events create different sources of uncertainty1 (e.g. the fact that not all grey clouds bring rain). Living in an uncertain world continuously probes learning systems in the brain, guiding agents to make better decisions. This is a type of value-based decision-making which is very important for survival in the wild and long-term evolutionary fitness. Consequently, reinforcement learning (RL) models describing cognitive/computational processes underlying learning-based adaptations have been pivotal in behavioural2,3 and neural sciences4–6, as well as machine learning7,8. This paper demonstrates the suitability of novel update rules for RL, based on a nonlinear relationship between prediction errors (i.e. difference between the agent’s expectation and the actual outcome) and learning rates (i.e. a coefficient with which agents update their beliefs about the environment), that can account for learning-based adaptations in the face of environmental uncertainty. These models illustrate how learners can flexibly adapt to dynamically changing environments.


2021 ◽  
Author(s):  
Joana Carvalheiro ◽  
Vasco A. Conceição ◽  
Ana Mesquita ◽  
Ana Seara‐Cardoso

2020 ◽  
Vol 30 (6) ◽  
pp. 3573-3589 ◽  
Author(s):  
Rick A Adams ◽  
Michael Moutoussis ◽  
Matthew M Nour ◽  
Tarik Dahoun ◽  
Declan Lewis ◽  
...  

Abstract Choosing actions that result in advantageous outcomes is a fundamental function of nervous systems. All computational decision-making models contain a mechanism that controls the variability of (or confidence in) action selection, but its neural implementation is unclear—especially in humans. We investigated this mechanism using two influential decision-making frameworks: active inference (AI) and reinforcement learning (RL). In AI, the precision (inverse variance) of beliefs about policies controls action selection variability—similar to decision ‘noise’ parameters in RL—and is thought to be encoded by striatal dopamine signaling. We tested this hypothesis by administering a ‘go/no-go’ task to 75 healthy participants, and measuring striatal dopamine 2/3 receptor (D2/3R) availability in a subset (n = 25) using [11C]-(+)-PHNO positron emission tomography. In behavioral model comparison, RL performed best across the whole group but AI performed best in participants performing above chance levels. Limbic striatal D2/3R availability had linear relationships with AI policy precision (P = 0.029) as well as with RL irreducible decision ‘noise’ (P = 0.020), and this relationship with D2/3R availability was confirmed with a ‘decision stochasticity’ factor that aggregated across both models (P = 0.0006). These findings are consistent with occupancy of inhibitory striatal D2/3Rs decreasing the variability of action selection in humans.


NeuroImage ◽  
2017 ◽  
Vol 146 ◽  
pp. 626-641 ◽  
Author(s):  
Anne-Marike Schiffer ◽  
Kayla Siletti ◽  
Florian Waszak ◽  
Nick Yeung

2013 ◽  
Vol 28 (1) ◽  
pp. 35-46 ◽  
Author(s):  
Nichole R. Lighthall ◽  
Marissa A. Gorlick ◽  
Andrej Schoeke ◽  
Michael J. Frank ◽  
Mara Mather

2003 ◽  
Vol 06 (03) ◽  
pp. 405-426 ◽  
Author(s):  
PAUL DARBYSHIRE

Distillations utilize multi-agent based modeling and simulation techniques to study warfare as a complex adaptive system at the conceptual level. The focus is placed on the interactions between the agents to facilitate study of cause and effect between individual interactions and overall system behavior. Current distillations do not utilize machine-learning techniques to model the cognitive abilities of individual combatants but employ agent control paradigms to represent agents as highly instinctual entities. For a team of agents implementing a reinforcement-learning paradigm, the rate of learning is not sufficient for agents to adapt to this hostile environment. However, by allowing the agents to communicate their respective rewards for actions performed as the simulation progresses, the rate of learning can be increased sufficiently to significantly increase the teams chances of survival. This paper presents the results of trials to measure the success of a team-based approach to the reinforcement-learning problem in a distillation, using reward communication to increase learning rates.


eLife ◽  
2017 ◽  
Vol 6 ◽  
Author(s):  
Sven Collette ◽  
Wolfgang M Pauli ◽  
Peter Bossaerts ◽  
John O'Doherty

In inverse reinforcement learning an observer infers the reward distribution available for actions in the environment solely through observing the actions implemented by another agent. To address whether this computational process is implemented in the human brain, participants underwent fMRI while learning about slot machines yielding hidden preferred and non-preferred food outcomes with varying probabilities, through observing the repeated slot choices of agents with similar and dissimilar food preferences. Using formal model comparison, we found that participants implemented inverse RL as opposed to a simple imitation strategy, in which the actions of the other agent are copied instead of inferring the underlying reward structure of the decision problem. Our computational fMRI analysis revealed that anterior dorsomedial prefrontal cortex encoded inferences about action-values within the value space of the agent as opposed to that of the observer, demonstrating that inverse RL is an abstract cognitive process divorceable from the values and concerns of the observer him/herself.


Sign in / Sign up

Export Citation Format

Share Document