A Social Reinforcement Learning Hypothesis of Mutual Reward Preferences in Rats

Author(s):  
Julen Hernandez-Lallement ◽  
Marijn van Wingerden ◽  
Sandra Schäble ◽  
Tobias Kalenscher
2020 ◽  
Author(s):  
Marie Levorsen ◽  
Ayahito Ito ◽  
Shinsuke Suzuki ◽  
Keise Izuma

2014 ◽  
Vol 25 (3) ◽  
pp. 711-719 ◽  
Author(s):  
Björn Lindström ◽  
Ida Selbing ◽  
Tanaz Molapour ◽  
Andreas Olsson

2019 ◽  
Vol 7 (6) ◽  
pp. 1372-1388
Author(s):  
Miranda L. Beltzer ◽  
Stephen Adams ◽  
Peter A. Beling ◽  
Bethany A. Teachman

Adaptive social behavior requires learning probabilities of social reward and punishment and updating these probabilities when they change. Given prior research on aberrant reinforcement learning in affective disorders, this study examines how social anxiety affects probabilistic social reinforcement learning and dynamic updating of learned probabilities in a volatile environment. Two hundred and twenty-two online participants completed questionnaires and a computerized ball-catching game with changing probabilities of reward and punishment. Dynamic learning rates were estimated to assess the relative importance ascribed to new information in response to volatility. Mixed-effects regression was used to analyze throw patterns as a function of social anxiety symptoms. Higher social anxiety predicted fewer throws to the previously punishing avatar and different learning rates after certain role changes, suggesting that social anxiety may be characterized by difficulty updating learned social probabilities. Socially anxious individuals may miss the chance to learn that a once-punishing situation no longer poses a threat.


2014 ◽  
Vol 14 (2) ◽  
pp. 683-697 ◽  
Author(s):  
Rebecca M. Jones ◽  
Leah H. Somerville ◽  
Jian Li ◽  
Erika J. Ruberry ◽  
Alisa Powers ◽  
...  

2011 ◽  
Vol 31 (37) ◽  
pp. 13039-13045 ◽  
Author(s):  
R. M. Jones ◽  
L. H. Somerville ◽  
J. Li ◽  
E. J. Ruberry ◽  
V. Libby ◽  
...  

2017 ◽  
Author(s):  
James C Thompson ◽  
Margaret L Westwater

Socially appropriate behavior involves learning actions that are valued by others and those that have a social cost. Facial expressions are one way that others can signal the social value of our actions. The rewarding or aversive properties of signals such as smiles or frowns also evoke automatic approach or avoidance behaviors in receivers, and a Pavlovian system learns cues that predict rewarding or aversive outcomes. In this study, we examined the computational and neural mechanisms underlying interactions between Pavlovian and Instrumental systems during social reinforcement learning. We found that Pavlovian biases to approach cues predicting social reward and avoid cues predicting social punishment interfered with Instrumental learning from social feedback. While the computations underlying Pavlovian and Instrumental interactions remained the same as when learning from monetary feedback, Pavlovian biases from social outcomes to approach or withdraw were not significantly correlated with biases from money. Trial-by-trial measures of alpha (8-14Hz) EEG power was associated with suppression of Pavlovian bias to social outcomes, while suppression of bias from money was associated with theta (4-7Hz) EEG power. Our findings demonstrate how emotional reactions to feedback from others are balanced with the instrumental value of that feedback to guide social behavior.Significance statementA smile from another can be a signal to continue what we are doing, while an angry scowl is a sure sign to stop. Feedback from others such as this plays an important role in shapeing social behavior. The rewarding nature of a smile (or the aversive nature of a scowl) can also lead to automatic tendencies to approach (or avoid), and we can learn situations that predict positive or negative social outcomes. In this study, we examined the brain mechanisms that come into play when the instrumental demands of a situation are in conflict with our automatic biases to approach or withdraw, such as when we have to approach someone who is scowling at us or withdraw from someone who is smiling.


PLoS Biology ◽  
2020 ◽  
Vol 18 (12) ◽  
pp. e3001028
Author(s):  
Anis Najar ◽  
Emmanuelle Bonnet ◽  
Bahador Bahrami ◽  
Stefano Palminteri

While there is no doubt that social signals affect human reinforcement learning, there is still no consensus about how this process is computationally implemented. To address this issue, we compared three psychologically plausible hypotheses about the algorithmic implementation of imitation in reinforcement learning. The first hypothesis, decision biasing (DB), postulates that imitation consists in transiently biasing the learner’s action selection without affecting their value function. According to the second hypothesis, model-based imitation (MB), the learner infers the demonstrator’s value function through inverse reinforcement learning and uses it to bias action selection. Finally, according to the third hypothesis, value shaping (VS), the demonstrator’s actions directly affect the learner’s value function. We tested these three hypotheses in 2 experiments (N = 24 and N = 44) featuring a new variant of a social reinforcement learning task. We show through model comparison and model simulation that VS provides the best explanation of learner’s behavior. Results replicated in a third independent experiment featuring a larger cohort and a different design (N = 302). In our experiments, we also manipulated the quality of the demonstrators’ choices and found that learners were able to adapt their imitation rate, so that only skilled demonstrators were imitated. We proposed and tested an efficient meta-learning process to account for this effect, where imitation is regulated by the agreement between the learner and the demonstrator. In sum, our findings provide new insights and perspectives on the computational mechanisms underlying adaptive imitation in human reinforcement learning.


Sign in / Sign up

Export Citation Format

Share Document