Testing a micro-genesis account of longer-form reinforcement learning (win-calmness and loss-restlessness)

2021 ◽  
Author(s):  
Ahad Asad ◽  
Ben Dyson

Fundamental reinforcement learning principles such as win-stay and lose-shift represent outcome-action associations between consecutive trials (trial n-1 and n). Longer-form expressions of the tendency to continually repeat previous actions following positive outcomes, and the tendency to continually change previous actions following negative outcomes, have been identified as win-calmness and lose-restlessness, respectively. Across 10 experiments, we tested a micro-genesis account of these phenomena by examining sequential contingencies across trial n-2, n-1 and n using simple game spaces. At a group level, we found no evidence of win-calmness and lose-restlessness when wins could not be maximized (unexploitable opponent). Similarly, we found no evidence of win-calmness and lose-restlessness when the threat of win minimization was presented (exploiting opponent). In contrast, we found evidence of win-calmness (but not lose-restlessness) when win maximization was made possible (exploitable opponent). At a participant level, we confirm that individual win rates determined the degree of win-calmness and lose-restlessness only in contexts were win rates could be maximized. The data identify the mechanisms that allow for the development of longer-form reinforcement learning principles and demonstrate the relative flexibility in decision-space afforded by positive outcomes, and the relative inflexibility in decision-space following negative outcomes.

2021 ◽  
Author(s):  
Kate Nussenbaum ◽  
Juan A. Velez ◽  
Bradli T. Washington ◽  
Hannah E. Hamling ◽  
Catherine A. Hartley

Optimal integration of positive and negative outcomes during learning varies depending on an environment’s reward statistics. The present study investigated the extent to which children, adolescents, and adults (N = 142 8 - 25 year-olds, 55% female, 42% White, 31% Asian, 17% mixed race, and 8% Black) adapt their weighting of better-than-expected and worse-than-expected outcomes when learning from reinforcement. Participants made a series of choices across two contexts: one in which weighting positive outcomes more heavily than negative outcomes led to better performance, and one in which the reverse was true. Reinforcement learning modeling revealed that across age, participants shifted their valence biases in accordance with the structure of the environment. Exploratory analyses revealed increases in context-dependent flexibility with age.


2018 ◽  
Vol 71 (7) ◽  
pp. 1584-1595 ◽  
Author(s):  
Steven Di Costa ◽  
Héloïse Théro ◽  
Valérian Chambon ◽  
Patrick Haggard

The sense of agency refers to the feeling that we control our actions and, through them, effects in the outside world. Reinforcement learning provides an important theoretical framework for understanding why people choose to make particular actions. Few previous studies have considered how reinforcement and learning might influence the subjective experience of agency over actions and outcomes. In two experiments, participants chose between two action alternatives, which differed in reward probability. Occasional reversals of action–reward mapping required participants to monitor outcomes and adjust action selection processing accordingly. We measured shifts in the perceived times of actions and subsequent outcomes (‘intentional binding’) as an implicit proxy for sense of agency. In the first experiment, negative outcomes showed stronger binding towards the preceding action, compared to positive outcomes. Furthermore, negative outcomes were followed by increased binding of actions towards their outcome on the following trial. Experiment 2 replicated this post-error boost in action binding and showed that it only occurred when people could learn from their errors to improve action choices. We modelled the post-error boost using an established quantitative model of reinforcement learning. The post-error boost in action binding correlated positively with participants’ tendency to learn more from negative outcomes than from positive outcomes. Our results suggest a novel relation between sense of agency and reinforcement learning, in which sense of agency is increased when negative outcomes trigger adaptive changes in subsequent action selection processing.


Author(s):  
Raymond L. Higgins ◽  
Matthew W. Gallagher

This chapter presents an overview of the development and status of the reality negotiation construct and relates it to a variety of coping processes. The reality negotiation construct follows from the social constructionist tradition and first appeared in discussions of how excuses protect self-images by decreasing the causal linkage to negative outcomes. The reality negotiation construct was later expanded to include a discussion of how the process of hoping may be used to increase perceived linkage to positive outcomes. In the two decades since these constructs were first introduced, four individual differences measures have been developed, and the effects of these reality negotiation techniques have been studied extensively. Reality negotiation techniques can be both maladaptive and adaptive and have been shown to be associated with coping and social support in a variety of populations. The chapter concludes by highlighting a few areas in which reality negotiation research could expand to further its relevance and applicability to the field of positive psychology.


2019 ◽  
Author(s):  
Arunima Sarin ◽  
David Lagnado ◽  
Paul Burgess

Knowledge of intention and outcome is integral to making judgments of responsibility, blame, and causality. Yet, little is known about the effect of conflicting intentions and outcomes on these judgments. In a series of four experiments, we combine good and bad intentions with positive and negative outcomes, presenting these through everyday moral scenarios. Our results demonstrate an asymmetry in responsibility, causality, and blame judgments for the two incongruent conditions: well-intentioned agents are regarded more morally and causally responsible for negative outcomes than ill-intentioned agents are held for positive outcomes. This novel effect of an intention-outcome asymmetry identifies an unexplored aspect of moral judgment and is partially explained by extra inferences that participants make about the actions of the moral agent.


PLoS Biology ◽  
2021 ◽  
Vol 19 (9) ◽  
pp. e3001119
Author(s):  
Joan Orpella ◽  
Ernest Mas-Herrero ◽  
Pablo Ripollés ◽  
Josep Marco-Pallarés ◽  
Ruth de Diego-Balaguer

Statistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online SL of simple syntactic structures combined with computational modeling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate—on 2 different cohorts—that a temporal difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the long-standing gap between language learning and reinforcement learning phenomena.


Games ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 25
Author(s):  
Vincent Srihaput ◽  
Kaylee Craplewe ◽  
Benjamin James Dyson

Predictability is a hallmark of poor-quality decision-making during competition. One source of predictability is the strong association between current outcome and future action, as dictated by the reinforcement learning principles of win–stay and lose–shift. We tested the idea that predictability could be reduced during competition by weakening the associations between outcome and action. To do this, participants completed a competitive zero-sum game in which the opponent from the current trial was either replayed (opponent repeat) thereby strengthening the association, or, replaced (opponent change) by a different competitor thereby weakening the association. We observed that win–stay behavior was reduced during opponent change trials but lose–shiftbehavior remained reliably predictable. Consistent with the group data, the number of individuals who exhibited predictable behavior following wins decreased for opponent change relative to opponent repeat trials. Our data show that future actions are more under internal control following positive relative to negative outcomes, and that externally breaking the bonds between outcome and action via opponent association also allows us to become less prone to exploitation.


2019 ◽  
Vol 31 (2) ◽  
pp. 309-331 ◽  
Author(s):  
Eric G. Lambert ◽  
Linda D. Keena ◽  
Stacy H. Haynes ◽  
David May ◽  
Matthew C. Leone

Job stress is a problem in corrections. Although the very nature of correctional work is stressful, workplace variables also contribute to correctional staff job stress. The job demands-resource model holds that job demands increase negative outcomes (e.g., job stress) and decrease positive outcomes (e.g., job satisfaction), whereas job resources help increase positive outcomes and decrease negative outcomes. An ordinary least squares regression analysis of self-reported survey data from 322 staff at a Southern prison indicated that input into decision-making and quality supervision had statistically significant negative effects on job stress, whereas role overload and fear of victimization had significant positive effects. Instrumental communication, views of training, and role clarity all had nonsignificant associations with stress from the job in the multivariate analysis. The results partially supported the job demands-resources model; however, the specific work environment variables varied in terms of their statistical significance. Correctional administrators need to be aware of the contribution that workplace variables have on job stress and make changes to reduce staff job stress.


2017 ◽  
Vol 42 (6) ◽  
pp. 932-952 ◽  
Author(s):  
Laurel D. Sarfan ◽  
Peter Gooch ◽  
Elise M. Clerkin

Emotion regulation strategies have been conceptualized as adaptive or maladaptive, but recent evidence suggests emotion regulation outcomes may be context-dependent. The present study tested whether the adaptiveness of a putatively adaptive emotion regulation strategy—problem solving—varied across contexts of high and low controllability. The present study also tested rumination, suggested to be one of the most putatively maladaptive strategies, which was expected to be associated with negative outcomes regardless of context. Participants completed an in vivo speech task, in which they were randomly assigned to a controllable ( n = 65) or an uncontrollable ( n = 63) condition. Using moderation analyses, we tested whether controllability interacted with emotion regulation use to predict negative affect, avoidance, and perception of performance. Partially consistent with hypotheses, problem solving was associated with certain positive outcomes (i.e., reduced behavioral avoidance) in the controllable (vs. uncontrollable) condition. Consistent with predictions, rumination was associated with negative outcomes (i.e., desired avoidance, negative affect, negative perception of performance) in both conditions. Overall, findings partially support contextual models of emotion regulation, insofar as the data suggest that the effects of problem solving may be more adaptive in controllable contexts for certain outcomes, whereas rumination may be maladaptive regardless of context.


Subject AI in the workplace. Significance Positive use cases for artificial intelligence (AI) systems are rising, but misuse means the number of negative examples is also rising, drawing attention to how to regulate it. Impacts Effective use of AI within appropriate contexts will improve business performance in many sectors. Current law is not suitable for some emerging forms of AI, but to gain competitiveness, some regions may prioritise efficiency over safety. Misuse of AI will become a major source of negative outcomes at work, likely outweighing the positive outcomes. Future uses of AI will become increasingly hard to manage or regulate. Firms expanding their 'ethical' activities and then arguing that more regulation would limit them will raise fears of ‘ethical washing’.


1982 ◽  
Vol 4 (1) ◽  
pp. 81-91
Author(s):  
Decky Fiedler ◽  
Lee Roy Beach

This study uses a Decision/Expectancy model to examine factors contributing to sports preference for college men and women at three levels of participation. Subjects rated the utility of outcomes for nine sports (a sampling of team and nonteam, competitive and recreational activities) and their expectations that each outcome would occur given that they participated in each sport. Subjects were divided into six groups according to current and recent participation in sports activities. A relationship was found between current level of participation and age of earliest participation. Subjective Expected Utilities (SEUs) for groups suggest that differences between groups having different levels of participation were not in their assessments of the utilities of outcomes of participation, because all subjects found positive outcomes equally favorable and negative outcomes equally unfavorable. The groups did differ, however, in their assessment of the probabilities that positive or negative outcomes would occur as a result of their participation. Men and women were very similar in their evaluations. There was a high correlation for groups between SEU and stated preference for the nine sports.


Sign in / Sign up

Export Citation Format

Share Document