reward contingencies
Recently Published Documents


TOTAL DOCUMENTS

65
(FIVE YEARS 25)

H-INDEX

17
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Valentina Glück ◽  
Katharina Zwosta ◽  
Uta Wolfensteller ◽  
Hannes Ruge ◽  
Andre Pittig

Avoidance habits potentially contribute to maintaining maladaptive, costly avoidance behaviors that persist in the absence of threat. However, experimental evidence about costly habitual avoidance is scarce. In two experiments, we tested whether extensively trained avoidance impairs the subsequent goal-directed approach of rewards. Healthy participants were extensively trained to avoid an aversive outcome by performing simple responses to distinct full-screen color stimuli. After the subsequent devaluation of the aversive outcome, participants received monetary rewards for correct responses to neutral object pictures, which were presented on top of the same full-screen colors. These approach responses were either compatible or incompatible with habitual avoidance responses. Notably, the full-screen colors were not relevant to inform approach responses. In Experiment 1, participants were not instructed about post-devaluation stimulus-response-reward contingencies. Accuracy was lower in habit-incompatible than in habit-compatible trials, indicating costly avoidance, whereas reaction times did not differ. In Experiment 2, contingencies were explicitly instructed. Accuracy differences disappeared, but reaction times were slower in habit-incompatible than in habit-compatible trials, indicating low-cost habitual avoidance tendencies. These findings suggest a small but consistent impact of habitual avoidance tendencies on subsequent goal-directed approach. Costly habitual responding could, however, be inhibited when competing goal-directed approach was easily realizable.


2021 ◽  
Author(s):  
Galina Kozunova ◽  
Ksenia Sayfulina ◽  
Andrey Prokofyev ◽  
Vladimir Medvedev ◽  
Anna Rytikova ◽  
...  

This study examined whether pupil size and response time would distinguish directed exploration from random exploration and exploitation. Eighty-nine participants performed the two-choice probabilistic learning task while their pupil size and response time were continuously recorded. Using LMM analysis, we estimated differences in the pupil size and response time between the advantageous and disadvantageous choices as a function of learning success, i.e., whether or not a participant has learned the probabilistic contingency between choices and their outcomes. We proposed that before a true value of each choice became known to a decision-maker, both advantageous and disadvantageous choices represented a random exploration of the two options with an equally uncertain outcome, whereas the same choices after learning manifested exploitation and direct exploration strategies, respectively. We found that disadvantageous choices were associated with increases both in response time and pupil size, but only after the participants had learned the choice-reward contingencies. For the pupil size, this effect was strongly amplified for those disadvantageous choices that immediately followed gains as compared to losses in the preceding choice. Pupil size modulations were evident during the behavioral choice rather than during the pretrial baseline. These findings suggest that occasional disadvantageous choices, which violate the acquired internal utility model, represent directed exploration. This exploratory strategy shifts choice priorities in favor of information seeking and its autonomic and behavioral concomitants are mainly driven by the conflict between the behavioral plan of the intended exploratory choice and its strong alternative, which has already proven to be more rewarding.


PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0255430
Author(s):  
Arthur Prével ◽  
Ruth M. Krebs ◽  
Nanne Kukkonen ◽  
Senne Braem

Motivation signals have been shown to influence the engagement of cognitive control processes. However, most studies focus on the invigorating effect of reward prospect, rather than the reinforcing effect of reward feedback. The present study aimed to test whether people strategically adapt conflict processing when confronted with condition-specific congruency-reward contingencies in a manual Stroop task. Results show that the size of the Stroop effect can be affected by selectively rewarding responses following incongruent versus congruent trials. However, our findings also suggest important boundary conditions. Our first two experiments only show a modulation of the Stroop effect in the first half of the experimental blocks, possibly due to our adaptive threshold procedure demotivating adaptive behavior over time. The third experiment showed an overall modulation of the Stroop effect, but did not find evidence for a similar modulation on test items, leaving open whether this effect generalizes to the congruency conditions, or is stimulus-specific. More generally, our results are consistent with computational models of cognitive control and support contemporary learning perspectives on cognitive control. The findings also offer new guidelines and directions for future investigations on the selective reinforcement of cognitive control processes.


Author(s):  
Naotoshi Abekawa ◽  
Hiroaki Gomi ◽  
Jörn Diedrichsen

When reaching for an object with the hand, the gaze is usually directed at the target. In a laboratory setting, fixation is strongly maintained at the reach target until the reaching is completed, a phenomenon known as "gaze-anchoring". While conventional accounts of such tight eye-hand coordination have often emphasized the internal synergetic linkage between both motor systems, more recent optimal control theories regard motor coordination as the adaptive solution to task requirements. We here investigated to what degree gaze control during reaching is modulated by task demands. We adopted a gaze-anchoring paradigm in which participants had to reach for a target location. During the reach, they additionally had to make a saccadic eye movement to a salient visual cue presented at locations other than the target. We manipulated the task demands by independently changing reward contingencies for saccade reaction time (RT) and reaching accuracy. On average, both saccade RTs and reach error varied systematically with reward condition, with reach accuracy improving when the saccade was delayed. The distribution of the saccade RTs showed two types of eye movements: fast saccades with short RTs, and voluntary saccade with longer RTs. Increased reward for high reach accuracy reduced the probability of reflexive fast saccades, but left their latency unchanged. The results suggest that gaze-anchoring acts through a suppression of fast saccades, a mechanism that can be adaptively adjusted to the current task demands.


2021 ◽  
Author(s):  
Monika Laschober ◽  
Roger Mundry ◽  
Ludwig Huber ◽  
Raoul Schwing

AbstractThe midsession reversal paradigm confronts an animal with a two-choice discrimination task where the reward contingencies are reversed at the midpoint of the session. Species react to the reversal with either win-stay/lose-shift, using local information of reinforcement, or reversal estimation, using global information, e.g. time, to estimate the point of reversal. Besides pigeons, only mammalian species were tested in this paradigm so far and analyses were conducted on pooled data, not considering possible individually different responses. We tested twelve kea parrots with a 40-trial midsession reversal test and additional shifted reversal tests with a variable point of reversal. Birds were tested in two groups on a touchscreen, with the discrimination task having either only visual or additional spatial information. We used Generalized Linear Mixed Models to control for individual differences when analysing the data. Our results demonstrate that kea can use win-stay/lose-shift independently of local information. The predictors group, session, and trial number as well as their interactions had a significant influence on the response. Furthermore, we discovered notable individual differences not only between birds but also between sessions of individual birds, including the ability to quite accurately estimate the reversal position in alternation to win-stay/lose-shift. Our findings of the kea’s quick and flexible responses contribute to the knowledge of diversity in avian cognitive abilities and emphasize the need to consider individuality as well as the limitation of pooling the data when analysing midsession reversal data.


2021 ◽  
Author(s):  
Julia Leonard ◽  
Skyler R. Cordrey ◽  
Hunter Z. Liu ◽  
Allyson Mackey

Learners must constantly decide which challenges are worth pursuing. When making this decision, one important signal may be whether performance has improved over time. Here, across four pre-registered experiments (N = 360, ages 4 to 6), we found that young children who were given evidence that their performance was improving were more likely to persist on a challenging task than children who were given evidence that their performance was not changing, even when final performance was matched. This effect was robust to differing reward contingencies and across in-person and online testing contexts. Older children made more accurate predictions about their performance, and in some contexts were more likely to choose a challenge, than younger children. Our results suggest that children will persist more if they are provided with clear feedback about their progress over time.


2021 ◽  
Author(s):  
Arthur Prével ◽  
Ruth Krebs ◽  
Nanne Kukkonen ◽  
Senne Braem

Motivation signals have been shown to influence the engagement of cognitive control processes. However, most studies focus on the invigorating effect of reward prospect, rather than the reinforcing effect of reward feedback. The present study aimed to test whether people strategically adapt conflict processing when confronted with condition-specific congruency-reward contingencies in a manual Stroop task. Results show that the size of the Stroop effect can be affected by selectively rewarding responses following incongruent versus congruent trials. However, our findings also suggest important boundary conditions. Our first two experiments only show a modulation of the Stroop effect in the first half of the experimental blocks, possibly due to our adaptive threshold procedure demotivating adaptive behavior over time. The third experiment showed an overall modulation of the Stroop effect, but did not find evidence for a similar modulation on test items, leaving open whether this effect generalizes to the congruency conditions, or is stimulus-specific. More generally, our results are consistent with computational models of cognitive control and support contemporary learning perspectives on cognitive control. The findings also offer new guidelines and directions for future investigations on the selective reinforcement of cognitive control processes.


2020 ◽  
Author(s):  
Tahra Eissa ◽  
Joshua I Gold ◽  
Krešimir Josić ◽  
Zachary P Kilpatrick

AbstractDecisions based on rare events are challenging because rare events alone can be both informative and unreliable as evidence. How humans should and do overcome this challenge is not well understood. Here we present results from a preregistered study of 200 on-line participants performing a simple inference task in which the evidence was rare and asymmetric but the priors were symmetric. Consistent with a Bayesian ideal observer, most participants exhibited choice asymmetries that reflected a tendency to rationally interpret a rare event as evidence for the alternative likely to produce slightly more events, even when the two alternatives were equally likely a priori. A subset of participants exhibited additional biases based on an under-weighing of rare events. The results provide new quantitative and theoretically grounded insights into rare-event inference, which is relevant to both real-world problems like predicting stock-market crashes and common laboratory tasks like predicting changes in reward contingencies.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Chang-Hao Kao ◽  
Sangil Lee ◽  
Joshua I Gold ◽  
Joseph W Kable

Effective learning requires using errors in a task-dependent manner, for example adjusting to errors that result from unpredicted environmental changes but ignoring errors that result from environmental stochasticity. Where and how the brain represents errors in a task-dependent manner and uses them to guide behavior are not well understood. We imaged the brains of human participants performing a predictive-inference task with two conditions that had different sources of errors. Their performance was sensitive to this difference, including more choice switches after fundamental changes versus stochastic fluctuations in reward contingencies. Using multi-voxel pattern classification, we identified task-dependent representations of error magnitude and past errors in posterior parietal cortex. These representations were distinct from representations of the resulting behavioral adjustments in dorsomedial frontal, anterior cingulate, and orbitofrontal cortex. The results provide new insights into how the human brain represents errors in a task-dependent manner and guides subsequent adaptive behavior.


2020 ◽  
Author(s):  
Johan Alsiö ◽  
Olivia Lehmann ◽  
Colin McKenzie ◽  
David E Theobald ◽  
Lydia Searle ◽  
...  

Abstract Cross-species studies have identified an evolutionarily conserved role for serotonin in flexible behavior including reversal learning. The aim of the current study was to investigate the contribution of serotonin within the orbitofrontal cortex (OFC) and medial prefrontal cortex (mPFC) to visual discrimination and reversal learning. Male Lister Hooded rats were trained to discriminate between a rewarded (A+) and a nonrewarded (B−) visual stimulus to receive sucrose rewards in touchscreen operant chambers. Serotonin was depleted using surgical infusions of 5,7-dihydroxytryptamine (5,7-DHT), either globally by intracebroventricular (i.c.v.) infusions or locally by microinfusions into the OFC or mPFC. Rats that received i.c.v. infusions of 5,7-DHT before initial training were significantly impaired during both visual discrimination and subsequent reversal learning during which the stimulus–reward contingencies were changed (A− vs. B+). Local serotonin depletion from the OFC impaired reversal learning without affecting initial discrimination. After mPFC depletion, rats were unimpaired during reversal learning but slower to respond at the stimuli during all the stages; the mPFC group was also slower to learn during discrimination than the OFC group. These findings extend our understanding of serotonin in cognitive flexibility by revealing differential effects within two subregions of the prefrontal cortex in visual discrimination and reversal learning.


Sign in / Sign up

Export Citation Format

Share Document