confirmatory bias
Recently Published Documents


TOTAL DOCUMENTS

51
(FIVE YEARS 14)

H-INDEX

8
(FIVE YEARS 3)

2021 ◽  
pp. 1-31
Author(s):  
Germain Lefebvre ◽  
Christopher Summerfield ◽  
Rafal Bogacz

Abstract Reinforcement learning involves updating estimates of the value of states and actions on the basis of experience. Previous work has shown that in humans, reinforcement learning exhibits a confirmatory bias: when the value of a chosen option is being updated, estimates are revised more radically following positive than negative reward prediction errors, but the converse is observed when updating the unchosen option value estimate. Here, we simulate performance on a multi-arm bandit task to examine the consequences of a confirmatory bias for reward harvesting. We report a paradoxical finding: that confirmatory biases allow the agent to maximize reward relative to an unbiased updating rule. This principle holds over a wide range of experimental settings and is most influential when decisions are corrupted by noise. We show that this occurs because on average, confirmatory biases lead to overestimating the value of more valuable bandits and underestimating the value of less valuable bandits, rendering decisions overall more robust in the face of noise. Our results show how apparently suboptimal learning rules can in fact be reward maximizing if decisions are made with finite computational precision.


2021 ◽  
Vol 25 ◽  
pp. 101155
Author(s):  
Alhassan I.H. Mohmed ◽  
Ashwin Kumaria ◽  
Barrie White

2021 ◽  
Vol 51 (2) ◽  
pp. 231-256
Author(s):  
Luka Mandić ◽  
Ksenija Klasnić

It is often assumed that survey results reflect only the quality of the sample and the underlying measuring instruments used in the survey. However, various phenomena can affect the results, but these influences are often neglected when conducting surveys. This study aimed to test the influences of various method effects on survey results. We tested the influences of the following method effects: item wording, confirmatory bias, careless responding, and acquiescence bias. Using a split-ballot survey design with online questionnaires, we collected data from 791 participants. We tested if these method effects had an influence on mean values, item correlations, construct correlations, model fits, and construct measurement invariance. The instruments used to test these influences were from the domain of personality and gender inequality, and their items were adapted based on the method effect tested. All tested method effects, except careless responding, had a statistically significant effect on at least one component of the analysis. Item wording and confirmatory bias affected mean values, model fit, and measurement invariance. Controlling for acquiescence bias improved the fit of the model. This paper confirms that the tested method effects should be carefully considered when using surveys in research, and suggests some guidelines on how to do so.


Author(s):  
Germain Lefebvre ◽  
Christopher Summerfield ◽  
Rafal Bogacz

AbstractReinforcement learning involves updating estimates of the value of states and actions on the basis of experience. Previous work has shown that in humans, reinforcement learning exhibits a confirmatory bias: when updating the value of a chosen option, estimates are revised more radically following positive than negative reward prediction errors, but the converse is observed when updating the unchosen option value estimate. Here, we simulate performance on a multi-arm bandit task to examine the consequences of a confirmatory bias for reward harvesting. We report a paradoxical finding: that confirmatory biases allow the agent to maximise reward relative to an unbiased updating rule. This principle holds over a wide range of experimental settings and is most influential when decisions are corrupted by noise. We show that this occurs because on average, confirmatory biases overestimate the value of more valuable bandits, and underestimate the value of less valuable bandits, rendering decisions overall more robust in the face of noise. Our results show how apparently suboptimal learning policies can in fact be reward-maximising if decisions are made with finite computational precision.


2020 ◽  
Vol 70 ◽  
pp. 102284 ◽  
Author(s):  
Mengcen Qian ◽  
Shin-Yi Chou ◽  
Ernest K. Lai

2020 ◽  
Vol 123 (1) ◽  
pp. 517-533 ◽  
Author(s):  
J. A. Garcia ◽  
Rosa Rodriguez-Sánchez ◽  
J. Fdez-Valdivia

Sign in / Sign up

Export Citation Format

Share Document