The role of feedback in risk-sensitivity
Over the last century, myriad versions of the bandit task were used to study operant conditioning in humans and other animals. However, the overwhelming majority of these variations utilized one of two types of feedbacks, partial and full feedback, revealing to participants the single outcome of the chosen alternative or the outcomes of all alternatives respectively. While ecologically relevant, when restricting the feedback method to these two methods alone, observed behavioral phenomena could potentially be confounded with specific effects that the feedback method itself might induce, for example attitude towards risk. Here we introduce a new form of feedback. In a 2-armed bandit task, the reverse feedback reveals to participants only the outcome of the unchosen alternative. In a behavioral experiment, human participants were incentivized to maximize their per-trial reward while exploring the reward-distribution associated with two alternatives. Randomly assigning participants to a specific type of feedback, we find that participants in the partial and reverse feedback condition demonstrated behavior consistent with risk aversion and risk seeking. This result is intriguing for two reasons. First, in gains-domain, humans are considered risk-averse and it is hence surprising to observe a robust demonstration of risk seeking. Second, We present risk-sensitivity as a casual outcome of the utilized feedback. Since in most ecological and lab environments humans utilize the partial feedback and demonstrate risk-aversion, our finding sheds new light on the common perception of risk-sensitivity as an inherent, rather than induced, characteristic. Utilizing a simple reinforcement learning model, we explain the emergent risk preference as an outcome of learning in the specific environment we use. We present the relation of our paradigm to prospect theory, relate our finding to existing literature, and discuss the new light our novel feedback-mechanisms shed on conclusions drawn from previous paradigms.