Measuring Trial-wise Choice Difficulty in Multi-feature Reinforcement Learning
Real world reinforcement learning (RL) requires learning about stimuli composed of multiple features, only some of which are relevant to reinforcement. We investigated RL in a multi-feature task known as the Dimensions Task. Past work developed a computational model of this task, where the expected value of a stimulus comprises weights assigned to the stimulus’s features, hence the weights estimate the importance of each feature. We studies these weights and how they relate to human behavior. We found a sparse subset of features accrued much weight, and just 2 of 9 features exerted a significant influence on reaction time (RT), suggesting this pair of features mostly influences choice. These findings clarify that the Dimensions Task requires selectively attending to just a sparse subset of features while ignoring numerous irrelevant features, emphasizing its distinction from other recent multi-feature RL tasks that either require attending to all features or learning to treat feature conjunctions as objects. We next examined whether we could use the feature weights to develop a trial-wise marker of choice difficulty that related to individual differences. We found that high (vs. low) performing participants were better able to calibrate their responses based on variation in the standard deviation (SD) of the 2 features influencing RT. This suggests better-performing participants may be more responsive to the difference between the features. We discuss how this measure of trial-wise choice difficulty could be applied in experimental and translational research.