scholarly journals Inference and Search on Graph-Structured Spaces

Author(s):  
Charley M. Wu ◽  
Eric Schulz ◽  
Samuel J. Gershman

Abstract How do people learn functions on structured spaces? And how do they use this knowledge to guide their search for rewards in situations where the number of options is large? We study human behavior on structures with graph-correlated values and propose a Bayesian model of function learning to describe and predict their behavior. Across two experiments, one assessing function learning and one assessing the search for rewards, we find that our model captures human predictions and sampling behavior better than several alternatives, generates human-like learning curves, and also captures participants’ confidence judgements. Our results extend past models of human function learning and reward learning to more complex, graph-structured domains.

2020 ◽  
Author(s):  
Charley M. Wu ◽  
Eric Schulz ◽  
Samuel J Gershman

How do people learn functions on structured spaces? And how do they use this knowledge to guide their search for rewards in situations where the number of options is large? We study human behavior on structures with graph-correlated values and propose a Bayesian model of function learning to describe and predict their behavior. Across two experiments, one assessing function learning and one assessing the search for rewards, we find that our model captures human predictions and sampling behavior better than several alternatives, generates human-like learning curves, and also captures participants’ confidence judgements. Our results extend past models of human function learning and reward learning to more complex, graph-structured domains.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Björn Lindström ◽  
Martin Bellander ◽  
David T. Schultner ◽  
Allen Chang ◽  
Philippe N. Tobler ◽  
...  

AbstractSocial media has become a modern arena for human life, with billions of daily users worldwide. The intense popularity of social media is often attributed to a psychological need for social rewards (likes), portraying the online world as a Skinner Box for the modern human. Yet despite such portrayals, empirical evidence for social media engagement as reward-based behavior remains scant. Here, we apply a computational approach to directly test whether reward learning mechanisms contribute to social media behavior. We analyze over one million posts from over 4000 individuals on multiple social media platforms, using computational models based on reinforcement learning theory. Our results consistently show that human behavior on social media conforms qualitatively and quantitatively to the principles of reward learning. Specifically, social media users spaced their posts to maximize the average rate of accrued social rewards, in a manner subject to both the effort cost of posting and the opportunity cost of inaction. Results further reveal meaningful individual difference profiles in social reward learning on social media. Finally, an online experiment (n = 176), mimicking key aspects of social media, verifies that social rewards causally influence behavior as posited by our computational account. Together, these findings support a reward learning account of social media engagement and offer new insights into this emergent mode of modern human behavior.


2009 ◽  
Vol 32 (1) ◽  
pp. 87-88 ◽  
Author(s):  
Wim De Neys

AbstractOaksford & Chater (O&C) rely on a data fitting approach to show that a Bayesian model captures the core reasoning data better than its logicist rivals. The problem is that O&C's modeling has focused exclusively on response output data. I argue that this exclusive focus is biasing their conclusions. Recent studies that focused on the processes that resulted in the response selection are more positive for the role of logic.


2019 ◽  
Author(s):  
M Alizadeh Asfestani ◽  
V Brechtmann ◽  
J Santiago ◽  
J Born ◽  
GB Feld

AbstractSleep enhances memories, especially, if they are related to future rewards. Although dopamine has been shown to be a key determinant during reward learning, the role of dopaminergic neurotransmission for amplifying reward-related memories during sleep remains unclear. In the present study, we scrutinize the idea that dopamine is needed for the preferential consolidation of rewarded information. We blocked dopaminergic neurotransmission, thereby aiming to wipe out preferential sleep-dependent consolidation of high over low rewarded memories during sleep. Following a double-blind, balanced, crossover design 20 young healthy men received the dopamine d2-like receptor blocker Sulpiride (800 mg) or placebo, after learning a Motivated Learning Task. The task required participants to memorize 80 highly and 80 lowly rewarded pictures. Half of them were presented for a short (750 ms) and a long duration (1500 ms), respectively, which enabled to dissociate effects of reward on sleep-associated consolidation from those of mere encoding depth. Retrieval was tested after a retention interval of 20 h that included 8 h of nocturnal sleep. As expected, at retrieval, highly rewarded memories were remembered better than lowly rewarded memories, under placebo. However, there was no evidence for an effect of blocking dopaminergic neurotransmission with Sulpiride during sleep on this differential retention of rewarded information. This result indicates that dopaminergic activation is not required for the preferential consolidation of reward-associated memory. Rather it appears that dopaminergic activation only tags such memories at encoding for intensified reprocessing during sleep.


2017 ◽  
Author(s):  
Eric Schulz ◽  
Charley M. Wu ◽  
Quentin J. M. Huys ◽  
Andreas Krause ◽  
Maarten Speekenbrink

AbstractHow do people pursue rewards in risky environments, where some outcomes should be avoided at all costs? We investigate how participant search for spatially correlated rewards in scenarios where one must avoid sampling rewards below a given threshold. This requires not only the balancing of exploration and exploitation, but also reasoning about how to avoid potentially risky areas of the search space. Within risky versions of the spatially correlated multi-armed bandit task, we show that participants’ behavior is aligned well with a Gaussian process function learning algorithm, which chooses points based on a safe optimization routine. Moreover, using leave-one-block-out cross-validation, we find that participants adapt their sampling behavior to the riskiness of the task, although the underlying function learning mechanism remains relatively unchanged. These results show that participants can adapt their search behavior to the adversity of the environment and enrich our understanding of adaptive behavior in the face of risk and uncertainty.


2020 ◽  
Vol 32 (9) ◽  
pp. 1688-1703 ◽  
Author(s):  
Marjan Alizadeh Asfestani ◽  
Valentin Brechtmann ◽  
João Santiago ◽  
Andreas Peter ◽  
Jan Born ◽  
...  

Sleep enhances memories, especially if they are related to future rewards. Although dopamine has been shown to be a key determinant during reward learning, the role of dopaminergic neurotransmission for amplifying reward-related memories during sleep remains unclear. In this study, we scrutinize the idea that dopamine is needed for the preferential consolidation of rewarded information. We impaired dopaminergic neurotransmission, thereby aiming to wipe out preferential sleep-dependent consolidation of high- over low-rewarded memories during sleep. Following a double-blind, balanced, crossover design, 17 young healthy men received the dopamine d2-like receptor blocker sulpiride (800 mg) or placebo, after learning a motivated learning task. The task required participants to memorize 80 highly and 80 lowly rewarded pictures. Half of them were presented for a short (750 msec) and a long (1500 msec) duration, respectively, which permitted dissociation of the effects of reward on sleep-associated consolidation from those of mere encoding depth. Retrieval was tested after a retention interval of approximately 22 hr that included 8 hr of nocturnal sleep. As expected, at retrieval, highly rewarded memories were remembered better than lowly rewarded memories, under placebo. However, there was no evidence for an effect of reducing dopaminergic neurotransmission with sulpiride during sleep on this differential retention of rewarded information. This result indicates that dopaminergic activation likely is not required for the preferential consolidation of reward-associated memory. Rather, it appears that dopaminergic activation only tags such memories at encoding for intensified reprocessing during sleep.


2020 ◽  
Vol 24 (23) ◽  
pp. 17771-17785
Author(s):  
Antonio Candelieri ◽  
Riccardo Perego ◽  
Ilaria Giordani ◽  
Andrea Ponti ◽  
Francesco Archetti

AbstractModelling human function learning has been the subject of intense research in cognitive sciences. The topic is relevant in black-box optimization where information about the objective and/or constraints is not available and must be learned through function evaluations. In this paper, we focus on the relation between the behaviour of humans searching for the maximum and the probabilistic model used in Bayesian optimization. As surrogate models of the unknown function, both Gaussian processes and random forest have been considered: the Bayesian learning paradigm is central in the development of active learning approaches balancing exploration/exploitation in uncertain conditions towards effective generalization in large decision spaces. In this paper, we analyse experimentally how Bayesian optimization compares to humans searching for the maximum of an unknown 2D function. A set of controlled experiments with 60 subjects, using both surrogate models, confirm that Bayesian optimization provides a general model to represent individual patterns of active learning in humans.


2007 ◽  
Vol 8 (3) ◽  
pp. 135-142 ◽  
Author(s):  
Danilo Fum ◽  
Fabio Del Missier ◽  
Andrea Stocco

2015 ◽  
Vol 45 (8) ◽  
pp. 1103-1121 ◽  
Author(s):  
Roger McHaney ◽  
Joey F. George ◽  
Manjul Gupta

Deception is a pervasive problem often found in human behavior. This study investigates why past deception studies have found groups perform no better than individuals in detection using time-interaction-performance theory which suggests teams are not immediately effective. Only after establishing relational links is their potential reached. Established groups spend less time building relational links and instead focus on task-oriented activities more effectively. We sought to determine whether groups with prior history of interaction outperform individuals in deception detection. First, participants were randomly assigned to an individual or ad hoc group role. Later, additional preexisting work groups were recruited. Participants were instructed to identify deception in online video interviews. The experiment tested theoretical explanations regarding cohesion, interaction, and satisfaction as components of relational links and relationships to deception detection. Results indicated that groups which exhibited higher levels of relational links, that is, established groups, were more accurate in deception detection than ad hoc groups.


2019 ◽  
Author(s):  
Eric Schulz ◽  
Charley M Wu

How do people generalize and explore structured spaces? We study human behavior on a multi-armed bandit task, where rewards are influenced by the connectivity structure of a graph. A detailed predictive model comparison shows that a Gaussian Process regression model using a diffusion kernel is able to best describe participant choices, and also predict judgments about expected reward and confidence. This model unifies psychological models of function learning with the Successor Representation used in reinforcement learning, thereby building a bridge between different models of generalization.


Sign in / Sign up

Export Citation Format

Share Document