Modelling Virtual Bargaining using Logical Representation Change

2021 ◽  
pp. 68-90
Author(s):  
Alan Bundy ◽  
Eugene Philalithis ◽  
Xue Li

We discuss work in progress on the computational modelling of virtual bargaining: inference-driven human coordination under severe communicative constraints. For this initial work we model variants of a two-player coordination game of item selection and avoidance taken from the current virtual bargaining literature. In this range of games, human participants collaborate to select items (e.g. bananas) or avoid items (e.g. scorpions), based on signalling conventions constructed and updated from shared assumptions, with minimal information exchange. We model behaviours in these games using logic programs interpretable as logical theories. From an initial theory comprised of rules, background assumptions and a basic signalling convention, we use automated theory repair to jointly adapt that basic signalling convention to novel contexts, with no explicit coordination between players. Our ABC system for theory repair delivers spontaneous adaptation, using reasoning failures to replace established conventions with better alternatives, matching human players’ own reasoning across several games.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
M. Herrojo Ruiz ◽  
T. Maudrich ◽  
B. Kalloch ◽  
D. Sammler ◽  
R. Kenville ◽  
...  

AbstractThe frontopolar cortex (FPC) contributes to tracking the reward of alternative choices during decision making, as well as their reliability. Whether this FPC function extends to reward gradients associated with continuous movements during motor learning remains unknown. We used anodal transcranial direct current stimulation (tDCS) over the right FPC to investigate its role in reward-based motor learning. Nineteen healthy human participants practiced novel sequences of finger movements on a digital piano with corresponding auditory feedback. Their aim was to use trialwise reward feedback to discover a hidden performance goal along a continuous dimension: timing. We additionally modulated the contralateral motor cortex (left M1) activity, and included a control sham stimulation. Right FPC-tDCS led to faster learning compared to lM1-tDCS and sham through regulation of motor variability. Bayesian computational modelling revealed that in all stimulation protocols, an increase in the trialwise expectation of reward was followed by greater exploitation, as shown previously. Yet, this association was weaker in lM1-tDCS suggesting a less efficient learning strategy. The effects of frontopolar stimulation were dissociated from those induced by lM1-tDCS and sham, as motor exploration was more sensitive to inferred changes in the reward tendency (volatility). The findings suggest that rFPC-tDCS increases the sensitivity of motor exploration to updates in reward volatility, accelerating reward-based motor learning.


2021 ◽  
Author(s):  
Maria Herrojo Ruiz ◽  
Tom Maudrich ◽  
Benjamin Kalloch ◽  
Daniela Sammler ◽  
Rouven Kenville ◽  
...  

Abstract The frontopolar cortex (FPC) contributes to tracking the reward of alternative choices during decision making, as well as their reliability. Whether this FPC function extends to reward gradients associated with continuous movements during motor learning remains unknown. We used anodal transcranial direct current stimulation (tDCS) over the right FPC to investigate its role in reward-based motor learning. Nineteen healthy human participants completed a motor sequence learning task using trialwise reward feedback to discover a hidden goal along a continuous dimension: timing. As additional conditions, we modulated the contralateral motor cortex (left M1) activity, and included a control sham stimulation. Right FPC-tDCS led to faster learning compared to lM1-tDCS and sham through regulation of motor variability. Computational modelling revealed that in all stimulation protocols, an increase in the trialwise expectation of reward was followed by greater exploitation, as shown previously. Yet, this association was weaker in lM1-tDCS suggesting a less efficient learning strategy. The effects of frontopolar stimulation were dissociated from those induced by lM1-tDCS and sham, as motor exploration was more sensitive to inferred changes in the reward tendency (volatility). The findings suggest that rFPC-tDCS increases the sensitivity of motor exploration to updates in reward volatility, accelerating reward-based motor learning.


2009 ◽  
Vol 9 (3) ◽  
pp. 245-308 ◽  
Author(s):  
JOOST VENNEKENS ◽  
MARC DENECKER ◽  
MAURICE BRUYNOOGHE

AbstractThis paper develops a logical language for representing probabilistic causal laws. Our interest in such a language is two-fold. First, it can be motivated as a fundamental study of the representation of causal knowledge. Causality has an inherent dynamic aspect, which has been studied at the semantical level by Shafer in his framework of probability trees. In such a dynamic context, where the evolution of a domain over time is considered, the idea of a causal law as something which guides this evolution is quite natural. In our formalization, a set of probabilistic causal laws can be used to represent a class of probability trees in a concise, flexible and modular way. In this way, our work extends Shafer's by offering a convenient logical representation for his semantical objects. Second, this language also has relevance for the area of probabilistic logic programming. In particular, we prove that the formal semantics of a theory in our language can be equivalently defined as a probability distribution over the well-founded models of certain logic programs, rendering it formally quite similar to existing languages such as ICL or PRISM. Because we can motivate and explain our language in a completely self-contained way as a representation of probabilistic causal laws, this provides a new way of explaining the intuitions behind such probabilistic logic programs: we can say precisely which knowledge such a program expresses, in terms that are equally understandable by a non-logician. Moreover, we also obtain an additional piece of knowledge representation methodology for probabilistic logic programs, by showing how they can express probabilistic causal laws.


Author(s):  
Maxine T. Sherman ◽  
Zafeirios Fountas ◽  
Anil K. Seth ◽  
Warrick Roseboom

AbstractHuman experience of time exhibits systematic, context-dependent deviations from objective clock time, for example, time is experienced differently at work than on holiday. However, leading explanations of time perception are not equipped to explain these deviations. Here we test the idea that these deviations arise because time estimates are constructed by accumulating the same quantity that guides perception: salient events. To test this, healthy human participants watched naturalistic, silent videos and estimated their duration while fMRI was acquired. Using computational modelling, we show that accumulated events in visual, auditory and somatosensory cortex all predict ‘clock time’, but duration biases reflecting human experience of time could only be predicted from the region involved in modality-specific sensory processing: visual cortex. Our results reveal that human subjective time is based on information arising during the processing of our dynamic sensory environment, providing a computational basis for an end-to-end account of time perception.


2019 ◽  
Author(s):  
Wojciech Zajkowski ◽  
Dominik Krzemiński ◽  
Jacopo Barone ◽  
Lisa Evans ◽  
Jiaxiang Zhang

AbstractChoosing between equally valued options can be a conundrum, for which classical decision theories predicted a prolonged response time (RT). Paradoxically, a rational decision-maker would need no deliberative thinking in this scenario, as outcomes of alternatives are indifferent. How individuals choose between equal options remain unclear. Here, we characterized the neurocognitive processes underlying such voluntary decisions, by integrating advanced cognitive modelling and EEG recording in a probabilistic reward task, in which human participants chose between pairs of cues associated with identical reward probabilities at different levels. We showed that higher reward certainty accelerated RT. At each certainty level, participants preferred to choose one cue faster and more frequently over the other. The behavioral effects on RT persisted in simple reactions to reward cues. By using hierarchical Bayesian parameter estimation for an accumulator model, we showed that the certainty and preference effects were independently associated with the rate of evidence accumulation during decisions, but not with visual encoding or motor execution latencies. Time-resolved multivariate pattern classification of EEG evoked response identified significant representations of reward certainty and preference choices as early as 120 ms after stimulus onset, with spatial relevance patterns maximal in middle central and parietal electrodes. Furthermore, EEG-informed computational modelling showed that the rate of change between N100 and P300 event-related potentials reflected changes in the model-derived rate of evidence accumulation on a trial-by-trial basis. Our findings suggested that reward certainty and preference collectively shaped voluntary decisions between equal options, providing a mechanism to prevent indecision or random behavior.


Author(s):  
Enrico Simetti ◽  
Giuseppe Casalino ◽  
Ninad Manerikar ◽  
Alessandro Sperinde ◽  
Sandro Torelli ◽  
...  

2021 ◽  
Author(s):  
Leonie Glitz ◽  
Keno Juechems ◽  
Christopher Summerfield ◽  
Neil P Garrett

Effective planning involves knowing where different actions will take us. However natural environments are rich and complex, leading to a "curse of dimensionality" - an exponential increase in memory demand as a plan grows in depth. One potential solution to this problem is to generalise the neural state transition functions used for planning between similar contexts. Here, we asked human participants to perform a sequential decision making task designed so that knowledge could be shared between some contexts but not others. Computational modelling showed that participants generalise a transition model between contexts where appropriate. fMRI data identified the medial temporal lobe as a locus for learning of state transitions, and within the same region, correlated BOLD patterns were observed in contexts where state transition information was shared. Finally, we show that the transition model is updated more strongly following the receipt of positive compared to negative outcomes, a finding that challenges conventional theories of planning which assume knowledge about our environment is updated independently of outcomes received. Together, these findings propose a computational and neural account of how information relevant for planning can be shared between contexts.


2020 ◽  
Vol 10 (2) ◽  
pp. 89-91
Author(s):  
Naresh Manandhar ◽  
Sunil Kumar Joshi

Research involving human participants needs to be scientifically valid and should be conducted according to accepted ethical standards. Research ethics provides guidelines for responsible conduct of research on human participants. It primarily protects the human participants of research and also educates and monitors researchers conducting health research to ensure a high quality of ethical standard. Consent is a research process of information exchange between the researcher and the human participants of research. Information provided to the human participants of research should be adequate, clearly understood by the participant of research with decision-making capacity and the research participant should voluntarily decide to participate. Respect for persons requires that the participants of research should be allowed to make choices about whether to participate or not in the research.


Sign in / Sign up

Export Citation Format

Share Document