scholarly journals Neural Correlates of Sequence Learning with Stochastic Feedback

2011 ◽  
Vol 23 (6) ◽  
pp. 1346-1357
Author(s):  
Bruno B. Averbeck ◽  
James Kilner ◽  
Christopher D. Frith

Although much is known about decision making under uncertainty when only a single step is required in the decision process, less is known about sequential decision making. We carried out a stochastic sequence learning task in which subjects had to use noisy feedback to learn sequences of button presses. We compared flat and hierarchical behavioral models and found that although both models predicted the choices of the group of subjects equally well, only the hierarchical model correlated significantly with learning-related changes in the magneto-encephalographic response. The significant modulations in the magneto-encephalographic signal occurred 83 msec before button press and 67 msec after button press. We also localized the sources of these effects and found that the early effect localized to the insula, whereas the late effect localized to the premotor cortex.

2018 ◽  
Author(s):  
Ziqiang Wei ◽  
Hidehiko Inagaki ◽  
Nuo Li ◽  
Karel Svoboda ◽  
Shaul Druckmann

AbstractAnimals are not simple input-output machines. Their responses to even very similar stimuli are variable. A key, long-standing question in neuroscience is understanding the neural correlates of such behavioral variability. To reveal these correlates, behavior and neural population must be related to one another on single trials. Such analysis is challenging due to the dynamical nature of brain function (e.g. decision making), neuronal heterogeneity and signal to noise difficulties. By analyzing population recordings from mouse frontal cortex in perceptual decision-making tasks, we show that an analysis approach tailored to the coarse grain features of the dynamics was able to reveal previously unrecognized structure in the organization of population activity. This structure was similar on error and correct trials, suggesting what may be the underlying circuit mechanisms, was able to predict multiple aspects of behavioral variability and revealed long time-scale modulation of population activity.


2020 ◽  
Vol 34 (05) ◽  
pp. 7358-7366
Author(s):  
Mahtab Ahmed ◽  
Robert E. Mercer

Learning sentence representation is a fundamental task in Natural Language Processing. Most of the existing sentence pair modelling architectures focus only on extracting and using the rich sentence pair features. The drawback of utilizing all of these features makes the learning process much harder. In this study, we propose a reinforcement learning (RL) method to learn a sentence pair representation when performing tasks like semantic similarity, paraphrase identification, and question-answer pair modelling. We formulate this learning problem as a sequential decision making task where the decision made in the current state will have a strong impact on the following decisions. We address this decision making with a policy gradient RL method which chooses the irrelevant words to delete by looking at the sub-optimal representation of the sentences being compared. With this policy, extensive experiments show that our model achieves on par performance when learning task-specific representations of sentence pairs without needing any further knowledge like parse trees. We suggest that the simplicity of each task inference provided by our RL model makes it easier to explain.


Author(s):  
Lilla Horvath ◽  
Stanley Colcombe ◽  
Michael Milham ◽  
Shruti Ray ◽  
Philipp Schwartenbeck ◽  
...  

AbstractHumans often face sequential decision-making problems, in which information about the environmental reward structure is detached from rewards for a subset of actions. In the current exploratory study, we introduce an information-selective symmetric reversal bandit task to model such situations and obtained choice data on this task from 24 participants. To arbitrate between different decision-making strategies that participants may use on this task, we developed a set of probabilistic agent-based behavioral models, including exploitative and explorative Bayesian agents, as well as heuristic control agents. Upon validating the model and parameter recovery properties of our model set and summarizing the participants’ choice data in a descriptive way, we used a maximum likelihood approach to evaluate the participants’ choice data from the perspective of our model set. In brief, we provide quantitative evidence that participants employ a belief state-based hybrid explorative-exploitative strategy on the information-selective symmetric reversal bandit task, lending further support to the finding that humans are guided by their subjective uncertainty when solving exploration-exploitation dilemmas.


Sign in / Sign up

Export Citation Format

Share Document