scholarly journals Hippocampal Contribution to Probabilistic Feedback Learning: Modeling Observation- and Reinforcement-based Processes

2021 ◽  
Author(s):  
Virginie Patt ◽  
Daniela Palombo ◽  
Michael Esterman ◽  
Mieke Verfaellie

Simple probabilistic reinforcement learning is recognized as a striatum-based learning system, but in recent years, has also been associated with hippocampal involvement. The present study examined whether such involvement may be attributed to observation-based learning processes, running in parallel to striatum-based reinforcement learning. A computational model of observation-based learning (OL), mirroring classic models of reinforcement-based learning (RL), was constructed and applied to the neuroimaging dataset of Palombo, Hayes, Reid, & Verfaellie (2019). Hippocampal contributions to value-based learning: Converging evidence from fMRI and amnesia. Cognitive, Affective & Behavioral Neuroscience, 19(3), 523–536. Results suggested that observation-based learning processes may indeed take place concomitantly to reinforcement learning and involve activation of the hippocampus and central orbitofrontal cortex (cOFC). However, rather than independent mechanisms running in parallel, the brain correlates of the OL and RL prediction errors indicated collaboration between systems, with direct implication of the hippocampus in computations of the discrepancy between the expected and actual reinforcing values of actions. These findings are consistent with previous accounts of a role for the hippocampus in encoding the strength of observed stimulus-outcome associations, with updating of such associations through striatal reinforcement-based computations. Additionally, enhanced negative prediction error signaling was found in the anterior insula with greater use of OL over RL processes. This result may suggest an additional mode of collaboration between OL and RL systems, implicating the error monitoring network.

2020 ◽  
Author(s):  
Dongjae Kim ◽  
Jaeseung Jeong ◽  
Sang Wan Lee

AbstractThe goal of learning is to maximize future rewards by minimizing prediction errors. Evidence have shown that the brain achieves this by combining model-based and model-free learning. However, the prediction error minimization is challenged by a bias-variance tradeoff, which imposes constraints on each strategy’s performance. We provide new theoretical insight into how this tradeoff can be resolved through the adaptive control of model-based and model-free learning. The theory predicts the baseline correction for prediction error reduces the lower bound of the bias–variance error by factoring out irreducible noise. Using a Markov decision task with context changes, we showed behavioral evidence of adaptive control. Model-based behavioral analyses show that the prediction error baseline signals context changes to improve adaptability. Critically, the neural results support this view, demonstrating multiplexed representations of prediction error baseline within the ventrolateral and ventromedial prefrontal cortex, key brain regions known to guide model-based and model-free learning.One sentence summaryA theoretical, behavioral, computational, and neural account of how the brain resolves the bias-variance tradeoff during reinforcement learning is described.


Author(s):  
Mitsuo Kawato ◽  
Aurelio Cortese

AbstractIn several papers published in Biological Cybernetics in the 1980s and 1990s, Kawato and colleagues proposed computational models explaining how internal models are acquired in the cerebellum. These models were later supported by neurophysiological experiments using monkeys and neuroimaging experiments involving humans. These early studies influenced neuroscience from basic, sensory-motor control to higher cognitive functions. One of the most perplexing enigmas related to internal models is to understand the neural mechanisms that enable animals to learn large-dimensional problems with so few trials. Consciousness and metacognition—the ability to monitor one’s own thoughts, may be part of the solution to this enigma. Based on literature reviews of the past 20 years, here we propose a computational neuroscience model of metacognition. The model comprises a modular hierarchical reinforcement-learning architecture of parallel and layered, generative-inverse model pairs. In the prefrontal cortex, a distributed executive network called the “cognitive reality monitoring network” (CRMN) orchestrates conscious involvement of generative-inverse model pairs in perception and action. Based on mismatches between computations by generative and inverse models, as well as reward prediction errors, CRMN computes a “responsibility signal” that gates selection and learning of pairs in perception, action, and reinforcement learning. A high responsibility signal is given to the pairs that best capture the external world, that are competent in movements (small mismatch), and that are capable of reinforcement learning (small reward-prediction error). CRMN selects pairs with higher responsibility signals as objects of metacognition, and consciousness is determined by the entropy of responsibility signals across all pairs. This model could lead to new-generation AI, which exhibits metacognition, consciousness, dimension reduction, selection of modules and corresponding representations, and learning from small samples. It may also lead to the development of a new scientific paradigm that enables the causal study of consciousness by combining CRMN and decoded neurofeedback.


2021 ◽  
Author(s):  
Bianca Westhoff ◽  
Neeltje E. Blankenstein ◽  
Elisabeth Schreuders ◽  
Eveline A. Crone ◽  
Anna C. K. van Duijvenvoorde

AbstractLearning which of our behaviors benefit others contributes to social bonding and being liked by others. An important period for the development of (pro)social behavior is adolescence, in which peers become more salient and relationships intensify. It is, however, unknown how learning to benefit others develops across adolescence and what the underlying cognitive and neural mechanisms are. In this functional neuroimaging study, we assessed learning for self and others (i.e., prosocial learning) and the concurring neural tracking of prediction errors across adolescence (ages 9-21, N=74). Participants performed a two-choice probabilistic reinforcement learning task in which outcomes resulted in monetary consequences for themselves, an unknown other, or no one. Participants from all ages were able to learn for themselves and others, but learning for others showed a more protracted developmental trajectory. Prediction errors for self were observed in the ventral striatum and showed no age-related differences. However, prediction error coding for others was specifically observed in the ventromedial prefrontal cortex and showed age-related increases. These results reveal insights into the computational mechanisms of learning for others across adolescence, and highlight that learning for self and others show different age-related patterns.


2020 ◽  
Author(s):  
Ceyda Sayalı ◽  
David Badre

AbstractPeople balance the benefits of cognitive work against the costs of cognitive effort. Models that incorporate prospective estimates of the costs of cognitive effort into decision making require a mechanism by which these costs are learned. However, it remains open what brain systems are important for this learning, particularly when learning is not tied explicitly to a decision about what task to perform. In this fMRI experiment, we parametrically manipulated the level of effort a task requires by increasing task switching frequency across six task contexts. In a scanned learning phase, participants implicitly learned about the task switching frequency in each context. In a subsequent test phase outside the scanner, participants made selections between pairs of these task contexts. Notably, during learning, participants were not aware of this later choice phase. Nonetheless, participants avoided task contexts requiring more task switching. We modeled learning within a reinforcement learning framework, and found that effort expectations that derived from task-switching probability and response time (RT) during learning were the best predictors of later choice behavior. Interestingly, prediction errors (PE) from these two models were differentially associated with separate brain networks during distinct learning epochs. Specifically, PE derived from expected RT was most correlated with the cingulo-opercular network early in learning, whereas PE derived from expected task switching frequency was correlated with the fronto-parietal network late in learning. These observations are discussed in relation to the contribution of cognitive control systems to new task learning and how this may bear on effort-based decisions.Significance StatementOn a daily basis, we make decisions about cognitive effort expenditure. It has been argued that we avoid cognitively effortful tasks to the degree subjective costs outweigh the benefits of the task. Here, we investigate the brain systems that learn about task demands for use in later effort-based decisions. Using reinforcement learning models, we find that learning about both expected response time and task switching frequency affect later effort-based decisions and these are differentially tracked by distinct brain networks during different epochs of learning. The results indicate that more than one signal is used by the brain to associate effort costs with a given task.


eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Ida Momennejad ◽  
A Ross Otto ◽  
Nathaniel D Daw ◽  
Kenneth A Norman

Making decisions in sequentially structured tasks requires integrating distally acquired information. The extensive computational cost of such integration challenges planning methods that integrate online, at decision time. Furthermore, it remains unclear whether ‘offline’ integration during replay supports planning, and if so which memories should be replayed. Inspired by machine learning, we propose that (a) offline replay of trajectories facilitates integrating representations that guide decisions, and (b) unsigned prediction errors (uncertainty) trigger such integrative replay. We designed a 2-step revaluation task for fMRI, whereby participants needed to integrate changes in rewards with past knowledge to optimally replan decisions. As predicted, we found that (a) multi-voxel pattern evidence for off-task replay predicts subsequent replanning; (b) neural sensitivity to uncertainty predicts subsequent replay and replanning; (c) off-task hippocampus and anterior cingulate activity increase when revaluation is required. These findings elucidate how the brain leverages offline mechanisms in planning and goal-directed behavior under uncertainty.


2016 ◽  
Vol 224 (4) ◽  
pp. 240-246 ◽  
Author(s):  
Mélanie Bédard ◽  
Line Laplante ◽  
Julien Mercier

Abstract. Dyslexia is a phenomenon for which the brain correlates have been studied since the beginning of the 20th century. Simultaneously, the field of education has also been studying dyslexia and its remediation, mainly through behavioral data. The last two decades have seen a growing interest in integrating neuroscience and education. This article provides a quick overview of pertinent scientific literature involving neurophysiological data on functional brain differences in dyslexia and discusses their very limited influence on the development of reading remediation for dyslexic individuals. Nevertheless, it appears that if certain conditions are met – related to the key elements of educational neuroscience and to the nature of the research questions – conceivable benefits can be expected from the integration of neurophysiological data with educational research. When neurophysiological data can be employed to overcome the limits of using behavioral data alone, researchers can both unravel phenomenon otherwise impossible to document and raise new questions.


Author(s):  
Riitta Salmelin ◽  
Jan Kujala ◽  
Mia Liljeström

When seeking to uncover the brain correlates of language processing, timing and location are of the essence. Magnetoencephalography (MEG) offers them both, with the highest sensitivity to cortical activity. MEG has shown its worth in revealing cortical dynamics of reading, speech perception, and speech production in adults and children, in unimpaired language processing as well as developmental and acquired language disorders. The MEG signals, once recorded, provide an extensive selection of measures for examination of neural processing. Like all other neuroimaging tools, MEG has its own strengths and limitations of which the user should be aware in order to make the best possible use of this powerful method and to generate meaningful and reliable scientific data. This chapter reviews MEG methodology and how MEG has been used to study the cortical dynamics of language.


PLoS Biology ◽  
2021 ◽  
Vol 19 (9) ◽  
pp. e3001119
Author(s):  
Joan Orpella ◽  
Ernest Mas-Herrero ◽  
Pablo Ripollés ◽  
Josep Marco-Pallarés ◽  
Ruth de Diego-Balaguer

Statistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online SL of simple syntactic structures combined with computational modeling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate—on 2 different cohorts—that a temporal difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the long-standing gap between language learning and reinforcement learning phenomena.


2003 ◽  
Vol 39 (7) ◽  
pp. 699-701
Author(s):  
Kosuke UMESAKO ◽  
Masanao OBAYASHI ◽  
Kunikazu KOBAYASHI

2019 ◽  
Author(s):  
A. Wiehler ◽  
K. Chakroun ◽  
J. Peters

AbstractGambling disorder is a behavioral addiction associated with impairments in decision-making and reduced behavioral flexibility. Decision-making in volatile environments requires a flexible trade-off between exploitation of options with high expected values and exploration of novel options to adapt to changing reward contingencies. This classical problem is known as the exploration-exploitation dilemma. We hypothesized gambling disorder to be associated with a specific reduction in directed (uncertainty-based) exploration compared to healthy controls, accompanied by changes in brain activity in a fronto-parietal exploration-related network.Twenty-three frequent gamblers and nineteen matched controls performed a classical four-armed bandit task during functional magnetic resonance imaging. Computational modeling revealed that choice behavior in both groups contained signatures of directed exploration, random exploration and perseveration. Gamblers showed a specific reduction in directed exploration, while random exploration and perseveration were similar between groups.Neuroimaging revealed no evidence for group differences in neural representations of expected value and reward prediction errors. Likewise, our hypothesis of attenuated fronto-parietal exploration effects in gambling disorder was not supported. However, during directed exploration, gamblers showed reduced parietal and substantia nigra / ventral tegmental area activity. Cross-validated classification analyses revealed that connectivity in an exploration-related network was predictive of clinical status, suggesting alterations in network dynamics in gambling disorder.In sum, we show that reduced flexibility during reinforcement learning in volatile environments in gamblers is attributable to a reduction in directed exploration rather than an increase in perseveration. Neuroimaging findings suggest that patterns of network connectivity might be more diagnostic of gambling disorder than univariate value and prediction error effects. We provide a computational account of flexibility impairments in gamblers during reinforcement learning that might arise as a consequence of dopaminergic dysregulation in this disorder.


Sign in / Sign up

Export Citation Format

Share Document