scholarly journals Model-Based Influences on Humans' Choices and Striatal Prediction Errors

Neuron ◽  
2011 ◽  
Vol 69 (6) ◽  
pp. 1204-1215 ◽  
Author(s):  
Nathaniel D. Daw ◽  
Samuel J. Gershman ◽  
Ben Seymour ◽  
Peter Dayan ◽  
Raymond J. Dolan
2020 ◽  
Author(s):  
Dongjae Kim ◽  
Jaeseung Jeong ◽  
Sang Wan Lee

AbstractThe goal of learning is to maximize future rewards by minimizing prediction errors. Evidence have shown that the brain achieves this by combining model-based and model-free learning. However, the prediction error minimization is challenged by a bias-variance tradeoff, which imposes constraints on each strategy’s performance. We provide new theoretical insight into how this tradeoff can be resolved through the adaptive control of model-based and model-free learning. The theory predicts the baseline correction for prediction error reduces the lower bound of the bias–variance error by factoring out irreducible noise. Using a Markov decision task with context changes, we showed behavioral evidence of adaptive control. Model-based behavioral analyses show that the prediction error baseline signals context changes to improve adaptability. Critically, the neural results support this view, demonstrating multiplexed representations of prediction error baseline within the ventrolateral and ventromedial prefrontal cortex, key brain regions known to guide model-based and model-free learning.One sentence summaryA theoretical, behavioral, computational, and neural account of how the brain resolves the bias-variance tradeoff during reinforcement learning is described.


2002 ◽  
Vol 14 (6) ◽  
pp. 1347-1369 ◽  
Author(s):  
Kenji Doya ◽  
Kazuyuki Samejima ◽  
Ken-ichi Katagiri ◽  
Mitsuo Kawato

We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state prediction model and a reinforcement learning controller. The “responsibility signal,” which is given by the softmax function of the prediction errors, is used to weight the outputs of multiple modules, as well as to gate the learning of the prediction models and the reinforcement learning controllers. We formulate MMRL for both discrete-time, finite-state case and continuous-time, continuous-state case. The performance of MMRL was demonstrated for discrete case in a nonstationary hunting task in a grid world and for continuous case in a nonlinear, nonstationary control task of swinging up a pendulum with variable physical parameters.


Author(s):  
Ramon Fraga Pereira ◽  
Mor Vered ◽  
Felipe Meneguzzi ◽  
Miquel Ramírez

This paper revisits probabilistic, model-based goal recognition to study the implications of the use of nominal models to estimate the posterior probability distribution over a finite set of hypothetical goals. Existing model-based approaches rely on expert knowledge to produce symbolic descriptions of the dynamic constraints domain objects are subject to, and these are assumed to produce correct predictions. We abandon this assumption to consider the use of nominal models that are learnt from observations on transitions of systems with unknown dynamics. Leveraging existing work on the acquisition of domain models via learning for Hybrid Planning we adapt and evaluate existing goal recognition approaches to analyze how prediction errors, inherent to system dynamics identification and model learning techniques have an impact over recognition error rates.


2014 ◽  
Vol 104 (5) ◽  
pp. 566-575 ◽  
Author(s):  
V. Ortega-López ◽  
M. Amo-Salas ◽  
A. Ortiz-Barredo ◽  
A.M. Díez-Navajas

AbstractLobesia botrana is the most significant pest of grape berries in Spain. Further knowledge of its phenology would enable wine growers to decide on an optimal treatment schedule. The aim of this study is, therefore, to predict the flight peaks of L. botrana in seven wine-growing regions of Spain. The main goal is to provide a prediction model based on meteorological data records. A logistic function model, based on temperature and humidity records, together with an exhaustive statistical analysis, were used to compare the wine-growing regions in which the male flight phenology of L. botrana displays similar patterns and to sort them into groups. By doing so, a joint study of the dynamics of the moth is possible in the regions within each group. A comparison of the prediction errors before and after applying the Touzeau model confirmed that the fit of the latter model is not sufficiently accurate for the regions under study. Moth flight predictions with the logistic function model are good, but accuracy may still be improved by evaluating other non-biotic and biotic factors.


2018 ◽  
Author(s):  
E. Kayhan ◽  
L. Heil ◽  
J. Kwisthout ◽  
I. van Rooij ◽  
S. Hunnius ◽  
...  

AbstractFrom early on in life, children are able to use information from their environment to form predictions about events. For instance, they can use statistical information about a population to predict the sample drawn from that population and infer an agent’s preferences from systematic violations of random sampling. We investigated how young children build and update models of an agent’s sampling actions over time, and whether a computational model based on the causal Bayesian network formalization of predictive processing can explain this process.We formalized three hypotheses about how different explanatory variables (i.e., prior probabilities, current observations, and agent characteristics) are used to build predictive models of others’ actions. We measured pupillary responses as a behavioral marker of ‘prediction errors’ (i.e., the perceived mismatch between what one’s model of an agent predicts and what the agent actually does), as described in the predictive processing framework. Pupillary responses of 24-month-olds, but not 18-month-olds, showed that young children integrated information about current observations, priors and agents to generate predictive models of agents and their actions.These findings shed light on the mechanisms behind toddlers’ inferences about agent-caused events. To our knowledge, this is the first study in which young children’s pupillary responses are used as markers of prediction errors, and explained by a computational model based on the causal Bayesian network formalization of predictive processing. We argue that the predictive processing framework provides a promising explanation of the way in which young children process other persons’ actions.HighlightsWe present three formalized hypotheses on how young children generate predictive models of others’ sampling actions.We measured pupillary responses of children as a behavioral marker of prediction errors as described in the predictive processing framework.Results showed that young children integrated information about current observations, prior probabilities and agents to generate predictive models about others’ actions.A computational model based on the causal Bayesian network formalization of predictive processing can explain this process.


2017 ◽  
Author(s):  
R. Keiflin ◽  
H.J. Pribut ◽  
N.B. Shah ◽  
P.H. Janak

ABSTRACTDopamine (DA) neurons in the ventral tegmental area (VTA) and substantia nigra (SNc) encode reward prediction errors (RPEs) and are proposed to mediate error-driven learning. However the learning strategy engaged by DA-RPEs remains controversial. Model-free associations imbue cue/actions with pure value, independently of representations of their associated outcome. In contrast, model-based associations support detailed representation of anticipated outcomes. Here we show that although both VTA and SNc DA neuron activation reinforces instrumental responding, only VTA DA neuron activation during consumption of expected sucrose reward restores error-driven learning and promotes formation of a new cue→sucrose association. Critically, expression of VTA DA-dependent Pavlovian associations is abolished following sucrose devaluation, a signature of model-based learning. These findings reveal that activation of VTA-or SNc-DA neurons engages largely dissociable learning processes with VTA-DA neurons capable of participating in model-based predictive learning, while the role of SNc-DA neurons appears limited to reinforcement of instrumental responses.


2021 ◽  
Vol 17 (2) ◽  
pp. e1008738
Author(s):  
Kentaro Katahira ◽  
Asako Toyama

Computational modeling has been applied for data analysis in psychology, neuroscience, and psychiatry. One of its important uses is to infer the latent variables underlying behavior by which researchers can evaluate corresponding neural, physiological, or behavioral measures. This feature is especially crucial for computational psychiatry, in which altered computational processes underlying mental disorders are of interest. For instance, several studies employing model-based fMRI—a method for identifying brain regions correlated with latent variables—have shown that patients with mental disorders (e.g., depression) exhibit diminished neural responses to reward prediction errors (RPEs), which are the differences between experienced and predicted rewards. Such model-based analysis has the drawback that the parameter estimates and inference of latent variables are not necessarily correct—rather, they usually contain some errors. A previous study theoretically and empirically showed that the error in model-fitting does not necessarily cause a serious error in model-based fMRI. However, the study did not deal with certain situations relevant to psychiatry, such as group comparisons between patients and healthy controls. We developed a theoretical framework to explore such situations. We demonstrate that the parameter-misspecification can critically affect the results of group comparison. We demonstrate that even if the RPE response in patients is completely intact, a spurious difference to healthy controls is observable. Such a situation occurs when the ground-truth learning rate differs between groups but a common learning rate is used, as per previous studies. Furthermore, even if the parameters are appropriately fitted to individual participants, spurious group differences in RPE responses are observable when the model lacks a component that differs between groups. These results highlight the importance of appropriate model-fitting and the need for caution when interpreting the results of model-based fMRI.


2011 ◽  
Vol 17 (4) ◽  
pp. 529-539 ◽  
Author(s):  
Ming-Chiao Lin ◽  
Hui Ping Tserng ◽  
Shih-Ping Ho ◽  
Der-Liang Young

The delay of vast building projects is still a common problem. This situation is extraordinarily severe to steel reinforced concrete (SRC) building projects that keep going to promote a new structure system in Taiwan's construction industry. The aim of this study is to develop a feasible contract duration model based upon few SRC building cases. A logical approach is employed to select and assure the “good” regression model identified when project characteristics were known and external uncertainties were reasonably estimated. Different necessary diagnostics had been adopted to examine the aptness of the model before inference. The cross-validation is used to validate the appropriateness of the variables selected and magnitudes of the regression coefficients. The mean of the square prediction errors (MSPR) is selected to measure the predictive ability of the model proposed, and the result shows that the predictive ability of the selected regression model could be adequate. Finally, several cases are taken to test the predictive accuracy of the model proposed, and the result shows that the actually necessary construction duration is considerably closed to the duration predicted by the mode. It is concluded that the predictive duration model proposed could be applicable to the SRC construction projects with a reasonable reliability. Santrauka Vėlavimas realizuojant stambius statybų projektus tebėra dažna problema. Ši situacija ypač rimta dirbant su gelžbetoninių statybų projektais, kuriais populiarinama naujų konstrukcijų sistema Taivanio statybų sektoriuje. Šiuo tyrimu siekiama sukurti tinkamą sutarties trukmės modelį, pagrįstą keliais gelžbetoninės statybos atvejais. Atrenkant ir užtikrinant, kad būtų sudarytas ,,geras“ regresijos modelis, kai projekto savybės žinomos, o išoriniai neapibrėžtumai pakankamai įvertinti, taikytas loginis metodas. Prieš darant išvadas modelio tinkamumas buvo išnagrinėtas naudojant skirtingas būtinos diagnostikos priemones. Naudojant kryžminį patikrinimą pagrindžiamas pasirinktų kintamujų tinkamumas ir regresijos koeficientų vertės. Siekiant įvertinti siūlomo modelio tinkamumą prognozuoti, apskaičiuotos vidutinės kvadratinės prognozavimo paklaidos (angl. Mean of the Square Prediction Errors, MSPR). Rezultatas rodo, kad pasirinktasis regresijos modelis prognozuoja gana gerai. Pagaliau pasirinkus kelis atvejus išbandomas siūlomo modelio prognozių tikslumas. Rezultatas rodo, kad faktinė bū tina statybų trukmė gana artima modelio prognozuojamai trukmei. Daroma išvada, kad siūlomą trukmės prognozių modelį galima taikyti gelžbetoninės statybos projektuose, o jo rezultatai bus gana patikimi.


Sign in / Sign up

Export Citation Format

Share Document