scholarly journals Parameter and model recovery of reinforcement learning models for restless bandit problems

2021 ◽  
Author(s):  
Ludwig Danwitz ◽  
David Mathar ◽  
Elke Smith ◽  
Deniz Tuzsus ◽  
Jan Peters

Multi-armed restless bandit tasks are regularly applied in psychology and cognitive neuroscience to assess exploration and exploitation behavior in structured environments. These models are also readily applied to examine effects of (virtual) brain lesions on performance, and to infer neurocomputational mechanisms using neuroimaging or pharmacological approaches. However, to infer individual, psychologically meaningful parameters from such data, computational cognitive modeling is typically applied. Recent studies indicate that softmax (SM) decision rule models that include a representation of environmental dynamics (e.g. the Kalman Filter) and additional parameters for modeling exploration and perseveration (Kalman SMEP) fit human bandit task data better than competing models. Parameter and model recovery are two central requirements for computational models: parameter recovery refers to the ability to recover true data-generating parameters; model recovery refers to the ability to correctly identify the true data generating model using model comparison techniques. Here we comprehensively examined parameter and model recovery of the Kalman SMEP model as well as nested model versions, i.e. models without the additional parameters, using simulation and Bayesian inference. Parameter recovery improved with increasing trial numbers, from around .8 for 100 trials to around .93 for 300 trials. Model recovery analyses likewise confirmed acceptable recovery of the Kalman SMEP model. Model recovery was lower for nested Kalman filter models as well as delta rule models with fixed learning rates. Exploratory analyses examined associations of model parameters with model-free performance metrics. Random exploration, captured by the inverse softmax temperature, was associated with lower accuracy and more switches. For the exploration bonus parameter modeling directed exploration, we confirmed an inverse- U-shaped association with accuracy, such that both an excess and a lack of directed exploration reduced accuracy. Taken together, these analyses underline that the Kalman SMEP model fulfills basic requirements of a cognitive model.

2019 ◽  
Author(s):  
Harhim Park ◽  
Jaeyeong Yang ◽  
Jasmin Vassileva ◽  
Woo-Young Ahn

The Balloon Analogue Risk Task (BART) is a popular task used to measure risk-taking behavior. To identify cognitive processes associated with choice behavior on the BART, a few computational models have been proposed. However, the extant models are either too simplistic or fail to show good parameter recovery performance. Here, we propose a novel computational model, the exponential-weight mean-variance (EWMV) model, which addresses the limitations of existing models. By using multiple model comparison methods, including post hoc model fits criterion and parameter recovery, we showed that the EWMV model outperforms the existing models. In addition, we applied the EWMV model to BART data from healthy controls and substance-using populations (patients with past opiate and stimulant dependence). The results suggest that (1) the EWMV model addresses the limitations of existing models and (2) heroin-dependent individuals show reduced risk preference than other groups in the BART.


2018 ◽  
Author(s):  
Romain Ligneul

AbstractThe Iowa Gambling Task (IGT) is one of the most common paradigms used to assess decision-making and executive functioning in neurological and psychiatric disorders. Several reinforcement-learning (RL) models were recently proposed to refine the qualitative and quantitative inferences that can be made about these processes based on IGT data. Yet, these models do not account for the complex exploratory patterns which characterize participants’ behavior in the task. Using a dataset of more than 500 subjects, we demonstrate the existence of such patterns and we describe a new computational architecture (Explore-Exploit) disentangling exploitation, random exploration and directed exploration in this large population of participants. The EE architecture provided a better fit to the choice data on multiple metrics. Parameter recovery and simulation analyses confirmed the superiority of the EE scheme over alternative schemes. Furthermore, using the EE model, we were able to replicate the reduction in directed exploration across lifespan, as previously reported in other paradigms. Finally, we provide a user-friendly toolbox enabling researchers to easily fit computational models on the IGT data, hence promoting reanalysis of the numerous datasets acquired in various populations of patients.


Author(s):  
Kamalanand Krishnamurthy

Parameter estimation is a central issue in mathematical modelling of biomedical systems and for the development of patient specific models. The technique of estimating parameters helps in obtaining diagnostic information from computational models of biological systems. However, in most of the biomedical systems, the estimation of model parameters is a challenging task due to the nonlinearity of mathematical models. In this chapter, the method of estimation of nonlinear model parameters from measurements of state variables, using the extended Kalman filter, is extensively explained using an example of the three-dimensional model of the HIV/AIDS system.


2018 ◽  
pp. 690-713
Author(s):  
Kamalanand Krishnamurthy

Parameter estimation is a central issue in mathematical modelling of biomedical systems and for the development of patient specific models. The technique of estimating parameters helps in obtaining diagnostic information from computational models of biological systems. However, in most of the biomedical systems, the estimation of model parameters is a challenging task due to the nonlinearity of mathematical models. In this chapter, the method of estimation of nonlinear model parameters from measurements of state variables, using the extended Kalman filter, is extensively explained using an example of the three-dimensional model of the HIV/AIDS system.


2021 ◽  
Vol 12 ◽  
Author(s):  
Marco Ragni ◽  
Daniel Brand ◽  
Nicolas Riesterer

In the last few decades, cognitive theories for explaining human spatial relational reasoning have increased. Few of these theories have been implemented as computational models, however, even fewer have been compared computationally to each other. A computational model comparison requires, among other things, a still missing quantitative benchmark of core spatial relational reasoning problems. By presenting a new evaluation approach, this paper addresses: (1) developing a benchmark including raw data of participants, (2) reimplementation, adaptation, and extension of existing cognitive models to predict individual responses, and (3) a thorough evaluation of the cognitive models on the benchmark data. The paper shifts the research focus of cognitive modeling from reproducing aggregated response patterns toward assessing the predictive power of models for the individual reasoner. It demonstrate that not all psychological effects can discern theories. We discuss implications for modeling spatial relational reasoning.


Author(s):  
Maxim Ziatdinov ◽  
Ayana Ghosh ◽  
Sergei V Kalinin

Abstract Both experimental and computational methods for the exploration of structure, functionality, and properties of materials often necessitate the search across broad parameter spaces to discover optimal experimental conditions and regions of interest in the image space or parameter space of computational models. The direct grid search of the parameter space tends to be extremely time-consuming, leading to the development of strategies balancing exploration of unknown parameter spaces and exploitation towards required performance metrics. However, classical Bayesian optimization strategies based on the Gaussian process (GP) do not readily allow for the incorporation of the known physical behaviors or past knowledge. Here we explore a hybrid optimization/exploration algorithm created by augmenting the standard GP with a structured probabilistic model of the expected system’s behavior. This approach balances the flexibility of the non-parametric GP approach with a rigid structure of physical knowledge encoded into the parametric model. The fully Bayesian treatment of the latter allows additional control over the optimization via the selection of priors for the model parameters. The method is demonstrated for a noisy version of the classical objective function used to evaluate optimization algorithms and further extended to physical lattice models. This methodology is expected to be universally suitable for injecting prior knowledge in the form of physical models and past data in the Bayesian optimization framework.


Author(s):  
Sascha Meyen ◽  
Dorothee M. B. Sigg ◽  
Ulrike von Luxburg ◽  
Volker H. Franz

Abstract Background It has repeatedly been reported that, when making decisions under uncertainty, groups outperform individuals. Real groups are often replaced by simulated groups: Instead of performing an actual group discussion, individual responses are aggregated by a numerical computation. While studies have typically used unweighted majority voting (MV) for this aggregation, the theoretically optimal method is confidence weighted majority voting (CWMV)—if independent and accurate confidence ratings from the individual group members are available. To determine which simulations (MV vs. CWMV) reflect real group processes better, we applied formal cognitive modeling and compared simulated group responses to real group responses. Results Simulated group decisions based on CWMV matched the accuracy of real group decisions, while simulated group decisions based on MV showed lower accuracy. CWMV predicted the confidence that groups put into their group decisions well. However, real groups treated individual votes to some extent more equally weighted than suggested by CWMV. Additionally, real groups tend to put lower confidence into their decisions compared to CWMV simulations. Conclusion Our results highlight the importance of taking individual confidences into account when simulating group decisions: We found that real groups can aggregate individual confidences in a way that matches statistical aggregations given by CWMV to some extent. This implies that research using simulated group decisions should use CWMV instead of MV as a benchmark to compare real groups to.


2021 ◽  
Vol 11 (7) ◽  
pp. 2898
Author(s):  
Humberto C. Godinez ◽  
Esteban Rougier

Simulation of fracture initiation, propagation, and arrest is a problem of interest for many applications in the scientific community. There are a number of numerical methods used for this purpose, and among the most widely accepted is the combined finite-discrete element method (FDEM). To model fracture with FDEM, material behavior is described by specifying a combination of elastic properties, strengths (in the normal and tangential directions), and energy dissipated in failure modes I and II, which are modeled by incorporating a parameterized softening curve defining a post-peak stress-displacement relationship unique to each material. In this work, we implement a data assimilation method to estimate key model parameter values with the objective of improving the calibration processes for FDEM fracture simulations. Specifically, we implement the ensemble Kalman filter assimilation method to the Hybrid Optimization Software Suite (HOSS), a FDEM-based code which was developed for the simulation of fracture and fragmentation behavior. We present a set of assimilation experiments to match the numerical results obtained for a Split Hopkinson Pressure Bar (SHPB) model with experimental observations for granite. We achieved this by calibrating a subset of model parameters. The results show a steady convergence of the assimilated parameter values towards observed time/stress curves from the SHPB observations. In particular, both tensile and shear strengths seem to be converging faster than the other parameters considered.


Energies ◽  
2021 ◽  
Vol 14 (4) ◽  
pp. 1054
Author(s):  
Kuo Yang ◽  
Yugui Tang ◽  
Zhen Zhang

With the development of new energy vehicle technology, battery management systems used to monitor the state of the battery have been widely researched. The accuracy of the battery status assessment to a great extent depends on the accuracy of the battery model parameters. This paper proposes an improved method for parameter identification and state-of-charge (SOC) estimation for lithium-ion batteries. Using a two-order equivalent circuit model, the battery model is divided into two parts based on fast dynamics and slow dynamics. The recursive least squares method is used to identify parameters of the battery, and then the SOC and the open-circuit voltage of the model is estimated with the extended Kalman filter. The two-module voltages are calculated using estimated open circuit voltage and initial parameters, and model parameters are constantly updated during iteration. The proposed method can be used to estimate the parameters and the SOC in real time, which does not need to know the state of SOC and the value of open circuit voltage in advance. The method is tested using data from dynamic stress tests, the root means squared error of the accuracy of the prediction model is about 0.01 V, and the average SOC estimation error is 0.0139. Results indicate that the method has higher accuracy in offline parameter identification and online state estimation than traditional recursive least squares methods.


Signals ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 434-455
Author(s):  
Sujan Kumar Roy ◽  
Kuldip K. Paliwal

Inaccurate estimates of the linear prediction coefficient (LPC) and noise variance introduce bias in Kalman filter (KF) gain and degrade speech enhancement performance. The existing methods propose a tuning of the biased Kalman gain, particularly in stationary noise conditions. This paper introduces a tuning of the KF gain for speech enhancement in real-life noise conditions. First, we estimate noise from each noisy speech frame using a speech presence probability (SPP) method to compute the noise variance. Then, we construct a whitening filter (with its coefficients computed from the estimated noise) to pre-whiten each noisy speech frame prior to computing the speech LPC parameters. We then construct the KF with the estimated parameters, where the robustness metric offsets the bias in KF gain during speech absence of noisy speech to that of the sensitivity metric during speech presence to achieve better noise reduction. The noise variance and the speech model parameters are adopted as a speech activity detector. The reduced-biased Kalman gain enables the KF to minimize the noise effect significantly, yielding the enhanced speech. Objective and subjective scores on the NOIZEUS corpus demonstrate that the enhanced speech produced by the proposed method exhibits higher quality and intelligibility than some benchmark methods.


Sign in / Sign up

Export Citation Format

Share Document