Bayesian optimization of generalized data

Direct application of Bayes' theorem to generalized data yields a posterior probability distribution function (PDF) that is a product of a prior PDF of generalized data and a likelihood function, where generalized data consists of model parameters, measured data, and model defect data. The prior PDF of generalized data is defined by prior expectation values and a prior covariance matrix of generalized data that naturally includes covariance between any two components of generalized data. A set of constraints imposed on the posterior expectation values and covariances of generalized data via a given model is formally solved by the method of Lagrange multipliers. Posterior expectation values of the constraints and their covariance matrix are conventionally set to zero, leading to a likelihood function that is a Dirac delta function of the constraining equation. It is shown that setting constraints to values other than zero is analogous to introducing a model defect. Since posterior expectation values of any function of generalized data are integrals of that function over all generalized data weighted by the posterior PDF, all elements of generalized data may be viewed as nuisance parameters marginalized by this integration. One simple form of posterior PDF is obtained when the prior PDF and the likelihood function are normal PDFs. For linear models without a defect this PDF becomes equivalent to constrained least squares (CLS) method, that is, the χ2 minimization method.

Download Full-text

Estimating Parameters of Generalized Integrate-and-Fire Neurons from the Maximum Likelihood of Spike Trains

Neural Computation ◽

10.1162/neco_a_00196 ◽

2011 ◽

Vol 23 (11) ◽

pp. 2833-2867 ◽

Cited By ~ 15

Author(s):

Yi Dong ◽

Stefan Mihalas ◽

Alexander Russell ◽

Ralph Etienne-Cummings ◽

Ernst Niebur

Keyword(s):

Maximum Likelihood ◽

Spike Train ◽

Global Minimum ◽

Likelihood Function ◽

Nonlinear Function ◽

Likelihood Method ◽

Model Parameters ◽

Minimization Method ◽

Integrate And Fire ◽

Log Likelihood

When a neuronal spike train is observed, what can we deduce from it about the properties of the neuron that generated it? A natural way to answer this question is to make an assumption about the type of neuron, select an appropriate model for this type, and then choose the model parameters as those that are most likely to generate the observed spike train. This is the maximum likelihood method. If the neuron obeys simple integrate-and-fire dynamics, Paninski, Pillow, and Simoncelli ( 2004 ) showed that its negative log-likelihood function is convex and that, at least in principle, its unique global minimum can thus be found by gradient descent techniques. Many biological neurons are, however, known to generate a richer repertoire of spiking behaviors than can be explained in a simple integrate-and-fire model. For instance, such a model retains only an implicit (through spike-induced currents), not an explicit, memory of its input; an example of a physiological situation that cannot be explained is the absence of firing if the input current is increased very slowly. Therefore, we use an expanded model (Mihalas & Niebur, 2009 ), which is capable of generating a large number of complex firing patterns while still being linear. Linearity is important because it maintains the distribution of the random variables and still allows maximum likelihood methods to be used. In this study, we show that although convexity of the negative log-likelihood function is not guaranteed for this model, the minimum of this function yields a good estimate for the model parameters, in particular if the noise level is treated as a free parameter. Furthermore, we show that a nonlinear function minimization method (r-algorithm with space dilation) usually reaches the global minimum.

Download Full-text

REGIONALISATION OF POSTERIOR PROBABILITY DISTRIBUTION OF MODEL PARAMETERS : PREDECTION ON UNGAGUED BASIN

PROCEEDINGS OF HYDRAULIC ENGINEERING ◽

10.2208/prohe.52.103 ◽

2008 ◽

Vol 52 ◽

pp. 103-108

Author(s):

Satish BASTOLA ◽

Hiroshi ISHIDAIRA ◽

Kuniyoshi TAKEUCHI

Keyword(s):

Probability Distribution ◽

Posterior Probability ◽

Model Parameters ◽

Posterior Probability Distribution

Download Full-text

Bayesian calibration of terrestrial ecosystem models: a study of advanced Markov chain Monte Carlo methods

Biogeosciences ◽

10.5194/bg-14-4295-2017 ◽

2017 ◽

Vol 14 (18) ◽

pp. 4295-4314 ◽

Cited By ~ 12

Author(s):

Dan Lu ◽

Daniel Ricciuto ◽

Anthony Walker ◽

Cosmin Safta ◽

William Munger

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Likelihood Function ◽

Terrestrial Ecosystem ◽

Error Model ◽

Model Parameters ◽

Posterior Distributions ◽

Ecosystem Models ◽

Bayesian Calibration

Abstract. Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results in a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. The result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.

Download Full-text

Towards a covariance matrix of CAB model parameters for H(H2O)

EPJ Web of Conferences ◽

10.1051/epjconf/201714613010 ◽

2017 ◽

Vol 146 ◽

pp. 13010

Author(s):

Juan Pablo Scotta ◽

Gilles Noguere ◽

José Ignacio Marquez Damian

Keyword(s):

Covariance Matrix ◽

Model Parameters

Download Full-text

Selecting a shrinkage parameter in structural equation modeling with a near singular covariance matrix by the GIC minimization method

Hiroshima Mathematical Journal ◽

10.32917/hmj/1419619749 ◽

2014 ◽

Vol 44 (3) ◽

pp. 315-326

Author(s):

Ami Kamada ◽

Hirokazu Yanagihara ◽

Hirofumi Wakaki ◽

Keisuke Fukui

Keyword(s):

Structural Equation Modeling ◽

Covariance Matrix ◽

Structural Equation ◽

Equation Modeling ◽

Minimization Method ◽

Singular Covariance Matrix

Download Full-text

Efficient detection of repeating sites to accelerate phylogenetic likelihood calculations

10.1101/035873 ◽

2016 ◽

Cited By ~ 2

Author(s):

Kassian Kobert ◽

Alexandros Stamatakis ◽

Tomáš Flouri

Keyword(s):

Evolutionary Biology ◽

Likelihood Function ◽

Simulated Data ◽

Evolutionary Model ◽

Identical Result ◽

Model Parameters ◽

Data Sets ◽

Efficient Detection ◽

Novel Method ◽

Computational Bottleneck

The phylogenetic likelihood function is the major computational bottleneck in several applications of evolutionary biology such as phylogenetic inference, species delimitation, model selection and divergence times estimation. Given the alignment, a tree and the evolutionary model parameters, the likelihood function computes the conditional likelihood vectors for every node of the tree. Vector entries for which all input data are identical result in redundant likelihood operations which, in turn, yield identical conditional values. Such operations can be omitted for improving run-time and, using appropriate data structures, reducing memory usage. We present a fast, novel method for identifying and omitting such redundant operations in phylogenetic likelihood calculations, and assess the performance improvement and memory saving attained by our method. Using empirical and simulated data sets, we show that a prototype implementation of our method yields up to 10-fold speedups and uses up to 78% less memory than one of the fastest and most highly tuned implementations of the phylogenetic likelihood function currently available. Our method is generic and can seamlessly be integrated into any phylogenetic likelihood implementation.

Download Full-text

Towards an Efficient Validation of Dynamical Whole-brain Models

10.21203/rs.3.rs-1139051/v1 ◽

2021 ◽

Author(s):

Kevin J. Wischnewski ◽

Simon B. Eickhoff ◽

Viktor K. Jirsa ◽

Oleksandr V. Popovych

Keyword(s):

Structural Connectivity ◽

Three Dimensional ◽

Bayesian Optimization ◽

High Dimensional ◽

Brain Dynamics ◽

Model Parameters ◽

Phase Oscillators ◽

Dimensional Parameter ◽

Whole Brain ◽

Brain Models

Abstract Simulating the resting-state brain dynamics via mathematical whole-brain models requires an optimal selection of parameters, which determine the model’s capability to replicate empirical data. Since the parameter optimization via a grid search (GS) becomes unfeasible for high-dimensional models, we evaluate several alternative approaches to maximize the correspondence between simulated and empirical functional connectivity. A dense GS serves as a benchmark to assess the performance of four optimization schemes: Nelder-Mead Algorithm (NMA), Particle Swarm Optimization (PSO), Covariance Matrix Adaptation Evolution Strategy (CMAES) and Bayesian Optimization (BO). To compare them, we employ an ensemble of coupled phase oscillators built upon individual empirical structural connectivity of 105 healthy subjects. We determine optimal model parameters from two- and three-dimensional parameter spaces and show that the overall fitting quality of the tested methods can compete with the GS. There are, however, marked differences in the required computational resources and stability properties, which we also investigate before proposing CMAES and BO as efficient alternatives to a high-dimensional GS. For the three-dimensional case, these methods generated similar results as the GS, but within less than 6% of the computation time. Our results contribute to an efficient validation of models for personalized simulations of brain dynamics.

Download Full-text

DNILMF-LDA: Prediction of lncRNA-Disease Associations by Dual-Network Integrated Logistic Matrix Factorization and Bayesian Optimization

Genes ◽

10.3390/genes10080608 ◽

2019 ◽

Vol 10 (8) ◽

pp. 608 ◽

Cited By ~ 3

Author(s):

Yan Li ◽

Junyi Li ◽

Naizheng Bian

Keyword(s):

Matrix Factorization ◽

Disease Diagnosis ◽

Disease Association ◽

Bayesian Optimization ◽

Model Parameters ◽

Target Interaction ◽

Disease Associations ◽

Auc Value ◽

Similarity Networks ◽

Fold Cross Validation

Identifying associations between lncRNAs and diseases can help understand disease-related lncRNAs and facilitate disease diagnosis and treatment. The dual-network integrated logistic matrix factorization (DNILMF) model has been used for drug–target interaction prediction, and good results have been achieved. We firstly applied DNILMF to lncRNA–disease association prediction (DNILMF-LDA). We combined different similarity kernel matrices of lncRNAs and diseases by using nonlinear fusion to extract the most important information in fused matrices. Then, lncRNA–disease association networks and similarity networks were built simultaneously. Finally, the Gaussian process mutual information (GP-MI) algorithm of Bayesian optimization was adopted to optimize the model parameters. The 10-fold cross-validation result showed that the area under receiving operating characteristic (ROC) curve (AUC) value of DNILMF-LDA was 0.9202, and the area under precision-recall (PR) curve (AUPR) was 0.5610. Compared with LRLSLDA, SIMCLDA, BiwalkLDA, and TPGLDA, the AUC value of our method increased by 38.81%, 13.07%, 8.35%, and 6.75%, respectively. The AUPR value of our method increased by 52.66%, 40.05%, 37.01%, and 44.25%. These results indicate that DNILMF-LDA is an effective method for predicting the associations between lncRNAs and diseases.

Download Full-text

Generalized Exponentiated Weibull Linear Model in the Presence of Covariates

International Journal of Statistics and Probability ◽

10.5539/ijsp.v6n3p75 ◽

2017 ◽

Vol 6 (3) ◽

pp. 75

Author(s):

Tiago V. F. Santana ◽

Edwin M. M. Ortega ◽

Gauss M. Cordeiro ◽

Adriano K. Suzuki

Keyword(s):

Regression Model ◽

Linear Model ◽

Goodness Of Fit ◽

Likelihood Function ◽

Confidence Bands ◽

Model Parameters ◽

Random Component ◽

Monte Carlo Simulation Study ◽

New Model ◽

Exponentiated Weibull

A new regression model based on the exponentiated Weibull with the structure distribution and the structure of the generalized linear model, called the generalized exponentiated Weibull linear model (GEWLM), is proposed. The GEWLM is composed by three important structural parts: the random component, characterized by the distribution of the response variable; the systematic component, which includes the explanatory variables in the model by means of a linear structure; and a link function, which connects the systematic and random parts of the model. Explicit expressions for the logarithm of the likelihood function, score vector and observed and expected information matrices are presented. The method of maximum likelihood and a Bayesian procedure are adopted for estimating the model parameters. To detect influential observations in the new model, we use diagnostic measures based on the local influence and Bayesian case influence diagnostics. Also, we show that the estimates of the GEWLM are robust to deal with the presence of outliers in the data. Additionally, to check whether the model supports its assumptions, to detect atypical observations and to verify the goodness-of-fit of the regression model, we define residuals based on the quantile function, and perform a Monte Carlo simulation study to construct confidence bands from the generated envelopes. We apply the new model to a dataset from the insurance area.

Download Full-text

A Hierarchical Model Validation of Predictive Models for Engineering Product Development

Volume 5: 35th Design Automation Conference, Parts A and B ◽

10.1115/detc2009-87571 ◽

2009 ◽

Author(s):

Byeng D. Youn ◽

Byung C. Jung ◽

Zhimin Xi ◽

Sang Bum Kim

Keyword(s):

Model Validation ◽

Computer Model ◽

Hierarchical Model ◽

Predictive Models ◽

Likelihood Function ◽

Hierarchical Level ◽

Model Parameters ◽

Bottom Up ◽

Hierarchy Level ◽

Statistical Calibration

As the role of predictive models has increased, the fidelity of computational results has been of great concern to engineering decision makers. Often our limited understanding of complex systems leads to building inappropriate predictive models. To address a growing concern about the fidelity of the predictive models, this paper proposes a hierarchical model validation procedure with two validation activities: (1) validation planning (top-down) and (2) validation execution (bottom-up). In the validation planning, engineers define either the physics-of-failure (PoF) mechanisms or the system performances of interest. Then, the engineering system is decomposed into subsystems or components of which computer models are partially valid in terms of PoF mechanisms or system performances of interest. Validation planning will identify vital tests and predictive models along with both known and unknown model parameter(s). The validation execution takes a bottom-up approach, improving the fidelity of the computer model at any hierarchical level using a statistical calibration technique. This technique compares the observed test results with the predicted results from the computer model. A likelihood function is used for the comparison metric. In the statistical calibration, an optimization technique is employed to maximize the likelihood function while determining the unknown model parameters. As the predictive model at a lower hierarchy level becomes valid, the valid model is fused into a model at a higher hierarchy level. The validation execution is then continued for the model at the higher hierarchy level. A cellular phone is used to demonstrate the hierarchical validation of predictive models presented in this paper.

Download Full-text