scholarly journals Emulation-accelerated Hamiltonian Monte Carlo algorithms for parameter estimation and uncertainty quantification in differential equation models

2021 ◽  
Vol 32 (1) ◽  
Author(s):  
L. Mihaela Paun ◽  
Dirk Husmeier

AbstractWe propose to accelerate Hamiltonian and Lagrangian Monte Carlo algorithms by coupling them with Gaussian processes for emulation of the log unnormalised posterior distribution. We provide proofs of detailed balance with respect to the exact posterior distribution for these algorithms, and validate the correctness of the samplers’ implementation by Geweke consistency tests. We implement these algorithms in a delayed acceptance (DA) framework, and investigate whether the DA scheme can offer computational gains over the standard algorithms. A comparative evaluation study is carried out to assess the performance of the methods on a series of models described by differential equations, including a real-world application of a 1D fluid-dynamics model of the pulmonary blood circulation. The aim is to identify the algorithm which gives the best trade-off between accuracy and computational efficiency, to be used in nonlinear DE models, which are computationally onerous due to repeated numerical integrations in a Bayesian analysis. Results showed no advantage of the DA scheme over the standard algorithms with respect to several efficiency measures based on the effective sample size for most methods and DE models considered. These gradient-driven algorithms register a high acceptance rate, thus the number of expensive forward model evaluations is not significantly reduced by the first emulator-based stage of DA. Additionally, the Lagrangian Dynamical Monte Carlo and Riemann Manifold Hamiltonian Monte Carlo tended to register the highest efficiency (in terms of effective sample size normalised by the number of forward model evaluations), followed by the Hamiltonian Monte Carlo, and the No U-turn sampler tended to be the least efficient.

Algorithms ◽  
2021 ◽  
Vol 14 (12) ◽  
pp. 351
Author(s):  
Wilson Tsakane Mongwe ◽  
Rendani Mbuvha ◽  
Tshilidzi Marwala

Markov chain Monte Carlo (MCMC) techniques are usually used to infer model parameters when closed-form inference is not feasible, with one of the simplest MCMC methods being the random walk Metropolis–Hastings (MH) algorithm. The MH algorithm suffers from random walk behaviour, which results in inefficient exploration of the target posterior distribution. This method has been improved upon, with algorithms such as Metropolis Adjusted Langevin Monte Carlo (MALA) and Hamiltonian Monte Carlo being examples of popular modifications to MH. In this work, we revisit the MH algorithm to reduce the autocorrelations in the generated samples without adding significant computational time. We present the: (1) Stochastic Volatility Metropolis–Hastings (SVMH) algorithm, which is based on using a random scaling matrix in the MH algorithm, and (2) Locally Scaled Metropolis–Hastings (LSMH) algorithm, in which the scaled matrix depends on the local geometry of the target distribution. For both these algorithms, the proposal distribution is still Gaussian centred at the current state. The empirical results show that these minor additions to the MH algorithm significantly improve the effective sample rates and predictive performance over the vanilla MH method. The SVMH algorithm produces similar effective sample sizes to the LSMH method, with SVMH outperforming LSMH on an execution time normalised effective sample size basis. The performance of the proposed methods is also compared to the MALA and the current state-of-art method being the No-U-Turn sampler (NUTS). The analysis is performed using a simulation study based on Neal’s funnel and multivariate Gaussian distributions and using real world data modeled using jump diffusion processes and Bayesian logistic regression. Although both MALA and NUTS outperform the proposed algorithms on an effective sample size basis, the SVMH algorithm has similar or better predictive performance when compared to MALA and NUTS across the various targets. In addition, the SVMH algorithm outperforms the other MCMC algorithms on a normalised effective sample size basis on the jump diffusion processes datasets. These results indicate the overall usefulness of the proposed algorithms.


Geophysics ◽  
2020 ◽  
Vol 85 (3) ◽  
pp. R177-R194 ◽  
Author(s):  
Mattia Aleardi ◽  
Alessandro Salusti

A reliable assessment of the posterior uncertainties is a crucial aspect of any amplitude versus angle (AVA) inversion due to the severe ill-conditioning of this inverse problem. To accomplish this task, numerical Markov chain Monte Carlo algorithms are usually used when the forward operator is nonlinear. The downside of these algorithms is the considerable number of samples needed to attain stable posterior estimations especially in high-dimensional spaces. To overcome this issue, we assessed the suitability of Hamiltonian Monte Carlo (HMC) algorithm for nonlinear target- and interval-oriented AVA inversions for the estimation of elastic properties and associated uncertainties from prestack seismic data. The target-oriented approach inverts the AVA responses of the target reflection by adopting the nonlinear Zoeppritz equations, whereas the interval-oriented method inverts the seismic amplitudes along a time interval using a 1D convolutional forward model still based on the Zoeppritz equations. HMC uses an artificial Hamiltonian system in which a model is viewed as a particle moving along a trajectory in an extended space. In this context, the inclusion of the derivative information of the misfit function makes possible long-distance moves with a high probability of acceptance from the current position toward a new independent model. In our application, we adopt a simple Gaussian a priori distribution that allows for an analytical inclusion of geostatistical constraints into the inversion framework, and we also develop a strategy that replaces the numerical computation of the Jacobian with a matrix operator analytically derived from a linearization of the Zoeppritz equations. Synthetic and field data inversions demonstrate that the HMC is a very promising approach for Bayesian AVA inversion that guarantees an efficient sampling of the model space and retrieves reliable estimations and accurate uncertainty quantifications with an affordable computational cost.


2018 ◽  
Author(s):  
R.L. Harms ◽  
A. Roebroeck

AbstractIn diffusion MRI analysis, advances in biophysical multi-compartment modeling have gained popularity over the conventional Diffusion Tensor Imaging (DTI), because they possess greater specificity in relating the dMRI signal to underlying cellular microstructure. Biophysical multi-compartment models require parameter estimation, typically performed using either Maximum Likelihood Estimation (MLE) or using Monte Carlo Markov Chain (MCMC) sampling. Whereas MLE provides only a point estimate of the fitted model parameters, MCMC recovers the entire posterior distribution of the model parameters given the data, providing additional information such as parameter uncertainty and correlations. MCMC sampling is currently not routinely applied in dMRI microstructure modeling because it requires adjustments and tuning specific to each model, particularly in the choice of proposal distributions, burn-in length, thinning and the number of samples to store. In addition, sampling often takes at least an order of magnitude more time than non-linear optimization. Here we investigate the performance of MCMC algorithm variations over multiple popular diffusion microstructure models to see whether a single well performing variation could be applied efficiently and robustly to many models. Using an efficient GPU-based implementation, we show that run times can be removed as a prohibitive constraint for sampling of diffusion multi-compartment models. Using this implementation, we investigated the effectiveness of different adaptive MCMC algorithms, burn-in, initialization and thinning. Finally we apply the theory of Effective Sample Size to diffusion multi-compartment models as a way of determining a relatively general target for the number of samples needed to characterize parameter distributions for different models and datasets. We conclude that robust and fast sampling is achieved in most diffusion microstructure models with the Adaptive Metropolis-Within-Gibbs (AMWG) algorithm initialized with an MLE point estimate, in which case 100 to 200 samples are sufficient as a burn-in and thinning is mostly unnecessary. As a relatively general target for the number of samples, we recommend a multivariate Effective Sample Size of 2200.


2020 ◽  
Vol 5 ◽  
pp. 53
Author(s):  
Guy Baele ◽  
Mandev S. Gill ◽  
Philippe Lemey ◽  
Marc A. Suchard

Nonparametric coalescent-based models are often employed to infer past population dynamics over time. Several of these models, such as the skyride and skygrid models, are equipped with a block-updating Markov chain Monte Carlo sampling scheme to efficiently estimate model parameters. The advent of powerful computational hardware along with the use of high-performance libraries for statistical phylogenetics has, however, made the development of alternative estimation methods feasible. We here present the implementation and performance assessment of a Hamiltonian Monte Carlo gradient-based sampler to infer the parameters of the skygrid model. The skygrid is a popular and flexible coalescent-based model for estimating population dynamics over time and is available in BEAST 1.10.5, a widely-used software package for Bayesian pylogenetic and phylodynamic analysis. Taking into account the increased computational cost of gradient evaluation, we report substantial increases in effective sample size per time unit compared to the established block-updating sampler. We expect gradient-based samplers to assume an increasingly important role for different classes of parameters typically estimated in Bayesian phylogenetic and phylodynamic analyses.


2016 ◽  
Vol 28 (2) ◽  
pp. 382-444 ◽  
Author(s):  
Motonobu Kanagawa ◽  
Yu Nishiyama ◽  
Arthur Gretton ◽  
Kenji Fukumizu

This letter addresses the problem of filtering with a state-space model. Standard approaches for filtering assume that a probabilistic model for observations (i.e., the observation model) is given explicitly or at least parametrically. We consider a setting where this assumption is not satisfied; we assume that the knowledge of the observation model is provided only by examples of state-observation pairs. This setting is important and appears when state variables are defined as quantities that are very different from the observations. We propose kernel Monte Carlo filter, a novel filtering method that is focused on this setting. Our approach is based on the framework of kernel mean embeddings, which enables nonparametric posterior inference using the state-observation examples. The proposed method represents state distributions as weighted samples, propagates these samples by sampling, estimates the state posteriors by kernel Bayes’ rule, and resamples by kernel herding. In particular, the sampling and resampling procedures are novel in being expressed using kernel mean embeddings, so we theoretically analyze their behaviors. We reveal the following properties, which are similar to those of corresponding procedures in particle methods: the performance of sampling can degrade if the effective sample size of a weighted sample is small, and resampling improves the sampling performance by increasing the effective sample size. We first demonstrate these theoretical findings by synthetic experiments. Then we show the effectiveness of the proposed filter by artificial and real data experiments, which include vision-based mobile robot localization.


Geophysics ◽  
2019 ◽  
Vol 84 (5) ◽  
pp. M1-M13 ◽  
Author(s):  
Leandro Passos de Figueiredo ◽  
Dario Grana ◽  
Mauro Roisenberg ◽  
Bruno B. Rodrigues

One of the main objectives in the reservoir characterization is estimating the rock properties based on seismic measurements. We have developed a stochastic sampling method for the joint prediction of facies and petrophysical properties, assuming a nonparametric mixture prior distribution and a nonlinear forward model. The proposed methodology is based on a Markov chain Monte Carlo (MCMC) method specifically designed for multimodal distributions for nonlinear problems. The vector of model parameters includes the facies sequence along the seismic trace as well as the continuous petrophysical properties, such as porosity, mineral fractions, and fluid saturations. At each location, the distribution of petrophysical properties is assumed to be multimodal and nonparametric with as many modes as the number of facies; therefore, along the seismic trace, the distribution is multimodal with the number of modes being equal to the number of facies power the number of samples. Because of the nonlinear forward model, the large number of modes and as a consequence the large dimension of the model space, the analytical computation of the full posterior distribution is not feasible. We then numerically evaluate the posterior distribution by using an MCMC method in which we iteratively sample the facies, by moving from one mode to another, and the petrophysical properties, by sampling within the same mode. The method is extended to multiple seismic traces by applying a first-order Markov chain that accounts for the lateral continuity of the model properties. We first validate the method using a synthetic 2D reservoir model and then we apply the method to a real data set acquired in a carbonate field.


1988 ◽  
Vol 102 ◽  
pp. 79-81
Author(s):  
A. Goldberg ◽  
S.D. Bloom

AbstractClosed expressions for the first, second, and (in some cases) the third moment of atomic transition arrays now exist. Recently a method has been developed for getting to very high moments (up to the 12th and beyond) in cases where a “collective” state-vector (i.e. a state-vector containing the entire electric dipole strength) can be created from each eigenstate in the parent configuration. Both of these approaches give exact results. Herein we describe astatistical(or Monte Carlo) approach which requires onlyonerepresentative state-vector |RV> for the entire parent manifold to get estimates of transition moments of high order. The representation is achieved through the random amplitudes associated with each basis vector making up |RV>. This also gives rise to the dispersion characterizing the method, which has been applied to a system (in the M shell) with≈250,000 lines where we have calculated up to the 5th moment. It turns out that the dispersion in the moments decreases with the size of the manifold, making its application to very big systems statistically advantageous. A discussion of the method and these dispersion characteristics will be presented.


Sign in / Sign up

Export Citation Format

Share Document