A Distributed Computer System for Parallel Markov Chain Monte Carlo (MCMC)

Author(s):  
Michael Hynes

A ubiquitous problem in physics is to determine expectation values of observables associated with a system. This problem is typically formulated as an integration of some likelihood over a multidimensional parameter space. In Bayesian analysis, numerical Markov Chain Monte Carlo (MCMC) algorithms are employed to solve such integrals using a fixed number of samples in the Markov Chain. In general, MCMC algorithms are computationally expensive for large datasets and have difficulties sampling from multimodal parameter spaces. An MCMC implementation that is robust and inexpensive for researchers is desired. Distributed computing systems have shown the potential to act as virtual supercomputers, such as in the SETI@home project in which millions of private computers participate. We propose that a clustered peer-to-peer (P2P) computer network serves as an ideal structure to run Markovian state exchange algorithms such as Parallel Tempering (PT). PT overcomes the difficulty in sampling from multimodal distributions by running multiple chains in parallel with different target distributions andexchanging their states in a Markovian manner. To demonstrate the feasibility of peer-to-peer Parallel Tempering (P2P PT), a simple two-dimensional dataset consisting of two Gaussian signals separated by a region of low probability was used in a Bayesian parameter fitting algorithm. A small connected peer-to-peer network was constructed using separate processes on a linux kernel, and P2P PT was applied to the dataset. These sampling results were compared with those obtained from sampling the parameter space with a single chain. It was found that the single chain was unable to sample both modes effectively, while the P2P PT method explored the target distribution well, visiting both modes approximately equally. Future work will involve scaling to many dimensions and large networks, and convergence conditions with highly heterogeneous computing capabilities of members within the network.

2016 ◽  
Vol 9 (9) ◽  
pp. 3213-3229 ◽  
Author(s):  
Mark F. Lunt ◽  
Matt Rigby ◽  
Anita L. Ganesan ◽  
Alistair J. Manning

Abstract. Atmospheric trace gas inversions often attempt to attribute fluxes to a high-dimensional grid using observations. To make this problem computationally feasible, and to reduce the degree of under-determination, some form of dimension reduction is usually performed. Here, we present an objective method for reducing the spatial dimension of the parameter space in atmospheric trace gas inversions. In addition to solving for a set of unknowns that govern emissions of a trace gas, we set out a framework that considers the number of unknowns to itself be an unknown. We rely on the well-established reversible-jump Markov chain Monte Carlo algorithm to use the data to determine the dimension of the parameter space. This framework provides a single-step process that solves for both the resolution of the inversion grid, as well as the magnitude of fluxes from this grid. Therefore, the uncertainty that surrounds the choice of aggregation is accounted for in the posterior parameter distribution. The posterior distribution of this transdimensional Markov chain provides a naturally smoothed solution, formed from an ensemble of coarser partitions of the spatial domain. We describe the form of the reversible-jump algorithm and how it may be applied to trace gas inversions. We build the system into a hierarchical Bayesian framework in which other unknown factors, such as the magnitude of the model uncertainty, can also be explored. A pseudo-data example is used to show the usefulness of this approach when compared to a subjectively chosen partitioning of a spatial domain. An inversion using real data is also shown to illustrate the scales at which the data allow for methane emissions over north-west Europe to be resolved.


2007 ◽  
Vol 16 (2) ◽  
pp. 153-178 ◽  
Author(s):  
Jeff Gill

Increasingly, political science researchers are turning to Markov chain Monte Carlo methods to solve inferential problems with complex models and problematic data. This is an enormously powerful set of tools based on replacing difficult or impossible analytical work with simulated empirical draws from the distributions of interest. Although practitioners are generally aware of the importance of convergence of the Markov chain, many are not fully aware of the difficulties in fully assessing convergence across multiple dimensions. In most applied circumstances, every parameter dimension must be converged for the others to converge. The usual culprit is slow mixing of the Markov chain and therefore slow convergence towards the target distribution. This work demonstrates the partial convergence problem for the two dominant algorithms and illustrates these issues with empirical examples.


2015 ◽  
Vol 52 (3) ◽  
pp. 811-825
Author(s):  
Yves Atchadé ◽  
Yizao Wang

In this paper we study the mixing time of certain adaptive Markov chain Monte Carlo (MCMC) algorithms. Under some regularity conditions, we show that the convergence rate of importance resampling MCMC algorithms, measured in terms of the total variation distance, is O(n-1). By means of an example, we establish that, in general, this algorithm does not converge at a faster rate. We also study the interacting tempering algorithm, a simplified version of the equi-energy sampler, and establish that its mixing time is of order O(n-1/2).


2004 ◽  
Vol 29 (4) ◽  
pp. 461-488 ◽  
Author(s):  
Sandip Sinharay

There is an increasing use of Markov chain Monte Carlo (MCMC) algorithms for fitting statistical models in psychometrics, especially in situations where the traditional estimation techniques are very difficult to apply. One of the disadvantages of using an MCMC algorithm is that it is not straightforward to determine the convergence of the algorithm. Using the output of an MCMC algorithm that has not converged may lead to incorrect inferences on the problem at hand. The convergence is not one to a point, but that of the distribution of a sequence of generated values to another distribution, and hence is not easy to assess; there is no guaranteed diagnostic tool to determine convergence of an MCMC algorithm in general. This article examines the convergence of MCMC algorithms using a number of convergence diagnostics for two real data examples from psychometrics. Findings from this research have the potential to be useful to researchers using the algorithms. For both the examples, the number of iterations required (suggested by the diagnostics) to be reasonably confident that the MCMC algorithm has converged may be larger than what many practitioners consider to be safe.


2019 ◽  
Vol 51 (03) ◽  
pp. 802-834 ◽  
Author(s):  
Nicholas G. Tawn ◽  
Gareth O. Roberts

AbstractIt is well known that traditional Markov chain Monte Carlo (MCMC) methods can fail to effectively explore the state space for multimodal problems. Parallel tempering is a well-established population approach for such target distributions involving a collection of particles indexed by temperature. However, this method can suffer dramatically from the curse of dimensionality. In this paper we introduce an improvement on parallel tempering called QuanTA. A comprehensive theoretical analysis quantifying the improved efficiency and scalability of the approach is given. Under weak regularity conditions, QuanTA gives accelerated mixing through the temperature space. Empirical evidence of the effectiveness of this new algorithm is illustrated on canonical examples.


2016 ◽  
Vol 53 (2) ◽  
pp. 410-420 ◽  
Author(s):  
Gareth O. Roberts ◽  
Jeffrey S. Rosenthal

Abstract We connect known results about diffusion limits of Markov chain Monte Carlo (MCMC) algorithms to the computer science notion of algorithm complexity. Our main result states that any weak limit of a Markov process implies a corresponding complexity bound (in an appropriate metric). We then combine this result with previously-known MCMC diffusion limit results to prove that under appropriate assumptions, the random-walk Metropolis algorithm in d dimensions takes O(d) iterations to converge to stationarity, while the Metropolis-adjusted Langevin algorithm takes O(d1/3) iterations to converge to stationarity.


Sign in / Sign up

Export Citation Format

Share Document