scholarly journals In search of lost mixing time: adaptive Markov chain Monte Carlo schemes for Bayesian variable selection with very large p

Biometrika ◽  
2020 ◽  
Author(s):  
J E Griffin ◽  
K G Łatuszyński ◽  
M F J Steel

Summary The availability of datasets with large numbers of variables is rapidly increasing. The effective application of Bayesian variable selection methods for regression with these datasets has proved difficult since available Markov chain Monte Carlo methods do not perform well in typical problem sizes of interest. We propose new adaptive Markov chain Monte Carlo algorithms to address this shortcoming. The adaptive design of these algorithms exploits the observation that in large-$p$, small-$n$ settings, the majority of the $p$ variables will be approximately uncorrelated a posteriori. The algorithms adaptively build suitable nonlocal proposals that result in moves with squared jumping distance significantly larger than standard methods. Their performance is studied empirically in high-dimensional problems and speed-ups of up to four orders of magnitude are observed.

2015 ◽  
Vol 52 (3) ◽  
pp. 811-825
Author(s):  
Yves Atchadé ◽  
Yizao Wang

In this paper we study the mixing time of certain adaptive Markov chain Monte Carlo (MCMC) algorithms. Under some regularity conditions, we show that the convergence rate of importance resampling MCMC algorithms, measured in terms of the total variation distance, is O(n-1). By means of an example, we establish that, in general, this algorithm does not converge at a faster rate. We also study the interacting tempering algorithm, a simplified version of the equi-energy sampler, and establish that its mixing time is of order O(n-1/2).


2007 ◽  
Vol 44 (02) ◽  
pp. 458-475 ◽  
Author(s):  
Gareth O. Roberts ◽  
Jeffrey S. Rosenthal

We consider basic ergodicity properties of adaptive Markov chain Monte Carlo algorithms under minimal assumptions, using coupling constructions. We prove convergence in distribution and a weak law of large numbers. We also give counterexamples to demonstrate that the assumptions we make are not redundant.


2007 ◽  
Vol 44 (02) ◽  
pp. 458-475 ◽  
Author(s):  
Gareth O. Roberts ◽  
Jeffrey S. Rosenthal

We consider basic ergodicity properties of adaptive Markov chain Monte Carlo algorithms under minimal assumptions, using coupling constructions. We prove convergence in distribution and a weak law of large numbers. We also give counterexamples to demonstrate that the assumptions we make are not redundant.


Biometrika ◽  
2020 ◽  
Vol 107 (4) ◽  
pp. 997-1004
Author(s):  
Qifan Song ◽  
Yan Sun ◽  
Mao Ye ◽  
Faming Liang

Summary Stochastic gradient Markov chain Monte Carlo algorithms have received much attention in Bayesian computing for big data problems, but they are only applicable to a small class of problems for which the parameter space has a fixed dimension and the log-posterior density is differentiable with respect to the parameters. This paper proposes an extended stochastic gradient Markov chain Monte Carlo algorithm which, by introducing appropriate latent variables, can be applied to more general large-scale Bayesian computing problems, such as those involving dimension jumping and missing data. Numerical studies show that the proposed algorithm is highly scalable and much more efficient than traditional Markov chain Monte Carlo algorithms.


2007 ◽  
Vol 44 (2) ◽  
pp. 458-475 ◽  
Author(s):  
Gareth O. Roberts ◽  
Jeffrey S. Rosenthal

We consider basic ergodicity properties of adaptive Markov chain Monte Carlo algorithms under minimal assumptions, using coupling constructions. We prove convergence in distribution and a weak law of large numbers. We also give counterexamples to demonstrate that the assumptions we make are not redundant.


2015 ◽  
Vol 52 (03) ◽  
pp. 811-825
Author(s):  
Yves Atchadé ◽  
Yizao Wang

In this paper we study the mixing time of certain adaptive Markov chain Monte Carlo (MCMC) algorithms. Under some regularity conditions, we show that the convergence rate of importance resampling MCMC algorithms, measured in terms of the total variation distance, isO(n-1). By means of an example, we establish that, in general, this algorithm does not converge at a faster rate. We also study the interacting tempering algorithm, a simplified version of the equi-energy sampler, and establish that its mixing time is of orderO(n-1/2).


Sign in / Sign up

Export Citation Format

Share Document