scholarly journals Another look at the treatment of data uncertainty in Markov chain Monte Carlo inversion and other probabilistic methods

Author(s):  
Frederik Tilmann ◽  
Hamzeh Sadeghisorkhani ◽  
Alexandra Mauerberger
2020 ◽  
Vol 222 (1) ◽  
pp. 388-405
Author(s):  
F J Tilmann ◽  
H Sadeghisorkhani ◽  
A Mauerberger

SUMMARY In probabilistic Bayesian inversions, data uncertainty is a crucial parameter for quantifying the uncertainties and correlations of the resulting model parameters or, in transdimensional approaches, even the complexity of the model. However, in many geophysical inference problems it is poorly known. Therefore, it is common practice to allow the data uncertainty itself to be a parameter to be determined. Although in principle any arbitrary uncertainty distribution can be assumed, Gaussian distributions whose standard deviation is then the unknown parameter to be estimated are the usual choice. In this special case, the paper demonstrates that a simple analytical integration is sufficient to marginalise out this uncertainty parameter, reducing the complexity of the model space without compromising the accuracy of the posterior model probability distribution. However, it is well known that the distribution of geophysical measurement errors, although superficially similar to a Gaussian distribution, typically contains more frequent samples along the tail of the distribution, so-called outliers. In linearized inversions these are often removed in subsequent iterations based on some threshold criterion, but in Markov chain Monte Carlo (McMC) inversions this approach is not possible as they rely on the likelihood ratios, which cannot be formed if the number of data points varies between the steps of the Markov chain. The flexibility to define the data error probability distribution in McMC can be exploited in order to account for this pattern of uncertainties in a natural way, without having to make arbitrary choices regarding residual thresholds. In particular, we can regard the data uncertainty distribution as a mixture between a Gaussian distribution, which represent valid measurements with some measurement error, and a uniform distribution, which represents invalid measurements. The relative balance between them is an unknown parameter to be estimated alongside the standard deviation of the Gauss distribution. For each data point, the algorithm can then assign a probability to be an outlier, and the influence of each data point will be effectively downgraded according to its probability to be an outlier. Furthermore, this assignment can change as the McMC search is exploring different parts of the model space. The approach is demonstrated with both synthetic and real tomography examples. In a synthetic test, the proposed mixed measurement error distribution allows recovery of the underlying model even in the presence of 6 per cent outliers, which completely destroy the ability of a regular McMC or linear search to provide a meaningful image. Applied to an actual ambient noise tomography study based on automatically picked dispersion curves, the resulting model is shown to be much more consistent for different data sets, which differ in the applied quality criteria, while retaining the ability to recover strong anomalies in selected parts of the model.


2011 ◽  
Vol 8 (2) ◽  
pp. 4025-4052 ◽  
Author(s):  
J. A. Vrugt

Abstract. Formal and informal Bayesian approaches are increasingly being used to treat forcing, model structural, parameter and calibration data uncertainty, and summarize hydrologic prediction uncertainty. This requires posterior sampling methods that approximate the (evolving) posterior distribution. We recently introduced the DiffeRential Evolution Adaptive Metropolis (DREAM) algorithm, an adaptive Markov Chain Monte Carlo (MCMC) method that is especially designed to solve complex, high-dimensional and multimodal posterior probability density functions. The method runs multiple chains in parallel, and maintains detailed balance and ergodicity. Here, I present the latest algorithmic developments, and introduce a discrete sampling variant of DREAM that samples the parameter space at fixed points. The development of this new code, DREAM(D), has been inspired by the existing class of integer optimization problems, and emerging class of experimental design problems. Such non-continuous parameter estimation problems are of considerable theoretical and practical interest. The theory developed herein is applicable to DREAM(ZS) (Vrugt et al., 2011) and MT-DREAM(ZS) (Laloy and Vrugt, 2011) as well. Two case studies involving a sudoku puzzle and rainfall – runoff model calibration problem are used to illustrate DREAM(D).


1994 ◽  
Author(s):  
Alan E. Gelfand ◽  
Sujit K. Sahu

Sign in / Sign up

Export Citation Format

Share Document