Accelerating inverse problems in seismology using adjoint-based machine learning

Author(s):  
Lars Gebraad ◽  
Sölvi Thrastarson ◽  
Andrea Zunino ◽  
Andreas Fichtner

<p><span>Uncertainty quantification is an essential part of many studies in Earth science. It allows us, for example, to assess the quality of tomographic reconstructions, quantify hypotheses and make physics-based risk assessments. In recent years there has been a surge in applications of uncertainty quantification in seismological inverse problems. This is mainly due to increasing computational power and the ‘discovery’ of optimal use cases for many algorithms (e.g., gradient-based Markov Chain Monte Carlo (MCMC). Performing Bayesian inference using these methods allows seismologists to perform advanced uncertainty quantification. However, oftentimes, Bayesian inference is still prohibitively expensive due to large parameter spaces and computationally expensive physics.</span></p><p><span>Simultaneously, machine learning has found its way into parameter estimation in geosciences. Recent works show that machine learning both allows one to accelerate repetitive inferences [e.g. </span>Shahraeeni & Curtis 2011, <span>Cao et al. 2020] as well as speed up single-instance Monte Carlo algorithms </span><span>using surrogate networks </span><span>[Aleardi 2020]. These advances allow seismologists to use machine learning as a tool to bring accurate inference on the subsurface to scale.</span></p><p>In this work, we propose the novel inclusion of adjoint modelling in machine learning accelerated inverse problems. The aforementioned references train machine learning models on observations of the misfit function. This is done with the aim of creating surrogate but accelerated models for the misfit computations, which in turn allows one to compute this function and its gradients much faster. This approach ignores that many physical models have an adjoint state, allowing one to compute gradients using only one additional simulation.</p><p>The inclusion of this information within gradient-based sampling creates performance gains in both training the surrogate and the sampling of the true posterior. We show how machine learning models that approximate misfits and gradients specifically trained using adjoint methods accelerate various types of inversions and bring Bayesian inference to scale. Practically, the proposed method simply allows us to utilize information from previous MCMC samples in the algorithm proposal step.</p><p>The application of the proposed machinery is in settings where models are extensively and repetitively run. Markov chain Monte Carlo algorithms, which may require millions of evaluations of the forward modelling equations, can be accelerated by off-loading these simulations to neural nets. This approach is also promising for tomographic monitoring, where experiments are repeatedly performed. Lastly, the efficiently trained neural nets can be used to learn a likelihood for a given dataset, to which subsequently different priors can be efficiently applied.<span> We show examples of all these use cases.</span></p><p> </p><p>Lars Gebraad, Christian Boehm and Andreas Fichtner, 2020: Bayesian Elastic Full‐Waveform Inversion Using Hamiltonian Monte Carlo.</p><p>Ruikun Cao, Stephanie Earp, Sjoerd A. L. de Ridder, Andrew Curtis, and Erica Galetti, 2020: Near-real-time near-surface 3D seismic velocity and uncertainty models by wavefield gradiometry and neural network inversion of ambient seismic noise.</p><p>Mohammad S. Shahraeeni and Andrew Curtis, 2011: Fast probabilistic nonlinear petrophysical inversion.</p><p><span>Mattia Aleardi, 2020: Combining discrete cosine transform and convolutional neural networks to speed up the Hamiltonian Monte Carlo inversion of pre‐stack seismic data.</span></p>

2020 ◽  
Author(s):  
Lars Gebraad ◽  
Andrea Zunino ◽  
Andreas Fichtner ◽  
Klaus Mosegaard

<div>We present a framework to solve geophysical inverse problems using the Hamiltonian Monte Carlo (HMC) method, with a focus on Bayesian tomography. Recent work in the geophysical community has shown the potential for gradient-based Monte Carlo sampling for a wide range of inverse problems across several fields.</div><div> </div><div>Many high-dimensional (non-linear) problems in geophysics have readily accessible gradient information which is unused in classical probabilistic inversions. Using HMC is a way to help improve traditional Monte Carlo sampling while increasing the scalability of inference problems, allowing access to uncertainty quantification for problems with many free parameters (>10'000). The result of HMC sampling is a collection of models representing the posterior probability density function, from which not only "best" models can be inferred, but also uncertainties and potentially different plausible scenarios, all compatible with the observed data. However, the amount of tuning parameters required by HMC, as well as the complexity of existing statistical modeling software, has limited the geophysical community in widely adopting a specific tool for performing efficient large-scale Bayesian inference.</div><div> </div><div>This work attempts to make a step towards filling that gap by providing an HMC sampler tailored for geophysical inverse problems (by e.g. supplying relevant priors and visualizations) combined with a set of different forward models, ranging from elastic and acoustic wave propagation to magnetic anomaly modeling, traveltimes, etc.. The framework is coded in the didactic but performant languages Julia and Python, with the possibility for the user to combine their own forward models, which are linked to the sampler routines by proper interfaces. In this way, we hope to illustrate the usefulness and potential of HMC in Bayesian inference. Tutorials featuring an array of physical experiments are written with the aim of both showcasing Bayesian inference and successful HMC usage. It additionally includes examples on how to speed up HMC e.g. with automated tuning techniques and GPU computations.</div>


2021 ◽  
Author(s):  
Alexander Kanonirov ◽  
Ksenia Balabaeva ◽  
Sergey Kovalchuk

The relevance of this study lies in improvement of machine learning models understanding. We present a method for interpreting clustering results and apply it to the case of clinical pathways modeling. This method is based on statistical inference and allows to get the description of the clusters, determining the influence of a particular feature on the difference between them. Based on the proposed approach, it is possible to determine the characteristic features for each cluster. Finally, we compare the method with the Bayesian inference explanation and with the interpretation of medical experts [1].


2020 ◽  
Author(s):  
Maria Moreno de Castro

<p>The presence of automated decision making continuously increases in today's society. Algorithms based in machine and deep learning decide how much we pay for insurance,  translate our thoughts to speech, and shape our consumption of goods (via e-marketing) and knowledge (via search engines). Machine and deep learning models are ubiquitous in science too, in particular, many promising examples are being developed to prove their feasibility for earth sciences applications, like finding temporal trends or spatial patterns in data or improving parameterization schemes for climate simulations. </p><p>However, most machine and deep learning applications aim to optimise performance metrics (for instance, accuracy, which stands for the times the model prediction was right), which are rarely good indicators of trust (i.e., why these predictions were right?). In fact, with the increase of data volume and model complexity, machine learning and deep learning  predictions can be very accurate but also prone to rely on spurious correlations, encode and magnify bias, and draw conclusions that do not incorporate the underlying dynamics governing the system. Because of that, the uncertainty of the predictions and our confidence in the model are difficult to estimate and the relation between inputs and outputs becomes hard to interpret. </p><p>Since it is challenging to shift a community from “black” to “glass” boxes, it is more useful to implement Explainable Artificial Intelligence (XAI) techniques right at the beginning of the machine learning and deep learning adoption rather than trying to fix fundamental problems later. The good news is that most of the popular XAI techniques basically are sensitivity analyses because they consist of a systematic perturbation of some model components in order to observe how it affects the model predictions. The techniques comprise random sampling, Monte-Carlo simulations, and ensemble runs, which are common methods in geosciences. Moreover, many XAI techniques are reusable because they are model-agnostic and must be applied after the model has been fitted. In addition, interpretability provides robust arguments when communicating machine and deep learning predictions to scientists and decision-makers.</p><p>In order to assist not only the practitioners but also the end-users in the evaluation of  machine and deep learning results, we will explain the intuition behind some popular techniques of XAI and aleatory and epistemic Uncertainty Quantification: (1) the Permutation Importance and Gaussian processes on the inputs (i.e., the perturbation of the model inputs), (2) the Monte-Carlo Dropout, Deep ensembles, Quantile Regression, and Gaussian processes on the weights (i.e, the perturbation of the model architecture), (3) the Conformal Predictors (useful to estimate the confidence interval on the outputs), and (4) the Layerwise Relevance Propagation (LRP), Shapley values, and Local Interpretable Model-Agnostic Explanations (LIME) (designed to visualize how each feature in the data affected a particular prediction). We will also introduce some best-practises, like the detection of anomalies in the training data before the training, the implementation of fallbacks when the prediction is not reliable, and physics-guided learning by including constraints in the loss function to avoid physical inconsistencies, like the violation of conservation laws. </p>


AIChE Journal ◽  
2016 ◽  
Vol 62 (9) ◽  
pp. 3352-3368 ◽  
Author(s):  
Jayashree Kalyanaraman ◽  
Yoshiaki Kawajiri ◽  
Ryan P. Lively ◽  
Matthew J. Realff

Sign in / Sign up

Export Citation Format

Share Document