Uncertainty quantification in stochastic inversion with dimensionality reduction using variational autoencoder

Geophysics ◽  
2021 ◽  
pp. 1-65
Author(s):  
Mingliang Liu ◽  
Dario Grana ◽  
Leandro Passos de Figueiredo

Estimating rock and fluid properties in the subsurface from geophysical measurements is a computationally and memory intensive inverse problem. For nonlinear problems with non-Gaussian variables, analytical solutions are generally not available, and the solutions of those inverse problems must be approximated using sampling and optimization methods. To reduce the computational cost, model and data can be re-parameterized into low-dimensional spaces where the solution of the inverse problem can be computed more efficiently. Among the potential dimensionality reduction methods, deep learning algorithms based on deep generative models provide an efficient approach to reduce the dimension of the model and data vectors. However, such dimension reduction might lead to information loss in the reconstructed model and data, reduction of the accuracy and resolution of the inverted models, and under or overestimation of the uncertainty of the predicted models. To comprehensively investigate the impact of model and data dimension reduction with deep generative models on uncertainty quantification, we compare the prediction uncertainty in nonlinear inverse problem solutions obtained from Markov chain Monte Carlo and ensemble-based data assimilation methods implemented in lower dimensional data and model spaces using a deep variational autoencoder. The proposed workflow is applied to two geophysical inverse problems for the prediction of reservoir properties: pre-stack seismic inversion and seismic history matching. The inversion results consist of the most likely model and a set of realizations of the variables of interest. The application of dimensionality reduction methods makes the stochastic inversion more efficient.

Geophysics ◽  
2019 ◽  
Vol 84 (6) ◽  
pp. M15-M24 ◽  
Author(s):  
Dario Grana ◽  
Leandro Passos de Figueiredo ◽  
Leonardo Azevedo

The prediction of rock properties in the subsurface from geophysical data generally requires the solution of a mathematical inverse problem. Because of the large size of geophysical (seismic) data sets and subsurface models, it is common to reduce the dimension of the problem by applying dimension reduction methods and considering a reparameterization of the model and/or the data. Especially for high-dimensional nonlinear inverse problems, in which the analytical solution of the problem is not available in a closed form and iterative sampling or optimization methods must be applied to approximate the solution, model and/or data reduction reduce the computational cost of the inversion. However, part of the information in the data or in the model can be lost by working in the reduced model and/or data space. We have focused on the uncertainty quantification in the solution of the inverse problem with data and/or model order reduction. We operate in a Bayesian setting for the inversion and uncertainty quantification and validate the proposed approach in the linear case, in which the posterior distribution of the model variables can be analytically written and the uncertainty of the model predictions can be exactly assessed. To quantify the changes in the uncertainty in the inverse problem in the reduced space, we compare the uncertainty in the solution with and without data and/or model reduction. We then extend the approach to nonlinear inverse problems in which the solution is computed using an ensemble-based method. Examples of applications to linearized acoustic and nonlinear elastic inversion allow quantifying the impact of the application of reduction methods to model and data vectors on the uncertainty of inverse problem solutions. Examples of applications to linearized acoustic and nonlinear elastic inversion are shown.


2021 ◽  
Author(s):  
Yongin Choi ◽  
Gerald Quon

Deep neural networks implementing generative models for dimensionality reduction have been extensively used for the visualization and analysis of genomic data. One of their key limitations is lack of interpretability: it is challenging to quantitatively identify which input features are used to construct the embedding dimensions, thus preventing insight into why cells are organized in a particular data visualization, for example. Here we present a scalable, interpretable variational autoencoder (siVAE) that is interpretable by design: it learns feature embeddings that guide the interpretation of the cell embeddings in a manner analogous to factor loadings of factor analysis. siVAE is as powerful and nearly as fast to train as the standard VAE but achieves full interpretability of the embedding dimensions. We exploit a number of connections between dimensionality reduction and gene network inference to identify gene neighborhoods and gene hubs, without the explicit need for gene network inference. Finally, we observe a systematic difference in the gene neighborhoods identified by dimensionality reduction methods and gene network inference algorithms in general, suggesting they provide complementary information about the underlying structure of the gene co-expression network.


2019 ◽  
Author(s):  
Leandro de Figueiredo ◽  
Dario Grana ◽  
Leonardo Azevedo ◽  
Mauro Roisenberg ◽  
Bruno Rodrigues

2021 ◽  
Vol 12 ◽  
Author(s):  
Ruizhi Xiang ◽  
Wencan Wang ◽  
Lei Yang ◽  
Shiyuan Wang ◽  
Chaohan Xu ◽  
...  

Single-cell RNA sequencing (scRNA-seq) is a high-throughput sequencing technology performed at the level of an individual cell, which can have a potential to understand cellular heterogeneity. However, scRNA-seq data are high-dimensional, noisy, and sparse data. Dimension reduction is an important step in downstream analysis of scRNA-seq. Therefore, several dimension reduction methods have been developed. We developed a strategy to evaluate the stability, accuracy, and computing cost of 10 dimensionality reduction methods using 30 simulation datasets and five real datasets. Additionally, we investigated the sensitivity of all the methods to hyperparameter tuning and gave users appropriate suggestions. We found that t-distributed stochastic neighbor embedding (t-SNE) yielded the best overall performance with the highest accuracy and computing cost. Meanwhile, uniform manifold approximation and projection (UMAP) exhibited the highest stability, as well as moderate accuracy and the second highest computing cost. UMAP well preserves the original cohesion and separation of cell populations. In addition, it is worth noting that users need to set the hyperparameters according to the specific situation before using the dimensionality reduction methods based on non-linear model and neural network.


2021 ◽  
Author(s):  
Lixiong Cao ◽  
Jie Liu ◽  
Cheng Lu ◽  
Wei Wang

Abstract The inverse problem analysis method provides an effective way for the structural parameter identification. Due to the coupling of multi-source uncertainties in the measured responses and the modeling parameters, the inverse of unknown structural parameter will face the challenges in the solving mechanism and the computational cost. In this paper, an uncertain inverse method based on convex model and dimension reduction decomposition is proposed to realize the interval identification of unknown structural parameter according to the uncertain measured responses and modeling parameters. Firstly, the polygonal convex set model is established to quantify the uncertainties of modeling parameters. Afterwards, a space collocation method based on dimension reduction decomposition is proposed to transform the inverse problem considering multi-source uncertainties into a few interval inverse problems considering response uncertainty. The transformed interval inverse problem involves the two-layer solving process including interval propagation and optimization updating. In order to solve the interval inverse problems considering response uncertainty, an efficient interval inverse method based on the high dimensional model representation and affine algorithm is further developed. Through the coupling of the above two methods, the proposed uncertain inverse method avoids the time-consuming multi-layer nested calculation procedure, and then effectively realize the inverse uncertainty quantification of unknown structural parameters. Finally, two engineering examples are provided to verify the effectiveness of the proposed uncertain inverse method.


2013 ◽  
Vol 38 (4) ◽  
pp. 465-470 ◽  
Author(s):  
Jingjie Yan ◽  
Xiaolan Wang ◽  
Weiyi Gu ◽  
LiLi Ma

Abstract Speech emotion recognition is deemed to be a meaningful and intractable issue among a number of do- mains comprising sentiment analysis, computer science, pedagogy, and so on. In this study, we investigate speech emotion recognition based on sparse partial least squares regression (SPLSR) approach in depth. We make use of the sparse partial least squares regression method to implement the feature selection and dimensionality reduction on the whole acquired speech emotion features. By the means of exploiting the SPLSR method, the component parts of those redundant and meaningless speech emotion features are lessened to zero while those serviceable and informative speech emotion features are maintained and selected to the following classification step. A number of tests on Berlin database reveal that the recogni- tion rate of the SPLSR method can reach up to 79.23% and is superior to other compared dimensionality reduction methods.


2020 ◽  
Vol 28 (5) ◽  
pp. 727-738
Author(s):  
Victor Sadovnichii ◽  
Yaudat Talgatovich Sultanaev ◽  
Azamat Akhtyamov

AbstractWe consider a new class of inverse problems on the recovery of the coefficients of differential equations from a finite set of eigenvalues of a boundary value problem with unseparated boundary conditions. A finite number of eigenvalues is possible only for problems in which the roots of the characteristic equation are multiple. The article describes solutions to such a problem for equations of the second, third, and fourth orders on a graph with three, four, and five edges. The inverse problem with an arbitrary number of edges is solved similarly.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Federico Calesella ◽  
Alberto Testolin ◽  
Michele De Filippo De Grazia ◽  
Marco Zorzi

AbstractMultivariate prediction of human behavior from resting state data is gaining increasing popularity in the neuroimaging community, with far-reaching translational implications in neurology and psychiatry. However, the high dimensionality of neuroimaging data increases the risk of overfitting, calling for the use of dimensionality reduction methods to build robust predictive models. In this work, we assess the ability of four well-known dimensionality reduction techniques to extract relevant features from resting state functional connectivity matrices of stroke patients, which are then used to build a predictive model of the associated deficits based on cross-validated regularized regression. In particular, we investigated the prediction ability over different neuropsychological scores referring to language, verbal memory, and spatial memory domains. Principal Component Analysis (PCA) and Independent Component Analysis (ICA) were the two best methods at extracting representative features, followed by Dictionary Learning (DL) and Non-Negative Matrix Factorization (NNMF). Consistent with these results, features extracted by PCA and ICA were found to be the best predictors of the neuropsychological scores across all the considered cognitive domains. For each feature extraction method, we also examined the impact of the regularization method, model complexity (in terms of number of features that entered in the model) and quality of the maps that display predictive edges in the resting state networks. We conclude that PCA-based models, especially when combined with L1 (LASSO) regularization, provide optimal balance between prediction accuracy, model complexity, and interpretability.


Sign in / Sign up

Export Citation Format

Share Document