scholarly journals Chinese Character Image Completion Using a Generative Latent Variable Model

2021 ◽  
Vol 11 (2) ◽  
pp. 624
Author(s):  
In-su Jo ◽  
Dong-bin Choi ◽  
Young B. Park

Chinese characters in ancient books have many corrupted characters, and there are cases in which objects are mixed in the process of extracting the characters into images. To use this incomplete image as accurate data, we use image completion technology, which removes unnecessary objects and restores corrupted images. In this paper, we propose a variational autoencoder with classification (VAE-C) model. This model is characterized by using classification areas and a class activation map (CAM). Through the classification area, the data distribution is disentangled, and then the node to be adjusted is tracked using CAM. Through the latent variable, with which the determined node value is reduced, an image from which unnecessary objects have been removed is created. The VAE-C model can be utilized not only to eliminate unnecessary objects but also to restore corrupted images. By comparing the performance of removing unnecessary objects with mask regions with convolutional neural networks (Mask R-CNN), one of the prevalent object detection technologies, and also comparing the image restoration performance with the partial convolution model (PConv) and the gated convolution model (GConv), which are image inpainting technologies, our model is proven to perform excellently in terms of removing objects and restoring corrupted areas.

2020 ◽  
Author(s):  
Aditya Arie Nugraha ◽  
Kouhei Sekiguchi ◽  
Kazuyoshi Yoshii

This paper describes a deep latent variable model of speech power spectrograms and its application to semi-supervised speech enhancement with a deep speech prior. By integrating two major deep generative models, a variational autoencoder (VAE) and a normalizing flow (NF), in a mutually-beneficial manner, we formulate a flexible latent variable model called the NF-VAE that can extract low-dimensional latent representations from high-dimensional observations, akin to the VAE, and does not need to explicitly represent the distribution of the observations, akin to the NF. In this paper, we consider a variant of NF called the generative flow (GF a.k.a. Glow) and formulate a latent variable model called the GF-VAE. We experimentally show that the proposed GF-VAE is better than the standard VAE at capturing fine-structured harmonics of speech spectrograms, especially in the high-frequency range. A similar finding is also obtained when the GF-VAE and the VAE are used to generate speech spectrograms from latent variables randomly sampled from the standard Gaussian distribution. Lastly, when these models are used as speech priors for statistical multichannel speech enhancement, the GF-VAE outperforms the VAE and the GF.


2020 ◽  
Author(s):  
Aditya Arie Nugraha ◽  
Kouhei Sekiguchi ◽  
Kazuyoshi Yoshii

This paper describes a deep latent variable model of speech power spectrograms and its application to semi-supervised speech enhancement with a deep speech prior. By integrating two major deep generative models, a variational autoencoder (VAE) and a normalizing flow (NF), in a mutually-beneficial manner, we formulate a flexible latent variable model called the NF-VAE that can extract low-dimensional latent representations from high-dimensional observations, akin to the VAE, and does not need to explicitly represent the distribution of the observations, akin to the NF. In this paper, we consider a variant of NF called the generative flow (GF a.k.a. Glow) and formulate a latent variable model called the GF-VAE. We experimentally show that the proposed GF-VAE is better than the standard VAE at capturing fine-structured harmonics of speech spectrograms, especially in the high-frequency range. A similar finding is also obtained when the GF-VAE and the VAE are used to generate speech spectrograms from latent variables randomly sampled from the standard Gaussian distribution. Lastly, when these models are used as speech priors for statistical multichannel speech enhancement, the GF-VAE outperforms the VAE and the GF.


2021 ◽  
Vol 421 ◽  
pp. 244-259
Author(s):  
Hao Xiong ◽  
Yuan Yan Tang ◽  
Fionn Murtagh ◽  
Leszek Rutkowski ◽  
Shlomo Berkovsky

Energies ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 3137
Author(s):  
Amine Tadjer ◽  
Reider B. Bratvold ◽  
Remus G. Hanea

Production forecasting is the basis for decision making in the oil and gas industry, and can be quite challenging, especially in terms of complex geological modeling of the subsurface. To help solve this problem, assisted history matching built on ensemble-based analysis such as the ensemble smoother and ensemble Kalman filter is useful in estimating models that preserve geological realism and have predictive capabilities. These methods tend, however, to be computationally demanding, as they require a large ensemble size for stable convergence. In this paper, we propose a novel method of uncertainty quantification and reservoir model calibration with much-reduced computation time. This approach is based on a sequential combination of nonlinear dimensionality reduction techniques: t-distributed stochastic neighbor embedding or the Gaussian process latent variable model and clustering K-means, along with the data assimilation method ensemble smoother with multiple data assimilation. The cluster analysis with t-distributed stochastic neighbor embedding and Gaussian process latent variable model is used to reduce the number of initial geostatistical realizations and select a set of optimal reservoir models that have similar production performance to the reference model. We then apply ensemble smoother with multiple data assimilation for providing reliable assimilation results. Experimental results based on the Brugge field case data verify the efficiency of the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document