Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness

As one of the most popular generative models, Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference. However, when the decoder network is sufficiently expressive, VAE may lead to posterior collapse; that is, uninformative latent representations may be learned. To this end, in this paper, we propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space, and thus the representation can be learned in a meaningful and compact manner. Specifically, we first theoretically demonstrate that it will result in better latent space with high diversity and low uncertainty awareness by controlling the distribution of posterior’s parameters across the whole data accordingly. Then, without the introduction of new loss terms or modifying training strategies, we propose to exploit Dropout on the variances and Batch-Normalization on the means simultaneously to regularize their distributions implicitly. Furthermore, to evaluate the generalization effect, we also exploit DU-VAE for inverse autoregressive flow based-VAE (VAE-IAF) empirically. Finally, extensive experiments on three benchmark datasets clearly show that our approach can outperform state-of-the-art baselines on both likelihood estimation and underlying classification tasks.

Download Full-text

Collective dynamics of repeated inference in variational autoencoder rapidly find cluster structure

Scientific Reports ◽

10.1038/s41598-020-72593-4 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Yoshihiro Nagano ◽

Ryo Karakida ◽

Masato Okada

Keyword(s):

Activity Pattern ◽

Latent Variables ◽

Cluster Structure ◽

Generative Models ◽

Cluster Center ◽

Specific Data ◽

Global Cluster ◽

Latent Space ◽

Variational Autoencoder ◽

Low Dimensional

Abstract Deep neural networks are good at extracting low-dimensional subspaces (latent spaces) that represent the essential features inside a high-dimensional dataset. Deep generative models represented by variational autoencoders (VAEs) can generate and infer high-quality datasets, such as images. In particular, VAEs can eliminate the noise contained in an image by repeating the mapping between latent and data space. To clarify the mechanism of such denoising, we numerically analyzed how the activity pattern of trained networks changes in the latent space during inference. We considered the time development of the activity pattern for specific data as one trajectory in the latent space and investigated the collective behavior of these inference trajectories for many data. Our study revealed that when a cluster structure exists in the dataset, the trajectory rapidly approaches the center of the cluster. This behavior was qualitatively consistent with the concept retrieval reported in associative memory models. Additionally, the larger the noise contained in the data, the closer the trajectory was to a more global cluster. It was demonstrated that by increasing the number of the latent variables, the trend of the approach a cluster center can be enhanced, and the generalization ability of the VAE can be improved.

Download Full-text

A Flow-Based Deep Latent Variable Model for Speech Spectrogram Modeling and Enhancement

10.36227/techrxiv.12375284 ◽

2020 ◽

Author(s):

Aditya Arie Nugraha ◽

Kouhei Sekiguchi ◽

Kazuyoshi Yoshii

Keyword(s):

Speech Enhancement ◽

Latent Variables ◽

Latent Variable ◽

Generative Models ◽

Latent Variable Model ◽

Variable Model ◽

Variational Autoencoder ◽

Latent Representations ◽

Low Dimensional ◽

Better Than

This paper describes a deep latent variable model of speech power spectrograms and its application to semi-supervised speech enhancement with a deep speech prior. By integrating two major deep generative models, a variational autoencoder (VAE) and a normalizing flow (NF), in a mutually-beneficial manner, we formulate a flexible latent variable model called the NF-VAE that can extract low-dimensional latent representations from high-dimensional observations, akin to the VAE, and does not need to explicitly represent the distribution of the observations, akin to the NF. In this paper, we consider a variant of NF called the generative flow (GF a.k.a. Glow) and formulate a latent variable model called the GF-VAE. We experimentally show that the proposed GF-VAE is better than the standard VAE at capturing fine-structured harmonics of speech spectrograms, especially in the high-frequency range. A similar finding is also obtained when the GF-VAE and the VAE are used to generate speech spectrograms from latent variables randomly sampled from the standard Gaussian distribution. Lastly, when these models are used as speech priors for statistical multichannel speech enhancement, the GF-VAE outperforms the VAE and the GF.

Download Full-text

Hyperprior Induced Unsupervised Disentanglement of Latent Representations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013175 ◽

2019 ◽

Vol 33 ◽

pp. 3175-3182 ◽

Cited By ~ 1

Author(s):

Abdul Fatir Ansari ◽

Harold Soh

Keyword(s):

Lower Bound ◽

Covariance Matrix ◽

State Of The Art ◽

Generative Models ◽

The State ◽

Experimental Results ◽

Hierarchical Bayesian ◽

Statistical Independence ◽

Latent Space ◽

Latent Representations

We address the problem of unsupervised disentanglement of latent representations learnt via deep generative models. In contrast to current approaches that operate on the evidence lower bound (ELBO), we argue that statistical independence in the latent space of VAEs can be enforced in a principled hierarchical Bayesian manner. To this effect, we augment the standard VAE with an inverse-Wishart (IW) prior on the covariance matrix of the latent code. By tuning the IW parameters, we are able to encourage (or discourage) independence in the learnt latent dimensions. Extensive experimental results on a range of datasets (2DShapes, 3DChairs, 3DFaces and CelebA) show our approach to outperform the β-VAE and is competitive with the state-of-the-art FactorVAE. Our approach achieves significantly better disentanglement and reconstruction on a new dataset (CorrelatedEllipses) which introduces correlations between the factors of variation.

Download Full-text

A Deep Generative Model for Code Switched Text

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/719 ◽

2019 ◽

Cited By ~ 1

Author(s):

Bidisha Samanta ◽

Sharmila Reddy ◽

Hussain Jagirdar ◽

Niloy Ganguly ◽

Soumen Chakrabarti

Keyword(s):

State Of The Art ◽

Generative Models ◽

Code Switching ◽

Language Models ◽

Generative Adversarial Networks ◽

Data Intensive ◽

Adversarial Networks ◽

Latent Space ◽

Variational Autoencoder ◽

Multilingual Societies

Code-switching, the interleaving of two or more languages within a sentence or discourse is pervasive in multilingual societies. Accurate language models for code-switched text are critical for NLP tasks. State-of-the-art data-intensive neural language models are difficult to train well from scarce language-labeled code-switched text. A potential solution is to use deep generative models to synthesize large volumes of realistic code-switched text. Although generative adversarial networks and variational autoencoders can synthesize plausible monolingual text from continuous latent space, they cannot adequately address code-switched text, owing to their informal style and complex interplay between the constituent languages. We introduce VACS, a novel variational autoencoder architecture specifically tailored to code-switching phenomena. VACS encodes to and decodes from a two-level hierarchical representation, which models syntactic contextual signals in the lower level, and language switching signals in the upper layer. Sampling representations from the prior and decoding them produced well-formed, diverse code-switched sentences. Extensive experiments show that using synthetic code-switched text with natural monolingual data results in significant (33.06\%) drop in perplexity.

Download Full-text

A Flow-Based Deep Latent Variable Model for Speech Spectrogram Modeling and Enhancement

10.36227/techrxiv.12375284.v1 ◽

2020 ◽

Author(s):

Aditya Arie Nugraha ◽

Kouhei Sekiguchi ◽

Kazuyoshi Yoshii

Keyword(s):

Speech Enhancement ◽

Latent Variables ◽

Latent Variable ◽

Generative Models ◽

Latent Variable Model ◽

Variable Model ◽

Variational Autoencoder ◽

Latent Representations ◽

Low Dimensional ◽

Better Than

Download Full-text

Effective Representing of Information Network by Variational Autoencoder

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/292 ◽

2017 ◽

Cited By ~ 3

Author(s):

Hang Li ◽

Haozheng Wang ◽

Zhenglu Yang ◽

Haochen Liu

Keyword(s):

State Of The Art ◽

Vector Model ◽

Information Network ◽

Network Representation ◽

Linear Relationships ◽

Variational Autoencoder ◽

Benchmark Datasets ◽

Text Information ◽

Complex Features ◽

Better Than

Network representation is the basis of many applications and of extensive interest in various fields, such as information retrieval, social network analysis, and recommendation systems. Most previous methods for network representation only consider the incomplete aspects of a problem, including link structure, node information, and partial integration. The present study proposes a deep network representation model that seamlessly integrates the text information and structure of a network. Our model captures highly non-linear relationships between nodes and complex features of a network by exploiting the variational autoencoder (VAE), which is a deep unsupervised generation algorithm. We also merge the representation learned with a paragraph vector model and that learned with the VAE to obtain the network representation that preserves both structure and text information. We conduct comprehensive empirical experiments on benchmark datasets and find our model performs better than state-of-the-art techniques by a large margin.

Download Full-text

Weakly Supervised Disentanglement by Pairwise Similarities

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5754 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3495-3502 ◽

Cited By ~ 1

Author(s):

Junxiang Chen ◽

Kayhan Batmanghelich

Keyword(s):

Real World ◽

Latent Variables ◽

Generative Models ◽

Experimental Results ◽

New Method ◽

Weak Supervision ◽

Real World Problem ◽

Variational Autoencoder ◽

Weakly Supervised

Recently, researches related to unsupervised disentanglement learning with deep generative models have gained substantial popularity. However, without introducing supervision, there is no guarantee that the factors of interest can be successfully recovered (Locatello et al. 2018). Motivated by a real-world problem, we propose a setting where the user introduces weak supervision by providing similarities between instances based on a factor to be disentangled. The similarity is provided as either a binary (yes/no) or real-valued label describing whether a pair of instances are similar or not. We propose a new method for weakly supervised disentanglement of latent variables within the framework of Variational Autoencoder. Experimental results demonstrate that utilizing weak supervision improves the performance of the disentanglement method substantially.

Download Full-text

Discriminative non-negative representation based classifier for image recognition

Journal of Algorithms & Computational Technology ◽

10.1177/17483026211044922 ◽

2021 ◽

Vol 15 ◽

pp. 174830262110449

Author(s):

Kai-Jun Hu ◽

He-Feng Yin ◽

Jun Sun

Keyword(s):

Pattern Classification ◽

State Of The Art ◽

Source Code ◽

Classification Performance ◽

Training Data ◽

Practical Applications ◽

The Past ◽

Benchmark Datasets ◽

Classification Tasks ◽

Negative Representation

During the past decade, representation based classification method has received considerable attention in the community of pattern recognition. The recently proposed non-negative representation based classifier achieved superb recognition results in diverse pattern classification tasks. Unfortunately, discriminative information of training data is not fully exploited in non-negative representation based classifier, which undermines its classification performance in practical applications. To address this problem, we introduce a decorrelation regularizer into the formulation of non-negative representation based classifier and propose a discriminative non-negative representation based classifier for pattern classification. The decorrelation regularizer is able to reduce the correlation of representation results of different classes, thus promoting the competition among them. Experimental results on benchmark datasets validate the efficacy of the proposed discriminative non-negative representation based classifier, and it can outperform some state-of-the-art deep learning based methods. The source code of our proposed discriminative non-negative representation based classifier is accessible at https://github.com/yinhefeng/DNRC .

Download Full-text

Fragment-based Sequential Translation for Molecular Optimization

10.33774/chemrxiv-2021-fzxmk-v2 ◽

2021 ◽

Author(s):

Benson Chen ◽

Xiang Fu ◽

Regina Barzilay ◽

Tommi Jaakkola

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Empirical Evaluation ◽

Molecular Fragments ◽

Complex Chemical ◽

Latent Space ◽

Variational Autoencoder ◽

Property Space ◽

Molecular Compounds ◽

Do So

Searching for novel molecular compounds with desired properties is an important problem in drug discovery. Many existing frameworks generate molecules one atom at a time. We instead propose a flexible editing paradigm that generates molecules using learned molecular fragments---meaningful substructures of molecules. To do so, we train a variational autoencoder (VAE) to encode molecular fragments in a coherent latent space, which we then utilize as a vocabulary for editing molecules to explore the complex chemical property space. Equipped with the learned fragment vocabulary, we propose Fragment-based Sequential Translation (FaST), which learns a reinforcement learning (RL) policy to iteratively translate model-discovered molecules into increasingly novel molecules while satisfying desired properties. Empirical evaluation shows that FaST significantly improves over state-of-the-art methods on benchmark single/multi-objective molecular optimization tasks.

Download Full-text

Generative Adversarial Network for Class-Conditional Data Augmentation

Applied Sciences ◽

10.3390/app10238415 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8415

Author(s):

Jeongmin Lee ◽

Younkyoung Yoon ◽

Junseok Kwon

Keyword(s):

Classification Accuracy ◽

Data Augmentation ◽

Denoising Autoencoder ◽

Generative Adversarial Network ◽

Minority Class ◽

Adversarial Network ◽

Latent Space ◽

Benchmark Datasets ◽

Class Information ◽

Classification Tasks

We propose a novel generative adversarial network for class-conditional data augmentation (i.e., GANDA) to mitigate data imbalance problems in image classification tasks. The proposed GANDA generates minority class data by exploiting majority class information to enhance the classification accuracy of minority classes. For stable GAN training, we introduce a new denoising autoencoder initialization with explicit class conditioning in the latent space, which enables the generation of definite samples. The generated samples are visually realistic and have a high resolution. Experimental results demonstrate that the proposed GANDA can considerably improve classification accuracy, especially when datasets are highly imbalanced on standard benchmark datasets (i.e., MNIST and CelebA). Our generated samples can be easily used to train conventional classifiers to enhance their classification accuracy.

Download Full-text