Latent-Space Data Augmentation for Visually-Grounded Language Understanding

Author(s):  
Aly Magassouba ◽  
Komei Sugiura ◽  
Hisashi Kawai
2021 ◽  
pp. 149-159
Author(s):  
Chen Chen ◽  
Kerstin Hammernik ◽  
Cheng Ouyang ◽  
Chen Qin ◽  
Wenjia Bai ◽  
...  

Author(s):  
Xiaofeng Liu ◽  
Yang Zou ◽  
Lingsheng Kong ◽  
Zhihui Diao ◽  
Junliang Yan ◽  
...  

2021 ◽  
Author(s):  
Helin Wang ◽  
Yuexian Zou ◽  
Wenwu Wang

In this paper, we present SpecAugment++, a novel data aug-mentation method for deep neural networks based acousticscene classification (ASC). Different from other popular dataaugmentation methods such as SpecAugment and mixup thatonly work on the input space, SpecAugment++ is applied toboth the input space and the hidden space of the deep neuralnetworks to enhance the input and the intermediate feature rep-resentations. For an intermediate hidden state, the augmentationtechniques consist of masking blocks of frequency channels andmasking blocks of time frames, which improve generalizationby enabling a model to attend not only to the most discrimina-tive parts of the feature, but also the entire parts. Apart fromusing zeros for masking, we also examine two approaches formasking based on the use of other samples within the mini-batch, which helps introduce noises to the networks to makethem more discriminative for classification. The experimentalresults on the DCASE 2018 Task1 dataset and DCASE 2019Task1 dataset show that our proposed method can obtain3.6%and4.7%accuracy gains over a strong baseline without aug-mentation (i.e.CP-ResNet) respectively, and outperforms otherprevious data augmentation methods.


2019 ◽  
Vol 55 ◽  
pp. 136-147 ◽  
Author(s):  
Ilkay Oksuz ◽  
Bram Ruijsink ◽  
Esther Puyol-Antón ◽  
James R. Clough ◽  
Gastao Cruz ◽  
...  

2021 ◽  
Vol 503 (3) ◽  
pp. 3351-3370
Author(s):  
David J Bastien ◽  
Anna M M Scaife ◽  
Hongming Tang ◽  
Micah Bowles ◽  
Fiona Porter

ABSTRACT We present a model for generating postage stamp images of synthetic Fanaroff–Riley Class I and Class II radio galaxies suitable for use in simulations of future radio surveys such as those being developed for the Square Kilometre Array. This model uses a fully connected neural network to implement structured variational inference through a variational autoencoder and decoder architecture. In order to optimize the dimensionality of the latent space for the autoencoder, we introduce the radio morphology inception score (RAMIS), a quantitative method for assessing the quality of generated images, and discuss in detail how data pre-processing choices can affect the value of this measure. We examine the 2D latent space of the VAEs and discuss how this can be used to control the generation of synthetic populations, whilst also cautioning how it may lead to biases when used for data augmentation.


Author(s):  
Mathis Peyron ◽  
Anthony Fillion ◽  
Selime Gürol ◽  
Victor Marchais ◽  
Serge Gratton ◽  
...  

Author(s):  
Kang Min Yoo ◽  
Youhyun Shin ◽  
Sang-goo Lee

Data scarcity is one of the main obstacles of domain adaptation in spoken language understanding (SLU) due to the high cost of creating manually tagged SLU datasets. Recent works in neural text generative models, particularly latent variable models such as variational autoencoder (VAE), have shown promising results in regards to generating plausible and natural sentences. In this paper, we propose a novel generative architecture which leverages the generative power of latent variable models to jointly synthesize fully annotated utterances. Our experiments show that existing SLU models trained on the additional synthetic examples achieve performance gains. Our approach not only helps alleviate the data scarcity issue in the SLU task for many datasets but also indiscriminately improves language understanding performances for various SLU models, supported by extensive experiments and rigorous statistical testing.


2020 ◽  
Vol 10 (23) ◽  
pp. 8415
Author(s):  
Jeongmin Lee ◽  
Younkyoung Yoon ◽  
Junseok Kwon

We propose a novel generative adversarial network for class-conditional data augmentation (i.e., GANDA) to mitigate data imbalance problems in image classification tasks. The proposed GANDA generates minority class data by exploiting majority class information to enhance the classification accuracy of minority classes. For stable GAN training, we introduce a new denoising autoencoder initialization with explicit class conditioning in the latent space, which enables the generation of definite samples. The generated samples are visually realistic and have a high resolution. Experimental results demonstrate that the proposed GANDA can considerably improve classification accuracy, especially when datasets are highly imbalanced on standard benchmark datasets (i.e., MNIST and CelebA). Our generated samples can be easily used to train conventional classifiers to enhance their classification accuracy.


Sign in / Sign up

Export Citation Format

Share Document