Searching the Latent Space of a Generative Adversarial Network to Generate DOOM Levels

Existing domain adaptation methods on visual sentiment classification typically are investigated under the single-source scenario, where the knowledge learned from a source domain of sufficient labeled data is transferred to the target domain of loosely labeled or unlabeled data. However, in practice, data from a single source domain usually have a limited volume and can hardly cover the characteristics of the target domain. In this paper, we propose a novel multi-source domain adaptation (MDA) method, termed Multi-source Sentiment Generative Adversarial Network (MSGAN), for visual sentiment classification. To handle data from multiple source domains, it learns to find a unified sentiment latent space where data from both the source and target domains share a similar distribution. This is achieved via cycle consistent adversarial learning in an end-to-end manner. Extensive experiments conducted on four benchmark datasets demonstrate that MSGAN significantly outperforms the state-of-the-art MDA approaches for visual sentiment classification.

Download Full-text

Improving Deep Interactive Evolution with a Style-Based Generator for Artistic Expression and Creative Exploration

Entropy ◽

10.3390/e23010011 ◽

2020 ◽

Vol 23 (1) ◽

pp. 11

Author(s):

Carlos Tejeda-Ocampo ◽

Armando López-Cuevas ◽

Hugo Terashima-Marin

Keyword(s):

Evolutionary Computation ◽

Artistic Expression ◽

Interactive Evolutionary Computation ◽

Generative Adversarial Network ◽

Domain Specific ◽

Adversarial Network ◽

Latent Space

Deep interactive evolution (DeepIE) combines the capacity of interactive evolutionary computation (IEC) to capture a user’s preference with the domain-specific robustness of a trained generative adversarial network (GAN) generator, allowing the user to control the GAN output through evolutionary exploration of the latent space. However, the traditional GAN latent space presents feature entanglement, which limits the practicability of possible applications of DeepIE. In this paper, we implement DeepIE within a style-based generator from a StyleGAN model trained on the WikiArt dataset and propose StyleIE, a variation of DeepIE that takes advantage of the secondary disentangled latent space in the style-based generator. We performed two AB/BA crossover user tests that compared the performance of DeepIE against StyleIE for art generation. Self-rated evaluations of the performance were collected through a questionnaire. Findings from the tests suggest that StyleIE and DeepIE perform equally in tasks with open-ended goals with relaxed constraints, but StyleIE performs better in close-ended and more constrained tasks.

Download Full-text

Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6817 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11515-11522

Author(s):

Kaiyi Lin ◽

Xing Xu ◽

Lianli Gao ◽

Zheng Wang ◽

Heng Tao Shen

Keyword(s):

Large Scale ◽

Semantic Space ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Latent Space ◽

Multimodal Features ◽

Benchmark Datasets ◽

Class Labels ◽

Low Dimensional ◽

Sketch Retrieval

Zero-Shot Cross-Modal Retrieval (ZS-CMR) is an emerging research hotspot that aims to retrieve data of new classes across different modality data. It is challenging for not only the heterogeneous distributions across different modalities, but also the inconsistent semantics across seen and unseen classes. A handful of recently proposed methods typically borrow the idea from zero-shot learning, i.e., exploiting word embeddings of class labels (i.e., class-embeddings) as common semantic space, and using generative adversarial network (GAN) to capture the underlying multimodal data structures, as well as strengthen relations between input data and semantic space to generalize across seen and unseen classes. In this paper, we propose a novel method termed Learning Cross-Aligned Latent Embeddings (LCALE) as an alternative to these GAN based methods for ZS-CMR. Unlike using the class-embeddings as the semantic space, our method seeks for a shared low-dimensional latent space of input multimodal features and class-embeddings by modality-specific variational autoencoders. Notably, we align the distributions learned from multimodal input features and from class-embeddings to construct latent embeddings that contain the essential cross-modal correlation associated with unseen classes. Effective cross-reconstruction and cross-alignment criterions are further developed to preserve class-discriminative information in latent space, which benefits the efficiency for retrieval and enable the knowledge transfer to unseen classes. We evaluate our model using four benchmark datasets on image-text retrieval tasks and one large-scale dataset on image-sketch retrieval tasks. The experimental results show that our method establishes the new state-of-the-art performance for both tasks on all datasets.

Download Full-text

BEGAN v3: Avoiding Mode Collapse in GANs Using Variational Inference

Electronics ◽

10.3390/electronics9040688 ◽

2020 ◽

Vol 9 (4) ◽

pp. 688

Author(s):

Sung-Wook Park ◽

Jun-Ho Huh ◽

Jong-Chan Kim

Keyword(s):

Image Synthesis ◽

Generative Model ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Compression Performance ◽

Adversarial Network ◽

Adversarial Networks ◽

Latent Space ◽

Boundary Equilibrium ◽

Space Compression

In the field of deep learning, the generative model did not attract much attention until GANs (generative adversarial networks) appeared. In 2014, Google’s Ian Goodfellow proposed a generative model called GANs. GANs use different structures and objective functions from the existing generative model. For example, GANs use two neural networks: a generator that creates a realistic image, and a discriminator that distinguishes whether the input is real or synthetic. If there are no problems in the training process, GANs can generate images that are difficult even for experts to distinguish in terms of authenticity. Currently, GANs are the most researched subject in the field of computer vision, which deals with the technology of image style translation, synthesis, and generation, and various models have been unveiled. The issues raised are also improving one by one. In image synthesis, BEGAN (Boundary Equilibrium Generative Adversarial Network), which outperforms the previously announced GANs, learns the latent space of the image, while balancing the generator and discriminator. Nonetheless, BEGAN also has a mode collapse wherein the generator generates only a few images or a single one. Although BEGAN-CS (Boundary Equilibrium Generative Adversarial Network with Constrained Space), which was improved in terms of loss function, was introduced, it did not solve the mode collapse. The discriminator structure of BEGAN-CS is AE (AutoEncoder), which cannot create a particularly useful or structured latent space. Compression performance is not good either. In this paper, this characteristic of AE is considered to be related to the occurrence of mode collapse. Thus, we used VAE (Variational AutoEncoder), which added statistical techniques to AE. As a result of the experiment, the proposed model did not cause mode collapse but converged to a better state than BEGAN-CS.

Download Full-text

Generative Adversarial Network for Class-Conditional Data Augmentation

Applied Sciences ◽

10.3390/app10238415 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8415

Author(s):

Jeongmin Lee ◽

Younkyoung Yoon ◽

Junseok Kwon

Keyword(s):

Classification Accuracy ◽

Data Augmentation ◽

Denoising Autoencoder ◽

Generative Adversarial Network ◽

Minority Class ◽

Adversarial Network ◽

Latent Space ◽

Benchmark Datasets ◽

Class Information ◽

Classification Tasks

We propose a novel generative adversarial network for class-conditional data augmentation (i.e., GANDA) to mitigate data imbalance problems in image classification tasks. The proposed GANDA generates minority class data by exploiting majority class information to enhance the classification accuracy of minority classes. For stable GAN training, we introduce a new denoising autoencoder initialization with explicit class conditioning in the latent space, which enables the generation of definite samples. The generated samples are visually realistic and have a high resolution. Experimental results demonstrate that the proposed GANDA can considerably improve classification accuracy, especially when datasets are highly imbalanced on standard benchmark datasets (i.e., MNIST and CelebA). Our generated samples can be easily used to train conventional classifiers to enhance their classification accuracy.

Download Full-text

A Latent Space Understandable Generative Adversarial Network: SelfExGAN

2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA) ◽

10.1109/dicta.2017.8227390 ◽

2017 ◽

Cited By ~ 1

Author(s):

Yongjie Liu ◽

Qianlong Wang ◽

Yanlei Gu ◽

Shunsuke Kamijo

Keyword(s):

Generative Adversarial Network ◽

Adversarial Network ◽

Latent Space

Download Full-text

Meta-learning with Latent Space Clustering in Generative Adversarial Network for Speaker Diarization

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2021.3061885 ◽

2021 ◽

pp. 1-1

Author(s):

Monisankha Pal ◽

Manoj Kumar ◽

Raghuveer Peri ◽

Tae Jin Park ◽

So Hyun Kim ◽

...

Keyword(s):

Speaker Diarization ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Latent Space ◽

Meta Learning

Download Full-text

Zero-shot Cross-modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3424341 ◽

2021 ◽

Vol 17 (1s) ◽

pp. 1-17

Author(s):

Xing Xu ◽

Jialin Tian ◽

Kaiyi Lin ◽

Huimin Lu ◽

Jie Shao ◽

...

Keyword(s):

Knowledge Transfer ◽

Generative Models ◽

Training Set ◽

Retrieval Task ◽

Generative Adversarial Network ◽

Multimodal Data ◽

Common Space ◽

Adversarial Network ◽

Latent Space ◽

Testing Set

Conventional cross-modal retrieval models mainly assume the same scope of the classes for both the training set and the testing set. This assumption limits their extensibility on zero-shot cross-modal retrieval (ZS-CMR), where the testing set consists of unseen classes that are disjoint with seen classes in the training set. The ZS-CMR task is more challenging due to the heterogeneous distributions of different modalities and the semantic inconsistency between seen and unseen classes. A few of recently proposed approaches are inspired by zero-shot learning to estimate the distribution underlying multimodal data by generative models and make the knowledge transfer from seen classes to unseen classes by leveraging class embeddings. However, directly borrowing the idea from zero-shot learning (ZSL) is not fully adaptive to the retrieval task, since the core of the retrieval task is learning the common space. To address the above issues, we propose a novel approach named Assembling AutoEncoder and Generative Adversarial Network (AAEGAN), which combines the strength of AutoEncoder (AE) and Generative Adversarial Network (GAN), to jointly incorporate common latent space learning, knowledge transfer, and feature synthesis for ZS-CMR. Besides, instead of utilizing class embeddings as common space, the AAEGAN approach maps all multimodal data into a learned latent space with the distribution alignment via three coupled AEs. We empirically show the remarkable improvement for ZS-CMR task and establish the state-of-the-art or competitive performance on four image-text retrieval datasets.

Download Full-text

Searching the Latent Space of a Generative Adversarial Network to Generate DOOM Levels

Evolving mario levels in the latent space of a deep convolutional generative adversarial network

Identity preserving multi-pose facial expression recognition using fine tuned VGG on the latent space vector of generative adversarial network

Multi-Source Domain Adaptation for Visual Sentiment Classification

Improving Deep Interactive Evolution with a Style-Based Generator for Artistic Expression and Creative Exploration

Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval

BEGAN v3: Avoiding Mode Collapse in GANs Using Variational Inference

Generative Adversarial Network for Class-Conditional Data Augmentation

A Latent Space Understandable Generative Adversarial Network: SelfExGAN

Meta-learning with Latent Space Clustering in Generative Adversarial Network for Speaker Diarization

Zero-shot Cross-modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network

Export Citation Format