Guiding Representation Learning in Deep Generative Models with Policy Gradients

A Survey on Variational Autoencoders from a Green AI Perspective

SN Computer Science ◽

10.1007/s42979-021-00702-9 ◽

2021 ◽

Vol 2 (4) ◽

Author(s):

Andrea Asperti ◽

Davide Evangelista ◽

Elena Loli Piccolomini

Keyword(s):

Architectural Design ◽

Mathematical Formulation ◽

Representation Learning ◽

Generative Models ◽

Model Description ◽

Energetic Efficiency ◽

Detailed Model ◽

Latent Distribution ◽

Quantitative Results ◽

Generation Problem

AbstractVariational Autoencoders (VAEs) are powerful generative models that merge elements from statistics and information theory with the flexibility offered by deep neural networks to efficiently solve the generation problem for high-dimensional data. The key insight of VAEs is to learn the latent distribution of data in such a way that new meaningful samples can be generated from it. This approach led to tremendous research and variations in the architectural design of VAEs, nourishing the recent field of research known as unsupervised representation learning. In this article, we provide a comparative evaluation of some of the most successful, recent variations of VAEs. We particularly focus the analysis on the energetic efficiency of the different models, in the spirit of the so-called Green AI, aiming both to reduce the carbon footprint and the financial cost of generative techniques. For each architecture, we provide its mathematical formulation, the ideas underlying its design, a detailed model description, a running implementation and quantitative results.

Download Full-text

TARA: Training and Representation Alteration for AI Fairness and Domain Generalization

Neural Computation ◽

10.1162/neco_a_01468 ◽

2022 ◽

pp. 1-38

Author(s):

William Paul ◽

Armin Hadzic ◽

Neil Joshi ◽

Fady Alajaji ◽

Philippe Burlina

Keyword(s):

Domain Adaptation ◽

Representation Learning ◽

Data Representation ◽

Generative Models ◽

Underrepresented Populations ◽

Latent Space ◽

Dual Strategy ◽

Fine Control ◽

Novel Method ◽

And Training

Abstract We propose a novel method for enforcing AI fairness with respect to protected or sensitive factors. This method uses a dual strategy performing training and representation alteration (TARA) for the mitigation of prominent causes of AI bias. It includes the use of representation learning alteration via adversarial independence to suppress the bias-inducing dependence of the data representation from protected factors and training set alteration via intelligent augmentation to address bias-causing data imbalance by using generative models that allow the fine control of sensitive factors related to underrepresented populations via domain adaptation and latent space manipulation. When testing our methods on image analytics, experiments demonstrate that TARA significantly or fully debiases baseline models while outperforming competing debiasing methods that have the same amount of information—for example, with (% overall accuracy, % accuracy gap) = (78.8, 0.5) versus the baseline method's score of (71.8, 10.5) for Eye-PACS, and (73.7, 11.8) versus (69.1, 21.7) for CelebA. Furthermore, recognizing certain limitations in current metrics used for assessing debiasing performance, we propose novel conjunctive debiasing metrics. Our experiments also demonstrate the ability of these novel metrics in assessing the Pareto efficiency of the proposed methods.

Download Full-text

Neural generative models and representation learning for information retrieval

ACM SIGIR Forum ◽

10.1145/3458553.3458565 ◽

2019 ◽

Vol 53 (2) ◽

pp. 97-97

Author(s):

Qingyao Ai

Keyword(s):

Information Retrieval ◽

Theoretical Analysis ◽

Language Processing ◽

Ad Hoc ◽

Representation Learning ◽

Generative Models ◽

Neural Models ◽

Retrieval Models ◽

Types Of Information ◽

Text Images

Information Retrieval (IR) concerns about the structure, analysis, organization, storage, and retrieval of information. Among different retrieval models proposed in the past decades, generative retrieval models, especially those under the statistical probabilistic framework, are one of the most popular techniques that have been widely applied to Information Retrieval problems. While they are famous for their well-grounded theory and good empirical performance in text retrieval, their applications in IR are often limited by their complexity and low extendability in the modeling of high-dimensional information. Recently, advances in deep learning techniques provide new opportunities for representation learning and generative models for information retrieval. In contrast to statistical models, neural models have much more flexibility because they model information and data correlation in latent spaces without explicitly relying on any prior knowledge. Previous studies on pattern recognition and natural language processing have shown that semantically meaningful representations of text, images, and many types of information can be acquired with neural models through supervised or unsupervised training. Nonetheless, the effectiveness of neural models for information retrieval is mostly unexplored. In this thesis, we study how to develop new generative models and representation learning frameworks with neural models for information retrieval. Specifically, our contributions include three main components: (1) Theoretical Analysis : We present the first theoretical analysis and adaptation of existing neural embedding models for ad-hoc retrieval tasks; (2) Design Practice : Based on our experience and knowledge, we show how to design an embedding-based neural generative model for practical information retrieval tasks such as personalized product search; And (3) Generic Framework : We further generalize our proposed neural generative framework for complicated heterogeneous information retrieval scenarios that concern text, images, knowledge entities, and their relationships. Empirical results show that the proposed neural generative framework can effectively learn information representations and construct retrieval models that outperform the state-of-the-art systems in a variety of IR tasks.

Download Full-text

Robot Concept Acquisition Based on Interaction Between Probabilistic and Deep Generative Models

Frontiers in Computer Science ◽

10.3389/fcomp.2021.618069 ◽

2021 ◽

Vol 3 ◽

Author(s):

Ryo Kuniyasu ◽

Tomoaki Nakamura ◽

Tadahiro Taniguchi ◽

Takayuki Nagai

Keyword(s):

Latent Variables ◽

Concept Formation ◽

Latent Dirichlet Allocation ◽

Multinomial Distribution ◽

Representation Learning ◽

Integrated Model ◽

Generative Models ◽

Image Features ◽

Sensory Data ◽

Multimodal Information

We propose a method for multimodal concept formation. In this method, unsupervised multimodal clustering and cross-modal inference, as well as unsupervised representation learning, can be performed by integrating the multimodal latent Dirichlet allocation (MLDA)-based concept formation and variational autoencoder (VAE)-based feature extraction. Multimodal clustering, representation learning, and cross-modal inference are critical for robots to form multimodal concepts from sensory data. Various models have been proposed for concept formation. However, in previous studies, features were extracted using manually designed or pre-trained feature extractors and representation learning was not performed simultaneously. Moreover, the generative probabilities of the features extracted from the sensory data could be predicted, but the sensory data could not be predicted in the cross-modal inference. Therefore, a method that can perform clustering, feature learning, and cross-modal inference among multimodal sensory data is required for concept formation. To realize such a method, we extend the VAE to the multinomial VAE (MNVAE), the latent variables of which follow a multinomial distribution, and construct a model that integrates the MNVAE and MLDA. In the experiments, the multimodal information of the images and words acquired by a robot was classified using the integrated model. The results demonstrated that the integrated model can classify the multimodal information as accurately as the previous model despite the feature extractor learning in an unsupervised manner, suitable image features for clustering can be learned, and cross-modal inference from the words to images is possible.

Download Full-text

Unsupervised Disentangled Representation Learning with Analogical Relations

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/335 ◽

2018 ◽

Cited By ~ 3

Author(s):

Zejian Li ◽

Yongchuan Tang ◽

Yongxing He

Keyword(s):

Artificial Intelligence ◽

Cognitive Processes ◽

State Of The Art ◽

Representation Learning ◽

Subspace Learning ◽

Generative Models ◽

The State ◽

The Other ◽

Learning Methods ◽

Training Strategy

Learning the disentangled representation of interpretable generative factors of data is one of the foundations to allow artificial intelligence to think like people. In this paper, we propose the analogical training strategy for the unsupervised disentangled representation learning in generative models. The analogy is one of the typical cognitive processes, and our proposed strategy is based on the observation that sample pairs in which one is different from the other in one specific generative factor show the same analogical relation. Thus, the generator is trained to generate sample pairs from which a designed classifier can identify the underlying analogical relation. In addition, we propose a disentanglement metric called the subspace score, which is inspired by subspace learning methods and does not require supervised information. Experiments show that our proposed training strategy allows the generative models to find the disentangled factors, and that our methods can give competitive performances as compared with the state-of-the-art methods.

Download Full-text

A Survey of Unsupervised Generative Models for Exploratory Data Analysis and Representation Learning

ACM Computing Surveys ◽

10.1145/3450963 ◽

2021 ◽

Vol 54 (5) ◽

pp. 1-40

Author(s):

Mohanad Abukmeil ◽

Stefano Ferrari ◽

Angelo Genovese ◽

Vincenzo Piuri ◽

Fabio Scotti

Keyword(s):

Big Data ◽

Data Analysis ◽

Exploratory Data Analysis ◽

Representation Learning ◽

Data Representation ◽

Generative Models ◽

Data Exploration ◽

Generative Learning ◽

Learning Models ◽

Exploratory Data

For more than a century, the methods for data representation and the exploration of the intrinsic structures of data have developed remarkably and consist of supervised and unsupervised methods. However, recent years have witnessed the flourishing of big data, where typical dataset dimensions are high and the data can come in messy, incomplete, unlabeled, or corrupted forms. Consequently, discovering the hidden structure buried inside such data becomes highly challenging. From this perspective, exploratory data analysis plays a substantial role in learning the hidden structures that encompass the significant features of the data in an ordered manner by extracting patterns and testing hypotheses to identify anomalies. Unsupervised generative learning models are a class of machine learning models characterized by their potential to reduce the dimensionality, discover the exploratory factors, and learn representations without any predefined labels; moreover, such models can generate the data from the reduced factors’ domain. The beginner researchers can find in this survey the recent unsupervised generative learning models for the purpose of data exploration and learning representations; specifically, this article covers three families of methods based on their usage in the era of big data: blind source separation, manifold learning, and neural networks, from shallow to deep architectures.

Download Full-text

Unsupervised Discovery, Control, and Disentanglement of Semantic Attributes With Applications to Anomaly Detection

Neural Computation ◽

10.1162/neco_a_01359 ◽

2021 ◽

Vol 33 (3) ◽

pp. 802-826

Author(s):

William Paul ◽

I-Jeng Wang ◽

Fady Alajaji ◽

Philippe Burlina

Keyword(s):

Mutual Information ◽

Anomaly Detection ◽

Network Architecture ◽

State Of The Art ◽

Representation Learning ◽

Generative Models ◽

Superior Performance ◽

Detection Methods ◽

Latent Factors ◽

Semantic Attributes

Our work focuses on unsupervised and generative methods that address the following goals: (1) learning unsupervised generative representations that discover latent factors controlling image semantic attributes, (2) studying how this ability to control attributes formally relates to the issue of latent factor disentanglement, clarifying related but dissimilar concepts that had been confounded in the past, and (3) developing anomaly detection methods that leverage representations learned in the first goal. For goal 1, we propose a network architecture that exploits the combination of multiscale generative models with mutual information (MI) maximization. For goal 2, we derive an analytical result, lemma 1 , that brings clarity to two related but distinct concepts: the ability of generative networks to control semantic attributes of images they generate, resulting from MI maximization, and the ability to disentangle latent space representations, obtained via total correlation minimization. More specifically, we demonstrate that maximizing semantic attribute control encourages disentanglement of latent factors. Using lemma 1 and adopting MI in our loss function, we then show empirically that for image generation tasks, the proposed approach exhibits superior performance as measured in the quality and disentanglement of the generated images when compared to other state-of-the-art methods, with quality assessed via the Fréchet inception distance (FID) and disentanglement via mutual information gap. For goal 3, we design several systems for anomaly detection exploiting representations learned in goal 1 and demonstrate their performance benefits when compared to state-of-the-art generative and discriminative algorithms. Our contributions in representation learning have potential applications in addressing other important problems in computer vision, such as bias and privacy in AI.

Download Full-text

Classification with Rejection: Scaling Generative Classifiers with Supervised Deep Infomax

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/412 ◽

2020 ◽

Author(s):

Xin Wang ◽

Siu Ming Yiu

Keyword(s):

Representation Learning ◽

Generative Models ◽

Probabilistic Constraints ◽

Class Label ◽

Learning Framework ◽

Level Data ◽

Data Representations ◽

Comparable Performance ◽

Adversarial Examples ◽

High Level

Deep Infomax (DIM) is an unsupervised representation learning framework by maximizing the mutual information between the inputs and the outputs of an encoder, while probabilistic constraints are imposed on the outputs. In this paper, we propose Supervised Deep InfoMax (SDIM), which introduces supervised probabilistic constraints to the encoder outputs. The supervised probabilistic constraints are equivalent to a generative classifier on high-level data representations, where class conditional log-likelihoods of samples can be evaluated. Unlike other works building generative classifiers with conditional generative models, SDIMs scale on complex datasets, and can achieve comparable performance with discriminative counterparts. With SDIM, we could perform classification with rejection. Instead of always reporting a class label, SDIM only makes predictions when test samples' largest class conditional surpass some pre-chosen thresholds, otherwise they will be deemed as out of the data distributions, and be rejected. Our experiments show that SDIM with rejection policy can effectively reject illegal inputs, including adversarial examples and out-of-distribution samples.

Download Full-text

Emergence of a "visual number sense" in hierarchical generative models

PsycEXTRA Dataset ◽

10.1037/e512592013-298 ◽

2011 ◽

Author(s):

M. Zorzi ◽

I. Stoianov

Keyword(s):

Number Sense ◽

Generative Models ◽

Visual Number

Download Full-text

Molecular Generation Targeting Desired Electronic Properties via Deep Generative Models

10.26434/chemrxiv.9913865.v2 ◽

2019 ◽

Author(s):

Qi Yuan ◽

Alejandro Santana-Bonilla ◽

Martijn Zwijnenburg ◽

Kim Jelfs

Keyword(s):

Neural Network ◽

Electronic Properties ◽

Transfer Learning ◽

Recurrent Neural Network ◽

Chemical Space ◽

Generative Models ◽

Molecular Features ◽

Donor Acceptor ◽

Homo Lumo ◽

Training Sets

<p>The chemical space for novel electronic donor-acceptor oligomers with targeted properties was explored using deep generative models and transfer learning. A General Recurrent Neural Network model was trained from the ChEMBL database to generate chemically valid SMILES strings. The parameters of the General Recurrent Neural Network were fine-tuned via transfer learning using the electronic donor-acceptor database from the Computational Material Repository to generate novel donor-acceptor oligomers. Six different transfer learning models were developed with different subsets of the donor-acceptor database as training sets. We concluded that electronic properties such as HOMO-LUMO gaps and dipole moments of the training sets can be learned using the SMILES representation with deep generative models, and that the chemical space of the training sets can be efficiently explored. This approach identified approximately 1700 new molecules that have promising electronic properties (HOMO-LUMO gap <2 eV and dipole moment <2 Debye), 6-times more than in the original database. Amongst the molecular transformations, the deep generative model has learned how to produce novel molecules by trading off between selected atomic substitutions (such as halogenation or methylation) and molecular features such as the spatial extension of the oligomer. The method can be extended as a plausible source of new chemical combinations to effectively explore the chemical space for targeted properties.</p>

Download Full-text