An Information-Theoretic Perspective on Proper Quaternion Variational Autoencoders

Variational autoencoders are deep generative models that have recently received a great deal of attention due to their ability to model the latent distribution of any kind of input such as images and audio signals, among others. A novel variational autoncoder in the quaternion domain H, namely the QVAE, has been recently proposed, leveraging the augmented second order statics of H-proper signals. In this paper, we analyze the QVAE under an information-theoretic perspective, studying the ability of the H-proper model to approximate improper distributions as well as the built-in H-proper ones and the loss of entropy due to the improperness of the input signal. We conduct experiments on a substantial set of quaternion signals, for each of which the QVAE shows the ability of modelling the input distribution, while learning the improperness and increasing the entropy of the latent space. The proposed analysis will prove that proper QVAEs can be employed with a good approximation even when the quaternion input data are improper.

Download Full-text

An Overview of Variational Autoencoders for Source Separation, Finance, and Bio-Signal Applications

Entropy ◽

10.3390/e24010055 ◽

2021 ◽

Vol 24 (1) ◽

pp. 55

Author(s):

Aman Singh ◽

Tokunbo Ogunfunmi

Keyword(s):

Input Data ◽

Source Separation ◽

Generative Models ◽

Learning System ◽

Generative Adversarial Networks ◽

Space Representation ◽

Comprehensive Overview ◽

Maximum Information ◽

Adversarial Networks ◽

Latent Space

Autoencoders are a self-supervised learning system where, during training, the output is an approximation of the input. Typically, autoencoders have three parts: Encoder (which produces a compressed latent space representation of the input data), the Latent Space (which retains the knowledge in the input data with reduced dimensionality but preserves maximum information) and the Decoder (which reconstructs the input data from the compressed latent space). Autoencoders have found wide applications in dimensionality reduction, object detection, image classification, and image denoising applications. Variational Autoencoders (VAEs) can be regarded as enhanced Autoencoders where a Bayesian approach is used to learn the probability distribution of the input data. VAEs have found wide applications in generating data for speech, images, and text. In this paper, we present a general comprehensive overview of variational autoencoders. We discuss problems with the VAEs and present several variants of the VAEs that attempt to provide solutions to the problems. We present applications of variational autoencoders for finance (a new and emerging field of application), speech/audio source separation, and biosignal applications. Experimental results are presented for an example of speech source separation to illustrate the powerful application of variants of VAE: VAE, β-VAE, and ITL-AE. We conclude the paper with a summary, and we identify possible areas of research in improving performance of VAEs in particular and deep generative models in general, of which VAEs and generative adversarial networks (GANs) are examples.

Download Full-text

The Dark Tetrad

Journal of Individual Differences ◽

10.1027/1614-0001/a000179 ◽

2015 ◽

Vol 36 (4) ◽

pp. 228-236 ◽

Cited By ~ 45

Author(s):

Janko Međedović ◽

Boban Petrović

Keyword(s):

Personality Traits ◽

Antisocial Behavior ◽

Second Order ◽

The Other ◽

Negative Pole ◽

Latent Space ◽

Order Factor

Abstract. Machiavellianism, narcissism, and psychopathy are personality traits understood to be dispositions toward amoral and antisocial behavior. Recent research has suggested that sadism should also be added to this set of traits. In the present study, we tested a hypothesis proposing that these four traits are expressions of one superordinate construct: The Dark Tetrad. Exploration of the latent space of four “dark” traits suggested that the singular second-order factor which represents the Dark Tetrad can be extracted. Analysis has shown that Dark Tetrad traits can be located in the space of basic personality traits, especially on the negative pole of the Honesty-Humility, Agreeableness, Conscientiousness, and Emotionality dimensions. We conclude that sadism behaves in a similar manner as the other dark traits, but it cannot be reduced to them. The results support the concept of “Dark Tetrad.”

Download Full-text

Supplemental Material for Information-Theoretic Latent Distribution Modeling: Distinguishing Discrete and Continuous Latent Variable Models

Psychological Methods ◽

10.1037/1082-989x.11.3.228.supp ◽

2006 ◽

Keyword(s):

Latent Variable ◽

Latent Variable Models ◽

Information Theoretic ◽

Distribution Modeling ◽

Latent Distribution

Download Full-text

A Survey on Variational Autoencoders from a Green AI Perspective

SN Computer Science ◽

10.1007/s42979-021-00702-9 ◽

2021 ◽

Vol 2 (4) ◽

Author(s):

Andrea Asperti ◽

Davide Evangelista ◽

Elena Loli Piccolomini

Keyword(s):

Architectural Design ◽

Mathematical Formulation ◽

Representation Learning ◽

Generative Models ◽

Model Description ◽

Energetic Efficiency ◽

Detailed Model ◽

Latent Distribution ◽

Quantitative Results ◽

Generation Problem

AbstractVariational Autoencoders (VAEs) are powerful generative models that merge elements from statistics and information theory with the flexibility offered by deep neural networks to efficiently solve the generation problem for high-dimensional data. The key insight of VAEs is to learn the latent distribution of data in such a way that new meaningful samples can be generated from it. This approach led to tremendous research and variations in the architectural design of VAEs, nourishing the recent field of research known as unsupervised representation learning. In this article, we provide a comparative evaluation of some of the most successful, recent variations of VAEs. We particularly focus the analysis on the energetic efficiency of the different models, in the spirit of the so-called Green AI, aiming both to reduce the carbon footprint and the financial cost of generative techniques. For each architecture, we provide its mathematical formulation, the ideas underlying its design, a detailed model description, a running implementation and quantitative results.

Download Full-text

Data Acquisition and Data Processing using Electroencephalogram in Neuromarketing: A Review

Pertanika Journal of Science and Technology ◽

10.47836/pjst.30.1.02 ◽

2021 ◽

Vol 30 (1) ◽

pp. 19-33

Author(s):

Annis Shafika Amran ◽

Sharifah Aida Sheikh Ibrahim ◽

Nurul Hashimah Ahamed Hassain Malim ◽

Nurfaten Hamzah ◽

Putra Sumari ◽

...

Keyword(s):

Data Acquisition ◽

Linear Models ◽

Brain Activity ◽

Generative Models ◽

Education Management ◽

Latent Space ◽

Latent Space Models ◽

Cost Feasibility ◽

Medical Healthcare ◽

Electroencephalogram Eeg

Electroencephalogram (EEG) is a neurotechnology used to measure brain activity via brain impulses. Throughout the years, EEG has contributed tremendously to data-driven research models (e.g., Generalised Linear Models, Bayesian Generative Models, and Latent Space Models) in Neuroscience Technology and Neuroinformatic. Due to versatility, portability, cost feasibility, and non-invasiveness. It contributed to various Neuroscientific data that led to advancement in medical, education, management, and even the marketing field. In the past years, the extensive uses of EEG have been inclined towards medical healthcare studies such as in disease detection and as an intervention in mental disorders, but not fully explored for uses in neuromarketing. Hence, this study construes the data acquisition technique in neuroscience studies using electroencephalogram and outlines the trend of revolution of this technique in aspects of its technology and databases by focusing on neuromarketing uses.

Download Full-text

RANCANGAN RANGKAIAN ANTI BOUNCING UNTUK RANGKAIAN DIGITAL

Sutet ◽

10.33322/sutet.v7i1.168 ◽

2018 ◽

Vol 7 (1) ◽

pp. 24-31

Author(s):

Redaksi Tim Jurnal

Keyword(s):

Electric Current ◽

Input Signal ◽

Input Data ◽

Digital Circuits ◽

Electronic Circuit ◽

Digital Electronics ◽

Mechanical Contacts ◽

Mechanical Switch

Push-On switches or toggle switches and mechanical relays are mechanical contacts made of metal which, when supplied with electric current, will result in a spike of electrical sparks, called Bouncing Effects. Bounce effects are often a problem in digital circuits, especially in digital electronics circuits, because these Bounce Effects will cause the value of data or signals coming into the circuit inaccurate or indeterminate, when the mechanical switch is pressed as input data. This will undoubtedly lead to undesirable conditions and must be overcome with an electronic circuit called De-Bounce for the data or input signal to be more certain.

Download Full-text

Incorporating structural knowledge into unsupervised deep learning for two-photon imaging data

10.1101/2021.05.18.443587 ◽

2021 ◽

Author(s):

Florian Eichin ◽

Maren Hackenberg ◽

Caroline Broichhagen ◽

Antje Kilias ◽

Jan Schmoranzer ◽

...

Keyword(s):

Deep Learning ◽

Live Imaging ◽

Temporal Changes ◽

Generative Models ◽

Structural Knowledge ◽

Imaging Data ◽

Two Photon ◽

Latent Space ◽

Photon Imaging ◽

Two Photon Imaging

Live imaging techniques, such as two-photon imaging, promise novel insights into cellular activity patterns at a high spatial and temporal resolution. While current deep learning approaches typically focus on specific supervised tasks in the analysis of such data, e.g., learning a segmentation mask as a basis for subsequent signal extraction steps, we investigate how unsupervised generative deep learning can be adapted to obtain interpretable models directly at the level of the video frames. Specifically, we consider variational autoencoders for models that infer a compressed representation of the data in a low-dimensional latent space, allowing for insight into what has been learned. Based on this approach, we illustrate how structural knowledge can be incorporated into the model architecture to improve model fitting and interpretability. Besides standard convolutional neural network components, we propose an architecture for separately encoding the foreground and background of live imaging data. We exemplify the proposed approach with two-photon imaging data from hippocampal CA1 neurons in mice, where we can disentangle the neural activity of interest from the neuropil background signal. Subsequently, we illustrate how to impose smoothness constraints onto the latent space for leveraging knowledge about gradual temporal changes. As a starting point for adaptation to similar live imaging applications, we provide a Jupyter notebook with code for exploration. Taken together, our results illustrate how architecture choices for deep generative models, such as for spatial structure, foreground vs. background, and gradual temporal changes, facilitate a modeling approach that combines the flexibility of deep learning with the benefits of incorporating domain knowledge. Such a strategy is seen to enable interpretable, purely image-based models of activity signals from live imaging, such as for two-photon data.

Download Full-text

Machine Learning for Non-Intrusive Speech Quality Assessment

10.26686/wgtn.16985584 ◽

2021 ◽

Author(s):

◽

Mouna Hakami

Keyword(s):

Machine Learning ◽

Quality Assessment ◽

Unsupervised Learning ◽

Supervised Learning ◽

Latent Variable ◽

Generative Models ◽

Speech Quality ◽

Speech Signals ◽

Latent Space ◽

Speech Quality Assessment

This thesis presents two studies on non-intrusive speech quality assessment methods. The first applies supervised learning methods to speech quality assessment, which is a common approach in machine learning based quality assessment. To outperform existing methods, we concentrate on enhancing the feature set. In the second study, we analyse quality assessment from a different point of view inspired by the biological brain and present the first unsupervised learning based non-intrusive quality assessment that removes the need for labelled training data. Supervised learning based, non-intrusive quality predictors generally involve the development of a regressor that maps signal features to a representation of perceived quality. The performance of the predictor largely depends on 1) how sensitive the features are to the different types of distortion, and 2) how well the model learns the relation between the features and the quality score. We improve the performance of the quality estimation by enhancing the feature set and using a contemporary machine learning model that fits this objective. We propose an augmented feature set that includes raw features that are presumably redundant. The speech quality assessment system benefits from this redundancy as it results in reducing the impact of unwanted noise in the input. Feature set augmentation generally leads to the inclusion of features that have non-smooth distributions. We introduce a new pre-processing method and re-distribute the features to facilitate the training. The evaluation of the system on the ITU-T Supplement23 database illustrates that the proposed system outperforms the popular standards and contemporary methods in the literature. The unsupervised learning quality assessment approach presented in this thesis is based on a model that is learnt from clean speech signals. Consequently, it does not need to learn the statistics of any corruption that exists in the degraded speech signals and is trained only with unlabelled clean speech samples. The quality has a new definition, which is based on the divergence between 1) the distribution of the spectrograms of test signals, and 2) the pre-existing model that represents the distribution of the spectrograms of good quality speech. The distribution of the spectrogram of the speech is complex, and hence comparing them is not trivial. To tackle this problem, we propose to map the spectrograms of speech signals to a simple latent space. Generative models that map simple latent distributions into complex distributions are excellent platforms for our work. Generative models that are trained on the spectrograms of clean speech signals learned to map the latent variable $Z$ from a simple distribution $P_Z$ into a spectrogram $X$ from the distribution of good quality speech. Consequently, an inference model is developed by inverting the pre-trained generator, which maps spectrograms of the signal under the test, $X_t$, into its relevant latent variable, $Z_t$, in the latent space. We postulate the divergence between the distribution of the latent variable and the prior distribution $P_Z$ is a good measure of the quality of speech. Generative adversarial nets (GAN) are an effective training method and work well in this application. The proposed system is a novel application for a GAN. The experimental results with the TIMIT and NOIZEUS databases show that the proposed measure correlates positively with the objective quality scores.

Download Full-text

Variational Autoencoder-Based Multiple Image Captioning Using a Caption Attention Map

Applied Sciences ◽

10.3390/app9132699 ◽

2019 ◽

Vol 9 (13) ◽

pp. 2699 ◽

Cited By ~ 4

Author(s):

Boeun Kim ◽

Saim Shin ◽

Hyedong Jung

Keyword(s):

Generative Models ◽

Research Topic ◽

Video Data ◽

Image Feature ◽

Image Captioning ◽

Multiple Image ◽

Visually Impaired People ◽

Proposed Model ◽

Variational Autoencoder ◽

Latent Distribution

Image captioning is a promising research topic that is applicable to services that search for desired content in a large amount of video data and a situation explanation service for visually impaired people. Previous research on image captioning has been focused on generating one caption per image. However, to increase usability in applications, it is necessary to generate several different captions that contain various representations for an image. We propose a method to generate multiple captions using a variational autoencoder, which is one of the generative models. Because an image feature plays an important role when generating captions, a method to extract a Caption Attention Map (CAM) of the image is proposed, and CAMs are projected to a latent distribution. In addition, methods for the evaluation of multiple image captioning tasks are proposed that have not yet been actively researched. The proposed model outperforms in the aspect of diversity compared with the base model when the accuracy is comparable. Moreover, it is verified that the model using CAM generates detailed captions describing various content in the image.

Download Full-text

The Effect of Evidence Transfer on Latent Feature Relevance for Clustering

Informatics ◽

10.3390/informatics6020017 ◽

2019 ◽

Vol 6 (2) ◽

pp. 17

Author(s):

Athanasios Davvetas ◽

Iraklis A. Klampanos ◽

Spiros Skiadopoulos ◽

Vangelis Karkaletsis

Keyword(s):

Mutual Information ◽

Ground Truth ◽

Original Data ◽

Information Theoretic ◽

Information Bottleneck ◽

Latent Space ◽

Before And After ◽

Feature Relevance ◽

Latent Representations ◽

Transfer Method

Evidence transfer for clustering is a deep learning method that manipulates the latent representations of an autoencoder according to external categorical evidence with the effect of improving a clustering outcome. Evidence transfer’s application on clustering is designed to be robust when introduced with a low quality of evidence, while increasing the effectiveness of the clustering accuracy during relevant corresponding evidence. We interpret the effects of evidence transfer on the latent representation of an autoencoder by comparing our method to the information bottleneck method. Information bottleneck is an optimisation problem of finding the best tradeoff between maximising the mutual information of data representations and a task outcome while at the same time being effective in compressing the original data source. We posit that the evidence transfer method has essentially the same objective regarding the latent representations produced by an autoencoder. We verify our hypothesis using information theoretic metrics from feature selection in order to perform an empirical analysis over the information that is carried through the bottleneck of the latent space. We use the relevance metric to compare the overall mutual information between the latent representations and the ground truth labels before and after their incremental manipulation, as well as, to study the effects of evidence transfer regarding the significance of each latent feature.

Download Full-text