Vector Quantization-Based Regularization for Autoencoders

Hanwei Wu; Markus Flierl

doi:10.1609/aaai.v34i04.6108

Vector Quantization-Based Regularization for Autoencoders

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6108 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6380-6387

Author(s):

Hanwei Wu ◽

Markus Flierl

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Vector Quantization ◽

Regularization Method ◽

Bayesian Estimator ◽

Regularization Methods ◽

Similarity Preserving ◽

Latent Representations ◽

Low Dimensional ◽

Overfitting Problem

Autoencoders and their variations provide unsupervised models for learning low-dimensional representations for downstream tasks. Without proper regularization, autoencoder models are susceptible to the overfitting problem and the so-called posterior collapse phenomenon. In this paper, we introduce a quantization-based regularizer in the bottleneck stage of autoencoder models to learn meaningful latent representations. We combine both perspectives of Vector Quantized-Variational AutoEncoders (VQ-VAE) and classical denoising regularization methods of neural networks. We interpret quantizers as regularizers that constrain latent representations while fostering a similarity-preserving mapping at the encoder. Before quantization, we impose noise on the latent codes and use a Bayesian estimator to optimize the quantizer-based representation. The introduced bottleneck Bayesian estimator outputs the posterior mean of the centroids to the decoder, and thus, is performing soft quantization of the noisy latent codes. We show that our proposed regularization method results in improved latent representations for both supervised learning and clustering downstream tasks when compared to autoencoders using other bottleneck structures.

Download Full-text

Manifold-tiling Localized Receptive Fields are Optimal in Similarity-preserving Neural Networks

10.1101/338947 ◽

2018 ◽

Cited By ~ 4

Author(s):

Anirvan M. Sengupta ◽

Mariano Tepper ◽

Cengiz Pehlevan ◽

Alexander Genkin ◽

Dmitri B. Chklovskii

Keyword(s):

Neural Networks ◽

Receptive Fields ◽

Small Neighborhood ◽

Place Cells ◽

Optimal Solutions ◽

Stimulus Space ◽

Sensory Inputs ◽

Similarity Preserving ◽

Low Dimensional ◽

The Brain

AbstractMany neurons in the brain, such as place cells in the rodent hippocampus, have localized receptive fields, i.e., they respond to a small neighborhood of stimulus space. What is the functional significance of such representations and how can they arise? Here, we propose that localized receptive fields emerge in similarity-preserving networks of rectifying neurons that learn low-dimensional manifolds populated by sensory inputs. Numerical simulations of such networks on standard datasets yield manifold-tiling localized receptive fields. More generally, we show analytically that, for data lying on symmetric manifolds, optimal solutions of objectives, from which similarity-preserving networks are derived, have localized receptive fields. Therefore, nonnegative similarity-preserving mapping (NSM) implemented by neural networks can model representations of continuous manifolds in the brain.

Download Full-text

Absum: Simple Regularization Method for Reducing Structural Sensitivity of Convolutional Neural Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5865 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4394-4403

Author(s):

Sekitoshi Kanai ◽

Yasutoshi Ida ◽

Yasuhiro Fujiwara ◽

Masanori Yamada ◽

Shuichi Adachi

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Regularization Method ◽

Regularization Methods ◽

Frequency Noise ◽

Weight Decay ◽

Structural Sensitivity ◽

Convolution Filter ◽

Gradient Based ◽

Standard Regularization

We propose Absum, which is a regularization method for improving adversarial robustness of convolutional neural networks (CNNs). Although CNNs can accurately recognize images, recent studies have shown that the convolution operations in CNNs commonly have structural sensitivity to specific noise composed of Fourier basis functions. By exploiting this sensitivity, they proposed a simple black-box adversarial attack: Single Fourier attack. To reduce structural sensitivity, we can use regularization of convolution filter weights since the sensitivity of linear transform can be assessed by the norm of the weights. However, standard regularization methods can prevent minimization of the loss function because they impose a tight constraint for obtaining high robustness. To solve this problem, Absum imposes a loose constraint; it penalizes the absolute values of the summation of the parameters in the convolution layers. Absum can improve robustness against single Fourier attack while being as simple and efficient as standard regularization methods (e.g., weight decay and L1 regularization). Our experiments demonstrate that Absum improves robustness against single Fourier attack more than standard regularization methods. Furthermore, we reveal that robust CNNs with Absum are more robust against transferred attacks due to decreasing the common sensitivity and against high-frequency noise than standard regularization methods. We also reveal that Absum can improve robustness against gradient-based attacks (projected gradient descent) when used with adversarial training.

Download Full-text

Search for the Global Extremum Using the Correlation Indicator for Neural Networks Supervised Learning

Programming and Computer Software ◽

10.1134/s0361768820080265 ◽

2020 ◽

Vol 46 (8) ◽

pp. 609-618

Author(s):

N. Vershkov ◽

M. Babenko ◽

V. Kuchukov ◽

N. Kuchukova

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Global Extremum

Download Full-text

Application of supervised learning artificial neural networks [CPNN, BPNN] for solving power flow problem

IET-UK International Conference on Information and Communication Technology in Electrical Sciences (ICTES 2007) ◽

10.1049/ic:20070603 ◽

2007 ◽

Cited By ~ 5

Author(s):

A. Rathinam ◽

S. Padmini ◽

V. Ravikumar

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Supervised Learning ◽

Power Flow ◽

Flow Problem ◽

Artificial Neural ◽

Power Flow Problem

Download Full-text

Adversarial training for few-shot text classification

Intelligenza Artificiale ◽

10.3233/ia-200051 ◽

2021 ◽

Vol 14 (2) ◽

pp. 201-214

Author(s):

Danilo Croce ◽

Giuseppe Castellucci ◽

Roberto Basili

Keyword(s):

Supervised Learning ◽

Language Processing ◽

Reproducing Kernel ◽

Generative Adversarial Networks ◽

Training Material ◽

Semantic Classification ◽

Universal Sentence ◽

Kernel Hilbert Spaces ◽

Supervised Methods ◽

Low Dimensional

In recent years, Deep Learning methods have become very popular in classification tasks for Natural Language Processing (NLP); this is mainly due to their ability to reach high performances by relying on very simple input representations, i.e., raw tokens. One of the drawbacks of deep architectures is the large amount of annotated data required for an effective training. Usually, in Machine Learning this problem is mitigated by the usage of semi-supervised methods or, more recently, by using Transfer Learning, in the context of deep architectures. One recent promising method to enable semi-supervised learning in deep architectures has been formalized within Semi-Supervised Generative Adversarial Networks (SS-GANs) in the context of Computer Vision. In this paper, we adopt the SS-GAN framework to enable semi-supervised learning in the context of NLP. We demonstrate how an SS-GAN can boost the performances of simple architectures when operating in expressive low-dimensional embeddings; these are derived by combining the unsupervised approximation of linguistic Reproducing Kernel Hilbert Spaces and the so-called Universal Sentence Encoders. We experimentally evaluate the proposed approach over a semantic classification task, i.e., Question Classification, by considering different sizes of training material and different numbers of target classes. By applying such adversarial schema to a simple Multi-Layer Perceptron, a classifier trained over a subset derived from 1% of the original training material achieves 92% of accuracy. Moreover, when considering a complex classification schema, e.g., involving 50 classes, the proposed method outperforms state-of-the-art alternatives such as BERT.

Download Full-text

Improving Semi-Supervised Learning for Audio Classification with FixMatch

Electronics ◽

10.3390/electronics10151807 ◽

2021 ◽

Vol 10 (15) ◽

pp. 1807

Author(s):

Sascha Grollmisch ◽

Estefanía Cano

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Transfer Learning ◽

Data Transfer ◽

State Of The Art ◽

Training Data ◽

Audio Classification ◽

Image Domain ◽

Full Dataset ◽

Audio Data

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.

Download Full-text

Data augmentation and semi-supervised learning for deep neural networks-based text classifier

Proceedings of the 35th Annual ACM Symposium on Applied Computing ◽

10.1145/3341105.3373992 ◽

2020 ◽

Author(s):

Heereen Shim ◽

Stijn Luca ◽

Dietwig Lowet ◽

Bart Vanrumste

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Deep Neural Networks ◽

Data Augmentation

Download Full-text

A rotation based regularization method for semi-supervised learning

Pattern Analysis and Applications ◽

10.1007/s10044-020-00947-9 ◽

2021 ◽

Author(s):

Prashant Shukla ◽

Abhishek ◽

Shekhar Verma ◽

Manish Kumar

Keyword(s):

Supervised Learning ◽

Regularization Method

Download Full-text

Laplacian networks: bounding indicator function smoothness for neural networks robustness

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2021.2 ◽

2021 ◽

Vol 10 ◽

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Supervised Learning ◽

Indicator Function ◽

Training Data ◽

Theoretical Justification ◽

The Past ◽

Noisy Examples

For the past few years, deep learning (DL) robustness (i.e. the ability to maintain the same decision when inputs are subject to perturbations) has become a question of paramount importance, in particular in settings where misclassification can have dramatic consequences. To address this question, authors have proposed different approaches, such as adding regularizers or training using noisy examples. In this paper we introduce a regularizer based on the Laplacian of similarity graphs obtained from the representation of training data at each layer of the DL architecture. This regularizer penalizes large changes (across consecutive layers in the architecture) in the distance between examples of different classes, and as such enforces smooth variations of the class boundaries. We provide theoretical justification for this regularizer and demonstrate its effectiveness to improve robustness on classical supervised learning vision datasets for various types of perturbations. We also show it can be combined with existing methods to increase overall robustness.

Download Full-text

Is the Meta-Learning Idea Able to Improve the Generalization of Deep Neural Networks on the Standard Supervised Learning?

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9412047 ◽

2021 ◽

Author(s):

Xiang Deng ◽

Zhongfei Mark Zhang

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Deep Neural Networks ◽

Meta Learning

Download Full-text