Invariant Representations through Adversarial Forgetting

Ayush Jaiswal; Daniel Moyer; Greg Ver Steeg; Wael AbdAlmageed; Premkumar Natarajan

doi:10.1609/aaai.v34i04.5850

Invariant Representations through Adversarial Forgetting

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5850 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4272-4279

Author(s):

Ayush Jaiswal ◽

Daniel Moyer ◽

Greg Ver Steeg ◽

Wael AbdAlmageed ◽

Premkumar Natarajan

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Empirical Results ◽

Information Bottleneck ◽

Novel Approach ◽

Adversarial Training ◽

Invariant Representations ◽

Art Performance ◽

Forgetting Mechanism

We propose a novel approach to achieving invariance for deep neural networks in the form of inducing amnesia to unwanted factors of data through a new adversarial forgetting mechanism. We show that the forgetting mechanism serves as an information-bottleneck, which is manipulated by the adversarial training to learn invariance to unwanted factors. Empirical results show that the proposed framework achieves state-of-the-art performance at learning invariance in both nuisance and bias settings on a diverse collection of datasets and tasks.

Download Full-text

Learning robust features by extended generative stochastic networks

International Journal of Modeling Simulation and Scientific Computing ◽

10.1142/s1793962318500046 ◽

2018 ◽

Vol 09 (01) ◽

pp. 1850004

Author(s):

Da Teng ◽

Xiao Song ◽

Guanghong Gong ◽

Junhua Zhou

Keyword(s):

Neural Networks ◽

Object Recognition ◽

Deep Neural Networks ◽

State Of The Art ◽

Random Noise ◽

Stochastic Networks ◽

Experimental Results ◽

Feedforward Networks ◽

Adversarial Examples ◽

Art Performance

Deep neural networks have achieved state-of-the-art performance on many object recognition tasks, but they are vulnerable to small adversarial perturbations. In this paper, several extensions of generative stochastic networks (GSNs) are proposed to improve the robustness of neural networks to random noise and adversarial perturbations. Experimental results show that compared to normal GSN method, the extensions using adversarial examples, lateral connections and feedforward networks can improve the performance of GSNs by making the models more resistant to overfitting and noise.

Download Full-text

Leveraging the Bhattacharyya coefficient for uncertainty quantification in deep neural networks

Neural Computing and Applications ◽

10.1007/s00521-021-05789-y ◽

2021 ◽

Author(s):

Pieter Van Molle ◽

Tim Verbelen ◽

Bert Vankeirsbilck ◽

Jonas De Vylder ◽

Bart Diricx ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Use Case ◽

Bhattacharyya Coefficient ◽

Output Uncertainty ◽

Novel Approach ◽

Benchmark Datasets ◽

Network Approaches

AbstractModern deep learning models achieve state-of-the-art results for many tasks in computer vision, such as image classification and segmentation. However, its adoption into high-risk applications, e.g. automated medical diagnosis systems, happens at a slow pace. One of the main reasons for this is that regular neural networks do not capture uncertainty. To assess uncertainty in classification, several techniques have been proposed casting neural network approaches in a Bayesian setting. Amongst these techniques, Monte Carlo dropout is by far the most popular. This particular technique estimates the moments of the output distribution through sampling with different dropout masks. The output uncertainty of a neural network is then approximated as the sample variance. In this paper, we highlight the limitations of such a variance-based uncertainty metric and propose an novel approach. Our approach is based on the overlap between output distributions of different classes. We show that our technique leads to a better approximation of the inter-class output confusion. We illustrate the advantages of our method using benchmark datasets. In addition, we apply our metric to skin lesion classification—a real-world use case—and show that this yields promising results.

Download Full-text

Robust Neural Networks are More Interpretable for Genomics

10.1101/657437 ◽

2019 ◽

Cited By ~ 5

Author(s):

Peter K. Koo ◽

Sharon Qian ◽

Gal Kaplun ◽

Verena Volf ◽

Dimitris Kalimeris

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Random Noise ◽

Genomic Data ◽

Training Methods ◽

Generalization Performance ◽

Regulatory Genomics ◽

Adversarial Training

AbstractDeep neural networks (DNNs) have been applied to a variety of regulatory genomics tasks. For interpretability, attribution methods are employed to provide importance scores for each nucleotide in a given sequence. However, even with state-of-the-art DNNs, there is no guarantee that these methods can recover interpretable, biological representations. Here we perform systematic experiments on synthetic genomic data to raise awareness of this issue. We find that deeper networks have better generalization performance, but attribution methods recover less interpretable representations. Then, we show training methods promoting robustness – including regularization, injecting random noise into the data, and adversarial training – significantly improve interpretability of DNNs, especially for smaller datasets.

Download Full-text

Effective Ensemble of Deep Neural Networks Predicts Neural Responses to Naturalistic Videos

10.1101/2021.08.24.457581 ◽

2021 ◽

Author(s):

Huzheng Yang ◽

Shanghang Zhang ◽

Yifan Wu ◽

Yuanning Li ◽

Shi Gu

Keyword(s):

Neural Networks ◽

Visual Cortex ◽

Deep Neural Networks ◽

Functional Neuroimaging ◽

State Of The Art ◽

Prediction Performance ◽

Neural Responses ◽

Neural Signals ◽

Art Performance ◽

Image Streams

This report provides a review of our submissions to the Algonauts Challenge 2021. In this challenge, neural responses in the visual cortex were recorded using functional neuroimaging when participants were watching naturalistic videos. The goal of the challenge is to develop voxel-wise encoding models which predict such neural signals based on the input videos. Here we built an ensemble of models that extract representations based on the input videos from 4 perspectives: image streams, motion, edges, and audio. We showed that adding new modules into the ensemble consistently improved our prediction performance. Our methods achieved state-of-the-art performance on both the mini track and the full track tasks.

Download Full-text

Diversity Adversarial Training against Adversarial Attack on Deep Neural Networks

Symmetry ◽

10.3390/sym13030428 ◽

2021 ◽

Vol 13 (3) ◽

pp. 428

Author(s):

Hyun Kwon ◽

Jun Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Diversity Training ◽

Original Data ◽

Training Method ◽

Learning Framework ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Attack ◽

Accuracy Rates

This paper presents research focusing on visualization and pattern recognition based on computer science. Although deep neural networks demonstrate satisfactory performance regarding image and voice recognition, as well as pattern analysis and intrusion detection, they exhibit inferior performance towards adversarial examples. Noise introduction, to some degree, to the original data could lead adversarial examples to be misclassified by deep neural networks, even though they can still be deemed as normal by humans. In this paper, a robust diversity adversarial training method against adversarial attacks was demonstrated. In this approach, the target model is more robust to unknown adversarial examples, as it trains various adversarial samples. During the experiment, Tensorflow was employed as our deep learning framework, while MNIST and Fashion-MNIST were used as experimental datasets. Results revealed that the diversity training method has lowered the attack success rate by an average of 27.2 and 24.3% for various adversarial examples, while maintaining the 98.7 and 91.5% accuracy rates regarding the original data of MNIST and Fashion-MNIST.

Download Full-text

Solving inverse problems in stochastic models using deep neural networks and adversarial training

Computer Methods in Applied Mechanics and Engineering ◽

10.1016/j.cma.2021.113976 ◽

2021 ◽

Vol 384 ◽

pp. 113976

Author(s):

Kailai Xu ◽

Eric Darve

Keyword(s):

Neural Networks ◽

Inverse Problems ◽

Stochastic Models ◽

Deep Neural Networks ◽

Adversarial Training

Download Full-text

Representing Deep Neural Networks Latent Space Geometries with Graphs

Algorithms ◽

10.3390/a14020039 ◽

2021 ◽

Vol 14 (2) ◽

pp. 39

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Objective Function ◽

Learning Process ◽

Deep Neural Networks ◽

State Of The Art ◽

The Core ◽

Learning Tasks ◽

Latent Space

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Download Full-text

Reconfigurable Binary Neural Network Accelerator with Adaptive Parallelism Scheme

Electronics ◽

10.3390/electronics10030230 ◽

2021 ◽

Vol 10 (3) ◽

pp. 230

Author(s):

Jaechan Cho ◽

Yongchul Jung ◽

Seongjoo Lee ◽

Yunho Jung

Keyword(s):

Neural Networks ◽

High Throughput ◽

Deep Neural Networks ◽

State Of The Art ◽

Throughput Performance ◽

Adaptive Parallelism ◽

Sensor Applications ◽

Binary Neural Network ◽

Target Layer ◽

Network Topologies

Binary neural networks (BNNs) have attracted significant interest for the implementation of deep neural networks (DNNs) on resource-constrained edge devices, and various BNN accelerator architectures have been proposed to achieve higher efficiency. BNN accelerators can be divided into two categories: streaming and layer accelerators. Although streaming accelerators designed for a specific BNN network topology provide high throughput, they are infeasible for various sensor applications in edge AI because of their complexity and inflexibility. In contrast, layer accelerators with reasonable resources can support various network topologies, but they operate with the same parallelism for all the layers of the BNN, which degrades throughput performance at certain layers. To overcome this problem, we propose a BNN accelerator with adaptive parallelism that offers high throughput performance in all layers. The proposed accelerator analyzes target layer parameters and operates with optimal parallelism using reasonable resources. In addition, this architecture is able to fully compute all types of BNN layers thanks to its reconfigurability, and it can achieve a higher area–speed efficiency than existing accelerators. In performance evaluation using state-of-the-art BNN topologies, the designed BNN accelerator achieved an area–speed efficiency 9.69 times higher than previous FPGA implementations and 24% higher than existing VLSI implementations for BNNs.

Download Full-text

Specializing Word Embeddings (for Parsing) by Information Bottleneck (Extended Abstract)

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/658 ◽

2020 ◽

Author(s):

Xiang Lisa Li ◽

Jason Eisner

Keyword(s):

Dimensionality Reduction ◽

Semantic Information ◽

State Of The Art ◽

Word Embedding ◽

Discrete Version ◽

Word Embeddings ◽

Continuous Version ◽

Continuous Vector ◽

Information Bottleneck ◽

Art Performance

Pre-trained word embeddings like ELMo and BERT contain rich syntactic and semantic information, resulting in state-of-the-art performance on various tasks. We propose a very fast variational information bottleneck (VIB) method to nonlinearly compress these embeddings, keeping only the information that helps a discriminative parser. We compress each word embedding to either a discrete tag or a continuous vector. In the discrete version, our automatically compressed tags form an alternative tag set: we show experimentally that our tags capture most of the information in traditional POS tag annotations, but our tag sequences can be parsed more accurately at the same level of tag granularity. In the continuous version, we show experimentally that moderately compressing the word embeddings by our method yields a more accurate parser in 8 of 9 languages, unlike simple dimensionality reduction.

Download Full-text

Label Distribution for Learning with Noisy Labels

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/356 ◽

2020 ◽

Author(s):

Yun-Peng Liu ◽

Ning Xu ◽

Yu Zhang ◽

Xin Geng

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Learning Algorithm ◽

State Of The Art ◽

Confidence Estimation ◽

Novel Method ◽

Real World Datasets ◽

Label Distribution ◽

Noisy Labels

The performances of deep neural networks (DNNs) crucially rely on the quality of labeling. In some situations, labels are easily corrupted, and therefore some labels become noisy labels. Thus, designing algorithms that deal with noisy labels is of great importance for learning robust DNNs. However, it is difficult to distinguish between clean labels and noisy labels, which becomes the bottleneck of many methods. To address the problem, this paper proposes a novel method named Label Distribution based Confidence Estimation (LDCE). LDCE estimates the confidence of the observed labels based on label distribution. Then, the boundary between clean labels and noisy labels becomes clear according to confidence scores. To verify the effectiveness of the method, LDCE is combined with the existing learning algorithm to train robust DNNs. Experiments on both synthetic and real-world datasets substantiate the superiority of the proposed algorithm against state-of-the-art methods.

Download Full-text