Hierarchical Knowledge Squeezed Adversarial Network Compression

Deep network compression has been achieved notable progress via knowledge distillation, where a teacher-student learning manner is adopted by using predetermined loss. Recently, more focuses have been transferred to employ the adversarial training to minimize the discrepancy between distributions of output from two networks. However, they always emphasize on result-oriented learning while neglecting the scheme of process-oriented learning, leading to the loss of rich information contained in the whole network pipeline. Whereas in other (non GAN-based) process-oriented methods, the knowledge have usually been transferred in a redundant manner. Observing that, the small network can not perfectly mimic a large one due to the huge gap of network scale, we propose a knowledge transfer method, involving effective intermediate supervision, under the adversarial training framework to learn the student network. Different from the other intermediate supervision methods, we design the knowledge representation in a compact form by introducing a task-driven attention mechanism. Meanwhile, to improve the representation capability of the attention-based method, a hierarchical structure is utilized so that powerful but highly squeezed knowledge is realized and the knowledge from teacher network could accommodate the size of student network. Extensive experimental results on three typical benchmark datasets, i.e., CIFAR-10, CIFAR-100, and ImageNet, demonstrate that our method achieves highly superior performances against state-of-the-art methods.

Download Full-text

Adversarial Optimization-Based Knowledge Transfer of Layer-Wise Dense Flow for Image Classification

Applied Sciences ◽

10.3390/app11083720 ◽

2021 ◽

Vol 11 (8) ◽

pp. 3720

Author(s):

Doyeob Yeo ◽

Min-Suk Kim ◽

Ji-Hoon Bae

Keyword(s):

Knowledge Transfer ◽

Euclidean Distance ◽

Learning Technology ◽

Transfer Scheme ◽

Adversarial Network ◽

Knowledge Distillation ◽

Dense Flow ◽

Adversarial Training ◽

Transfer Method ◽

Accuracy Performance

A deep-learning technology for knowledge transfer is necessary to advance and optimize efficient knowledge distillation. Here, we aim to develop a new adversarial optimization-based knowledge transfer method involved with a layer-wise dense flow that is distilled from a pre-trained deep neural network (DNN). Knowledge distillation transferred to another target DNN based on adversarial loss functions has multiple flow-based knowledge items that are densely extracted by overlapping them from a pre-trained DNN to enhance the existing knowledge. We propose a semi-supervised learning-based knowledge transfer with multiple items of dense flow-based knowledge extracted from the pre-trained DNN. The proposed loss function would comprise a supervised cross-entropy loss for a typical classification, an adversarial training loss for the target DNN and discriminators, and Euclidean distance-based loss in terms of dense flow. For both pre-trained and target DNNs considered in this study, we adopt a residual network (ResNet) architecture. We propose methods of (1) the adversarial-based knowledge optimization, (2) the extended and flow-based knowledge transfer scheme, and (3) the combined layer-wise dense flow in an adversarial network. The results show that it provides higher accuracy performance in the improved target ResNet compared to the prior knowledge transfer methods.

Download Full-text

Robust CNN Compression Framework for Security-Sensitive Embedded Systems

Applied Sciences ◽

10.3390/app11031093 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1093

Author(s):

Jeonghyun Lee ◽

Sangkyun Lee

Keyword(s):

Embedded Systems ◽

Optimization Problem ◽

State Of The Art ◽

Classification Problems ◽

Proximal Gradient Method ◽

Knowledge Distillation ◽

New Type ◽

Adversarial Examples ◽

Adversarial Training ◽

Memory Efficient

Convolutional neural networks (CNNs) have achieved tremendous success in solving complex classification problems. Motivated by this success, there have been proposed various compression methods for downsizing the CNNs to deploy them on resource-constrained embedded systems. However, a new type of vulnerability of compressed CNNs known as the adversarial examples has been discovered recently, which is critical for security-sensitive systems because the adversarial examples can cause malfunction of CNNs and can be crafted easily in many cases. In this paper, we proposed a compression framework to produce compressed CNNs robust against such adversarial examples. To achieve the goal, our framework uses both pruning and knowledge distillation with adversarial training. We formulate our framework as an optimization problem and provide a solution algorithm based on the proximal gradient method, which is more memory-efficient than the popular ADMM-based compression approaches. In experiments, we show that our framework can improve the trade-off between adversarial robustness and compression rate compared to the existing state-of-the-art adversarial pruning approach.

Download Full-text

Attention-Aware Adversarial Network for Person Re-Identification

Applied Sciences ◽

10.3390/app9081550 ◽

2019 ◽

Vol 9 (8) ◽

pp. 1550 ◽

Cited By ~ 1

Author(s):

Aihong Shen ◽

Huasheng Wang ◽

Junjie Wang ◽

Hongchen Tan ◽

Xiuping Liu ◽

...

Keyword(s):

High Performance ◽

Large Scale ◽

Data Augmentation ◽

Fundamental Problem ◽

Training Data ◽

Specific Data ◽

Training Strategy ◽

Adversarial Network ◽

Benchmark Datasets ◽

Adversarial Training

Person re-identification (re-ID) is a fundamental problem in the field of computer vision. The performance of deep learning-based person re-ID models suffers from a lack of training data. In this work, we introduce a novel image-specific data augmentation method on the feature map level to enforce feature diversity in the network. Furthermore, an attention assignment mechanism is proposed to enforce that the person re-ID classifier focuses on nearly all important regions of the input person image. To achieve this, a three-stage framework is proposed. First, a baseline classification network is trained for person re-ID. Second, an attention assignment network is proposed based on the baseline network, in which the attention module learns to suppress the response of the current detected regions and re-assign attentions to other important locations. By this means, multiple important regions for classification are highlighted by the attention map. Finally, the attention map is integrated in the attention-aware adversarial network (AAA-Net), which generates high-performance classification results with an adversarial training strategy. We evaluate the proposed method on two large-scale benchmark datasets, including Market1501 and DukeMTMC-reID. Experimental results show that our algorithm performs favorably against the state-of-the-art methods.

Download Full-text

Zero-Shot Learning from Adversarial Feature Residual to Compact Visual Feature

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6821 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11547-11554

Author(s):

Bo Liu ◽

Qiulei Dong ◽

Zhanyi Hu

Keyword(s):

State Of The Art ◽

Feature Space ◽

Visual Features ◽

Selection Strategy ◽

Semantic Features ◽

Visual Feature ◽

Adversarial Network ◽

Benchmark Datasets ◽

Residual Generator ◽

Object Features

Recently, many zero-shot learning (ZSL) methods focused on learning discriminative object features in an embedding feature space, however, the distributions of the unseen-class features learned by these methods are prone to be partly overlapped, resulting in inaccurate object recognition. Addressing this problem, we propose a novel adversarial network to synthesize compact semantic visual features for ZSL, consisting of a residual generator, a prototype predictor, and a discriminator. The residual generator is to generate the visual feature residual, which is integrated with a visual prototype predicted via the prototype predictor for synthesizing the visual feature. The discriminator is to distinguish the synthetic visual features from the real ones extracted from an existing categorization CNN. Since the generated residuals are generally numerically much smaller than the distances among all the prototypes, the distributions of the unseen-class features synthesized by the proposed network are less overlapped. In addition, considering that the visual features from categorization CNNs are generally inconsistent with their semantic features, a simple feature selection strategy is introduced for extracting more compact semantic visual features. Extensive experimental results on six benchmark datasets demonstrate that our method could achieve a significantly better performance than existing state-of-the-art methods by ∼1.2-13.2% in most cases.

Download Full-text

EWGAN: Entropy-Based Wasserstein GAN for Imbalanced Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.330110011 ◽

2019 ◽

Vol 33 ◽

pp. 10011-10012 ◽

Cited By ~ 1

Author(s):

Jinfu Ren ◽

Yang Liu ◽

Jiming Liu

Keyword(s):

Feature Vector ◽

State Of The Art ◽

Random Noise ◽

Classification Performance ◽

Imbalanced Learning ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Benchmark Datasets ◽

Entropy Weighted ◽

Original Feature

In this paper, we propose a novel oversampling strategy dubbed Entropy-based Wasserstein Generative Adversarial Network (EWGAN) to generate data samples for minority classes in imbalanced learning. First, we construct an entropyweighted label vector for each class to characterize the data imbalance in different classes. Then we concatenate this entropyweighted label vector with the original feature vector of each data sample, and feed it into the WGAN model to train the generator. After the generator is trained, we concatenate the entropy-weighted label vector with random noise feature vectors, and feed them into the generator to generate data samples for minority classes. Experimental results on two benchmark datasets show that the samples generated by the proposed oversampling strategy can help to improve the classification performance when the data are highly imbalanced. Furthermore, the proposed strategy outperforms other state-of-the-art oversampling algorithms in terms of the classification accuracy.

Download Full-text

Structural Adversarial Variational Auto-Encoder for Attributed Network Embedding

Applied Sciences ◽

10.3390/app11052371 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2371

Author(s):

Junjian Zhan ◽

Feng Li ◽

Yang Wang ◽

Daoyu Lin ◽

Guangluan Xu

Keyword(s):

State Of The Art ◽

Global Information ◽

Network Embedding ◽

Sampling Process ◽

Attributed Network ◽

Benchmark Datasets ◽

Adversarial Training ◽

Low Dimensional ◽

Embedding Methods ◽

Local Proximity

As most networks come with some content in each node, attributed network embedding has aroused much research interest. Most existing attributed network embedding methods aim at learning a fixed representation for each node encoding its local proximity. However, those methods usually neglect the global information between nodes distant from each other and distribution of the latent codes. We propose Structural Adversarial Variational Graph Auto-Encoder (SAVGAE), a novel framework which encodes the network structure and node content into low-dimensional embeddings. On one hand, our model captures the local proximity and proximities at any distance of a network by exploiting a high-order proximity indicator named Rooted Pagerank. On the other hand, our method learns the data distribution of each node representation while circumvents the side effect its sampling process causes on learning a robust embedding through adversarial training. On benchmark datasets, we demonstrate that our method performs competitively compared with state-of-the-art models.

Download Full-text

Bidirectional Adversarial Training for Semi-Supervised Domain Adaptation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/130 ◽

2020 ◽

Author(s):

Pin Jiang ◽

Aming Wu ◽

Yahong Han ◽

Yunfeng Shao ◽

Meiyu Qi ◽

...

Keyword(s):

Additional Data ◽

Domain Adaptation ◽

State Of The Art ◽

Powerful Method ◽

Target Domain ◽

Unsupervised Domain Adaptation ◽

Benchmark Datasets ◽

Adversarial Examples ◽

Adversarial Training ◽

Effective Use

Semi-supervised domain adaptation (SSDA) is a novel branch of machine learning that scarce labeled target examples are available, compared with unsupervised domain adaptation. To make effective use of these additional data so as to bridge the domain gap, one possible way is to generate adversarial examples, which are images with additional perturbations, between the two domains and fill the domain gap. Adversarial training has been proven to be a powerful method for this purpose. However, the traditional adversarial training adds noises in arbitrary directions, which is inefficient to migrate between domains, or generate directional noises from the source to target domain and reverse. In this work, we devise a general bidirectional adversarial training method and employ gradient to guide adversarial examples across the domain gap, i.e., the Adaptive Adversarial Training (AAT) for source to target domain and Entropy-penalized Virtual Adversarial Training (E-VAT) for target to source domain. Particularly, we devise a Bidirectional Adversarial Training (BiAT) network to perform diverse adversarial trainings jointly. We evaluate the effectiveness of BiAT on three benchmark datasets and experimental results demonstrate the proposed method achieves the state-of-the-art.

Download Full-text

Distilling Portable Generative Adversarial Networks for Image Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5765 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3585-3592 ◽

Cited By ~ 1

Author(s):

Hanting Chen ◽

Yunhe Wang ◽

Han Shu ◽

Changyuan Wen ◽

Chunjing Xu ◽

...

Keyword(s):

Generative Models ◽

Generative Adversarial Networks ◽

Adversarial Networks ◽

Image Translation ◽

Level Information ◽

Benchmark Datasets ◽

Knowledge Distillation ◽

Network Compression ◽

High Level ◽

And Storage

Despite Generative Adversarial Networks (GANs) have been widely used in various image-to-image translation tasks, they can be hardly applied on mobile devices due to their heavy computation and storage cost. Traditional network compression methods focus on visually recognition tasks, but never deal with generation tasks. Inspired by knowledge distillation, a student generator of fewer parameters is trained by inheriting the low-level and high-level information from the original heavy teacher generator. To promote the capability of student generator, we include a student discriminator to measure the distances between real images, and images generated by student and teacher generators. An adversarial learning process is therefore established to optimize student generator and student discriminator. Qualitative and quantitative analysis by conducting experiments on benchmark datasets demonstrate that the proposed method can learn portable generative models with strong performance.

Download Full-text

HorNet: A Hierarchical Offshoot Recurrent Network for Improving Person Re-ID via Image Captioning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/742 ◽

2019 ◽

Cited By ~ 1

Author(s):

Shiyang Yan ◽

Jun Xu ◽

Yuai Liu ◽

Lin Xu

Keyword(s):

State Of The Art ◽

Recurrent Network ◽

Image Captioning ◽

Generative Adversarial Network ◽

Visual Attributes ◽

Adversarial Network ◽

Language Representation ◽

Benchmark Datasets ◽

Similarity Preserving ◽

Domain Transfer

Person re-identification (re-ID) aims to recognize a person-of-interest across different cameras with notable appearance variance. Existing research works focused on the capability and robustness of visual representation. In this paper, instead, we propose a novel hierarchical offshoot recurrent network (HorNet) for improving person re-ID via image captioning. Image captions are semantically richer and more consistent than visual attributes, which could significantly alleviate the variance. We use the similarity preserving generative adversarial network (SPGAN) and an image captioner to fulfill domain transfer and language descriptions generation. Then the proposed HorNet can learn the visual and language representation from both the images and captions jointly, and thus enhance the performance of person re-ID. Extensive experiments are conducted on several benchmark datasets with or without image captions, i.e., CUHK03, Market-1501, and Duke-MTMC, demonstrating the superiority of the proposed method. Our method can generate and extract meaningful image captions while achieving state-of-the-art performance.

Download Full-text

MRD-Net: Multi-Modal Residual Knowledge Distillation for Spoken Question Answering

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/549 ◽

2021 ◽

Author(s):

Chenyu You ◽

Nuo Chen ◽

Yuexian Zou

Keyword(s):

Network Performance ◽

Question Answering ◽

State Of The Art ◽

Speech Community ◽

Benchmark Datasets ◽

Knowledge Distillation ◽

Text Features ◽

The Common ◽

The Given ◽

Residual Errors

Spoken question answering (SQA) has recently drawn considerable attention in the speech community. It requires systems to find correct answers from the given spoken passages simultaneously. The common SQA systems consist of the automatic speech recognition (ASR) module and text-based question answering module. However, previous methods suffer from severe performance degradation due to ASR errors. To alleviate this problem, this work proposes a novel multi-modal residual knowledge distillation method (MRD-Net), which further distills knowledge at the acoustic level from the audio-assistant (Audio-A). Specifically, we utilize the teacher (T) trained on manual transcriptions to guide the training of the student (S) on ASR transcriptions. We also show that introducing an Audio-A helps this procedure by learning residual errors between T and S. Moreover, we propose a simple yet effective attention mechanism to adaptively leverage audio-text features as the new deep attention knowledge to boost the network performance. Extensive experiments demonstrate that the proposed MRD-Net achieves superior results compared with state-of-the-art methods on three spoken question answering benchmark datasets.

Download Full-text