Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

Minhao Cheng; Jinfeng Yi; Pin-Yu Chen; Huan Zhang; Cho-Jui Hsieh

doi:10.1609/aaai.v34i04.5767

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5767 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3601-3608

Author(s):

Minhao Cheng ◽

Jinfeng Yi ◽

Pin-Yu Chen ◽

Huan Zhang ◽

Cho-Jui Hsieh

Keyword(s):

Deep Neural Networks ◽

Classification Problem ◽

Text Summarization ◽

Loss Functions ◽

Challenging Problem ◽

Success Rates ◽

Projected Gradient Method ◽

Input Space ◽

Adversarial Examples ◽

Output Space

Crafting adversarial examples has become an important technique to evaluate the robustness of deep neural networks (DNNs). However, most existing works focus on attacking the image classification problem since its input space is continuous and output space is finite. In this paper, we study the much more challenging problem of crafting adversarial examples for sequence-to-sequence (seq2seq) models, whose inputs are discrete text strings and outputs have an almost infinite number of possibilities. To address the challenges caused by the discrete input space, we propose a projected gradient method combined with group lasso and gradient regularization. To handle the almost infinite output space, we design some novel loss functions to conduct non-overlapping attack and targeted keyword attack. We apply our algorithm to machine translation and text summarization tasks, and verify the effectiveness of the proposed algorithm: by changing less than 3 words, we can make seq2seq model to produce desired outputs with high success rates. We also use an external sentiment classifier to verify the property of preserving semantic meanings for our generated adversarial examples. On the other hand, we recognize that, compared with the well-evaluated CNN-based classifiers, seq2seq models are intrinsically more robust to adversarial attacks.

Download Full-text

Demiguise Attack: Crafting Invisible Semantic Adversarial Perturbations with Perceptual Similarity

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/430 ◽

2021 ◽

Author(s):

Yajie Wang ◽

Shangbo Wu ◽

Wenyi Jiang ◽

Shengang Hao ◽

Yu-an Tan ◽

...

Keyword(s):

Deep Neural Networks ◽

Semantic Information ◽

State Of The Art ◽

Perceptual Similarity ◽

Success Rates ◽

Lp Norm ◽

Box Models ◽

Adversarial Examples ◽

Black Box Models ◽

Blind Spots

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples. Adversarial examples are malicious images with visually imperceptible perturbations. While these carefully crafted perturbations restricted with tight Lp norm bounds are small, they are still easily perceivable by humans. These perturbations also have limited success rates when attacking black-box models or models with defenses like noise reduction filters. To solve these problems, we propose Demiguise Attack, crafting "unrestricted" perturbations with Perceptual Similarity. Specifically, we can create powerful and photorealistic adversarial examples by manipulating semantic information based on Perceptual Similarity. Adversarial examples we generate are friendly to the human visual system (HVS), although the perturbations are of large magnitudes. We extend widely-used attacks with our approach, enhancing adversarial effectiveness impressively while contributing to imperceptibility. Extensive experiments show that the proposed method not only outperforms various state-of-the-art attacks in terms of fooling rate, transferability, and robustness against defenses but can also improve attacks effectively. In addition, we also notice that our implementation can simulate illumination and contrast changes that occur in real-world scenarios, which will contribute to exposing the blind spots of DNNs.

Download Full-text

Boosting Targeted Black-Box Attacks via Ensemble Substitute Training and Linear Augmentation

Applied Sciences ◽

10.3390/app9112286 ◽

2019 ◽

Vol 9 (11) ◽

pp. 2286 ◽

Cited By ~ 6

Author(s):

Xianfeng Gao ◽

Yu-an Tan ◽

Hongwei Jiang ◽

Quanxin Zhang ◽

Xiaohui Kuang

Keyword(s):

Neural Networks ◽

Image Classification ◽

Deep Neural Networks ◽

Black Box ◽

Decision Boundary ◽

Success Rates ◽

Small Perturbations ◽

Targeted Attacks ◽

Adversarial Examples ◽

Effectiveness And Efficiency

These years, Deep Neural Networks (DNNs) have shown unprecedented performance in many areas. However, some recent studies revealed their vulnerability to small perturbations added on source inputs. Furthermore, we call the ways to generate these perturbations’ adversarial attacks, which contain two types, black-box and white-box attacks, according to the adversaries’ access to target models. In order to overcome the problem of black-box attackers’ unreachabilities to the internals of target DNN, many researchers put forward a series of strategies. Previous works include a method of training a local substitute model for the target black-box model via Jacobian-based augmentation and then use the substitute model to craft adversarial examples using white-box methods. In this work, we improve the dataset augmentation to make the substitute models better fit the decision boundary of the target model. Unlike the previous work that just performed the non-targeted attack, we make it first to generate targeted adversarial examples via training substitute models. Moreover, to boost the targeted attacks, we apply the idea of ensemble attacks to the substitute training. Experiments on MNIST and GTSRB, two common datasets for image classification, demonstrate our effectiveness and efficiency of boosting a targeted black-box attack, and we finally attack the MNIST and GTSRB classifiers with the success rates of 97.7% and 92.8%.

Download Full-text

Natural Scene Statistics for Detecting Adversarial Examples in Deep Neural Networks

2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP) ◽

10.1109/mmsp48831.2020.9287056 ◽

2020 ◽

Author(s):

Anouar Kherchouche ◽

Sid Ahmed Fezza ◽

Wassim Hamidouche ◽

Olivier Deforges

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Natural Scene ◽

Natural Scene Statistics ◽

Adversarial Examples

Download Full-text

Diversity Adversarial Training against Adversarial Attack on Deep Neural Networks

Symmetry ◽

10.3390/sym13030428 ◽

2021 ◽

Vol 13 (3) ◽

pp. 428

Author(s):

Hyun Kwon ◽

Jun Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Diversity Training ◽

Original Data ◽

Training Method ◽

Learning Framework ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Attack ◽

Accuracy Rates

This paper presents research focusing on visualization and pattern recognition based on computer science. Although deep neural networks demonstrate satisfactory performance regarding image and voice recognition, as well as pattern analysis and intrusion detection, they exhibit inferior performance towards adversarial examples. Noise introduction, to some degree, to the original data could lead adversarial examples to be misclassified by deep neural networks, even though they can still be deemed as normal by humans. In this paper, a robust diversity adversarial training method against adversarial attacks was demonstrated. In this approach, the target model is more robust to unknown adversarial examples, as it trains various adversarial samples. During the experiment, Tensorflow was employed as our deep learning framework, while MNIST and Fashion-MNIST were used as experimental datasets. Results revealed that the diversity training method has lowered the attack success rate by an average of 27.2 and 24.3% for various adversarial examples, while maintaining the 98.7 and 91.5% accuracy rates regarding the original data of MNIST and Fashion-MNIST.

Download Full-text

Self-Amplificated Network: Learning fine-grained learner with few samples

Journal of Physics Conference Series ◽

10.1088/1742-6596/2050/1/012006 ◽

2021 ◽

Vol 2050 (1) ◽

pp. 012006

Author(s):

Xili Dai ◽

Chunmei Ma ◽

Jingwei Sun ◽

Tao Zhang ◽

Haigang Gong ◽

...

Keyword(s):

Deep Neural Networks ◽

Classification Problem ◽

The Self ◽

Superior Performance ◽

Query Image ◽

Network Learning ◽

Fine Grained ◽

Support Set ◽

Meta Learning ◽

Benchmark Datasets

Abstract Training deep neural networks from only a few examples has been an interesting topic that motivated few shot learning. In this paper, we study the fine-grained image classification problem in a challenging few-shot learning setting, and propose the Self-Amplificated Network (SAN), a method based on meta-learning to tackle this problem. The SAN model consists of three parts, which are the Encoder, Amplification and Similarity Modules. The Encoder Module encodes a fine-grained image input into a feature vector. The Amplification Module is used to amplify subtle differences between fine-grained images based on the self attention mechanism which is composed of multi-head attention. The Similarity Module measures how similar the query image and the support set are in order to determine the classification result. In-depth experiments on three benchmark datasets have showcased that our network achieves superior performance over the competing baselines.

Download Full-text

USE OF GENETIC ALGORITHM IN DEEP NEURAL NETWORKS CONFIGURATION FOR THE PURPOSES OF COMPUTER ATTACKS CLASSIFICATION

10.22250/isu.2020.66.104-117 ◽

2020 ◽

pp. 104-117

Author(s):

O.S. Amosov ◽

◽

S.G. Amosova ◽

D.S. Magola ◽

◽

...

Keyword(s):

Neural Network ◽

Genetic Algorithm ◽

Network Architecture ◽

Deep Neural Network ◽

Deep Neural Networks ◽

Classification Problem ◽

Neural Network Architecture ◽

Computer Attacks ◽

Neural Network Technology

The task of multiclass network classification of computer attacks is given. The applicability of deep neural network technology in problem solving has been considered. Deep neural network architecture was chosen based on the strategy of combining a set of convolution and recurrence LSTM layers. Op-timization of neural network parameters based on genetic algorithm is proposed. The presented results of modeling show the possibility of solving the network classification problem in real time.

Download Full-text

Deep Convolutional Neural Network for Object Classification

Handbook of Research on Deep Learning-Based Image Analysis Under Constrained and Unconstrained Environments - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-6690-9.ch016 ◽

2021 ◽

pp. 317-343

Author(s):

Amira Ahmad Al-Sharkawy ◽

Gehan A. Bahgat ◽

Elsayed E. Hemayed ◽

Samia Abdel-Razik Mashali

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Computational Models ◽

Human Performance ◽

Deep Neural Networks ◽

Object Classification ◽

Classification Problem ◽

Deep Convolutional Neural Network ◽

Object Appearance

Object classification problem is essential in many applications nowadays. Human can easily classify objects in unconstrained environments easily. Classical classification techniques were far away from human performance. Thus, researchers try to mimic the human visual system till they reached the deep neural networks. This chapter gives a review and analysis in the field of the deep convolutional neural network usage in object classification under constrained and unconstrained environment. The chapter gives a brief review on the classical techniques of object classification and the development of bio-inspired computational models from neuroscience till the creation of deep neural networks. A review is given on the constrained environment issues: the hardware computing resources and memory, the object appearance and background, and the training and processing time. Datasets that are used to test the performance are analyzed according to the images environmental conditions, besides the dataset biasing is discussed.

Download Full-text

Utilizing Information Bottleneck to Evaluate the Capability of Deep Neural Networks for Image Classification

Entropy ◽

10.3390/e21050456 ◽

2019 ◽

Vol 21 (5) ◽

pp. 456 ◽

Cited By ~ 3

Author(s):

Hao Cheng ◽

Dongze Lian ◽

Shenghua Gao ◽

Yanlin Geng

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Model Selection ◽

Image Classification ◽

Transfer Learning ◽

Deep Neural Networks ◽

Classification Problem ◽

Distribution Model ◽

Classification Problems ◽

Information Bottleneck

Inspired by the pioneering work of the information bottleneck (IB) principle for Deep Neural Networks’ (DNNs) analysis, we thoroughly study the relationship among the model accuracy, I ( X ; T ) and I ( T ; Y ) , where I ( X ; T ) and I ( T ; Y ) are the mutual information of DNN’s output T with input X and label Y. Then, we design an information plane-based framework to evaluate the capability of DNNs (including CNNs) for image classification. Instead of each hidden layer’s output, our framework focuses on the model output T. We successfully apply our framework to many application scenarios arising in deep learning and image classification problems, such as image classification with unbalanced data distribution, model selection, and transfer learning. The experimental results verify the effectiveness of the information plane-based framework: Our framework may facilitate a quick model selection and determine the number of samples needed for each class in the unbalanced classification problem. Furthermore, the framework explains the efficiency of transfer learning in the deep learning area.

Download Full-text

A Black-Box Approach to Generate Adversarial Examples Against Deep Neural Networks for High Dimensional Input

2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC) ◽

10.1109/dsc.2019.00078 ◽

2019 ◽

Author(s):

Chengru Song ◽

Changqiao Xu ◽

Shujie Yang ◽

Zan Zhou ◽

Changhui Gong

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Black Box ◽

High Dimensional ◽

Adversarial Examples

Download Full-text

Cycle-Consistent Adversarial GAN: The Integration of Adversarial Attack and Defense

Security and Communication Networks ◽

10.1155/2020/3608173 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Lingyun Jiang ◽

Kai Qiao ◽

Ruoxi Qin ◽

Linyuan Wang ◽

Wanting Yu ◽

...

Keyword(s):

Deep Learning ◽

Deep Neural Networks ◽

State Of The Art ◽

Small Magnitude ◽

Defense Strategies ◽

Adversarial Examples ◽

Adversarial Attack ◽

Public Datasets ◽

Attack And Defense

In image classification of deep learning, adversarial examples where input is intended to add small magnitude perturbations may mislead deep neural networks (DNNs) to incorrect results, which means DNNs are vulnerable to them. Different attack and defense strategies have been proposed to better research the mechanism of deep learning. However, those researches in these networks are only for one aspect, either an attack or a defense. There is in the improvement of offensive and defensive performance, and it is difficult to promote each other in the same framework. In this paper, we propose Cycle-Consistent Adversarial GAN (CycleAdvGAN) to generate adversarial examples, which can learn and approximate the distribution of the original instances and adversarial examples, especially promoting attackers and defenders to confront each other and improve their ability. For CycleAdvGAN, once the GeneratorA and D are trained, GA can generate adversarial perturbations efﬁciently for any instance, improving the performance of the existing attack methods, and GD can generate recovery adversarial examples to clean instances, defending against existing attack methods. We apply CycleAdvGAN under semiwhite-box and black-box settings on two public datasets MNIST and CIFAR10. Using the extensive experiments, we show that our method has achieved the state-of-the-art adversarial attack method and also has efficiently improved the defense ability, which made the integration of adversarial attack and defense come true. In addition, it has improved the attack effect only trained on the adversarial dataset generated by any kind of adversarial attack.

Download Full-text