Improving the Transferability of Adversarial Examples With a Noise Data Enhancement Framework and Random Erasing

Frontiers in Neurorobotics ◽

10.3389/fnbot.2021.784053 ◽

2021 ◽

Vol 15 ◽

Author(s):

Pengfei Xie ◽

Shuhao Shi ◽

Shuai Yang ◽

Kai Qiao ◽

Ningning Liang ◽

...

Keyword(s):

Success Rate ◽

Deep Neural Networks ◽

Black Box ◽

Excellent Performance ◽

Training Models ◽

Noise Data ◽

Adversarial Examples ◽

Adversarial Training ◽

Fast Gradient ◽

Sign Method

Deep neural networks (DNNs) are proven vulnerable to attack against adversarial examples. Black-box transfer attacks pose a massive threat to AI applications without accessing target models. At present, the most effective black-box attack methods mainly adopt data enhancement methods, such as input transformation. Previous data enhancement frameworks only work on input transformations that satisfy accuracy or loss invariance. However, it does not work for other transformations that do not meet the above conditions, such as the transformation which will lose information. To solve this problem, we propose a new noise data enhancement framework (NDEF), which only transforms adversarial perturbation to avoid the above issues effectively. In addition, we introduce random erasing under this framework to prevent the over-fitting of adversarial examples. Experimental results show that the black-box attack success rate of our method Random Erasing Iterative Fast Gradient Sign Method (REI-FGSM) is 4.2% higher than DI-FGSM in six models on average and 6.6% higher than DI-FGSM in three defense models. REI-FGSM can combine with other methods to achieve excellent performance. The attack performance of SI-FGSM can be improved by 22.9% on average when combined with REI-FGSM. Besides, our combined version with DI-TI-MI-FGSM, i.e., DI-TI-MI-REI-FGSM can achieve an average attack success rate of 97.0% against three ensemble adversarial training models, which is greater than the current gradient iterative attack method. We also introduce Gaussian blur to prove the compatibility of our framework.

Download Full-text

Random Transformation of image brightness for adversarial attack

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211157 ◽

2021 ◽

pp. 1-12

Author(s):

Bo Yang ◽

Kaiyong Xu ◽

Hengjun Wang ◽

Hengwei Zhang

Keyword(s):

Success Rate ◽

Data Augmentation ◽

Black Box ◽

Image Brightness ◽

Gradient Based ◽

Adversarial Examples ◽

Random Transformation ◽

Fast Gradient ◽

Adversarial Example ◽

Sign Method

Deep neural networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding small, human-imperceptible perturbations to the original images, but make the model output inaccurate predictions. Before DNNs are deployed, adversarial attacks can thus be an important method to evaluate and select robust models in safety-critical applications. However, under the challenging black-box setting, the attack success rate, i.e., the transferability of adversarial examples, still needs to be improved. Based on image augmentation methods, this paper found that random transformation of image brightness can eliminate overfitting in the generation of adversarial examples and improve their transferability. In light of this phenomenon, this paper proposes an adversarial example generation method, which can be integrated with Fast Gradient Sign Method (FGSM)-related methods to build a more robust gradient-based attack and to generate adversarial examples with better transferability. Extensive experiments on the ImageNet dataset have demonstrated the effectiveness of the aforementioned method. Whether on normally or adversarially trained networks, our method has a higher success rate for black-box attacks than other attack methods based on data augmentation. It is hoped that this method can help evaluate and improve the robustness of models.

Download Full-text

Complete Defense Framework to Protect Deep Neural Networks against Adversarial Examples

Mathematical Problems in Engineering ◽

10.1155/2020/8319249 ◽

2020 ◽

Vol 2020 ◽

pp. 1-17

Author(s):

Guangling Sun ◽

Yuying Su ◽

Chuan Qin ◽

Wenbo Xu ◽

Xiaofeng Lu ◽

...

Keyword(s):

Neural Networks ◽

Iterative Method ◽

Deep Neural Networks ◽

Great Success ◽

Minor Alteration ◽

Adversarial Examples ◽

Adversarial Training ◽

Fast Gradient ◽

Sign Method

Although Deep Neural Networks (DNNs) have achieved great success on various applications, investigations have increasingly shown DNNs to be highly vulnerable when adversarial examples are used as input. Here, we present a comprehensive defense framework to protect DNNs against adversarial examples. First, we present statistical and minor alteration detectors to filter out adversarial examples contaminated by noticeable and unnoticeable perturbations, respectively. Then, we ensemble the detectors, a deep Residual Generative Network (ResGN), and an adversarially trained targeted network, to construct a complete defense framework. In this framework, the ResGN is our previously proposed network which is used to remove adversarial perturbations, and the adversarially trained targeted network is a network that is learned through adversarial training. Specifically, once the detectors determine an input example to be adversarial, it is cleaned by ResGN and then classified by the adversarially trained targeted network; otherwise, it is directly classified by this network. We empirically evaluate the proposed complete defense on ImageNet dataset. The results confirm the robustness against current representative attacking methods including fast gradient sign method, randomized fast gradient sign method, basic iterative method, universal adversarial perturbations, DeepFool method, and Carlini & Wagner method.

Download Full-text

MedicalGuard: U-Net Model Robust against Adversarially Perturbed Images

Security and Communication Networks ◽

10.1155/2021/5595026 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Hyun Kwon

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Human Perception ◽

Original Data ◽

Classification Model ◽

Medical Field ◽

Adversarial Examples ◽

Adversarial Training ◽

Fast Gradient ◽

Sign Method

Deep neural networks perform well for image recognition, speech recognition, and pattern analysis. This type of neural network has also been used in the medical field, where it has displayed good performance in predicting or classifying patient diagnoses. An example is the U-Net model, which has demonstrated good performance in data segmentation, an important technology in the field of medical imaging. However, deep neural networks are vulnerable to adversarial examples. Adversarial examples are samples created by adding a small amount of noise to an original data sample in such a way that to human perception they appear to be normal data but they will be incorrectly classified by the classification model. Adversarial examples pose a significant threat in the medical field, as they can cause models to misidentify or misclassify patient diagnoses. In this paper, I propose an advanced adversarial training method to defend against such adversarial examples. An advantage of the proposed method is that it creates a wide variety of adversarial examples for use in training, which are generated by the fast gradient sign method (FGSM) for a range of epsilon values. A U-Net model trained on these diverse adversarial examples will be more robust to unknown adversarial examples. Experiments were conducted using the ISBI 2012 dataset, with TensorFlow as the machine learning library. According to the experimental results, the proposed method builds a model that demonstrates segmentation robustness against adversarial examples by reducing the pixel error between the original labels and the adversarial examples to an average of 1.45.

Download Full-text

Enhancing adversarial attack transferability with multi-scale feature attack

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691320500769 ◽

2020 ◽

pp. 2050076

Author(s):

Caixia Sun ◽

Lian Zou ◽

Cien Fan ◽

Yu Shi ◽

Yifeng Liu

Keyword(s):

Internal Representation ◽

Feature Space ◽

Source Model ◽

Black Box ◽

Space Representation ◽

Scale Feature ◽

Multi Scale ◽

Adversarial Examples ◽

Fast Gradient ◽

Sign Method

Deep neural networks are vulnerable to adversarial examples, which can fool models by adding carefully designed perturbations. An intriguing phenomenon is that adversarial examples often exhibit transferability, thus making black-box attacks effective in real-world applications. However, the adversarial examples generated by existing methods typically overfit the structure and feature representation of the source model, resulting in a low success rate in a black-box manner. To address this issue, we propose the multi-scale feature attack to boost attack transferability, which adjusts the internal feature space representation of the adversarial image to get far to the internal representation of the original image. We show that we can select a low-level layer and a high-level layer of the source model to conduct the perturbations, and the crafted adversarial examples are confused with original images, not just in the class but also in the feature space representations. To further improve the transferability of adversarial examples, we apply reverse cross-entropy loss to reduce the overfitting further and show that it is effective for attacking adversarially trained models with strong defensive ability. Extensive experiments show that the proposed methods consistently outperform the iterative fast gradient sign method (IFGSM) and momentum iterative fast gradient sign method (MIFGSM) under the challenging black-box setting.

Download Full-text

Generating Adversarial Examples with Adversarial Networks

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/543 ◽

2018 ◽

Cited By ~ 65

Author(s):

Chaowei Xiao ◽

Bo Li ◽

Jun-yan Zhu ◽

Warren He ◽

Mingyan Liu ◽

...

Keyword(s):

Deep Neural Networks ◽

State Of The Art ◽

Black Box ◽

Generative Adversarial Networks ◽

Perceptual Quality ◽

Small Magnitude ◽

Adversarial Networks ◽

Original Target ◽

Adversarial Examples ◽

Adversarial Training

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial exam- ples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply Adv- GAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.

Download Full-text

Boosting Adversarial Attacks on Neural Networks with Better Optimizer

Security and Communication Networks ◽

10.1155/2021/9983309 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Heng Yin ◽

Hengwei Zhang ◽

Jindong Wang ◽

Ruiyu Dou

Keyword(s):

Neural Networks ◽

Success Rate ◽

Gradient Descent ◽

State Of The Art ◽

Black Box ◽

Security Threats ◽

Gradient Descent Algorithm ◽

Gradient Based ◽

Adversarial Examples ◽

Fast Gradient

Convolutional neural networks have outperformed humans in image recognition tasks, but they remain vulnerable to attacks from adversarial examples. Since these data are crafted by adding imperceptible noise to normal images, their existence poses potential security threats to deep learning systems. Sophisticated adversarial examples with strong attack performance can also be used as a tool to evaluate the robustness of a model. However, the success rate of adversarial attacks can be further improved in black-box environments. Therefore, this study combines a modified Adam gradient descent algorithm with the iterative gradient-based attack method. The proposed Adam iterative fast gradient method is then used to improve the transferability of adversarial examples. Extensive experiments on ImageNet showed that the proposed method offers a higher attack success rate than existing iterative methods. By extending our method, we achieved a state-of-the-art attack success rate of 95.0% on defense models.

Download Full-text

Diversity Adversarial Training against Adversarial Attack on Deep Neural Networks

Symmetry ◽

10.3390/sym13030428 ◽

2021 ◽

Vol 13 (3) ◽

pp. 428

Author(s):

Hyun Kwon ◽

Jun Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Diversity Training ◽

Original Data ◽

Training Method ◽

Learning Framework ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Attack ◽

Accuracy Rates

This paper presents research focusing on visualization and pattern recognition based on computer science. Although deep neural networks demonstrate satisfactory performance regarding image and voice recognition, as well as pattern analysis and intrusion detection, they exhibit inferior performance towards adversarial examples. Noise introduction, to some degree, to the original data could lead adversarial examples to be misclassified by deep neural networks, even though they can still be deemed as normal by humans. In this paper, a robust diversity adversarial training method against adversarial attacks was demonstrated. In this approach, the target model is more robust to unknown adversarial examples, as it trains various adversarial samples. During the experiment, Tensorflow was employed as our deep learning framework, while MNIST and Fashion-MNIST were used as experimental datasets. Results revealed that the diversity training method has lowered the attack success rate by an average of 27.2 and 24.3% for various adversarial examples, while maintaining the 98.7 and 91.5% accuracy rates regarding the original data of MNIST and Fashion-MNIST.

Download Full-text

A Black-Box Approach to Generate Adversarial Examples Against Deep Neural Networks for High Dimensional Input

2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC) ◽

10.1109/dsc.2019.00078 ◽

2019 ◽

Author(s):

Chengru Song ◽

Changqiao Xu ◽

Shujie Yang ◽

Zan Zhou ◽

Changhui Gong

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Black Box ◽

High Dimensional ◽

Adversarial Examples

Download Full-text

Generate Adversarial Examples by Nesterov-momentum Iterative Fast Gradient Sign Method

2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS) ◽

10.1109/icsess49938.2020.9237700 ◽

2020 ◽

Author(s):

Jin Xu

Keyword(s):

Adversarial Examples ◽

Fast Gradient ◽

Sign Method

Download Full-text

On the Effectiveness of Adversarial Training in Defending against Adversarial Example Attacks for Image Classification

Applied Sciences ◽

10.3390/app10228079 ◽

2020 ◽

Vol 10 (22) ◽

pp. 8079

Author(s):

Sanglee Park ◽

Jungmin So

Keyword(s):

Data Augmentation ◽

Black Box ◽

Training Data ◽

Model Parameters ◽

Neural Network Models ◽

Practical Applications ◽

Target Network ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Example

State-of-the-art neural network models are actively used in various fields, but it is well-known that they are vulnerable to adversarial example attacks. Throughout the efforts to make the models robust against adversarial example attacks, it has been found to be a very difficult task. While many defense approaches were shown to be not effective, adversarial training remains as one of the promising methods. In adversarial training, the training data are augmented by “adversarial” samples generated using an attack algorithm. If the attacker uses a similar attack algorithm to generate adversarial examples, the adversarially trained network can be quite robust to the attack. However, there are numerous ways of creating adversarial examples, and the defender does not know what algorithm the attacker may use. A natural question is: Can we use adversarial training to train a model robust to multiple types of attack? Previous work have shown that, when a network is trained with adversarial examples generated from multiple attack methods, the network is still vulnerable to white-box attacks where the attacker has complete access to the model parameters. In this paper, we study this question in the context of black-box attacks, which can be a more realistic assumption for practical applications. Experiments with the MNIST dataset show that adversarially training a network with an attack method helps defending against that particular attack method, but has limited effect for other attack methods. In addition, even if the defender trains a network with multiple types of adversarial examples and the attacker attacks with one of the methods, the network could lose accuracy to the attack if the attacker uses a different data augmentation strategy on the target network. These results show that it is very difficult to make a robust network using adversarial training, even for black-box settings where the attacker has restricted information on the target network.

Download Full-text