Complete Defense Framework to Protect Deep Neural Networks against Adversarial Examples

Mathematical Problems in Engineering ◽

10.1155/2020/8319249 ◽

2020 ◽

Vol 2020 ◽

pp. 1-17

Author(s):

Guangling Sun ◽

Yuying Su ◽

Chuan Qin ◽

Wenbo Xu ◽

Xiaofeng Lu ◽

...

Keyword(s):

Neural Networks ◽

Iterative Method ◽

Deep Neural Networks ◽

Great Success ◽

Minor Alteration ◽

Adversarial Examples ◽

Adversarial Training ◽

Fast Gradient ◽

Sign Method

Although Deep Neural Networks (DNNs) have achieved great success on various applications, investigations have increasingly shown DNNs to be highly vulnerable when adversarial examples are used as input. Here, we present a comprehensive defense framework to protect DNNs against adversarial examples. First, we present statistical and minor alteration detectors to filter out adversarial examples contaminated by noticeable and unnoticeable perturbations, respectively. Then, we ensemble the detectors, a deep Residual Generative Network (ResGN), and an adversarially trained targeted network, to construct a complete defense framework. In this framework, the ResGN is our previously proposed network which is used to remove adversarial perturbations, and the adversarially trained targeted network is a network that is learned through adversarial training. Specifically, once the detectors determine an input example to be adversarial, it is cleaned by ResGN and then classified by the adversarially trained targeted network; otherwise, it is directly classified by this network. We empirically evaluate the proposed complete defense on ImageNet dataset. The results confirm the robustness against current representative attacking methods including fast gradient sign method, randomized fast gradient sign method, basic iterative method, universal adversarial perturbations, DeepFool method, and Carlini & Wagner method.

Get full-text (via PubEx)

MedicalGuard: U-Net Model Robust against Adversarially Perturbed Images

Security and Communication Networks ◽

10.1155/2021/5595026 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Hyun Kwon

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Human Perception ◽

Original Data ◽

Classification Model ◽

Medical Field ◽

Adversarial Examples ◽

Adversarial Training ◽

Fast Gradient ◽

Sign Method

Deep neural networks perform well for image recognition, speech recognition, and pattern analysis. This type of neural network has also been used in the medical field, where it has displayed good performance in predicting or classifying patient diagnoses. An example is the U-Net model, which has demonstrated good performance in data segmentation, an important technology in the field of medical imaging. However, deep neural networks are vulnerable to adversarial examples. Adversarial examples are samples created by adding a small amount of noise to an original data sample in such a way that to human perception they appear to be normal data but they will be incorrectly classified by the classification model. Adversarial examples pose a significant threat in the medical field, as they can cause models to misidentify or misclassify patient diagnoses. In this paper, I propose an advanced adversarial training method to defend against such adversarial examples. An advantage of the proposed method is that it creates a wide variety of adversarial examples for use in training, which are generated by the fast gradient sign method (FGSM) for a range of epsilon values. A U-Net model trained on these diverse adversarial examples will be more robust to unknown adversarial examples. Experiments were conducted using the ISBI 2012 dataset, with TensorFlow as the machine learning library. According to the experimental results, the proposed method builds a model that demonstrates segmentation robustness against adversarial examples by reducing the pixel error between the original labels and the adversarial examples to an average of 1.45.

Get full-text (via PubEx)

Simple Iterative Method for Generating Targeted Universal Adversarial Perturbations

Algorithms ◽

10.3390/a13110268 ◽

2020 ◽

Vol 13 (11) ◽

pp. 268 ◽

Cited By ~ 1

Author(s):

Hokuto Hirano ◽

Kazuhiro Takemoto

Keyword(s):

Neural Networks ◽

Iterative Method ◽

Image Classification ◽

Deep Neural Networks ◽

State Of The Art ◽

Specific Class ◽

Targeted Attacks ◽

Fast Gradient ◽

Classification Tasks ◽

Sign Method

Deep neural networks (DNNs) are vulnerable to adversarial attacks. In particular, a single perturbation known as the universal adversarial perturbation (UAP) can foil most classification tasks conducted by DNNs. Thus, different methods for generating UAPs are required to fully evaluate the vulnerability of DNNs. A realistic evaluation would be with cases that consider targeted attacks; wherein the generated UAP causes the DNN to classify an input into a specific class. However, the development of UAPs for targeted attacks has largely fallen behind that of UAPs for non-targeted attacks. Therefore, we propose a simple iterative method to generate UAPs for targeted attacks. Our method combines the simple iterative method for generating non-targeted UAPs and the fast gradient sign method for generating a targeted adversarial perturbation for an input. We applied the proposed method to state-of-the-art DNN models for image classification and proved the existence of almost imperceptible UAPs for targeted attacks; further, we demonstrated that such UAPs can be easily generated.

Get full-text (via PubEx)

Improving the Transferability of Adversarial Examples With a Noise Data Enhancement Framework and Random Erasing

Frontiers in Neurorobotics ◽

10.3389/fnbot.2021.784053 ◽

2021 ◽

Vol 15 ◽

Author(s):

Pengfei Xie ◽

Shuhao Shi ◽

Shuai Yang ◽

Kai Qiao ◽

Ningning Liang ◽

...

Keyword(s):

Success Rate ◽

Deep Neural Networks ◽

Black Box ◽

Excellent Performance ◽

Training Models ◽

Noise Data ◽

Adversarial Examples ◽

Adversarial Training ◽

Fast Gradient ◽

Sign Method

Deep neural networks (DNNs) are proven vulnerable to attack against adversarial examples. Black-box transfer attacks pose a massive threat to AI applications without accessing target models. At present, the most effective black-box attack methods mainly adopt data enhancement methods, such as input transformation. Previous data enhancement frameworks only work on input transformations that satisfy accuracy or loss invariance. However, it does not work for other transformations that do not meet the above conditions, such as the transformation which will lose information. To solve this problem, we propose a new noise data enhancement framework (NDEF), which only transforms adversarial perturbation to avoid the above issues effectively. In addition, we introduce random erasing under this framework to prevent the over-fitting of adversarial examples. Experimental results show that the black-box attack success rate of our method Random Erasing Iterative Fast Gradient Sign Method (REI-FGSM) is 4.2% higher than DI-FGSM in six models on average and 6.6% higher than DI-FGSM in three defense models. REI-FGSM can combine with other methods to achieve excellent performance. The attack performance of SI-FGSM can be improved by 22.9% on average when combined with REI-FGSM. Besides, our combined version with DI-TI-MI-FGSM, i.e., DI-TI-MI-REI-FGSM can achieve an average attack success rate of 97.0% against three ensemble adversarial training models, which is greater than the current gradient iterative attack method. We also introduce Gaussian blur to prove the compatibility of our framework.

Get full-text (via PubEx)

Diversity Adversarial Training against Adversarial Attack on Deep Neural Networks

Symmetry ◽

10.3390/sym13030428 ◽

2021 ◽

Vol 13 (3) ◽

pp. 428

Author(s):

Hyun Kwon ◽

Jun Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Diversity Training ◽

Original Data ◽

Training Method ◽

Learning Framework ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Attack ◽

Accuracy Rates

This paper presents research focusing on visualization and pattern recognition based on computer science. Although deep neural networks demonstrate satisfactory performance regarding image and voice recognition, as well as pattern analysis and intrusion detection, they exhibit inferior performance towards adversarial examples. Noise introduction, to some degree, to the original data could lead adversarial examples to be misclassified by deep neural networks, even though they can still be deemed as normal by humans. In this paper, a robust diversity adversarial training method against adversarial attacks was demonstrated. In this approach, the target model is more robust to unknown adversarial examples, as it trains various adversarial samples. During the experiment, Tensorflow was employed as our deep learning framework, while MNIST and Fashion-MNIST were used as experimental datasets. Results revealed that the diversity training method has lowered the attack success rate by an average of 27.2 and 24.3% for various adversarial examples, while maintaining the 98.7 and 91.5% accuracy rates regarding the original data of MNIST and Fashion-MNIST.

Get full-text (via PubEx)

Natural Scene Statistics for Detecting Adversarial Examples in Deep Neural Networks

2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP) ◽

10.1109/mmsp48831.2020.9287056 ◽

2020 ◽

Author(s):

Anouar Kherchouche ◽

Sid Ahmed Fezza ◽

Wassim Hamidouche ◽

Olivier Deforges

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Natural Scene ◽

Natural Scene Statistics ◽

Adversarial Examples

Get full-text (via PubEx)

Solving inverse problems in stochastic models using deep neural networks and adversarial training

Computer Methods in Applied Mechanics and Engineering ◽

10.1016/j.cma.2021.113976 ◽

2021 ◽

Vol 384 ◽

pp. 113976

Author(s):

Kailai Xu ◽

Eric Darve

Keyword(s):

Neural Networks ◽

Inverse Problems ◽

Stochastic Models ◽

Deep Neural Networks ◽

Adversarial Training

Get full-text (via PubEx)

Semisupervised Learning for Seismic Monitoring Applications

Seismological Research Letters ◽

10.1785/0220200195 ◽

2020 ◽

Vol 92 (1) ◽

pp. 388-395

Author(s):

Lisa Linville ◽

Dylan Anderson ◽

Joshua Michalenko ◽

Jennifer Galasso ◽

Timothy Draelos

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Seismic Event ◽

Model Performance ◽

Seismic Monitoring ◽

Semisupervised Learning ◽

Unlabeled Data ◽

Hybrid Technique ◽

Source Type ◽

Adversarial Training

Abstract The impressive performance that deep neural networks demonstrate on a range of seismic monitoring tasks depends largely on the availability of event catalogs that have been manually curated over many years or decades. However, the quality, duration, and availability of seismic event catalogs vary significantly across the range of monitoring operations, regions, and objectives. Semisupervised learning (SSL) enables learning from both labeled and unlabeled data and provides a framework to leverage the abundance of unreviewed seismic data for training deep neural networks on a variety of target tasks. We apply two SSL algorithms (mean-teacher and virtual adversarial training) as well as a novel hybrid technique (exponential average adversarial training) to seismic event classification to examine how unlabeled data with SSL can enhance model performance. In general, we find that SSL can perform as well as supervised learning with fewer labels. We also observe in some scenarios that almost half of the benefits of SSL are the result of the meaningful regularization enforced through SSL techniques and may not be attributable to unlabeled data directly. Lastly, the benefits from unlabeled data scale with the difficulty of the predictive task when we evaluate the use of unlabeled data to characterize sources in new geographic regions. In geographic areas where supervised model performance is low, SSL significantly increases the accuracy of source-type classification using unlabeled data.

Get full-text (via PubEx)

A Black-Box Approach to Generate Adversarial Examples Against Deep Neural Networks for High Dimensional Input

2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC) ◽

10.1109/dsc.2019.00078 ◽

2019 ◽

Author(s):

Chengru Song ◽

Changqiao Xu ◽

Shujie Yang ◽

Zan Zhou ◽

Changhui Gong

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Black Box ◽

High Dimensional ◽

Adversarial Examples

Get full-text (via PubEx)

Generate Adversarial Examples by Nesterov-momentum Iterative Fast Gradient Sign Method

2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS) ◽

10.1109/icsess49938.2020.9237700 ◽

2020 ◽

Author(s):

Jin Xu

Keyword(s):

Adversarial Examples ◽

Fast Gradient ◽

Sign Method

Get full-text (via PubEx)

Enhancing adversarial attack transferability with multi-scale feature attack

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691320500769 ◽

2020 ◽

pp. 2050076

Author(s):

Caixia Sun ◽

Lian Zou ◽

Cien Fan ◽

Yu Shi ◽

Yifeng Liu

Keyword(s):

Internal Representation ◽

Feature Space ◽

Source Model ◽

Black Box ◽

Space Representation ◽

Scale Feature ◽

Multi Scale ◽

Adversarial Examples ◽

Fast Gradient ◽

Sign Method

Deep neural networks are vulnerable to adversarial examples, which can fool models by adding carefully designed perturbations. An intriguing phenomenon is that adversarial examples often exhibit transferability, thus making black-box attacks effective in real-world applications. However, the adversarial examples generated by existing methods typically overfit the structure and feature representation of the source model, resulting in a low success rate in a black-box manner. To address this issue, we propose the multi-scale feature attack to boost attack transferability, which adjusts the internal feature space representation of the adversarial image to get far to the internal representation of the original image. We show that we can select a low-level layer and a high-level layer of the source model to conduct the perturbations, and the crafted adversarial examples are confused with original images, not just in the class but also in the feature space representations. To further improve the transferability of adversarial examples, we apply reverse cross-entropy loss to reduce the overfitting further and show that it is effective for attacking adversarially trained models with strong defensive ability. Extensive experiments show that the proposed methods consistently outperform the iterative fast gradient sign method (IFGSM) and momentum iterative fast gradient sign method (MIFGSM) under the challenging black-box setting.

Get full-text (via PubEx)