A Non-Global Disturbance Targeted Adversarial Example Algorithm Combined with C&W and Grad-Cam

Mapping Intimacies ◽

10.21203/rs.3.rs-865960/v1 ◽

2021 ◽

Author(s):

Yinghui Zhu ◽

Yuzhen Jiang

Keyword(s):

Learning Systems ◽

Fine Tuning ◽

Generation Process ◽

Original Image ◽

Signal Features ◽

Adversarial Examples ◽

Salient Regions ◽

Adversarial Attack ◽

Adversarial Example ◽

Generation Control

Abstract Adversarial examples are artificially crafted to mislead deep learning systems into making wrong decisions. In the research of attack algorithms against multi-class image classifiers, an improved strategy of applying category explanation to the generation control of targeted adversarial example is proposed to reduce the perturbation noise and improve the adversarial robustness. On the basis of C&W adversarial attack algorithm, the method uses Grad-Cam, a category visualization explanation algorithm of CNN, to dynamically obtain the salient regions according to the signal features of source and target categories during the iterative generation process. The adversarial example of non-global perturbation is finally achieved by gradually shielding the non salient regions and fine-tuning the perturbation signals. Compared with other similar algorithms under the same conditions, the method enhances the effects of the original image category signal on the perturbation position. Experimental results show that, the improved adversarial examples have higher PSNR. In addition, in a variety of different defense processing tests, the examples can keep high adversarial performance and show strong attacking robustness.

Download Full-text

Heuristic Black-Box Adversarial Attacks on Video Recognition Models

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6918 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12338-12345 ◽

Cited By ~ 1

Author(s):

Zhipeng Wei ◽

Jingjing Chen ◽

Xingxing Wei ◽

Linxi Jiang ◽

Tat-Seng Chua ◽

...

Keyword(s):

Black Box ◽

Computation Cost ◽

Attack Model ◽

Video Recognition ◽

Spatial Domains ◽

Adversarial Examples ◽

Salient Regions ◽

Adversarial Attack ◽

Adversarial Example ◽

The Given

We study the problem of attacking video recognition models in the black-box setting, where the model information is unknown and the adversary can only make queries to detect the predicted top-1 class and its probability. Compared with the black-box attack on images, attacking videos is more challenging as the computation cost for searching the adversarial perturbations on a video is much higher due to its high dimensionality. To overcome this challenge, we propose a heuristic black-box attack model that generates adversarial perturbations only on the selected frames and regions. More specifically, a heuristic-based algorithm is proposed to measure the importance of each frame in the video towards generating the adversarial examples. Based on the frames' importance, the proposed algorithm heuristically searches a subset of frames where the generated adversarial example has strong adversarial attack ability while keeps the perturbations lower than the given bound. Besides, to further boost the attack efficiency, we propose to generate the perturbations only on the salient regions of the selected frames. In this way, the generated perturbations are sparse in both temporal and spatial domains. Experimental results of attacking two mainstream video recognition methods on the UCF-101 dataset and the HMDB-51 dataset demonstrate that the proposed heuristic black-box adversarial attack method can significantly reduce the computation cost and lead to more than 28% reduction in query numbers for the untargeted attack on both datasets.

Download Full-text

Random Untargeted Adversarial Example on Deep Neural Network

Symmetry ◽

10.3390/sym10120738 ◽

2018 ◽

Vol 10 (12) ◽

pp. 738 ◽

Cited By ~ 2

Author(s):

Hyun Kwon ◽

Yongchul Kim ◽

Hyunsoo Yoon ◽

Daeseon Choi

Keyword(s):

Autonomous Vehicles ◽

Deep Neural Networks ◽

Generation Process ◽

Research Attention ◽

Disease Diagnostics ◽

The Face ◽

Adversarial Examples ◽

Adversarial Example ◽

Original Class ◽

Minimum Distortion

Deep neural networks (DNNs) have demonstrated remarkable performance in machine learning areas such as image recognition, speech recognition, intrusion detection, and pattern analysis. However, it has been revealed that DNNs have weaknesses in the face of adversarial examples, which are created by adding a little noise to an original sample to cause misclassification by the DNN. Such adversarial examples can lead to fatal accidents in applications such as autonomous vehicles and disease diagnostics. Thus, the generation of adversarial examples has attracted extensive research attention recently. An adversarial example is categorized as targeted or untargeted. In this paper, we focus on the untargeted adversarial example scenario because it has a faster learning time and less distortion compared with the targeted adversarial example. However, there is a pattern vulnerability with untargeted adversarial examples: Because of the similarity between the original class and certain specific classes, it may be possible for the defending system to determine the original class by analyzing the output classes of the untargeted adversarial examples. To overcome this problem, we propose a new method for generating untargeted adversarial examples, one that uses an arbitrary class in the generation process. Moreover, we show that our proposed scheme can be applied to steganography. Through experiments, we show that our proposed scheme can achieve a 100% attack success rate with minimum distortion (1.99 and 42.32 using the MNIST and CIFAR10 datasets, respectively) and without the pattern vulnerability. Using a steganography test, we show that our proposed scheme can be used to fool humans, as demonstrated by the probability of their detecting hidden classes being equal to that of random selection.

Download Full-text

Harnessing the Vulnerability of Latent Layers in Adversarially Trained Models

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/385 ◽

2019 ◽

Cited By ~ 3

Author(s):

Nupur Kumari ◽

Mayank Singh ◽

Abhishek Sinha ◽

Harshitha Machiraju ◽

Balaji Krishnamurthy ◽

...

Keyword(s):

Fine Tuning ◽

Test Accuracy ◽

Small Magnitude ◽

First Order ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Attack ◽

A New Technique ◽

A Minor ◽

Minor Improvement

Neural networks are vulnerable to adversarial attacks - small visually imperceptible crafted noise which when added to the input drastically changes the output. The most effective method of defending against adversarial attacks is to use the methodology of adversarial training. We analyze the adversarially trained robust models to study their vulnerability against adversarial attacks at the level of the latent layers. Our analysis reveals that contrary to the input layer which is robust to adversarial attack, the latent layer of these robust models are highly susceptible to adversarial perturbations of small magnitude. Leveraging this information, we introduce a new technique Latent Adversarial Training (LAT) which comprises of fine-tuning the adversarially trained models to ensure the robustness at the feature layers. We also propose Latent Attack (LA), a novel algorithm for constructing adversarial examples. LAT results in a minor improvement in test accuracy and leads to a state-of-the-art adversarial accuracy against the universal first-order adversarial PGD attack which is shown for the MNIST, CIFAR-10, CIFAR-100, SVHN and Restricted ImageNet datasets.

Download Full-text

Robust Audio Adversarial Example for a Physical Attack

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/741 ◽

2019 ◽

Cited By ~ 7

Author(s):

Hiromu Yakura ◽

Jun Sakuma

Keyword(s):

Speech Recognition ◽

Process Evaluation ◽

State Of The Art ◽

Physical World ◽

Generation Process ◽

Recognition Model ◽

Physical Attack ◽

Adversarial Examples ◽

Adversarial Example ◽

Listening Experiment

We propose a method to generate audio adversarial examples that can attack a state-of-the-art speech recognition model in the physical world. Previous work assumes that generated adversarial examples are directly fed to the recognition model, and is not able to perform such a physical attack because of reverberation and noise from playback environments. In contrast, our method obtains robust adversarial examples by simulating transformations caused by playback or recording in the physical world and incorporating the transformations into the generation process. Evaluation and a listening experiment demonstrated that our adversarial examples are able to attack without being noticed by humans. This result suggests that audio adversarial examples generated by the proposed method may become a real threat.

Download Full-text

Weighted-Sampling Audio Adversarial Example Attack

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5928 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4908-4915 ◽

Cited By ~ 1

Author(s):

Xiaolei Liu ◽

Kun Wan ◽

Yufei Ding ◽

Xiaosong Zhang ◽

Qingxin Zhu

Keyword(s):

Automatic Speech Recognition ◽

Loss Function ◽

State Of The Art ◽

Low Noise ◽

Denoising Method ◽

Adversarial Examples ◽

Adversarial Attack ◽

Recognition Systems ◽

Level 1 ◽

Adversarial Example

Recent studies have highlighted audio adversarial examples as a ubiquitous threat to state-of-the-art automatic speech recognition systems. Thorough studies on how to effectively generate adversarial examples are essential to prevent potential attacks. Despite many research on this, the efficiency and the robustness of existing works are not yet satisfactory. In this paper, we propose weighted-sampling audio adversarial examples, focusing on the numbers and the weights of distortion to reinforce the attack. Further, we apply a denoising method in the loss function to make the adversarial attack more imperceptible. Experiments show that our method is the first in the field to generate audio adversarial examples with low noise and high audio robustness at the minute time-consuming level 1.

Download Full-text

Bayesian Adversarial Attack on Graph Neural Networks (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7206 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13867-13868

Author(s):

Xiao Liu ◽

Jing Zhao ◽

Shiliang Sun

Keyword(s):

Gradient Descent ◽

Random Variable ◽

Misclassification Rate ◽

Experimental Comparison ◽

Graph Node ◽

Adversarial Examples ◽

Adversarial Attack ◽

Projected Gradient Descent ◽

Adversarial Example ◽

Graph Neural Networks

Adversarial attack on graph neural network (GNN) is distinctive as it often jointly trains the available nodes to generate a graph as an adversarial example. Existing attacking approaches usually consider the case that all the training set is available which may be impractical. In this paper, we propose a novel Bayesian adversarial attack approach based on projected gradient descent optimization, called Bayesian PGD attack, which gets more general attack examples than deterministic attack approaches. The generated adversarial examples by our approach using the same partial dataset as deterministic attack approaches would make the GNN have higher misclassification rate on graph node classification. Specifically, in our approach, the edge perturbation Z is used for generating adversarial examples, which is viewed as a random variable with scale constraint, and the optimization target of the edge perturbation is to maximize the KL divergence between its true posterior distribution p(Z|D) and its approximate variational distribution qθ(Z). We experimentally find that the attack performance will decrease with the reduction of available nodes, and the effect of attack using different nodes varies greatly especially when the number of nodes is small. Through experimental comparison with the state-of-the-art attack approaches on GNNs, our approach is demonstrated to have better and robust attack performance.

Download Full-text

Challenging the Adversarial Robustness of DNNs Based on Error-Correcting Output Codes

Security and Communication Networks ◽

10.1155/2020/8882494 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Bowen Zhang ◽

Benedetta Tondi ◽

Xixiang Lv ◽

Mauro Barni

Keyword(s):

Deep Learning ◽

Learning Systems ◽

Defence Mechanisms ◽

Target Class ◽

Prediction Confidence ◽

Adversarial Examples ◽

Adversarial Attack ◽

Classification Tasks ◽

Error Correcting Output Codes ◽

Do So

The existence of adversarial examples and the easiness with which they can be generated raise several security concerns with regard to deep learning systems, pushing researchers to develop suitable defence mechanisms. The use of networks adopting error-correcting output codes (ECOC) has recently been proposed to counter the creation of adversarial examples in a white-box setting. In this paper, we carry out an in-depth investigation of the adversarial robustness achieved by the ECOC approach. We do so by proposing a new adversarial attack specifically designed for multilabel classification architectures, like the ECOC-based one, and by applying two existing attacks. In contrast to previous findings, our analysis reveals that ECOC-based networks can be attacked quite easily by introducing a small adversarial perturbation. Moreover, the adversarial examples can be generated in such a way to achieve high probabilities for the predicted target class, hence making it difficult to use the prediction confidence to detect them. Our findings are proven by means of experimental results obtained on MNIST, CIFAR-10, and GTSRB classification tasks.

Download Full-text

Adversarial Attack and Defence through Adversarial Training and Feature Fusion for Diabetic Retinopathy Recognition

Sensors ◽

10.3390/s21113922 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3922

Author(s):

Sheeba Lal ◽

Saeed Ur Rehman ◽

Jamal Hussain Shah ◽

Talha Meraj ◽

Hafiz Tayyab Rauf ◽

...

Keyword(s):

Diabetic Retinopathy ◽

Feature Fusion ◽

Speckle Noise ◽

Fundus Images ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Attack ◽

Retinal Fundus Images ◽

Attacks And Defenses ◽

Retinal Fundus

Due to the rapid growth in artificial intelligence (AI) and deep learning (DL) approaches, the security and robustness of the deployed algorithms need to be guaranteed. The security susceptibility of the DL algorithms to adversarial examples has been widely acknowledged. The artificially created examples will lead to different instances negatively identified by the DL models that are humanly considered benign. Practical application in actual physical scenarios with adversarial threats shows their features. Thus, adversarial attacks and defense, including machine learning and its reliability, have drawn growing interest and, in recent years, has been a hot topic of research. We introduce a framework that provides a defensive model against the adversarial speckle-noise attack, the adversarial training, and a feature fusion strategy, which preserves the classification with correct labelling. We evaluate and analyze the adversarial attacks and defenses on the retinal fundus images for the Diabetic Retinopathy recognition problem, which is considered a state-of-the-art endeavor. Results obtained on the retinal fundus images, which are prone to adversarial attacks, are 99% accurate and prove that the proposed defensive model is robust.

Download Full-text

Real-Time Adversarial Attack Detection with Deep Image Prior Initialized as a High-Level Representation Based Blurring Network

Electronics ◽

10.3390/electronics10010052 ◽

2020 ◽

Vol 10 (1) ◽

pp. 52

Author(s):

Richard Evan Sutanto ◽

Sukho Lee

Keyword(s):

Neural Network ◽

Attack Detection ◽

Detection Methods ◽

Defense System ◽

Image Prior ◽

The Neural Network ◽

Adversarial Examples ◽

Deep Image ◽

Adversarial Attack ◽

High Level

Several recent studies have shown that artificial intelligence (AI) systems can malfunction due to intentionally manipulated data coming through normal channels. Such kinds of manipulated data are called adversarial examples. Adversarial examples can pose a major threat to an AI-led society when an attacker uses them as means to attack an AI system, which is called an adversarial attack. Therefore, major IT companies such as Google are now studying ways to build AI systems which are robust against adversarial attacks by developing effective defense methods. However, one of the reasons why it is difficult to establish an effective defense system is due to the fact that it is difficult to know in advance what kind of adversarial attack method the opponent is using. Therefore, in this paper, we propose a method to detect the adversarial noise without knowledge of the kind of adversarial noise used by the attacker. For this end, we propose a blurring network that is trained only with normal images and also use it as an initial condition of the Deep Image Prior (DIP) network. This is in contrast to other neural network based detection methods, which require the use of many adversarial noisy images for the training of the neural network. Experimental results indicate the validity of the proposed method.

Download Full-text

Diversity Adversarial Training against Adversarial Attack on Deep Neural Networks

Symmetry ◽

10.3390/sym13030428 ◽

2021 ◽

Vol 13 (3) ◽

pp. 428

Author(s):

Hyun Kwon ◽

Jun Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Diversity Training ◽

Original Data ◽

Training Method ◽

Learning Framework ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Attack ◽

Accuracy Rates

This paper presents research focusing on visualization and pattern recognition based on computer science. Although deep neural networks demonstrate satisfactory performance regarding image and voice recognition, as well as pattern analysis and intrusion detection, they exhibit inferior performance towards adversarial examples. Noise introduction, to some degree, to the original data could lead adversarial examples to be misclassified by deep neural networks, even though they can still be deemed as normal by humans. In this paper, a robust diversity adversarial training method against adversarial attacks was demonstrated. In this approach, the target model is more robust to unknown adversarial examples, as it trains various adversarial samples. During the experiment, Tensorflow was employed as our deep learning framework, while MNIST and Fashion-MNIST were used as experimental datasets. Results revealed that the diversity training method has lowered the attack success rate by an average of 27.2 and 24.3% for various adversarial examples, while maintaining the 98.7 and 91.5% accuracy rates regarding the original data of MNIST and Fashion-MNIST.

Download Full-text