A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks

Jinghui Chen; Dongruo Zhou; Jinfeng Yi; Quanquan Gu

doi:10.1609/aaai.v34i04.5753

A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5753 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3486-3494

Author(s):

Jinghui Chen ◽

Dongruo Zhou ◽

Jinfeng Yi ◽

Quanquan Gu

Keyword(s):

Gradient Descent ◽

State Of The Art ◽

Black Box ◽

Success Rates ◽

Practical Usefulness ◽

Efficiency And Effectiveness ◽

Large Distortion ◽

Adversarial Examples ◽

Adversarial Attack ◽

Projected Gradient Descent

Depending on how much information an adversary can access to, adversarial attacks can be classified as white-box attack and black-box attack. For white-box attack, optimization-based attack algorithms such as projected gradient descent (PGD) can achieve relatively high attack success rates within moderate iterates. However, they tend to generate adversarial examples near or upon the boundary of the perturbation set, resulting in large distortion. Furthermore, their corresponding black-box attack algorithms also suffer from high query complexities, thereby limiting their practical usefulness. In this paper, we focus on the problem of developing efficient and effective optimization-based adversarial attack algorithms. In particular, we propose a novel adversarial attack framework for both white-box and black-box settings based on a variant of Frank-Wolfe algorithm. We show in theory that the proposed attack algorithms are efficient with an O(1/√T) convergence rate. The empirical results of attacking the ImageNet and MNIST datasets also verify the efficiency and effectiveness of the proposed algorithms. More specifically, our proposed algorithms attain the best attack performances in both white-box and black-box attacks among all baselines, and are more time and query efficient than the state-of-the-art.

Download Full-text

Boosting Adversarial Attacks on Neural Networks with Better Optimizer

Security and Communication Networks ◽

10.1155/2021/9983309 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Heng Yin ◽

Hengwei Zhang ◽

Jindong Wang ◽

Ruiyu Dou

Keyword(s):

Neural Networks ◽

Success Rate ◽

Gradient Descent ◽

State Of The Art ◽

Black Box ◽

Security Threats ◽

Gradient Descent Algorithm ◽

Gradient Based ◽

Adversarial Examples ◽

Fast Gradient

Convolutional neural networks have outperformed humans in image recognition tasks, but they remain vulnerable to attacks from adversarial examples. Since these data are crafted by adding imperceptible noise to normal images, their existence poses potential security threats to deep learning systems. Sophisticated adversarial examples with strong attack performance can also be used as a tool to evaluate the robustness of a model. However, the success rate of adversarial attacks can be further improved in black-box environments. Therefore, this study combines a modified Adam gradient descent algorithm with the iterative gradient-based attack method. The proposed Adam iterative fast gradient method is then used to improve the transferability of adversarial examples. Extensive experiments on ImageNet showed that the proposed method offers a higher attack success rate than existing iterative methods. By extending our method, we achieved a state-of-the-art attack success rate of 95.0% on defense models.

Download Full-text

Bayesian Adversarial Attack on Graph Neural Networks (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7206 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13867-13868

Author(s):

Xiao Liu ◽

Jing Zhao ◽

Shiliang Sun

Keyword(s):

Gradient Descent ◽

Random Variable ◽

Misclassification Rate ◽

Experimental Comparison ◽

Graph Node ◽

Adversarial Examples ◽

Adversarial Attack ◽

Projected Gradient Descent ◽

Adversarial Example ◽

Graph Neural Networks

Adversarial attack on graph neural network (GNN) is distinctive as it often jointly trains the available nodes to generate a graph as an adversarial example. Existing attacking approaches usually consider the case that all the training set is available which may be impractical. In this paper, we propose a novel Bayesian adversarial attack approach based on projected gradient descent optimization, called Bayesian PGD attack, which gets more general attack examples than deterministic attack approaches. The generated adversarial examples by our approach using the same partial dataset as deterministic attack approaches would make the GNN have higher misclassification rate on graph node classification. Specifically, in our approach, the edge perturbation Z is used for generating adversarial examples, which is viewed as a random variable with scale constraint, and the optimization target of the edge perturbation is to maximize the KL divergence between its true posterior distribution p(Z|D) and its approximate variational distribution qθ(Z). We experimentally find that the attack performance will decrease with the reduction of available nodes, and the effect of attack using different nodes varies greatly especially when the number of nodes is small. Through experimental comparison with the state-of-the-art attack approaches on GNNs, our approach is demonstrated to have better and robust attack performance.

Download Full-text

A New Ensemble Adversarial Attack Powered by Long-Term Gradient Memories

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5743 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3405-3413

Author(s):

Zhaohui Che ◽

Ali Borji ◽

Guangtao Zhai ◽

Suiyi Ling ◽

Jing Li ◽

...

Keyword(s):

Broad Class ◽

Black Box ◽

Security Threat ◽

Source Models ◽

Adversarial Examples ◽

Adversarial Attack ◽

Prediction Systems ◽

Attack And Defense ◽

Decision Boundaries

Deep neural networks are vulnerable to adversarial attacks. More importantly, some adversarial examples crafted against an ensemble of pre-trained source models can transfer to other new target models, thus pose a security threat to black-box applications (when the attackers have no access to the target models). Despite adopting diverse architectures and parameters, source and target models often share similar decision boundaries. Therefore, if an adversary is capable of fooling several source models concurrently, it can potentially capture intrinsic transferable adversarial information that may allow it to fool a broad class of other black-box target models. Current ensemble attacks, however, only consider a limited number of source models to craft an adversary, and obtain poor transferability. In this paper, we propose a novel black-box attack, dubbed Serial-Mini-Batch-Ensemble-Attack (SMBEA). SMBEA divides a large number of pre-trained source models into several mini-batches. For each single batch, we design 3 new ensemble strategies to improve the intra-batch transferability. Besides, we propose a new algorithm that recursively accumulates the “long-term” gradient memories of the previous batch to the following batch. This way, the learned adversarial information can be preserved and the inter-batch transferability can be improved. Experiments indicate that our method outperforms state-of-the-art ensemble attacks over multiple pixel-to-pixel vision tasks including image translation and salient region prediction. Our method successfully fools two online black-box saliency prediction systems including DeepGaze-II (Kummerer 2017) and SALICON (Huang et al. 2017). Finally, we also contribute a new repository to promote the research on adversarial attack and defense over pixel-to-pixel tasks: https://github.com/CZHQuality/AAA-Pix2pix.

Download Full-text

Two Improved Methods of Generating Adversarial Examples against Faster R-CNNs for Tram Environment Perception Systems

Complexity ◽

10.1155/2020/6814263 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Shize Huang ◽

Xiaowen Liu ◽

Xiaolu Yang ◽

Zhaoxin Zhang ◽

Lingyu Yang

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Gradient Descent ◽

Learning Networks ◽

Adversarial Examples ◽

Environment Perception ◽

Projected Gradient Descent ◽

Adversarial Example ◽

Improved Methods ◽

Growing Neural Networks

Trams have increasingly deployed object detectors to perceive running conditions, and deep learning networks have been widely adopted by those detectors. Growing neural networks have incurred severe attacks such as adversarial example attacks, imposing threats to tram safety. Only if adversarial attacks are studied thoroughly, researchers can come up with better defence methods against them. However, most existing methods of generating adversarial examples have been devoted to classification, and none of them target tram environment perception systems. In this paper, we propose an improved projected gradient descent (PGD) algorithm and an improved Carlini and Wagner (C&W) algorithm to generate adversarial examples against Faster R-CNN object detectors. Experiments verify that both algorithms can successfully conduct nontargeted and targeted white-box digital attacks when trams are running. We also compare the performance of the two methods, including attack effects, similarity to clean images, and the generating time. The results show that both algorithms can generate adversarial examples within 220 seconds, a much shorter time, without decrease of the success rate.

Download Full-text

Cycle-Consistent Adversarial GAN: The Integration of Adversarial Attack and Defense

Security and Communication Networks ◽

10.1155/2020/3608173 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Lingyun Jiang ◽

Kai Qiao ◽

Ruoxi Qin ◽

Linyuan Wang ◽

Wanting Yu ◽

...

Keyword(s):

Deep Learning ◽

Deep Neural Networks ◽

State Of The Art ◽

Small Magnitude ◽

Defense Strategies ◽

Adversarial Examples ◽

Adversarial Attack ◽

Public Datasets ◽

Attack And Defense

In image classification of deep learning, adversarial examples where input is intended to add small magnitude perturbations may mislead deep neural networks (DNNs) to incorrect results, which means DNNs are vulnerable to them. Different attack and defense strategies have been proposed to better research the mechanism of deep learning. However, those researches in these networks are only for one aspect, either an attack or a defense. There is in the improvement of offensive and defensive performance, and it is difficult to promote each other in the same framework. In this paper, we propose Cycle-Consistent Adversarial GAN (CycleAdvGAN) to generate adversarial examples, which can learn and approximate the distribution of the original instances and adversarial examples, especially promoting attackers and defenders to confront each other and improve their ability. For CycleAdvGAN, once the GeneratorA and D are trained, GA can generate adversarial perturbations efﬁciently for any instance, improving the performance of the existing attack methods, and GD can generate recovery adversarial examples to clean instances, defending against existing attack methods. We apply CycleAdvGAN under semiwhite-box and black-box settings on two public datasets MNIST and CIFAR10. Using the extensive experiments, we show that our method has achieved the state-of-the-art adversarial attack method and also has efficiently improved the defense ability, which made the integration of adversarial attack and defense come true. In addition, it has improved the attack effect only trained on the adversarial dataset generated by any kind of adversarial attack.

Download Full-text

Universal Adversarial Attack Via Enhanced Projected Gradient Descent

2020 IEEE International Conference on Image Processing (ICIP) ◽

10.1109/icip40778.2020.9191288 ◽

2020 ◽

Author(s):

Yingpeng Deng ◽

Lina J. Karam

Keyword(s):

Gradient Descent ◽

Projected Gradient ◽

Adversarial Attack ◽

Projected Gradient Descent

Download Full-text

Generating Adversarial Examples with Adversarial Networks

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/543 ◽

2018 ◽

Cited By ~ 65

Author(s):

Chaowei Xiao ◽

Bo Li ◽

Jun-yan Zhu ◽

Warren He ◽

Mingyan Liu ◽

...

Keyword(s):

Deep Neural Networks ◽

State Of The Art ◽

Black Box ◽

Generative Adversarial Networks ◽

Perceptual Quality ◽

Small Magnitude ◽

Adversarial Networks ◽

Original Target ◽

Adversarial Examples ◽

Adversarial Training

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial exam- ples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply Adv- GAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.

Download Full-text

Adv-Makeup: A New Imperceptible and Transferable Attack on Face Recognition

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/173 ◽

2021 ◽

Author(s):

Bangjie Yin ◽

Wenxuan Wang ◽

Taiping Yao ◽

Junfeng Guo ◽

Zelun Kong ◽

...

Keyword(s):

Face Recognition ◽

Black Box ◽

Fine Grained ◽

Box Models ◽

Meta Learning ◽

Adversarial Examples ◽

Adversarial Attack ◽

Face Generation ◽

Orbital Region ◽

Black Box Models

Deep neural networks, particularly face recognition models, have been shown to be vulnerable to both digital and physical adversarial examples. However, existing adversarial examples against face recognition systems either lack transferability to black-box models, or fail to be implemented in practice. In this paper, we propose a unified adversarial face generation method - Adv-Makeup, which can realize imperceptible and transferable attack under the black-box setting. Adv-Makeup develops a task-driven makeup generation method with the blending module to synthesize imperceptible eye shadow over the orbital region on faces. And to achieve transferability, Adv-Makeup implements a fine-grained meta-learning based adversarial attack strategy to learn more vulnerable or sensitive features from various models. Compared to existing techniques, sufficient visualization results demonstrate that Adv-Makeup is capable to generate much more imperceptible attacks under both digital and physical scenarios. Meanwhile, extensive quantitative experiments show that Adv-Makeup can significantly improve the attack success rate under black-box setting, even attacking commercial systems.

Download Full-text

Demiguise Attack: Crafting Invisible Semantic Adversarial Perturbations with Perceptual Similarity

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/430 ◽

2021 ◽

Author(s):

Yajie Wang ◽

Shangbo Wu ◽

Wenyi Jiang ◽

Shengang Hao ◽

Yu-an Tan ◽

...

Keyword(s):

Deep Neural Networks ◽

Semantic Information ◽

State Of The Art ◽

Perceptual Similarity ◽

Success Rates ◽

Lp Norm ◽

Box Models ◽

Adversarial Examples ◽

Black Box Models ◽

Blind Spots

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples. Adversarial examples are malicious images with visually imperceptible perturbations. While these carefully crafted perturbations restricted with tight Lp norm bounds are small, they are still easily perceivable by humans. These perturbations also have limited success rates when attacking black-box models or models with defenses like noise reduction filters. To solve these problems, we propose Demiguise Attack, crafting "unrestricted" perturbations with Perceptual Similarity. Specifically, we can create powerful and photorealistic adversarial examples by manipulating semantic information based on Perceptual Similarity. Adversarial examples we generate are friendly to the human visual system (HVS), although the perturbations are of large magnitudes. We extend widely-used attacks with our approach, enhancing adversarial effectiveness impressively while contributing to imperceptibility. Extensive experiments show that the proposed method not only outperforms various state-of-the-art attacks in terms of fooling rate, transferability, and robustness against defenses but can also improve attacks effectively. In addition, we also notice that our implementation can simulate illumination and contrast changes that occur in real-world scenarios, which will contribute to exposing the blind spots of DNNs.

Download Full-text

Sanitizing hidden activations for improving adversarial robustness of convolutional neural networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210371 ◽

2021 ◽

pp. 1-11

Author(s):

Tianshi Mu ◽

Kequan Lin ◽

Huabing Zhang ◽

Jian Wang

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Black Box ◽

Experimental Results ◽

Amplification Effect ◽

Wide Range ◽

Adversarial Examples

Deep learning is gaining significant traction in a wide range of areas. Whereas, recent studies have demonstrated that deep learning exhibits the fatal weakness on adversarial examples. Due to the black-box nature and un-transparency problem of deep learning, it is difficult to explain the reason for the existence of adversarial examples and also hard to defend against them. This study focuses on improving the adversarial robustness of convolutional neural networks. We first explore how adversarial examples behave inside the network through visualization. We find that adversarial examples produce perturbations in hidden activations, which forms an amplification effect to fool the network. Motivated by this observation, we propose an approach, termed as sanitizing hidden activations, to help the network correctly recognize adversarial examples by eliminating or reducing the perturbations in hidden activations. To demonstrate the effectiveness of our approach, we conduct experiments on three widely used datasets: MNIST, CIFAR-10 and ImageNet, and also compare with state-of-the-art defense techniques. The experimental results show that our sanitizing approach is more generalized to defend against different kinds of attacks and can effectively improve the adversarial robustness of convolutional neural networks.

Download Full-text