scholarly journals Adversarial attack and defense in reinforcement learning-from AI security view

Cybersecurity ◽  
2019 ◽  
Vol 2 (1) ◽  
Author(s):  
Tong Chen ◽  
Jiqiang Liu ◽  
Yingxiao Xiang ◽  
Wenjia Niu ◽  
Endong Tong ◽  
...  
2020 ◽  
Vol 34 (04) ◽  
pp. 3405-3413
Author(s):  
Zhaohui Che ◽  
Ali Borji ◽  
Guangtao Zhai ◽  
Suiyi Ling ◽  
Jing Li ◽  
...  

Deep neural networks are vulnerable to adversarial attacks. More importantly, some adversarial examples crafted against an ensemble of pre-trained source models can transfer to other new target models, thus pose a security threat to black-box applications (when the attackers have no access to the target models). Despite adopting diverse architectures and parameters, source and target models often share similar decision boundaries. Therefore, if an adversary is capable of fooling several source models concurrently, it can potentially capture intrinsic transferable adversarial information that may allow it to fool a broad class of other black-box target models. Current ensemble attacks, however, only consider a limited number of source models to craft an adversary, and obtain poor transferability. In this paper, we propose a novel black-box attack, dubbed Serial-Mini-Batch-Ensemble-Attack (SMBEA). SMBEA divides a large number of pre-trained source models into several mini-batches. For each single batch, we design 3 new ensemble strategies to improve the intra-batch transferability. Besides, we propose a new algorithm that recursively accumulates the “long-term” gradient memories of the previous batch to the following batch. This way, the learned adversarial information can be preserved and the inter-batch transferability can be improved. Experiments indicate that our method outperforms state-of-the-art ensemble attacks over multiple pixel-to-pixel vision tasks including image translation and salient region prediction. Our method successfully fools two online black-box saliency prediction systems including DeepGaze-II (Kummerer 2017) and SALICON (Huang et al. 2017). Finally, we also contribute a new repository to promote the research on adversarial attack and defense over pixel-to-pixel tasks: https://github.com/CZHQuality/AAA-Pix2pix.


2020 ◽  
Vol 2020 ◽  
pp. 1-9 ◽  
Author(s):  
Lingyun Jiang ◽  
Kai Qiao ◽  
Ruoxi Qin ◽  
Linyuan Wang ◽  
Wanting Yu ◽  
...  

In image classification of deep learning, adversarial examples where input is intended to add small magnitude perturbations may mislead deep neural networks (DNNs) to incorrect results, which means DNNs are vulnerable to them. Different attack and defense strategies have been proposed to better research the mechanism of deep learning. However, those researches in these networks are only for one aspect, either an attack or a defense. There is in the improvement of offensive and defensive performance, and it is difficult to promote each other in the same framework. In this paper, we propose Cycle-Consistent Adversarial GAN (CycleAdvGAN) to generate adversarial examples, which can learn and approximate the distribution of the original instances and adversarial examples, especially promoting attackers and defenders to confront each other and improve their ability. For CycleAdvGAN, once the GeneratorA and D are trained, GA can generate adversarial perturbations efficiently for any instance, improving the performance of the existing attack methods, and GD can generate recovery adversarial examples to clean instances, defending against existing attack methods. We apply CycleAdvGAN under semiwhite-box and black-box settings on two public datasets MNIST and CIFAR10. Using the extensive experiments, we show that our method has achieved the state-of-the-art adversarial attack method and also has efficiently improved the defense ability, which made the integration of adversarial attack and defense come true. In addition, it has improved the attack effect only trained on the adversarial dataset generated by any kind of adversarial attack.


Author(s):  
Nilaksh Das ◽  
Madhuri Shanbhogue ◽  
Shang-Tse Chen ◽  
Li Chen ◽  
Michael E. Kounavis ◽  
...  

2021 ◽  
Vol 11 (18) ◽  
pp. 8450
Author(s):  
Xiaojiao Chen ◽  
Sheng Li ◽  
Hao Huang

Voice Processing Systems (VPSes), now widely deployed, have become deeply involved in people’s daily lives, helping drive the car, unlock the smartphone, make online purchases, etc. Unfortunately, recent research has shown that those systems based on deep neural networks are vulnerable to adversarial examples, which attract significant attention to VPS security. This review presents a detailed introduction to the background knowledge of adversarial attacks, including the generation of adversarial examples, psychoacoustic models, and evaluation indicators. Then we provide a concise introduction to defense methods against adversarial attacks. Finally, we propose a systematic classification of adversarial attacks and defense methods, with which we hope to provide a better understanding of the classification and structure for beginners in this field.


2020 ◽  
Vol 34 (04) ◽  
pp. 5883-5891
Author(s):  
Jianwen Sun ◽  
Tianwei Zhang ◽  
Xiaofei Xie ◽  
Lei Ma ◽  
Yan Zheng ◽  
...  

Adversarial attacks against conventional Deep Learning (DL) systems and algorithms have been widely studied, and various defenses were proposed. However, the possibility and feasibility of such attacks against Deep Reinforcement Learning (DRL) are less explored. As DRL has achieved great success in various complex tasks, designing effective adversarial attacks is an indispensable prerequisite towards building robust DRL algorithms. In this paper, we introduce two novel adversarial attack techniques to stealthily and efficiently attack the DRL agents. These two techniques enable an adversary to inject adversarial samples in a minimal set of critical moments while causing the most severe damage to the agent. The first technique is the critical point attack: the adversary builds a model to predict the future environmental states and agent's actions, assesses the damage of each possible attack strategy, and selects the optimal one. The second technique is the antagonist attack: the adversary automatically learns a domain-agnostic model to discover the critical moments of attacking the agent in an episode. Experimental results demonstrate the effectiveness of our techniques. Specifically, to successfully attack the DRL agent, our critical point technique only requires 1 (TORCS) or 2 (Atari Pong and Breakout) steps, and the antagonist technique needs fewer than 5 steps (4 Mujoco tasks), which are significant improvements over state-of-the-art methods.


Entropy ◽  
2021 ◽  
Vol 23 (11) ◽  
pp. 1433
Author(s):  
Kaifang Wan ◽  
Dingwei Wu ◽  
Yiwei Zhai ◽  
Bo Li ◽  
Xiaoguang Gao ◽  
...  

A pursuit–evasion game is a classical maneuver confrontation problem in the multi-agent systems (MASs) domain. An online decision technique based on deep reinforcement learning (DRL) was developed in this paper to address the problem of environment sensing and decision-making in pursuit–evasion games. A control-oriented framework developed from the DRL-based multi-agent deep deterministic policy gradient (MADDPG) algorithm was built to implement multi-agent cooperative decision-making to overcome the limitation of the tedious state variables required for the traditionally complicated modeling process. To address the effects of errors between a model and a real scenario, this paper introduces adversarial disturbances. It also proposes a novel adversarial attack trick and adversarial learning MADDPG (A2-MADDPG) algorithm. By introducing an adversarial attack trick for the agents themselves, uncertainties of the real world are modeled, thereby optimizing robust training. During the training process, adversarial learning was incorporated into our algorithm to preprocess the actions of multiple agents, which enabled them to properly respond to uncertain dynamic changes in MASs. Experimental results verified that the proposed approach provides superior performance and effectiveness for pursuers and evaders, and both can learn the corresponding confrontational strategy during training.


Author(s):  
Chaoning Zhang ◽  
Philipp Benz ◽  
Chenguo Lin ◽  
Adil Karjauv ◽  
Jing Wu ◽  
...  

The intriguing phenomenon of adversarial examples has attracted significant attention in machine learning and what might be more surprising to the community is the existence of universal adversarial perturbations (UAPs), i.e. a single perturbation to fool the target DNN for most images. With the focus on UAP against deep classifiers, this survey summarizes the recent progress on universal adversarial attacks, discussing the challenges from both the attack and defense sides, as well as the reason for the existence of UAP. We aim to extend this work as a dynamic survey that will regularly update its content to follow new works regarding UAP or universal attack in a wide range of domains, such as image, audio, video, text, etc. Relevant updates will be discussed at: https://bit.ly/2SbQlLG. We welcome authors of future works in this field to contact us for including your new findings.


Sign in / Sign up

Export Citation Format

Share Document