Attention Mechanism Based Adversarial Attack Against Deep Reinforcement Learning

A Black-Box Adversarial Attack via Deep Reinforcement Learning on the Feature Space

2021 IEEE Conference on Dependable and Secure Computing (DSC) ◽

10.1109/dsc49826.2021.9346264 ◽

2021 ◽

Author(s):

Lyue Li ◽

Amir Rezapour ◽

Wen-Guey Tzeng

Keyword(s):

Reinforcement Learning ◽

Feature Space ◽

Black Box ◽

Adversarial Attack

Attentional Reinforcement Learning in the Brain

New Generation Computing ◽

10.1007/s00354-019-00081-z ◽

2020 ◽

Vol 38 (1) ◽

pp. 49-64 ◽

Cited By ~ 2

Author(s):

Hiroshi Yamakawa

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Basal Ganglia ◽

Language Processing ◽

Information Source ◽

Attention Mechanism ◽

Transmission Route ◽

Thalamic Relay ◽

Signal Changes ◽

The Brain

AbstractRecently, attention mechanisms have significantly boosted the performance of natural language processing using deep learning. An attention mechanism can select the information to be used, such as by conducting a dictionary lookup; this information is then used, for example, to select the next utterance word in a sentence. In neuroscience, the basis of the function of sequentially selecting words is considered to be the cortico-basal ganglia-thalamocortical loop. Here, we first show that the attention mechanism used in deep learning corresponds to the mechanism in which the cerebral basal ganglia suppress thalamic relay cells in the brain. Next, we demonstrate that, in neuroscience, the output of the basal ganglia is associated with the action output in the actor of reinforcement learning. Based on these, we show that the aforementioned loop can be generalized as reinforcement learning that controls the transmission of the prediction signal so as to maximize the prediction reward. We call this attentional reinforcement learning (ARL). In ARL, the actor selects the information transmission route according to the attention, and the prediction signal changes according to the context detected by the information source of the route. Hence, ARL enables flexible action selection that depends on the situation, unlike traditional reinforcement learning, wherein the actor must directly select an action.

Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6047 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5883-5891

Author(s):

Jianwen Sun ◽

Tianwei Zhang ◽

Xiaofei Xie ◽

Lei Ma ◽

Yan Zheng ◽

...

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Critical Point ◽

State Of The Art ◽

Great Success ◽

Severe Damage ◽

Minimal Set ◽

Adversarial Attack ◽

Attack Strategy ◽

Critical Moments

Adversarial attacks against conventional Deep Learning (DL) systems and algorithms have been widely studied, and various defenses were proposed. However, the possibility and feasibility of such attacks against Deep Reinforcement Learning (DRL) are less explored. As DRL has achieved great success in various complex tasks, designing effective adversarial attacks is an indispensable prerequisite towards building robust DRL algorithms. In this paper, we introduce two novel adversarial attack techniques to stealthily and efficiently attack the DRL agents. These two techniques enable an adversary to inject adversarial samples in a minimal set of critical moments while causing the most severe damage to the agent. The first technique is the critical point attack: the adversary builds a model to predict the future environmental states and agent's actions, assesses the damage of each possible attack strategy, and selects the optimal one. The second technique is the antagonist attack: the adversary automatically learns a domain-agnostic model to discover the critical moments of attacking the agent in an episode. Experimental results demonstrate the effectiveness of our techniques. Specifically, to successfully attack the DRL agent, our critical point technique only requires 1 (TORCS) or 2 (Atari Pong and Breakout) steps, and the antagonist technique needs fewer than 5 steps (4 Mujoco tasks), which are significant improvements over state-of-the-art methods.

An Improved Approach towards Multi-Agent Pursuit–Evasion Game Decision-Making Using Deep Reinforcement Learning

Entropy ◽

10.3390/e23111433 ◽

2021 ◽

Vol 23 (11) ◽

pp. 1433

Author(s):

Kaifang Wan ◽

Dingwei Wu ◽

Yiwei Zhai ◽

Bo Li ◽

Xiaoguang Gao ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Superior Performance ◽

State Variables ◽

Multi Agent Systems ◽

Adversarial Learning ◽

Pursuit Evasion ◽

Evasion Game ◽

Multi Agent ◽

Adversarial Attack

A pursuit–evasion game is a classical maneuver confrontation problem in the multi-agent systems (MASs) domain. An online decision technique based on deep reinforcement learning (DRL) was developed in this paper to address the problem of environment sensing and decision-making in pursuit–evasion games. A control-oriented framework developed from the DRL-based multi-agent deep deterministic policy gradient (MADDPG) algorithm was built to implement multi-agent cooperative decision-making to overcome the limitation of the tedious state variables required for the traditionally complicated modeling process. To address the effects of errors between a model and a real scenario, this paper introduces adversarial disturbances. It also proposes a novel adversarial attack trick and adversarial learning MADDPG (A2-MADDPG) algorithm. By introducing an adversarial attack trick for the agents themselves, uncertainties of the real world are modeled, thereby optimizing robust training. During the training process, adversarial learning was incorporated into our algorithm to preprocess the actions of multiple agents, which enabled them to properly respond to uncertain dynamic changes in MASs. Experimental results verified that the proposed approach provides superior performance and effectiveness for pursuers and evaders, and both can learn the corresponding confrontational strategy during training.

An Actor-Critic-Attention Mechanism for Deep Reinforcement Learning in Multi-view Environments

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/277 ◽

2019 ◽

Cited By ~ 1

Author(s):

Elaheh Barati ◽

Xuewen Chen

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Attention Mechanism ◽

Feature Representation ◽

Decision Making Process ◽

Multiple Views ◽

Single Feature ◽

3D Environments ◽

Noisy Conditions ◽

Level Of Importance

In reinforcement learning algorithms, leveraging multiple views of the environment can improve the learning of complicated policies. In multi-view environments, due to the fact that the views may frequently suffer from partial observability, their level of importance are often different. In this paper, we propose a deep reinforcement learning method and an attention mechanism in a multi-view environment. Each view can provide various representative information about the environment. Through our attention mechanism, our method generates a single feature representation of environment given its multiple views. It learns a policy to dynamically attend to each view based on its importance in the decision-making process. Through experiments, we show that our method outperforms its state-of-the-art baselines on TORCS racing car simulator and three other complex 3D environments with obstacles. We also provide experimental results to evaluate the performance of our method on noisy conditions and partial observation settings.

Adversarial Attack for Deep Reinforcement Learning Based Demand Response

10.1109/pesgm46819.2021.9637826 ◽

2021 ◽

Author(s):

Zhiqiang Wan ◽

Hepeng Li ◽

Hang Shuai ◽

Yan Lindsay Sun ◽

Haibo He

Keyword(s):

Reinforcement Learning ◽

Demand Response ◽

Adversarial Attack

Deep Deterministic Policy Gradient Algorithm Based on Convolutional Block Attention for Autonomous Driving

Symmetry ◽

10.3390/sym13061061 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1061

Author(s):

Yanliang Jin ◽

Qianhong Liu ◽

Liquan Shen ◽

Leiji Zhu

Keyword(s):

Reinforcement Learning ◽

Supervised Learning ◽

State Of The Art ◽

Autonomous Driving ◽

Attention Mechanism ◽

Gradient Algorithm ◽

Excellent Performance ◽

Average Speed ◽

The Road ◽

Policy Gradient

The research on autonomous driving based on deep reinforcement learning algorithms is a research hotspot. Traditional autonomous driving requires human involvement, and the autonomous driving algorithms based on supervised learning must be trained in advance using human experience. To deal with autonomous driving problems, this paper proposes an improved end-to-end deep deterministic policy gradient (DDPG) algorithm based on the convolutional block attention mechanism, and it is called multi-input attention prioritized deep deterministic policy gradient algorithm (MAPDDPG). Both the actor network and the critic network of the model have the same structure with symmetry. Meanwhile, the attention mechanism is introduced to help the vehicles focus on useful environmental information. The experiments are conducted in the open racing car simulator (TORCS)and the results of five experiment runs on the test tracks are averaged to obtain the final result. Compared with the state-of-the-art algorithm, the maximum reward increases from 62,207 to 116,347, and the average speed increases from 135 km/h to 193 km/h, while the number of success episodes to complete a circle increases from 96 to 147. Also, the variance of the distance from the vehicle to the center of the road is compared, and the result indicates that the variance of the DDPG is 0.6 m while that of the MAPDDPG is only 0.2 m. The above results indicate that the proposed MAPDDPG achieves excellent performance.

Adversarial Attack against Deep Reinforcement Learning with Static Reward Impact Map

Proceedings of the 15th ACM Asia Conference on Computer and Communications Security ◽

10.1145/3320269.3384715 ◽

2020 ◽

Cited By ~ 1

Author(s):

Patrick P. K. Chan ◽

Yaxuan Wang ◽

Daniel S. Yeung

Keyword(s):

Reinforcement Learning ◽

Adversarial Attack

A Graph Attention Mechanism Based Multi-Agent Reinforcement Learning Method for Efficient Traffic Light Control

2021 International Wireless Communications and Mobile Computing (IWCMC) ◽

10.1109/iwcmc51323.2021.9498671 ◽

2021 ◽

Author(s):

Changqing Su ◽

Yan Yan ◽

Tao Wang ◽

Baoxian Zhang ◽

Cheng Li

Keyword(s):

Reinforcement Learning ◽

Attention Mechanism ◽

Learning Method ◽

Light Control ◽

Traffic Light ◽

Traffic Light Control ◽

Multi Agent

Image Inpainting and Classification Agent Training Based on Reinforcement Learning and Generative Models with Attention Mechanism

10.1109/icm52667.2021.9664950 ◽

2021 ◽

Author(s):

Chiagoziem C. Ukwuoma ◽

Md Belal Bin Heyat ◽

Mahmoud Masadeh ◽

Faijan Akhtar ◽

Qin Zhiguang ◽

...

Keyword(s):

Reinforcement Learning ◽

Image Inpainting ◽

Generative Models ◽

Attention Mechanism