Autonomous Decision-Making Method for Combat Mission of UAV based on Deep Reinforcement Learning

Author(s):  
Jie Xu ◽  
Qing Guo ◽  
Lei Xiao ◽  
Zhaoyi Li ◽  
Gaowei Zhang
2013 ◽  
Vol 92 (1) ◽  
pp. 5-39 ◽  
Author(s):  
Markus Peters ◽  
Wolfgang Ketter ◽  
Maytal Saar-Tsechansky ◽  
John Collins

Author(s):  
Junfeng Zhang ◽  
Qing Xue

In a tactical wargame, the decisions of the artificial intelligence (AI) commander are critical to the final combat result. Due to the existence of fog-of-war, AI commanders are faced with unknown and invisible information on the battlefield and lack of understanding of the situation, and it is difficult to make appropriate tactical strategies. The traditional knowledge rule-based decision-making method lacks flexibility and autonomy. How to make flexible and autonomous decision-making when facing complex battlefield situations is a difficult problem. This paper aims to solve the decision-making problem of the AI commander by using the deep reinforcement learning (DRL) method. We develop a tactical wargame as the research environment, which contains built-in script AI and supports the machine–machine combat mode. On this basis, an end-to-end actor–critic framework for commander decision making based on the convolutional neural network is designed to represent the battlefield situation and the reinforcement learning method is used to try different tactical strategies. Finally, we carry out a combat experiment between a DRL-based agent and a rule-based agent in a jungle terrain scenario. The result shows that the AI commander who adopts the actor–critic method successfully learns how to get a higher score in the tactical wargame, and the DRL-based agent has a higher winning ratio than the rule-based agent.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Baolai Wang ◽  
Shengang Li ◽  
Xianzhong Gao ◽  
Tao Xie

With the development of unmanned aerial vehicle (UAV) technology, UAV swarm confrontation has attracted many researchers’ attention. However, the situation faced by the UAV swarm has substantial uncertainty and dynamic variability. The state space and action space increase exponentially with the number of UAVs, so that autonomous decision-making becomes a difficult problem in the confrontation environment. In this paper, a multiagent reinforcement learning method with macro action and human expertise is proposed for autonomous decision-making of UAVs. In the proposed approach, UAV swarm is modeled as a large multiagent system (MAS) with an individual UAV as an agent, and the sequential decision-making problem in swarm confrontation is modeled as a Markov decision process. Agents in the proposed method are trained based on the macro actions, where sparse and delayed rewards, large state space, and action space are effectively overcome. The key to the success of this method is the generation of the macro actions that allow the high-level policy to find a near-optimal solution. In this paper, we further leverage human expertise to design a set of good macro actions. Extensive empirical experiments in our constructed swarm confrontation environment show that our method performs better than the other algorithms.


Author(s):  
Richard Ashcroft

This chapter discusses the ethics of depression from a personal perspective. The author, an academic who has worked in the field of medical ethics or bioethics, has suffered episodes of depression throughout his life, some lasting several months. Here he shares a few quite informal things about how these two facts about him are connected. He first considers the paradigm of autonomy and autonomous decision-making, as well as the problem with functional accounts of autonomy with regard to depression. He then reflects on an approach to ethics and depression that involves thinking about the ethics of being depressed. He also highlights two aspects of the ‘ethics of depression’: treatment and the ethical obligation to talk about it.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Batel Yifrah ◽  
Ayelet Ramaty ◽  
Genela Morris ◽  
Avi Mendelsohn

AbstractDecision making can be shaped both by trial-and-error experiences and by memory of unique contextual information. Moreover, these types of information can be acquired either by means of active experience or by observing others behave in similar situations. The interactions between reinforcement learning parameters that inform decision updating and memory formation of declarative information in experienced and observational learning settings are, however, unknown. In the current study, participants took part in a probabilistic decision-making task involving situations that either yielded similar outcomes to those of an observed player or opposed them. By fitting alternative reinforcement learning models to each subject, we discerned participants who learned similarly from experience and observation from those who assigned different weights to learning signals from these two sources. Participants who assigned different weights to their own experience versus those of others displayed enhanced memory performance as well as subjective memory strength for episodes involving significant reward prospects. Conversely, memory performance of participants who did not prioritize their own experience over others did not seem to be influenced by reinforcement learning parameters. These findings demonstrate that interactions between implicit and explicit learning systems depend on the means by which individuals weigh relevant information conveyed via experience and observation.


Author(s):  
Ming-Sheng Ying ◽  
Yuan Feng ◽  
Sheng-Gang Ying

AbstractMarkov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.


Sign in / Sign up

Export Citation Format

Share Document