scholarly journals Drone Deep Reinforcement Learning: A Review

Electronics ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 999
Author(s):  
Ahmad Taher Azar ◽  
Anis Koubaa ◽  
Nada Ali Mohamed ◽  
Habiba A. Ibrahim ◽  
Zahra Fathy Ibrahim ◽  
...  

Unmanned Aerial Vehicles (UAVs) are increasingly being used in many challenging and diversified applications. These applications belong to the civilian and the military fields. To name a few; infrastructure inspection, traffic patrolling, remote sensing, mapping, surveillance, rescuing humans and animals, environment monitoring, and Intelligence, Surveillance, Target Acquisition, and Reconnaissance (ISTAR) operations. However, the use of UAVs in these applications needs a substantial level of autonomy. In other words, UAVs should have the ability to accomplish planned missions in unexpected situations without requiring human intervention. To ensure this level of autonomy, many artificial intelligence algorithms were designed. These algorithms targeted the guidance, navigation, and control (GNC) of UAVs. In this paper, we described the state of the art of one subset of these algorithms: the deep reinforcement learning (DRL) techniques. We made a detailed description of them, and we deduced the current limitations in this area. We noted that most of these DRL methods were designed to ensure stable and smooth UAV navigation by training computer-simulated environments. We realized that further research efforts are needed to address the challenges that restrain their deployment in real-life scenarios.

2021 ◽  
Vol 10 (2) ◽  
pp. 54-63
Author(s):  
Cezar POPA ◽  
Ion MITULEȚU

With the diversification of risks and threats in the multidimensional operational environment, in variable geometry conflicts, state-of-the-art technology must be used at all times in the architecture of command and control systems. This will ensure optimal response conditions both at the planning level and at the level of the execution of the military operation/action. Real-time communication, horizontally and vertically, between tactical level forces, operational and strategic level command and support structures, and with other institutions with security and defence responsibilities can only be ensured and protected by using advanced technologies. Not to be neglected at all is the training of the human resource for an efficient use of equipment as well as the algorithms and processes for making an efficient decision-making process, in line with technical, technological and artificial intelligence developments.   Keywords: command and control; artificial intelligence; efficiency; technologies; 'Information Technologies.  


2021 ◽  
Author(s):  
Zhenhui Ye

<div>In this paper, we aim to design a deep reinforcement learning(DRL) based control solution to navigate a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under provide optimal communication coverage for the ground mobile users. Compared with existing DRL-based solutions that mainly solve the problem with global observation and centralized training, a practical and efficient Decentralized Training and Decentralized Execution(DTDE) framework is desirable to train and deploy each UAV in a distributed manner. To this end, we propose a novel DRL approach named Deep Recurrent Graph Network(DRGN) that makes use of Graph Attention Network-based Flying Ad-hoc Network(GAT-FANET) to achieve inter-UAV communications and Gated Recurrent Unit (GRU) to record historical information. We conducted extensive experiments to define an appropriate structure for GAT-FANET and examine the performance of DRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and four heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of DRGN.</div>


2021 ◽  
Author(s):  
Zhenhui Ye

<div>In this paper, we aim to design a deep reinforcement learning(DRL) based control solution to navigate a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under provide optimal communication coverage for the ground mobile users. Compared with existing DRL-based solutions that mainly solve the problem with global observation and centralized training, a practical and efficient Decentralized Training and Decentralized Execution(DTDE) framework is desirable to train and deploy each UAV in a distributed manner. To this end, we propose a novel DRL approach named Deep Recurrent Graph Network(DRGN) that makes use of Graph Attention Network-based Flying Ad-hoc Network(GAT-FANET) to achieve inter-UAV communications and Gated Recurrent Unit (GRU) to record historical information. We conducted extensive experiments to define an appropriate structure for GAT-FANET and examine the performance of DRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and four heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of DRGN.</div>


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Canghong Jin ◽  
Yuli Zhou ◽  
Shengyu Ying ◽  
Chi Zhang ◽  
Weisong Wang ◽  
...  

In recent decades, more teachers are using question generators to provide students with online homework. Learning-to-rank (LTR) methods can partially rank questions to address the needs of individual students and reduce their study burden. Unfortunately, ranking questions for students is not trivial because of three main challenges: (1) discovering students’ latent knowledge and cognitive level is difficult, (2) the content of quizzes can be totally different but the knowledge points of these quizzes may be inherently related, and (3) ranking models based on supervised, semisupervised, or reinforcement learning focus on the current assignment without considering past performance. In this work, we propose KFRank, a knowledge-fusion ranking model based on reinforcement learning, which considers both a student’s assignment history and the relevance of quizzes with their knowledge points. First, we load students’ assignment history, reorganize it using knowledge points, and calculate the effective features for ranking in terms of the relation between a student’s knowledge cognitive and the question. Then, a similarity estimator is built to choose historical questions, and an attention neural network is used to calculate the attention value and update the current study state with knowledge fusion. Finally, a rank algorithm based on a Markov decision process is used to optimize the parameters. Extensive experiments were conducted on a real-life dataset spanning a year and we compared our model with the state-of-the-art ranking models (e.g., ListNET and LambdaMART) and reinforcement-learning methods (such as MDPRank). Based on top- k nDCG values, our model outperforms other methods for groups of average and weak students, whose study abilities are relatively poor and thus their behaviors are more difficult to predict.


2009 ◽  
Vol 129 (4) ◽  
pp. 363-367
Author(s):  
Tomoyuki Maeda ◽  
Makishi Nakayama ◽  
Hiroshi Narazaki ◽  
Akira Kitamura

2018 ◽  
Vol 28 (6) ◽  
pp. 1887-1891
Author(s):  
Todor Kalinov

Management and Command253 are two different words and terms, but military structures use them as synonyms. Military commanders’ authorities are almost equal in meaning to civilian managers’ privileges and power. Comparison between military command and the civilian management system structure, organization, and way of work shows almost full identity and overlapping. The highest in scale and size military systems are national ministries of defense and multinational military alliances and coalitions. Military systems at this level combine military command structures with civilian political leadership and support elements. Therefore, they incorporate both military command and civilian management organizations without any complications, because their nature originated from same source and have similar framework and content. Management of organizations requires communication in order to plan, coordinate, lead, control, and conduct all routine or extraordinary activities. Immediate long-distance communications originated from telegraphy, which was firstly applied in 19th century. Later, long-distance communications included telephony, aerial transmitting, satellite, and last but not least internet data exchange. They allowed immediate exchange of letters, voice and images, bringing to new capabilities of the managers. Their sophisticated technical base brought to new area of the military command and civilian management structures. These area covered technical and operational parts of communications, and created engineer sub-field of science, that has become one of the most popular educations, worldwide. Communications were excluded from the military command and moved to separate field, named Computers and Communications. A historic overview and analysis of the command and management structures and requirements shows their relationships, common origin, and mission. They have significant differences: management and control are based on humanities, natural and social sciences, while communications are mainly based on engineering and technology. These differences do not create enough conditions for defragmentation of communications from the management structures. They exist together in symbiosis and management structures need communications in order to exist and multiply their effectiveness and efficiency. Future defragmentation between military command and communications will bring risks of worse coordination, need for more human resources, and worse end states. These risks are extremely negative for nations and should be avoided by wide appliance of the education and science among nowadays and future leaders, managers, and commanders.


Author(s):  
Ivan Herreros

This chapter discusses basic concepts from control theory and machine learning to facilitate a formal understanding of animal learning and motor control. It first distinguishes between feedback and feed-forward control strategies, and later introduces the classification of machine learning applications into supervised, unsupervised, and reinforcement learning problems. Next, it links these concepts with their counterparts in the domain of the psychology of animal learning, highlighting the analogies between supervised learning and classical conditioning, reinforcement learning and operant conditioning, and between unsupervised and perceptual learning. Additionally, it interprets innate and acquired actions from the standpoint of feedback vs anticipatory and adaptive control. Finally, it argues how this framework of translating knowledge between formal and biological disciplines can serve us to not only structure and advance our understanding of brain function but also enrich engineering solutions at the level of robot learning and control with insights coming from biology.


2021 ◽  
Author(s):  
Amarildo Likmeta ◽  
Alberto Maria Metelli ◽  
Giorgia Ramponi ◽  
Andrea Tirinzoni ◽  
Matteo Giuliani ◽  
...  

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.


Metals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 870
Author(s):  
Robby Neven ◽  
Toon Goedemé

Automating sheet steel visual inspection can improve quality and reduce costs during its production. While many manufacturers still rely on manual or traditional inspection methods, deep learning-based approaches have proven their efficiency. In this paper, we go beyond the state-of-the-art in this domain by proposing a multi-task model that performs both pixel-based defect segmentation and severity estimation of the defects in one two-branch network. Additionally, we show how incorporation of the production process parameters improves the model’s performance. After manually constructing a real-life industrial dataset, we first implemented and trained two single-task models performing the defect segmentation and severity estimation tasks separately. Next, we compared this to a multi-task model that simultaneously performs the two tasks at hand. By combining the tasks into one model, both segmentation tasks improved by 2.5% and 3% mIoU, respectively. In the next step, we extended the multi-task model using sensor fusion with process parameters. We demonstrate that the incorporation of the process parameters resulted in a further mIoU increase of 6.8% and 2.9% for the defect segmentation and severity estimation tasks, respectively.


Sign in / Sign up

Export Citation Format

Share Document