Drone Deep Reinforcement Learning: A Review

Ahmad Taher Azar; Anis Koubaa; Nada Ali Mohamed; Habiba A. Ibrahim; Zahra Fathy Ibrahim; Muhammad Kazim; Adel Ammar; Bilel Benjdira; Alaa M. Khamis; Ibrahim A. Hameed; Gabriella Casalino

doi:10.3390/electronics10090999

STATE-OF-THE-ART TECHNOLOGIES TO BE USED IN COMMAND AND CONTROL SYSTEMS

BULLETIN OF "CAROL I" NATIONAL DEFENCE UNIVERSITY ◽

10.53477/2284-9378-21-06 ◽

2021 ◽

Vol 10 (2) ◽

pp. 54-63

Author(s):

Cezar POPA ◽

Ion MITULEȚU

Keyword(s):

Artificial Intelligence ◽

Control Systems ◽

Information Technologies ◽

State Of The Art ◽

Command And Control ◽

Operational Environment ◽

Military Operation ◽

Advanced Technologies ◽

And Control ◽

The Military

With the diversification of risks and threats in the multidimensional operational environment, in variable geometry conflicts, state-of-the-art technology must be used at all times in the architecture of command and control systems. This will ensure optimal response conditions both at the planning level and at the level of the execution of the military operation/action. Real-time communication, horizontally and vertically, between tactical level forces, operational and strategic level command and support structures, and with other institutions with security and defence responsibilities can only be ensured and protected by using advanced technologies. Not to be neglected at all is the training of the human resource for an efficient use of equipment as well as the algorithms and processes for making an efficient decision-making process, in line with technical, technological and artificial intelligence developments. Keywords: command and control; artificial intelligence; efficiency; technologies; 'Information Technologies.

Download Full-text

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

10.36227/techrxiv.15048273.v1 ◽

2021 ◽

Author(s):

Zhenhui Ye

Keyword(s):

Reinforcement Learning ◽

Ad Hoc Network ◽

Ad Hoc ◽

State Of The Art ◽

Target Area ◽

Proposed Model ◽

Decentralized Execution ◽

Partially Observable ◽

Gated Recurrent Unit ◽

Uav Navigation

<div>In this paper, we aim to design a deep reinforcement learning(DRL) based control solution to navigate a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under provide optimal communication coverage for the ground mobile users. Compared with existing DRL-based solutions that mainly solve the problem with global observation and centralized training, a practical and efficient Decentralized Training and Decentralized Execution(DTDE) framework is desirable to train and deploy each UAV in a distributed manner. To this end, we propose a novel DRL approach named Deep Recurrent Graph Network(DRGN) that makes use of Graph Attention Network-based Flying Ad-hoc Network(GAT-FANET) to achieve inter-UAV communications and Gated Recurrent Unit (GRU) to record historical information. We conducted extensive experiments to define an appropriate structure for GAT-FANET and examine the performance of DRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and four heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of DRGN.</div>

Download Full-text

GREENHOUSE ENVIRONMENT MONITORING AND CONTROL: STATE OF THE ART AND CURRENT TRENDS

Environmental Engineering and Management Journal ◽

10.30638/eemj.2018.041 ◽

2018 ◽

Vol 17 (2) ◽

pp. 399-416 ◽

Cited By ~ 3

Author(s):

Gheorghe-Daniel Andreescu ◽

Eugen Horatiu Gurban

Keyword(s):

State Of The Art ◽

Environment Monitoring ◽

Monitoring And Control ◽

Current Trends ◽

And Control ◽

Control State

Download Full-text

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

10.36227/techrxiv.15048273 ◽

2021 ◽

Author(s):

Zhenhui Ye

Keyword(s):

Reinforcement Learning ◽

Ad Hoc Network ◽

Ad Hoc ◽

State Of The Art ◽

Target Area ◽

Proposed Model ◽

Decentralized Execution ◽

Partially Observable ◽

Gated Recurrent Unit ◽

Uav Navigation

<div>In this paper, we aim to design a deep reinforcement learning(DRL) based control solution to navigate a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under provide optimal communication coverage for the ground mobile users. Compared with existing DRL-based solutions that mainly solve the problem with global observation and centralized training, a practical and efficient Decentralized Training and Decentralized Execution(DTDE) framework is desirable to train and deploy each UAV in a distributed manner. To this end, we propose a novel DRL approach named Deep Recurrent Graph Network(DRGN) that makes use of Graph Attention Network-based Flying Ad-hoc Network(GAT-FANET) to achieve inter-UAV communications and Gated Recurrent Unit (GRU) to record historical information. We conducted extensive experiments to define an appropriate structure for GAT-FANET and examine the performance of DRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and four heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of DRGN.</div>

Download Full-text

A Knowledge-Fusion Ranking System with an Attention Network for Making Assignment Recommendations

Computational Intelligence and Neuroscience ◽

10.1155/2020/6748430 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Canghong Jin ◽

Yuli Zhou ◽

Shengyu Ying ◽

Chi Zhang ◽

Weisong Wang ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Learning To Rank ◽

Real Life ◽

Online Homework ◽

Cognitive Level ◽

Past Performance ◽

Knowledge Fusion ◽

Ranking Models ◽

Markov Decision

In recent decades, more teachers are using question generators to provide students with online homework. Learning-to-rank (LTR) methods can partially rank questions to address the needs of individual students and reduce their study burden. Unfortunately, ranking questions for students is not trivial because of three main challenges: (1) discovering students’ latent knowledge and cognitive level is difficult, (2) the content of quizzes can be totally different but the knowledge points of these quizzes may be inherently related, and (3) ranking models based on supervised, semisupervised, or reinforcement learning focus on the current assignment without considering past performance. In this work, we propose KFRank, a knowledge-fusion ranking model based on reinforcement learning, which considers both a student’s assignment history and the relevance of quizzes with their knowledge points. First, we load students’ assignment history, reorganize it using knowledge points, and calculate the effective features for ranking in terms of the relation between a student’s knowledge cognitive and the question. Then, a similarity estimator is built to choose historical questions, and an attention neural network is used to calculate the attention value and update the current study state with knowledge fusion. Finally, a rank algorithm based on a Markov decision process is used to optimize the parameters. Extensive experiments were conducted on a real-life dataset spanning a year and we compared our model with the state-of-the-art ranking models (e.g., ListNET and LambdaMART) and reinforcement-learning methods (such as MDPRank). Based on top- k nDCG values, our model outperforms other methods for groups of average and weak students, whose study abilities are relatively poor and thus their behaviors are more difficult to predict.

Download Full-text

Modeling and Control for Plant Dynamics Based on Reinforcement Learning

IEEJ Transactions on Industry Applications ◽

10.1541/ieejias.129.363 ◽

2009 ◽

Vol 129 (4) ◽

pp. 363-367

Author(s):

Tomoyuki Maeda ◽

Makishi Nakayama ◽

Hiroshi Narazaki ◽

Akira Kitamura

Keyword(s):

Reinforcement Learning ◽

Modeling And Control ◽

Plant Dynamics ◽

And Control

Download Full-text

RELATIONSHIPS BETWEEN COMMAND, MANAGEMENT, AND COMMUNICATIONS. TO CONTINUE IN DIFFERENT DIRECTIONS OR TO REMAIN IN SAME FIELD

Knowledge International Journal ◽

10.35120/kij28061887t ◽

2018 ◽

Vol 28 (6) ◽

pp. 1887-1891

Author(s):

Todor Kalinov

Keyword(s):

Data Exchange ◽

Content Management ◽

System Structure ◽

Power Comparison ◽

Long Distance ◽

Military Alliances ◽

Military Systems ◽

Military Command ◽

And Control ◽

The Military

Management and Command253 are two different words and terms, but military structures use them as synonyms. Military commanders’ authorities are almost equal in meaning to civilian managers’ privileges and power. Comparison between military command and the civilian management system structure, organization, and way of work shows almost full identity and overlapping. The highest in scale and size military systems are national ministries of defense and multinational military alliances and coalitions. Military systems at this level combine military command structures with civilian political leadership and support elements. Therefore, they incorporate both military command and civilian management organizations without any complications, because their nature originated from same source and have similar framework and content. Management of organizations requires communication in order to plan, coordinate, lead, control, and conduct all routine or extraordinary activities. Immediate long-distance communications originated from telegraphy, which was firstly applied in 19th century. Later, long-distance communications included telephony, aerial transmitting, satellite, and last but not least internet data exchange. They allowed immediate exchange of letters, voice and images, bringing to new capabilities of the managers. Their sophisticated technical base brought to new area of the military command and civilian management structures. These area covered technical and operational parts of communications, and created engineer sub-field of science, that has become one of the most popular educations, worldwide. Communications were excluded from the military command and moved to separate field, named Computers and Communications. A historic overview and analysis of the command and management structures and requirements shows their relationships, common origin, and mission. They have significant differences: management and control are based on humanities, natural and social sciences, while communications are mainly based on engineering and technology. These differences do not create enough conditions for defragmentation of communications from the management structures. They exist together in symbiosis and management structures need communications in order to exist and multiply their effectiveness and efficiency. Future defragmentation between military command and communications will bring risks of worse coordination, need for more human resources, and worse end states. These risks are extremely negative for nations and should be avoided by wide appliance of the education and science among nowadays and future leaders, managers, and commanders.

Download Full-text

Learning and control

10.1093/oso/9780199674923.003.0026 ◽

2018 ◽

Author(s):

Ivan Herreros

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Brain Function ◽

Control Strategies ◽

Learning Problems ◽

Animal Learning ◽

Feed Forward Control ◽

Machine Learning Applications ◽

And Control

This chapter discusses basic concepts from control theory and machine learning to facilitate a formal understanding of animal learning and motor control. It first distinguishes between feedback and feed-forward control strategies, and later introduces the classification of machine learning applications into supervised, unsupervised, and reinforcement learning problems. Next, it links these concepts with their counterparts in the domain of the psychology of animal learning, highlighting the analogies between supervised learning and classical conditioning, reinforcement learning and operant conditioning, and between unsupervised and perceptual learning. Additionally, it interprets innate and acquired actions from the standpoint of feedback vs anticipatory and adaptive control. Finally, it argues how this framework of translating knowledge between formal and biological disciplines can serve us to not only structure and advance our understanding of brain function but also enrich engineering solutions at the level of robot learning and control with insights coming from biology.

Download Full-text

Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems

Machine Learning ◽

10.1007/s10994-020-05939-8 ◽

2021 ◽

Author(s):

Amarildo Likmeta ◽

Alberto Maria Metelli ◽

Giorgia Ramponi ◽

Andrea Tirinzoni ◽

Matteo Giuliani ◽

...

Keyword(s):

Reinforcement Learning ◽

Real World ◽

Real Life ◽

User Preferences ◽

Inverse Reinforcement Learning ◽

Water Release ◽

Reward Function ◽

Model Free ◽

Conflicting Objectives ◽

Multiple Experts

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.

Download Full-text

A Multi-Branch U-Net for Steel Surface Defect Type and Severity Segmentation

Metals ◽

10.3390/met11060870 ◽

2021 ◽

Vol 11 (6) ◽

pp. 870

Author(s):

Robby Neven ◽

Toon Goedemé

Keyword(s):

Process Parameters ◽

Visual Inspection ◽

Steel Surface ◽

Surface Defect ◽

State Of The Art ◽

Sheet Steel ◽

Real Life ◽

Single Task ◽

Severity Estimation ◽

Task Model

Automating sheet steel visual inspection can improve quality and reduce costs during its production. While many manufacturers still rely on manual or traditional inspection methods, deep learning-based approaches have proven their efficiency. In this paper, we go beyond the state-of-the-art in this domain by proposing a multi-task model that performs both pixel-based defect segmentation and severity estimation of the defects in one two-branch network. Additionally, we show how incorporation of the production process parameters improves the model’s performance. After manually constructing a real-life industrial dataset, we first implemented and trained two single-task models performing the defect segmentation and severity estimation tasks separately. Next, we compared this to a multi-task model that simultaneously performs the two tasks at hand. By combining the tasks into one model, both segmentation tasks improved by 2.5% and 3% mIoU, respectively. In the next step, we extended the multi-task model using sensor fusion with process parameters. We demonstrate that the incorporation of the process parameters resulted in a further mIoU increase of 6.8% and 2.9% for the defect segmentation and severity estimation tasks, respectively.

Download Full-text