Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

10.36227/techrxiv.15048273 ◽

2021 ◽

Author(s):

Zhenhui Ye

Keyword(s):

Reinforcement Learning ◽

Ad Hoc Network ◽

Ad Hoc ◽

State Of The Art ◽

Target Area ◽

Proposed Model ◽

Decentralized Execution ◽

Partially Observable ◽

Gated Recurrent Unit ◽

Uav Navigation

<div>In this paper, we aim to design a deep reinforcement learning(DRL) based control solution to navigate a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under provide optimal communication coverage for the ground mobile users. Compared with existing DRL-based solutions that mainly solve the problem with global observation and centralized training, a practical and efficient Decentralized Training and Decentralized Execution(DTDE) framework is desirable to train and deploy each UAV in a distributed manner. To this end, we propose a novel DRL approach named Deep Recurrent Graph Network(DRGN) that makes use of Graph Attention Network-based Flying Ad-hoc Network(GAT-FANET) to achieve inter-UAV communications and Gated Recurrent Unit (GRU) to record historical information. We conducted extensive experiments to define an appropriate structure for GAT-FANET and examine the performance of DRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and four heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of DRGN.</div>

Download Full-text

Drone Deep Reinforcement Learning: A Review

Electronics ◽

10.3390/electronics10090999 ◽

2021 ◽

Vol 10 (9) ◽

pp. 999

Author(s):

Ahmad Taher Azar ◽

Anis Koubaa ◽

Nada Ali Mohamed ◽

Habiba A. Ibrahim ◽

Zahra Fathy Ibrahim ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Real Life ◽

Environment Monitoring ◽

Simulated Environments ◽

Infrastructure Inspection ◽

Remote Sensing Mapping ◽

And Control ◽

The Military ◽

Uav Navigation

Unmanned Aerial Vehicles (UAVs) are increasingly being used in many challenging and diversified applications. These applications belong to the civilian and the military fields. To name a few; infrastructure inspection, traffic patrolling, remote sensing, mapping, surveillance, rescuing humans and animals, environment monitoring, and Intelligence, Surveillance, Target Acquisition, and Reconnaissance (ISTAR) operations. However, the use of UAVs in these applications needs a substantial level of autonomy. In other words, UAVs should have the ability to accomplish planned missions in unexpected situations without requiring human intervention. To ensure this level of autonomy, many artificial intelligence algorithms were designed. These algorithms targeted the guidance, navigation, and control (GNC) of UAVs. In this paper, we described the state of the art of one subset of these algorithms: the deep reinforcement learning (DRL) techniques. We made a detailed description of them, and we deduced the current limitations in this area. We noted that most of these DRL methods were designed to ensure stable and smooth UAV navigation by training computer-simulated environments. We realized that further research efforts are needed to address the challenges that restrain their deployment in real-life scenarios.

Download Full-text

Analysis of Reinforcement Based Adaptive Routing in MANET

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v2.i3.pp648-694 ◽

2016 ◽

Vol 2 (3) ◽

pp. 648 ◽

Cited By ~ 1

Author(s):

Rahul Desai ◽

B P Patil

Keyword(s):

Reinforcement Learning ◽

Shortest Path ◽

Ad Hoc Network ◽

Ad Hoc ◽

Learning Algorithms ◽

High Mobility ◽

Delivery Ratio ◽

Shortest Path Routing ◽

Minimum Number ◽

Shortest Path Algorithms

<p class="Abstract">This paper describes and evaluates the performance of various reinforcement learning algorithms with shortest path algorithms that are widely used for routing packets through the network. Shortest path routing is the simplest policy used for routing the packets along the path having minimum number of hops. In high traffic or high mobility conditions, the shortest path get flooded with huge number of packets and congestions occurs, So such shortest path does not provides the shortest path and increases delay for reaching the packets to the destination. Reinforcement learning algorithms are adaptive algorithms where the path is selected based on the traffic present on the network at real time. Thus they guarantee the least delivery time to reach the packets to the destination. Analysis done on a 6 by 6 irregular grid and sample ad hoc network shows that performance parameters used for judging the network - packet delivery ratio and delay provides optimum results using reinforcement learning algorithms. </p>

Download Full-text

The Reinforcement Learning based Local Routing Optimization for Ad-hoc Network

JSTS Journal of Semiconductor Technology and Science ◽

10.5573/jsts.2019.19.1.137 ◽

2019 ◽

Vol 19 (1) ◽

pp. 137-143

Author(s):

Yongsu Lee ◽

Jongchan Woo ◽

Hoi-Jun Yoo

Keyword(s):

Reinforcement Learning ◽

Ad Hoc Network ◽

Ad Hoc ◽

Routing Optimization ◽

Local Routing

Download Full-text

Multiobjective Reinforcement Learning for Traffic Signal Control Using Vehicular Ad Hoc Network

EURASIP Journal on Advances in Signal Processing ◽

10.1155/2010/724035 ◽

2010 ◽

Vol 2010 (1) ◽

Cited By ~ 17

Author(s):

Duan Houli ◽

Li Zhiheng ◽

Zhang Yi

Keyword(s):

Reinforcement Learning ◽

Ad Hoc Network ◽

Ad Hoc ◽

Traffic Signal ◽

Vehicular Ad Hoc Network ◽

Signal Control ◽

Traffic Signal Control

Download Full-text

Universal Reinforcement Learning Algorithms: Survey and Experiments

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/194 ◽

2017 ◽

Author(s):

John Aslanides ◽

Jan Leike ◽

Marcus Hutter

Keyword(s):

Reinforcement Learning ◽

Open Source ◽

Markov Decision Process ◽

Decision Process ◽

Empirical Investigation ◽

State Of The Art ◽

Learning Algorithms ◽

Markov Decision ◽

Reference Implementation ◽

Partially Observable

Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP). In contrast, the field of universal reinforcement learning (URL) is concerned with algorithms that make as few assumptions as possible about the environment. The universal Bayesian agent AIXI and a family of related URL algorithms have been developed in this setting. While numerous theoretical optimality results have been proven for these agents, there has been no empirical investigation of their behavior to date. We present a short and accessible survey of these URL algorithms under a unified notation and framework, along with results of some experiments that qualitatively illustrate some properties of the resulting policies, and their relative performance on partially-observable gridworld environments. We also present an open- source reference implementation of the algorithms which we hope will facilitate further understanding of, and experimentation with, these ideas.

Download Full-text

Unsupervised Learning and Clustered Connectivity Enhance Reinforcement Learning in Spiking Neural Networks

Frontiers in Computational Neuroscience ◽

10.3389/fncom.2021.543872 ◽

2021 ◽

Vol 15 ◽

Author(s):

Philipp Weidel ◽

Renato Duarte ◽

Abigail Morrison

Keyword(s):

Reinforcement Learning ◽

Unsupervised Learning ◽

Ad Hoc ◽

Activity Patterns ◽

Receptive Fields ◽

Complex Environments ◽

Self Organized ◽

Proposed Model ◽

The Brain ◽

Clustered Network

Reinforcement learning is a paradigm that can account for how organisms learn to adapt their behavior in complex environments with sparse rewards. To partition an environment into discrete states, implementations in spiking neuronal networks typically rely on input architectures involving place cells or receptive fields specified ad hoc by the researcher. This is problematic as a model for how an organism can learn appropriate behavioral sequences in unknown environments, as it fails to account for the unsupervised and self-organized nature of the required representations. Additionally, this approach presupposes knowledge on the part of the researcher on how the environment should be partitioned and represented and scales poorly with the size or complexity of the environment. To address these issues and gain insights into how the brain generates its own task-relevant mappings, we propose a learning architecture that combines unsupervised learning on the input projections with biologically motivated clustered connectivity within the representation layer. This combination allows input features to be mapped to clusters; thus the network self-organizes to produce clearly distinguishable activity patterns that can serve as the basis for reinforcement learning on the output projections. On the basis of the MNIST and Mountain Car tasks, we show that our proposed model performs better than either a comparable unclustered network or a clustered network with static input projections. We conclude that the combination of unsupervised learning and clustered connectivity provides a generic representational substrate suitable for further computation.

Download Full-text

A Topic-Aware Reinforced Model for Weakly Supervised Stance Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017249 ◽

2019 ◽

Vol 33 ◽

pp. 7249-7256

Author(s):

Penghui Wei ◽

Wenji Mao ◽

Guandan Chen

Keyword(s):

Reinforcement Learning ◽

Opinion Mining ◽

State Of The Art ◽

Public Attitudes ◽

Representation Learning ◽

Experimental Results ◽

Training Data ◽

Policy Network ◽

Proposed Model ◽

Weakly Supervised

Analyzing public attitudes plays an important role in opinion mining systems. Stance detection aims to determine from a text whether its author is in favor of, against, or neutral towards a given target. One challenge of this task is that a text may not explicitly express an attitude towards the target, but existing approaches utilize target content alone to build models. Moreover, although weakly supervised approaches have been proposed to ease the burden of manually annotating largescale training data, such approaches are confronted with noisy labeling problem. To address the above two issues, in this paper, we propose a Topic-Aware Reinforced Model (TARM) for weakly supervised stance detection. Our model consists of two complementary components: (1) a detection network that incorporates target-related topic information into representation learning for identifying stance effectively; (2) a policy network that learns to eliminate noisy instances from auto-labeled data based on off-policy reinforcement learning. Two networks are alternately optimized to improve each other’s performances. Experimental results demonstrate that our proposed model TARM outperforms the state-of-the-art approaches.

Download Full-text

Adaptive Routing for an Ad Hoc Network Based on Reinforcement Learning

International Journal of Business Data Communications and Networking ◽

10.4018/ijbdcn.2015070103 ◽

2015 ◽

Vol 11 (2) ◽

pp. 40-52

Author(s):

Rahul Desai ◽

B.P. Patil

Keyword(s):

Reinforcement Learning ◽

Shortest Path ◽

Ad Hoc Network ◽

Ad Hoc ◽

Learning Algorithms ◽

High Mobility ◽

Delivery Ratio ◽

Shortest Path Routing ◽

Minimum Number ◽

Shortest Path Algorithms

This paper describes and evaluates the performance of various reinforcement learning algorithms with shortest path algorithms that are widely used for routing packets throughout the network. Shortest path routing is simplest policy used for routing the packets along the path having minimum number of hops. In high traffic or high mobility conditions, the shortest path gets flooded with huge number of packets and congestions occurs, so such shortest path does not provide the shortest path and increases delay for reaching the packets to the destination. Reinforcement learning algorithms are adaptive algorithms where the path is selected based on the traffic present on the network at real time. Thus they guarantee the least delivery time to reach the packets to the destination. Analysis is done on a 6-by-6 irregular grid and sample ad hoc network shows that performance parameters used for judging the network such as packet delivery ratio and delay provide optimum results using reinforcement learning algorithms.

Download Full-text

An Adaptable Secure Scheme in Mobile Ad hoc Network to Protect the Communication Channel From Malicious Behaviours

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2021070104 ◽

2021 ◽

Vol 16 (3) ◽

pp. 54-73

Author(s):

Srilakshmi R. ◽

Jaya Bhaskar M.

Keyword(s):

Ad Hoc Network ◽

Communication Channel ◽

Ad Hoc ◽

Mobile Ad Hoc Network ◽

Multipath Routing ◽

Routing Overhead ◽

Digital World ◽

Proposed Model ◽

Mobile Ad Hoc ◽

Packet Drop

Mobile ad-hoc network (MANET) is a trending field in the smart digital world; it is effectively utilized for communication sharing purposes. Besides this communication, it has numerous advances like a personal computer. However, the packet drop and low throughput ratio became serious issues. Several algorithms are implemented to increase the throughput ratio by developing multipath routing. But in some cases, the multipath routing ends in routing overhead and takes more time to transfer the data because of data load in the same path. To end this problem, this research aimed to develop a novel temporary ordered route energy migration (TOREM). Here, the migration approach balanced the data load equally and enhanced the communication channel; also, the reference node creation strategy reduced the routing overhead and packet drop ratio. Finally, the outcome of the proposed model is validated with recent existing works and earned better results by minimizing packet drop and maximizing throughput ratio.

Download Full-text