A Deep Reinforcement Learning Framework for UAV Navigation in Indoor Environments

Abstract. Autonomously exploring and mapping is one of the open challenges of robotics and artificial intelligence. Especially when the environments are unknown, choosing the optimal navigation directive is not straightforward. In this paper, we propose a reinforcement learning framework for navigating, exploring, and mapping unknown environments. The reinforcement learning agent is in charge of selecting the commands for steering the mobile robot, while a SLAM algorithm estimates the robot pose and maps the environments. The agent, to select optimal actions, is trained to be curious about the world. This concept translates into the introduction of a curiosity-driven reward function that encourages the agent to steer the mobile robot towards unknown and unseen areas of the world and the map. We test our approach in explorations challenges in different indoor environments. The agent trained with the proposed reward function outperforms the agents trained with reward functions commonly used in the literature for solving such tasks.

Download Full-text

Hierarchical Reinforcement Learning Framework for Secure UAV Communication in the Presence of Multiple UAV Adaptive Eavesdroppers

2020 IEEE 6th International Conference on Computer and Communications (ICCC) ◽

10.1109/iccc51575.2020.9344970 ◽

2020 ◽

Author(s):

Liu Jue ◽

Yang Weiwei

Keyword(s):

Reinforcement Learning ◽

Hierarchical Reinforcement Learning ◽

Learning Framework

Download Full-text

A Reinforcement Learning Framework for Spiking Networks with Dynamic Synapses

Computational Intelligence and Neuroscience ◽

10.1155/2011/869348 ◽

2011 ◽

Vol 2011 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Karim El-Laithy ◽

Martin Bogdan

Keyword(s):

Reinforcement Learning ◽

Spike Timing ◽

Neural Representation ◽

Model Parameters ◽

Learning Framework ◽

Reference Target ◽

Wide Range ◽

Spiking Network ◽

Dynamic Synapses ◽

Exclusive Or

An integration of both the Hebbian-based and reinforcement learning (RL) rules is presented for dynamic synapses. The proposed framework permits the Hebbian rule to update the hidden synaptic model parameters regulating the synaptic response rather than the synaptic weights. This is performed using both the value and the sign of the temporal difference in the reward signal after each trial. Applying this framework, a spiking network with spike-timing-dependent synapses is tested to learn the exclusive-OR computation on a temporally coded basis. Reward values are calculated with the distance between the output spike train of the network and a reference target one. Results show that the network is able to capture the required dynamics and that the proposed framework can reveal indeed an integrated version of Hebbian and RL. The proposed framework is tractable and less computationally expensive. The framework is applicable to a wide class of synaptic models and is not restricted to the used neural representation. This generality, along with the reported results, supports adopting the introduced approach to benefit from the biologically plausible synaptic models in a wide range of intuitive signal processing.

Download Full-text

Drone Deep Reinforcement Learning: A Review

Electronics ◽

10.3390/electronics10090999 ◽

2021 ◽

Vol 10 (9) ◽

pp. 999

Author(s):

Ahmad Taher Azar ◽

Anis Koubaa ◽

Nada Ali Mohamed ◽

Habiba A. Ibrahim ◽

Zahra Fathy Ibrahim ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Real Life ◽

Environment Monitoring ◽

Simulated Environments ◽

Infrastructure Inspection ◽

Remote Sensing Mapping ◽

And Control ◽

The Military ◽

Uav Navigation

Unmanned Aerial Vehicles (UAVs) are increasingly being used in many challenging and diversified applications. These applications belong to the civilian and the military fields. To name a few; infrastructure inspection, traffic patrolling, remote sensing, mapping, surveillance, rescuing humans and animals, environment monitoring, and Intelligence, Surveillance, Target Acquisition, and Reconnaissance (ISTAR) operations. However, the use of UAVs in these applications needs a substantial level of autonomy. In other words, UAVs should have the ability to accomplish planned missions in unexpected situations without requiring human intervention. To ensure this level of autonomy, many artificial intelligence algorithms were designed. These algorithms targeted the guidance, navigation, and control (GNC) of UAVs. In this paper, we described the state of the art of one subset of these algorithms: the deep reinforcement learning (DRL) techniques. We made a detailed description of them, and we deduced the current limitations in this area. We noted that most of these DRL methods were designed to ensure stable and smooth UAV navigation by training computer-simulated environments. We realized that further research efforts are needed to address the challenges that restrain their deployment in real-life scenarios.

Download Full-text

Adaptive Reinforcement Learning Framework for NOMA-UAV Networks

IEEE Communications Letters ◽

10.1109/lcomm.2021.3093385 ◽

2021 ◽

pp. 1-1

Author(s):

Syed Khurram Mahmud ◽

Yuanwei Liu ◽

Yue Chen ◽

Kok Keong Chai

Keyword(s):

Reinforcement Learning ◽

Learning Framework

Download Full-text

A Dual-Critic Reinforcement Learning Framework for Frame-Level Bit Allocation in HEVC/H.265

2021 Data Compression Conference (DCC) ◽

10.1109/dcc50243.2021.00009 ◽

2021 ◽

Author(s):

Yung-Han Ho ◽

Guo-Lun Jin ◽

Yun Liang ◽

Wen-Hsiao Peng ◽

Xiaobo Li

Keyword(s):

Reinforcement Learning ◽

Bit Allocation ◽

Learning Framework

Download Full-text

An efficient UAV navigation solution for confined but partially known indoor environments

11th IEEE International Conference on Control & Automation (ICCA) ◽

10.1109/icca.2014.6871120 ◽

2014 ◽

Cited By ~ 9

Author(s):

Fei Wang ◽

Kangli Wang ◽

Shupeng Lai ◽

Swee King Phang ◽

Ben M. Chen ◽

...

Keyword(s):

Indoor Environments ◽

Navigation Solution ◽

Uav Navigation

Download Full-text

A novel reinforcement learning framework for sensor subset selection

2010 International Conference on Networking, Sensing and Control (ICNSC) ◽

10.1109/icnsc.2010.5461532 ◽

2010 ◽

Cited By ~ 3

Author(s):

Omkar Tilak ◽

Snehasis Mukhopadhyay ◽

Mihran Tuceryan ◽

Rajeev Raje

Keyword(s):

Reinforcement Learning ◽

Subset Selection ◽

Learning Framework

Download Full-text

A Reinforcement Learning Framework for Relevance Feedback

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval ◽

10.1145/3397271.3401099 ◽

2020 ◽

Cited By ~ 2

Author(s):

Ali Montazeralghaem ◽

Hamed Zamani ◽

James Allan

Keyword(s):

Reinforcement Learning ◽

Relevance Feedback ◽

Learning Framework

Download Full-text

Predicting Human Mobility with Reinforcement-Learning-Based Long-Term Periodicity Modeling

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3469860 ◽

2021 ◽

Vol 12 (6) ◽

pp. 1-23

Author(s):

Shuo Tao ◽

Jingang Jiang ◽

Defu Lian ◽

Kai Zheng ◽

Enhong Chen

Keyword(s):

Reinforcement Learning ◽

Human Mobility ◽

Recurrent Network ◽

Mobility Prediction ◽

Learning Framework ◽

Temporal Features ◽

Wide Range ◽

Spatio Temporal ◽

Historical Trajectory

Mobility prediction plays an important role in a wide range of location-based applications and services. However, there are three problems in the existing literature: (1) explicit high-order interactions of spatio-temporal features are not systemically modeled; (2) most existing algorithms place attention mechanisms on top of recurrent network, so they can not allow for full parallelism and are inferior to self-attention for capturing long-range dependence; (3) most literature does not make good use of long-term historical information and do not effectively model the long-term periodicity of users. To this end, we propose MoveNet and RLMoveNet. MoveNet is a self-attention-based sequential model, predicting each user’s next destination based on her most recent visits and historical trajectory. MoveNet first introduces a cross-based learning framework for modeling feature interactions. With self-attention on both the most recent visits and historical trajectory, MoveNet can use an attention mechanism to capture the user’s long-term regularity in a more efficient way. Based on MoveNet, to model long-term periodicity more effectively, we add the reinforcement learning layer and named RLMoveNet. RLMoveNet regards the human mobility prediction as a reinforcement learning problem, using the reinforcement learning layer as the regularization part to drive the model to pay attention to the behavior with periodic actions, which can help us make the algorithm more effective. We evaluate both of them with three real-world mobility datasets. MoveNet outperforms the state-of-the-art mobility predictor by around 10% in terms of accuracy, and simultaneously achieves faster convergence and over 4x training speedup. Moreover, RLMoveNet achieves higher prediction accuracy than MoveNet, which proves that modeling periodicity explicitly from the perspective of reinforcement learning is more effective.

Download Full-text

A Deep Reinforcement Learning Framework for UAV Navigation in Indoor Environments

CURIOSITY-DRIVEN REINFORCEMENT LEARNING AGENT FOR MAPPING UNKNOWN INDOOR ENVIRONMENTS

Hierarchical Reinforcement Learning Framework for Secure UAV Communication in the Presence of Multiple UAV Adaptive Eavesdroppers

A Reinforcement Learning Framework for Spiking Networks with Dynamic Synapses

Drone Deep Reinforcement Learning: A Review

Adaptive Reinforcement Learning Framework for NOMA-UAV Networks

A Dual-Critic Reinforcement Learning Framework for Frame-Level Bit Allocation in HEVC/H.265

An efficient UAV navigation solution for confined but partially known indoor environments

A novel reinforcement learning framework for sensor subset selection

A Reinforcement Learning Framework for Relevance Feedback

Predicting Human Mobility with Reinforcement-Learning-Based Long-Term Periodicity Modeling

Export Citation Format