Autonomous Surface Vessel Obstacle Avoidance Based on Hierarchical Reinforcement Learning

Abstract The obstacle avoidance problem of autonomous surface vessels (ASV) has attracted the attention of the marine control research community for long years. Out of safety consideration, it is important for ASV to avoid all kinds of obstacles like shores, cliffs, floaters and other vessels. Developing a heading and path planning strategy for ASV is the main task and the remaining challenge. Traditional obstacle avoidance algorithms lead to too much computing in working environment. The issue of computation cost can be solved by training obstacle avoidance models with reinforcement learning (RL). By using the RL method, the ASV will choose the most efficient action according to the ASV’s experience it learned from the past. In this paper, RL is adopted to design a decision-making agent for obstacle avoidance. To train the obstacle avoidance model under a sparse feedback environment, hierarchical reinforcement learning (HRL) method is applied. Using this algorithm, better obstacle avoidance performance and longer survival time can be achieved. Memory pool modification and target network modification are also used to smooth the training process of the ASV. Simulation results demonstrate that HRL can make the learning process of un-manned ship’s obstacle avoidance smoother and more effective. Also, the living time of ASVs is improved.

Download Full-text

Autonomous Surface Vessel Obstacle Avoidance Based on Hierarchical Reinforcement Learning With Potential Field Method

10.1115/1.0000710v ◽

2021 ◽

Author(s):

Chang Zhou ◽

Lei Wang ◽

Huacheng He ◽

Shangyu Yu

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance ◽

Potential Field ◽

Field Method ◽

Hierarchical Reinforcement Learning ◽

Potential Field Method ◽

Surface Vessel

Download Full-text

Robot obstacle avoidance system using deep reinforcement learning

Industrial Robot the international journal of robotics research and application ◽

10.1108/ir-06-2021-0127 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Xiaojun Zhu ◽

Yinghao Liang ◽

Hanxu Sun ◽

Xueqian Wang ◽

Bin Ren

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Obstacle Avoidance ◽

Learning Algorithm ◽

Optimal Path ◽

Environmental Parameters ◽

Working Environment ◽

Content Type ◽

Practical Applications ◽

Human Operators

Purpose Most manufacturing plants choose the easy way of completely separating human operators from robots to prevent accidents, but as a result, it dramatically affects the overall quality and speed that is expected from human–robot collaboration. It is not an easy task to ensure human safety when he/she has entered a robot’s workspace, and the unstructured nature of those working environments makes it even harder. The purpose of this paper is to propose a real-time robot collision avoidance method to alleviate this problem. Design/methodology/approach In this paper, a model is trained to learn the direct control commands from the raw depth images through self-supervised reinforcement learning algorithm. To reduce the effect of sample inefficiency and safety during initial training, a virtual reality platform is used to simulate a natural working environment and generate obstacle avoidance data for training. To ensure a smooth transfer to a real robot, the automatic domain randomization technique is used to generate randomly distributed environmental parameters through the obstacle avoidance simulation of virtual robots in the virtual environment, contributing to better performance in the natural environment. Findings The method has been tested in both simulations with a real UR3 robot for several practical applications. The results of this paper indicate that the proposed approach can effectively make the robot safety-aware and learn how to divert its trajectory to avoid accidents with humans within the workspace. Research limitations/implications The method has been tested in both simulations with a real UR3 robot in several practical applications. The results indicate that the proposed approach can effectively make the robot be aware of safety and learn how to change its trajectory to avoid accidents with persons within the workspace. Originality/value This paper provides a novel collision avoidance framework that allows robots to work alongside human operators in unstructured and complex environments. The method uses end-to-end policy training to directly extract the optimal path from the visual inputs for the scene.

Download Full-text

Hierarchical Reinforcement Learning Considering Stochastic Wind Disturbance for Power Line Maintenance Robot

10.21203/rs.3.rs-783306/v1 ◽

2021 ◽

Author(s):

Xiaoliang Zheng ◽

Gongping Wu

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance ◽

Power Line ◽

Gradient Algorithm ◽

State Function ◽

Wind Disturbance ◽

Local Approach ◽

Reward Function ◽

Hierarchical Reinforcement Learning ◽

Autonomous Operation

Abstract Robot intelligence includes motion intelligence and cognitive intelligence. Aiming at the motion intelligence, a hierarchical reinforcement learning architecture considering stochastic wind disturbance is proposed for the decision-making of the power line maintenance robot with autonomous operation. This architecture uses the prior information of the mechanism knowledge and empirical data to improve the safety and efficiency of the robot operation. In this architecture, the high-level policy selection and the low-level motion control at global and local levels are considered comprehensively under the condition of stochastic wind disturbance. Firstly, the operation task is decomposed into three sub-policies: global obstacle avoidance, local approach and local tightening, and each sub-policy is learned. Then, a master policy is learned to select the operation sub-policy in the current state. The dual deep Q network algorithm is used for the master policy, while the deep deterministic policy gradient algorithm is used for the operation policy. In order to improve the training efficiency, the global obstacle avoidance sub-policy takes the random forest composed of dynamic environmental decision tree as the expert algorithm for imitation learning. The architecture is applied to a power line maintenance scenario, the state function and reward function of each policy are designed, and all policies are trained in an asynchronous and parallel computing environment. It is proved that this architecture can realize stable and safe autonomous operating decision for the power line maintenance robot subjected to stochastic wind disturbance.

Download Full-text

Autonomous Surface Vessel Obstacle Avoidance Based on Hierarchical Reinforcement Learning With Potential Field Method

10.1115/1.0005152v ◽

2021 ◽

Author(s):

Chang Zhou ◽

Lei Wang ◽

Huacheng He ◽

Shangyu Yu

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance ◽

Potential Field ◽

Field Method ◽

Hierarchical Reinforcement Learning ◽

Potential Field Method ◽

Surface Vessel

Download Full-text

Improving Student-System Interaction Through Data-driven Explanations of Hierarchical Reinforcement Learning Induced Pedagogical Policies

Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization ◽

10.1145/3340631.3394848 ◽

2020 ◽

Author(s):

Guojing Zhou ◽

Xi Yang ◽

Hamoon Azizsoltani ◽

Tiffany Barnes ◽

Min Chi

Keyword(s):

Reinforcement Learning ◽

Data Driven ◽

Hierarchical Reinforcement Learning

Download Full-text

Automatic Hierarchical Reinforcement Learning for Reusing Service Process Fragments

IEEE Access ◽

10.1109/access.2021.3054852 ◽

2021 ◽

Vol 9 ◽

pp. 20746-20759

Author(s):

Rong Yang ◽

Bing Li ◽

Zhengli Liu

Keyword(s):

Reinforcement Learning ◽

Service Process ◽

Hierarchical Reinforcement Learning ◽

Process Fragments

Download Full-text

Hierarchical Reinforcement Learning Framework for Secure UAV Communication in the Presence of Multiple UAV Adaptive Eavesdroppers

2020 IEEE 6th International Conference on Computer and Communications (ICCC) ◽

10.1109/iccc51575.2020.9344970 ◽

2020 ◽

Author(s):

Liu Jue ◽

Yang Weiwei

Keyword(s):

Reinforcement Learning ◽

Hierarchical Reinforcement Learning ◽

Learning Framework

Download Full-text

Training a simulated bat: Modeling sonar-based obstacle avoidance using deep-reinforcement learning

2020 IEEE Symposium Series on Computational Intelligence (SSCI) ◽

10.1109/ssci47803.2020.9308555 ◽

2020 ◽

Author(s):

Adithya Venkatesh Mohan ◽

Dieter Vanderelst

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance

Download Full-text

Depth-based Obstacle Avoidance through Deep Reinforcement Learning

Proceedings of the 5th International Conference on Mechatronics and Robotics Engineering - ICMRE'19 ◽

10.1145/3314493.3314495 ◽

2019 ◽

Cited By ~ 1

Author(s):

Keyu Wu ◽

Mahdi Abolfazli Esfahani ◽

Shenghai Yuan ◽

Han Wang

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance

Download Full-text

Hierarchical Reinforcement Learning

ACM Computing Surveys ◽

10.1145/3453160 ◽

2021 ◽

Vol 54 (5) ◽

pp. 1-35

Author(s):

Shubham Pateria ◽

Budhitama Subagdja ◽

Ah-hwee Tan ◽

Chai Quek

Keyword(s):

Reinforcement Learning ◽

Future Research ◽

Comprehensive Overview ◽

Open Problems ◽

Practical Applications ◽

Hierarchical Reinforcement Learning ◽

The Past ◽

Agent Learning ◽

Multi Agent ◽

Supplementary Material

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the approaches. Based on the survey, a set of important open problems is proposed to motivate the future research in HRL. Furthermore, we outline a few suitable task domains for evaluating the HRL approaches and a few interesting examples of the practical applications of HRL in the Supplementary Material.

Download Full-text