Reinforcement Learning Based Obstacle Avoidance for Autonomous Underwater Vehicle

This paper presents the development of simple but powerful path-following and obstacle-avoidance control laws for an underactuated autonomous underwater vehicle (AUV). Potential function-based proportional derivative (PFPD) as well as a potential function-based augmented proportional derivative (PFAPD) control laws are developed to govern the motion of the AUV in an obstacle-rich environment. For obstacle avoidance, a mathematical potential function is used, which formulates the repulsive force between the AUV and the solid obstacles intersecting the desired path. Numerical simulations are carried out to study the efficacy of the proposed controllers and the results are observed. To reduce the values of the overshoots and steady-state errors identified due to the application of PFPD controller a PFAPD controller is designed that drives the AUV along the desired trajectory. From the simulation results, it is observed that the proposed controllers are able to drive the AUV to track the desired path, avoiding the obstacles in an obstacle-rich environment. The results are compared and it is observed that the PFAPD outperforms the PFPD to drive the AUV along the desired trajectory. It is also proved that it is not necessary to employ highly complicated controllers for solving obstacle-avoidance and path-following problems of underactuated AUVs. These problems can be solved with the application of PFAPD controllers.

Download Full-text

Obstacle Avoidance Approaches for Autonomous Underwater Vehicle: Simulation and Experimental Results

IEEE Journal of Oceanic Engineering ◽

10.1109/joe.2015.2506204 ◽

2016 ◽

Vol 41 (4) ◽

pp. 882-892 ◽

Cited By ~ 24

Author(s):

Boris Braginsky ◽

Hugo Guterman

Keyword(s):

Obstacle Avoidance ◽

Autonomous Underwater Vehicle ◽

Underwater Vehicle ◽

Experimental Results ◽

Vehicle Simulation

Download Full-text

Autonomous underwater vehicle path planning based on actor-multi-critic reinforcement learning

Proceedings of the Institution of Mechanical Engineers Part I Journal of Systems and Control Engineering ◽

10.1177/0959651820937085 ◽

2020 ◽

pp. 095965182093708

Author(s):

Zhuo Wang ◽

Shiwei Zhang ◽

Xiaoning Feng ◽

Yancheng Sui

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Value Function ◽

Autonomous Underwater Vehicle ◽

Autonomous Underwater Vehicles ◽

Underwater Vehicle ◽

Learning Efficiency ◽

Environmental Adaptability ◽

Vehicle Path ◽

The Value Function

The environmental adaptability of autonomous underwater vehicles is always a problem for its path planning. Although reinforcement learning can improve the environmental adaptability, the slow convergence of reinforcement learning is caused by multi-behavior coupling, so it is difficult for autonomous underwater vehicle to avoid moving obstacles. This article proposes a multi-behavior critic reinforcement learning algorithm applied to autonomous underwater vehicle path planning to overcome problems associated with oscillating amplitudes and low learning efficiency in the early stages of training which are common in traditional actor–critic algorithms. Behavior critic reinforcement learning assesses the actions of the actor from perspectives such as energy saving and security, combining these aspects into a whole evaluation of the actor. In this article, the policy gradient method is selected as the actor part, and the value function method is selected as the critic part. The strategy gradient and the value function methods for actor and critic, respectively, are approximated by a backpropagation neural network, the parameters of which are updated using the gradient descent method. The simulation results show that the method has the ability of optimizing learning in the environment and can improve learning efficiency, which meets the needs of real time and adaptability for autonomous underwater vehicle dynamic obstacle avoidance.

Download Full-text