scholarly journals Path Planning of Coastal Ships Based on Optimized DQN Reward Function

2021 ◽  
Vol 9 (2) ◽  
pp. 210
Author(s):  
Siyu Guo ◽  
Xiuguo Zhang ◽  
Yiquan Du ◽  
Yisong Zheng ◽  
Zhiying Cao

Path planning is a key issue in the field of coastal ships, and it is also the core foundation of ship intelligent development. In order to better realize the ship path planning in the process of navigation, this paper proposes a coastal ship path planning model based on the optimized deep Q network (DQN) algorithm. The model is mainly composed of environment status information and the DQN algorithm. The environment status information provides training space for the DQN algorithm and is quantified according to the actual navigation environment and international rules for collision avoidance at sea. The DQN algorithm mainly includes four components which are ship state space, action space, action exploration strategy and reward function. The traditional reward function of DQN may lead to the low learning efficiency and convergence speed of the model. This paper optimizes the traditional reward function from three aspects: (a) the potential energy reward of the target point to the ship is set; (b) the reward area is added near the target point; and (c) the danger area is added near the obstacle. Through the above optimized method, the ship can avoid obstacles to reach the target point faster, and the convergence speed of the model is accelerated. The traditional DQN algorithm, A* algorithm, BUG2 algorithm and artificial potential field (APF) algorithm are selected for experimental comparison, and the experimental data are analyzed from the path length, planning time, number of path corners. The experimental results show that the optimized DQN algorithm has better stability and convergence, and greatly reduces the calculation time. It can plan the optimal path in line with the actual navigation rules, and improve the safety, economy and autonomous decision-making ability of ship navigation.

2021 ◽  
Vol 2021 ◽  
pp. 1-23
Author(s):  
Yiquan Du ◽  
Xiuguo Zhang ◽  
Zhiying Cao ◽  
Shaobo Wang ◽  
Jiacheng Liang ◽  
...  

Deep Reinforcement Learning (DRL) is widely used in path planning with its powerful neural network fitting ability and learning ability. However, existing DRL-based methods use discrete action space and do not consider the impact of historical state information, resulting in the algorithm not being able to learn the optimal strategy to plan the path, and the planned path has arcs or too many corners, which does not meet the actual sailing requirements of the ship. In this paper, an optimized path planning method for coastal ships based on improved Deep Deterministic Policy Gradient (DDPG) and Douglas–Peucker (DP) algorithm is proposed. Firstly, Long Short-Term Memory (LSTM) is used to improve the network structure of DDPG, which uses the historical state information to approximate the current environmental state information, so that the predicted action is more accurate. On the other hand, the traditional reward function of DDPG may lead to low learning efficiency and convergence speed of the model. Hence, this paper improves the reward principle of traditional DDPG through the mainline reward function and auxiliary reward function, which not only helps to plan a better path for ship but also improves the convergence speed of the model. Secondly, aiming at the problem that too many turning points exist in the above-planned path which may increase the navigation risk, an improved DP algorithm is proposed to further optimize the planned path to make the final path more safe and economical. Finally, simulation experiments are carried out to verify the proposed method from the aspects of plan planning effect and convergence trend. Results show that the proposed method can plan safe and economic navigation paths and has good stability and convergence.


2021 ◽  
Vol 9 (3) ◽  
pp. 252
Author(s):  
Yushan Sun ◽  
Xiaokun Luo ◽  
Xiangrui Ran ◽  
Guocheng Zhang

This research aims to solve the safe navigation problem of autonomous underwater vehicles (AUVs) in deep ocean, which is a complex and changeable environment with various mountains. When an AUV reaches the deep sea navigation, it encounters many underwater canyons, and the hard valley walls threaten its safety seriously. To solve the problem on the safe driving of AUV in underwater canyons and address the potential of AUV autonomous obstacle avoidance in uncertain environments, an improved AUV path planning algorithm based on the deep deterministic policy gradient (DDPG) algorithm is proposed in this work. This method refers to an end-to-end path planning algorithm that optimizes the strategy directly. It takes sensor information as input and driving speed and yaw angle as outputs. The path planning algorithm can reach the predetermined target point while avoiding large-scale static obstacles, such as valley walls in the simulated underwater canyon environment, as well as sudden small-scale dynamic obstacles, such as marine life and other vehicles. In addition, this research aims at the multi-objective structure of the obstacle avoidance of path planning, modularized reward function design, and combined artificial potential field method to set continuous rewards. This research also proposes a new algorithm called deep SumTree-deterministic policy gradient algorithm (SumTree-DDPG), which improves the random storage and extraction strategy of DDPG algorithm experience samples. According to the importance of the experience samples, the samples are classified and stored in combination with the SumTree structure, high-quality samples are extracted continuously, and SumTree-DDPG algorithm finally improves the speed of the convergence model. Finally, this research uses Python language to write an underwater canyon simulation environment and builds a deep reinforcement learning simulation platform on a high-performance computer to conduct simulation learning training for AUV. Data simulation verified that the proposed path planning method can guide the under-actuated underwater robot to navigate to the target without colliding with any obstacles. In comparison with the DDPG algorithm, the stability, training’s total reward, and robustness of the improved Sumtree-DDPG algorithm planner in this study are better.


Information ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 99 ◽  
Author(s):  
Haiyan Wang ◽  
Zhiyu Zhou

Path planning, as the core of navigation control for mobile robots, has become the focus of research in the field of mobile robots. Various path planning algorithms have been recently proposed. In this paper, in view of the advantages and disadvantages of different path planning algorithms, a heuristic elastic particle swarm algorithm is proposed. Using the path planned by the A* algorithm in a large-scale grid for global guidance, the elastic particle swarm optimization algorithm uses a shrinking operation to determine the globally optimal path formed by locally optimal nodes so that the particles can converge to it rapidly. Furthermore, in the iterative process, the diversity of the particles is ensured by a rebound operation. Computer simulation and real experimental results show that the proposed algorithm not only overcomes the shortcomings of the A* algorithm, which cannot yield the shortest path, but also avoids the problem of failure to converge to the globally optimal path, owing to a lack of heuristic information. Additionally, the proposed algorithm maintains the simplicity and high efficiency of both the algorithms.


2019 ◽  
Vol 9 (6) ◽  
pp. 1057 ◽  
Author(s):  
Chenguang Liu ◽  
Qingzhou Mao ◽  
Xiumin Chu ◽  
Shuo Xie

A traditional A-Star (A*) algorithm generates an optimal path by minimizing the path cost. For a vessel, factors of path length, obstacle collision risk, traffic separation rule and manoeuvrability restriction should be all taken into account for path planning. Meanwhile, the water current also plays an important role in voyaging and berthing for vessels. In consideration of these defects of the traditional A-Star algorithm when it is used for vessel path planning, an improved A-Star algorithm has been proposed. To be specific, the risk models of obstacles (bridge pier, moored or anchored ship, port, shore, etc.) considering currents, traffic separation, berthing, manoeuvrability restriction have been built firstly. Then, the normal path generation and the berthing path generation with the proposed improved A-Star algorithm have been represented, respectively. Moreover, the problem of combining the normal path and the berthing path has been also solved. To verify the effectiveness of the proposed A-Star path planning methods, four cases have been studied in simulation and real scenarios. The results of experiments show that the proposed A-Star path planning methods can deal with the problems denoted in this article well, and realize the trade-off between the path length and the navigation safety.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Zihan Yu ◽  
Linying Xiang

In recent years, the path planning of robot has been a hot research direction, and multirobot formation has practical application prospect in our life. This article proposes a hybrid path planning algorithm applied to robot formation. The improved Rapidly Exploring Random Trees algorithm PQ-RRT ∗ with new distance evaluation function is used as a global planning algorithm to generate the initial global path. The determined parent nodes and child nodes are used as the starting points and target points of the local planning algorithm, respectively. The dynamic window approach is used as the local planning algorithm to avoid dynamic obstacles. At the same time, the algorithm restricts the movement of robots inside the formation to avoid internal collisions. The local optimal path is selected by the evaluation function containing the possibility of formation collision. Therefore, multiple mobile robots can quickly and safely reach the global target point in a complex environment with dynamic and static obstacles through the hybrid path planning algorithm. Numerical simulations are given to verify the effectiveness and superiority of the proposed hybrid path planning algorithm.


Robotica ◽  
2019 ◽  
Vol 37 (11) ◽  
pp. 1956-1970 ◽  
Author(s):  
Xin-Yi Yu ◽  
Zhen-Yong Fan ◽  
Lin-Lin Ou ◽  
Feng Zhu ◽  
Yong-Kui Guo

SummaryRobots often need to accomplish some complex tasks such as surveillance, response and obstacle avoidance. In this paper, a dynamic search method is proposed to generate optimal robot trajectories satisfying complex task requirement in uncertain environment. The LTL-A* algorithm is presented to generate a global optimal path and the A* algorithm is provided to modify the global optimal path. The task is specified by a linear temporal logic (LTL) formula, and a weighted transition system according to the known information in uncertain environment is modeled to describe the robot motion. Subsequently, a product automaton is constructed by combining the transition system with the task requirement. Based on the product automaton, the LTL-A* algorithm is proposed to generate a global optimal path. The local path planning based on the A* algorithm is employed to deal with the environment change during the process of tracking the global optimal path for the robot. The results of the simulation and experiments show that the proposed method can not only meet the complex task requirement in uncertain environment but also improve the search efficiency.


2020 ◽  
Vol 10 (18) ◽  
pp. 6622
Author(s):  
Ziyu Zhao ◽  
Lin Bi

During the operation of open-pit mining, the loading position of a haulage truck often changes, bringing a new challenge concerning how to plan an optimal truck transportation path considering the terrain factors. This paper proposes a path planning method based on a high-precision digital map. It contains two parts: (1) constructing a high-precision digital map of the cutting zone and (2) planning the optimal path based on the modified Hybrid A* algorithm. Firstly, we process the high-precision map based on different terrain feature factors to generate the obstacle cost map and surface roughness cost map of the cutting zone. Then, we fuse the two cost maps to generate the final cost map for path planning. Finally, we incorporate the contact cost between tire and ground to improve the node extension and path smoothing part of the Hybrid A* algorithm and further enhance the algorithm’s capability of avoiding the roughness. We use real elevation data with different terrain resolutions to perform random tests and the results show that, compared with the path without considering the terrain factors, the total transportation cost of the optimal path is reduced by 10%–20%. Moreover, the methods demonstrate robustness.


2014 ◽  
Vol 568-570 ◽  
pp. 1054-1058 ◽  
Author(s):  
Qiang Hong ◽  
Mei Xiao Chen ◽  
Yan Song Deng

Based on improved A* algorithm, this paper proposes the optimal path planning of robot fish in globally known environment, so as to achieve better coordination between the robot fish by means of improving their path planning. In the known obstacle environment which is rasterized, target nodes are generated via smoothing A* algorithm. The unnecessary connection points are removed then and the path is smoothed at the turning points. That improved algorithm, in combination with distributed scroll algorithms, is applied to multi-robot path planning in an effort to optimize the path with the avoidance of collision. The experimental results on the 2D simulation platform have verified the feasibility of that method.


2012 ◽  
Vol 229-231 ◽  
pp. 2019-2024 ◽  
Author(s):  
Zhi Qiang Zhao ◽  
Zhi Hua Liu ◽  
Jia Xin Hao

In the process of ground simulation object maneuver simulation in large-scale operation simulation, an efficient path planning method based on A*algorithm is proposed. By means of introducing all kind of geography factors and security factors into heuristic function, the plan reaching method solves the problem of finding an optimal path under acquiring enemy's situation and terrain data. Experiment results show that it has effectively raised path planning speed of A* algorithm and the scheme is practical and feasible.


2017 ◽  
Vol 12 (4) ◽  
pp. 26-35 ◽  
Author(s):  
Nizar Hadi Abbas ◽  
Farah Mahdi Ali

This paper describes the problem of online autonomous mobile robot path planning, which is consisted of finding optimal paths or trajectories for an autonomous mobile robot from a starting point to a destination across a flat map of a terrain, represented by a 2-D workspace. An enhanced algorithm for solving the problem of path planning using Bacterial Foraging Optimization algorithm is presented. This nature-inspired metaheuristic algorithm, which imitates the foraging behavior of E-coli bacteria, was used to find the optimal path from a starting point to a target point. The proposed algorithm was demonstrated by simulations in both static and dynamic different environments. A comparative study was evaluated between the developed algorithm and other two state-of-the-art algorithms. This study showed that the proposed method is effective and produces trajectories with satisfactory results.


Sign in / Sign up

Export Citation Format

Share Document