Dynamic Path Planning of Unknown Environment Based on Deep Reinforcement Learning

Dynamic path planning of unknown environment has always been a challenge for mobile robots. In this paper, we apply double Q-network (DDQN) deep reinforcement learning proposed by DeepMind in 2016 to dynamic path planning of unknown environment. The reward and punishment function and the training method are designed for the instability of the training stage and the sparsity of the environment state space. In different training stages, we dynamically adjust the starting position and target position. With the updating of neural network and the increase of greedy rule probability, the local space searched by agent is expanded. Pygame module in PYTHON is used to establish dynamic environments. Considering lidar signal and local target position as the inputs, convolutional neural networks (CNNs) are used to generalize the environmental state. Q-learning algorithm enhances the ability of the dynamic obstacle avoidance and local planning of the agents in environment. The results show that, after training in different dynamic environments and testing in a new environment, the agent is able to reach the local target position successfully in unknown dynamic environment.

Download Full-text

Dynamic path planning for mobile robot based on genetic algorithm in unknown environment

2010 Chinese Control and Decision Conference ◽

10.1109/ccdc.2010.5498349 ◽

2010 ◽

Cited By ~ 11

Author(s):

Pu Shi ◽

Yujie Cui

Keyword(s):

Genetic Algorithm ◽

Path Planning ◽

Mobile Robot ◽

Unknown Environment ◽

Dynamic Path Planning ◽

Dynamic Path

Download Full-text

Improved Path Planning for Indoor Patrol Robot Based on Deep Reinforcement Learning

Symmetry ◽

10.3390/sym14010132 ◽

2022 ◽

Vol 14 (1) ◽

pp. 132

Author(s):

Jianfeng Zheng ◽

Shuren Mao ◽

Zhenyu Wu ◽

Pengcheng Kong ◽

Hao Qiang

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Loss Function ◽

Learning Algorithm ◽

Target Position ◽

Convergence Speed ◽

Position Information ◽

Image Information ◽

Navigation Task ◽

Reinforcement Learning Algorithm

To solve the problems of poor exploration ability and convergence speed of traditional deep reinforcement learning in the navigation task of the patrol robot under indoor specified routes, an improved deep reinforcement learning algorithm based on Pan/Tilt/Zoom(PTZ) image information was proposed in this paper. The obtained symmetric image information and target position information are taken as the input of the network, the speed of the robot is taken as the output of the next action, and the circular route with boundary is taken as the test. The improved reward and punishment function is designed to improve the convergence speed of the algorithm and optimize the path so that the robot can plan a safer path while avoiding obstacles first. Compared with Deep Q Network(DQN) algorithm, the convergence speed after improvement is shortened by about 40%, and the loss function is more stable.

Download Full-text

Path Planning of Robotic Fish in Unknown Environment with Improved Reinforcement Learning Algorithm

Internet and Distributed Computing Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-030-02738-4_21 ◽

2018 ◽

pp. 248-257

Author(s):

Jingbo Hu ◽

Jie Mei ◽

Dingfang Chen ◽

Lijie Li ◽

Zhengshu Cheng

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Learning Algorithm ◽

Robotic Fish ◽

Unknown Environment ◽

Reinforcement Learning Algorithm

Download Full-text

Dynamic path planning of a mobile robot with improved Q-learning algorithm

2015 IEEE International Conference on Information and Automation ◽

10.1109/icinfa.2015.7279322 ◽

2015 ◽

Cited By ~ 10

Author(s):

Siding Li ◽

Xin Xu ◽

Lei Zuo

Keyword(s):

Path Planning ◽

Mobile Robot ◽

Learning Algorithm ◽

Q Learning ◽

Dynamic Path Planning ◽

Dynamic Path

Download Full-text

Multi-robot multi-target dynamic path planning using artificial bee colony and evolutionary programming in unknown environment

Intelligent Service Robotics ◽

10.1007/s11370-017-0244-7 ◽

2018 ◽

Vol 11 (2) ◽

pp. 171-186 ◽

Cited By ~ 8

Author(s):

Abdul Qadir Faridi ◽

Sanjeev Sharma ◽

Anupam Shukla ◽

Ritu Tiwari ◽

Joydip Dhar

Keyword(s):

Path Planning ◽

Artificial Bee Colony ◽

Evolutionary Programming ◽

Unknown Environment ◽

Bee Colony ◽

Dynamic Path Planning ◽

Dynamic Path ◽

Multi Robot

Download Full-text

UUV dynamic path planning and trap escape strategies in unknown environment

2016 35th Chinese Control Conference (CCC) ◽

10.1109/chicc.2016.7554458 ◽

2016 ◽

Cited By ~ 1

Author(s):

Xuelian Zhang ◽

Hongjian Wang ◽

Hongli Lv ◽

Qing Li

Keyword(s):

Path Planning ◽

Unknown Environment ◽

Dynamic Path Planning ◽

Dynamic Path

Download Full-text

Research on Dynamic Path Planning of Mobile Robot Based on Improved DDPG Algorithm

Mobile Information Systems ◽

10.1155/2021/5169460 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Peng Li ◽

Xiangcheng Ding ◽

Hongfang Sun ◽

Shiquan Zhao ◽

Ricardo Cajo

Keyword(s):

Path Planning ◽

Mobile Robot ◽

Success Rate ◽

Dynamic Environment ◽

Simulation Software ◽

Convergence Speed ◽

Good Effect ◽

Dynamic Path Planning ◽

Experience Replay ◽

Dynamic Path

Aiming at the problems of low success rate and slow learning speed of the DDPG algorithm in path planning of a mobile robot in a dynamic environment, an improved DDPG algorithm is designed. In this article, the RAdam algorithm is used to replace the neural network optimizer in DDPG, combined with the curiosity algorithm to improve the success rate and convergence speed. Based on the improved algorithm, priority experience replay is added, and transfer learning is introduced to improve the training effect. Through the ROS robot operating system and Gazebo simulation software, a dynamic simulation environment is established, and the improved DDPG algorithm and DDPG algorithm are compared. For the dynamic path planning task of the mobile robot, the simulation results show that the convergence speed of the improved DDPG algorithm is increased by 21%, and the success rate is increased to 90% compared with the original DDPG algorithm. It has a good effect on dynamic path planning for mobile robots with continuous action space.

Download Full-text

UAV dynamic path planning based on obstacle position prediction in an unknown environment

IEEE Access ◽

10.1109/access.2021.3128295 ◽

2021 ◽

pp. 1-1

Author(s):

Jianxin Feng ◽

Jingze Zhang ◽

Geng Zhang ◽

Shuang Xie ◽

Yuanming Ding ◽

...

Keyword(s):

Path Planning ◽

Unknown Environment ◽

Dynamic Path Planning ◽

Position Prediction ◽

Dynamic Path

Download Full-text

A Cooperative Multi-Agent Based Path-Planning and Optimization Strategy for Dynamic Environment

10.20944/preprints202012.0307.v1 ◽

2020 ◽

Author(s):

Junior Sundar ◽

Sumith Yesudasan ◽

Sibi Chacko

Keyword(s):

Path Planning ◽

Dynamic Environment ◽

Dynamic Environments ◽

Obstacle Detection ◽

Optimization Strategy ◽

Agent Based ◽

Dynamic Path Planning ◽

Comparable Performance ◽

Graph Search Algorithms ◽

Multi Agent

This investigation explores a novel path-planning and optimization strategy for multiple cooperative robotic agents, applied in a fully observable and dynamically changing obstacle field. Current dynamic path planning strategies employ static algorithms operating over incremental time-steps. We propose a cooperative multi-agent (CMA) based algorithm, based on natural flocking of animals, using vector operations. It is preferred over more common graph search algorithms like A* as it can be easily applied for dynamic environments. CMA algorithm executes obstacle avoidance using static potential fields around obstacles, that scale based on relative motion. Optimization strategies including interpolation and Bezier curves are applied to the algorithm. To validate effectiveness, CMA algorithm is compared with A* using static obstacles due to lack of equivalent algorithms for dynamic environments. CMA performed comparably to A* with difference ranging from -0.2% to 1.3%. CMA algorithm is applied experimentally to achieve comparable performance, with an error range of -0.5% to 5.2%. These errors are attributed to the limitations of the Kinect V1 sensor used for obstacle detection. The algorithm was finally implemented in a 3D simulated space, indicating that it is possible to apply with drones. This algorithm shows promise for application in warehouse and inventory automation, especially when the workspace is observable.

Download Full-text

Irradiation-Driven Dynamic Path-Planning of Moving Airborne Solar Farms Using Reinforcement Learning

10.1109/isgteurope52324.2021.9639969 ◽

2021 ◽

Author(s):

Siyao Gu ◽

Miltiadis Alamaniotis

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Dynamic Path Planning ◽

Dynamic Path

Download Full-text