Path planning in an unknown environment based on deep reinforcement learning with prior knowledge

Path planning in an unknown environment is a basic task for mobile robots to complete tasks. As a typical deep reinforcement learning, deep Q-network (DQN) algorithm has gained wide popularity in path planning tasks due to its self-learning and adaptability to complex environment. However, most of path planning algorithms based on DQN spend plenty of time for model training and the learned model policy depends only on the information observed by sensors. It will cause poor generalization capability for the new task and time waste for model retraining. Therefore, a new deep reinforcement learning method combining DQN with prior knowledge is proposed to reduce training time and enhance generalization capability. In this method, a fuzzy logic controller is designed to avoid the obstacles and help the robot avoid blind exploration for reducing the training time. A target-driven approach is used to address the lack of generalization, in which the learned policy depends on the fusion of observed information and target information. Extensive experiments show that the proposed algorithm converges faster than DQN algorithm in path planning tasks and the target can be reached without retraining when the path planning task changes.

Download Full-text

Path Planning in Unknown Environment Based on Reinforcement Learning

10.1007/978-981-16-6320-8_24 ◽

2021 ◽

pp. 229-236

Author(s):

Xinshun Ning ◽

Hongyong Yang ◽

Zhilin Fan ◽

Yilin Han

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Unknown Environment

Download Full-text

Deep Reinforcement Learning based Path Planning for Mobile Robot in Unknown Environment

Journal of Physics Conference Series ◽

10.1088/1742-6596/1576/1/012009 ◽

2020 ◽

Vol 1576 ◽

pp. 012009

Author(s):

Yang Wang ◽

Yilin Fang ◽

Ping Lou ◽

Junwei Yan ◽

Nianyun Liu

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Mobile Robot ◽

Unknown Environment

Download Full-text

AUV Path Planning with Kinematic Constraints in Unknown Environment Using Reinforcement Learning

Proceedings of the 2020 4th International Conference on Digital Signal Processing ◽

10.1145/3408127.3408183 ◽

2020 ◽

Author(s):

Xuyang Hou ◽

Jun Du ◽

Jingjing Wang ◽

Yong Ren

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Kinematic Constraints ◽

Unknown Environment

Download Full-text

Integral reinforcement learning‐based approximate minimum time‐energy path planning in an unknown environment

International Journal of Robust and Nonlinear Control ◽

10.1002/rnc.5122 ◽

2020 ◽

Author(s):

Chenyuan He ◽

Yan Wan ◽

Yixin Gu ◽

Frank L. Lewis

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Minimum Time ◽

Unknown Environment

Download Full-text

Depth Semantic Segmentation of Tobacco Planting Areas from Unmanned Aerial Vehicle Remote Sensing Images in Plateau Mountains

Journal of Spectroscopy ◽

10.1155/2021/6687799 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Liang Huang ◽

Xuequn Wu ◽

Qiuzhi Peng ◽

Xueqin Yu

Keyword(s):

Remote Sensing ◽

Unmanned Aerial Vehicle ◽

Network Models ◽

Semantic Segmentation ◽

Remote Sensing Images ◽

Training Time ◽

Backbone Networks ◽

Aerial Vehicle ◽

Model Training ◽

Self Learning

The tobacco in plateau mountains has the characteristics of fragmented planting, uneven growth, and mixed/interplanting of crops. It is difficult to extract effective features using an object-oriented image analysis method to accurately extract tobacco planting areas. To this end, the advantage of deep learning features self-learning is relied on in this paper. An accurate extraction method of tobacco planting areas based on a deep semantic segmentation model from the unmanned aerial vehicle (UAV) remote sensing images in plateau mountains is proposed in this paper. Firstly, the tobacco semantic segmentation dataset is established using Labelme. Four deep semantic segmentation models of DeeplabV3+, PSPNet, SegNet, and U-Net are used to train the sample data in the dataset. Among them, in order to reduce the model training time, the MobileNet series of lightweight networks are used to replace the original backbone networks of the four network models. Finally, the predictive images are semantically segmented by trained networks, and the mean Intersection over Union (mIoU) is used to evaluate the accuracy. The experimental results show that, using DeeplabV3+, PSPNet, SegNet, and U-Net to perform semantic segmentation on 71 scene prediction images, the mIoU obtained is 0.9436, 0.9118, 0.9392, and 0.9473, respectively, and the accuracy of semantic segmentation is high. The feasibility of the deep semantic segmentation method for extracting tobacco planting surface from UAV remote sensing images has been verified, and the research method can provide a reference for subsequent automatic extraction of tobacco planting areas.

Download Full-text

Path Planning of Robotic Fish in Unknown Environment with Improved Reinforcement Learning Algorithm

Internet and Distributed Computing Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-030-02738-4_21 ◽

2018 ◽

pp. 248-257

Author(s):

Jingbo Hu ◽

Jie Mei ◽

Dingfang Chen ◽

Lijie Li ◽

Zhengshu Cheng

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Learning Algorithm ◽

Robotic Fish ◽

Unknown Environment ◽

Reinforcement Learning Algorithm

Download Full-text

Dynamic Path Planning of Unknown Environment Based on Deep Reinforcement Learning

Journal of Robotics ◽

10.1155/2018/5781591 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10 ◽

Cited By ~ 14

Author(s):

Xiaoyun Lei ◽

Zhian Zhang ◽

Peifang Dong

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Learning Algorithm ◽

Dynamic Environment ◽

Target Position ◽

Dynamic Environments ◽

Unknown Environment ◽

Starting Position ◽

Dynamic Path Planning ◽

Dynamic Path

Dynamic path planning of unknown environment has always been a challenge for mobile robots. In this paper, we apply double Q-network (DDQN) deep reinforcement learning proposed by DeepMind in 2016 to dynamic path planning of unknown environment. The reward and punishment function and the training method are designed for the instability of the training stage and the sparsity of the environment state space. In different training stages, we dynamically adjust the starting position and target position. With the updating of neural network and the increase of greedy rule probability, the local space searched by agent is expanded. Pygame module in PYTHON is used to establish dynamic environments. Considering lidar signal and local target position as the inputs, convolutional neural networks (CNNs) are used to generalize the environmental state. Q-learning algorithm enhances the ability of the dynamic obstacle avoidance and local planning of the agents in environment. The results show that, after training in different dynamic environments and testing in a new environment, the agent is able to reach the local target position successfully in unknown dynamic environment.

Download Full-text

Path Planning of Maritime Autonomous Surface Ships in Unknown Environment with Reinforcement Learning

Communications in Computer and Information Science - Cognitive Systems and Signal Processing ◽

10.1007/978-981-13-7986-4_12 ◽

2019 ◽

pp. 127-137 ◽

Cited By ~ 3

Author(s):

Chengbo Wang ◽

Xinyu Zhang ◽

Ruijie Li ◽

Peifang Dong

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Unknown Environment

Download Full-text

Multi-agent path planning in unknown environment with reinforcement learning and neural network

2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC) ◽

10.1109/smc.2014.6974464 ◽

2014 ◽

Cited By ~ 3

Author(s):

David Luviano Cruz ◽

Wen Yu

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Path Planning ◽

Unknown Environment ◽

Multi Agent

Download Full-text

Trajectory Simulation Approach for Autonomous Vehicles Path Planning using Deep Reinforcement Learning

International Journal for Innovation Education and Research ◽

10.31686/ijier.vol8.iss12.2837 ◽

2020 ◽

Vol 8 (12) ◽

pp. 436-454

Author(s):

Jean Phelipe De Oliveira Lima ◽

Raimundo Correa de Oliveira ◽

Cleinaldo de Almeida Costa

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Prior Knowledge ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Convergence Time ◽

Trajectory Simulation ◽

Art Works ◽

Vehicle Path ◽

Two Phases

Autonomous vehicle path planning aims to allow safe and rapid movement in an environment without human interference. Recently, Reinforcement Learning methods have been used to solve this problem and have achieved satisfactory results. This work presents the use of Deep Reinforcement Learning for the task of path planning for autonomous vehicles through trajectory simulation, to define routes that offer greater safety (without collisions) and less distance for the displacement between two points. A method for creating simulation environments was developed to analyze the performance of the proposed models in different difficult degrees of circumstances. The decision-making strategy implemented was based on the use of Artificial Neural Networks of the Multilayer Perceptron type with parameters and hyperparameters determined from a grid search. The models were evaluated for their reward charts resulting from their learning process. Such evaluation occurred in two phases: isolated evaluation, in which the models were inserted into the environment without prior knowledge; and incremental evaluation, in which models were inserted in unknown environments with previous intelligence accumulated in other conditions. The results obtained are competitive with state-of-the-art works and highlight the adaptive characteristic of the models presented, which, when inserted with prior knowledge in environments, can reduce the convergence time by up to 89.47% when compared to related works.

Download Full-text