Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge

Abstract An autonomous path-planning strategy based on Skinner operant conditioning principle and reinforcement learning principle is developed in this paper. The core strategies are the use of tendency cell and cognitive learning cell, which simulate bionic orientation and asymptotic learning ability. Cognitive learning cell is designed on the base of Boltzmann machine and improved Q-Learning algorithm, which executes operant action learning function to approximate the operative part of robot system. The tendency cell adjusts network weights by the use of information entropy to evaluate the function of operate action. The results of the simulation experiment in mobile robot showed that the designed autonomous path-planning strategy lets the robot realize autonomous navigation path planning. The robot learns to select autonomously according to the bionic orientate action and have fast convergence rate and higher adaptability.

Download Full-text

Fast Path Planning for Unmanned Aerial Vehicles by Self-Correction Based on Q-Learning

Journal of Aerospace Information Systems ◽

10.2514/1.i010856 ◽

2021 ◽

pp. 1-9

Author(s):

Ziyang Wang ◽

Hongbing Yang ◽

Qingsong Wu ◽

Jiafei Zheng

Keyword(s):

Path Planning ◽

Unmanned Aerial Vehicles ◽

Q Learning ◽

Aerial Vehicles

Download Full-text

BIOINSPIRED NEURAL NETWORK-BASED Q-LEARNING APPROACH FOR ROBOT PATH PLANNING IN UNKNOWN ENVIRONMENTS

International Journal of Robotics and Automation ◽

10.2316/journal.206.2016.6.206-4526 ◽

2016 ◽

Vol 31 (6) ◽

Cited By ~ 3

Author(s):

Jianjun Ni ◽

Xinyun Li ◽

Mingang Hua ◽

Simon X. Yang

Keyword(s):

Neural Network ◽

Path Planning ◽

Learning Approach ◽

Robot Path Planning ◽

Q Learning ◽

Unknown Environments ◽

Robot Path

Download Full-text

Cognition of a Robotic Manipulator Using the Q-Learning Based Situation-Operator Model

Journal of Information Technology Research ◽

10.4018/jitr.2018010109 ◽

2018 ◽

Vol 11 (1) ◽

pp. 146-157 ◽

Cited By ~ 1

Author(s):

Akash Dutt Dubey ◽

Ravi Bhushan Mishra

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Path Planning ◽

Mobile Robot ◽

Knowledge Base ◽

Operator Model ◽

Q Learning ◽

Sensory Inputs ◽

Learning Module ◽

Som Model

In this article, we have applied cognition on robot using Q-learning based situation operator model. The situation operator model takes the initial situation of the mobile robot and applies a set of operators in order to move the robot to the destination. The initial situation of the mobile robot is defined by a set of characteristics inferred by the sensor inputs. The Situation-Operator Model (SOM) model comprises of a planning and learning module which uses certain heuristics for learning through the mobile robot and a knowledge base which stored the experiences of the mobile robot. The control and learning of the robot is done using q-learning. A camera sensor and an ultrasonic sensor were used as the sensory inputs for the mobile robot. These sensory inputs are used to define the initial situation, which is then used in the learning module to apply the valid operator. The results obtained by the proposed method were compared to the result obtained by Reinforcement-Based Artificial Neural Network for path planning.

Download Full-text

High-Level Path Planning for an Autonomous Sailboat Robot Using Q-Learning

Sensors ◽

10.3390/s20061550 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1550 ◽

Cited By ~ 3

Author(s):

Andouglas Gonçalves da Silva Silva Junior ◽

Davi Henrique dos Santos ◽

Alvaro Pinto Fernandes de Negreiros ◽

João Moreno Vilas Boas de Souza Silva ◽

Luiz Marcos Garcia Gonçalves

Keyword(s):

Path Planning ◽

Dead Zone ◽

Weather Conditions ◽

Target Point ◽

Q Learning ◽

Ground Station ◽

Global Route ◽

Global Planning ◽

High Level ◽

Path Planner

Path planning for sailboat robots is a challenging task particularly due to the kinematics and dynamics modelling of such kinds of wind propelled boats. The problem is divided into two layers. The first one is global were a general trajectory composed of waypoints is planned, which can be done automatically based on some variables such as weather conditions or defined by hand using some human–robot interface (a ground-station). In the second local layer, at execution time, the global route should be followed by making the sailboat proceed between each pair of consecutive waypoints. Our proposal in this paper is an algorithm for the global, path generation layer, which has been developed for the N-Boat (The Sailboat Robot project), in order to compute feasible sailing routes between a start and a target point while avoiding dangerous situations such as obstacles and borders. A reinforcement learning approach (Q-Learning) is used based on a reward matrix and a set of actions that changes according to wind directions to account for the dead zone, which is the region against the wind where the sailboat can not gain velocity. Our algorithm generates straight and zigzag paths accounting for wind direction. The path generated also guarantees the sailboat safety and robustness, enabling it to sail for long periods of time, depending only on the start and target points defined for this global planning. The result is the development of a complete path planner algorithm that, together with the local planner solved in previous work, can be used to allow the final developments of an N-Boat making it a fully autonomous sailboat.

Download Full-text