Conditional Q-learning algorithm for path-planning of a mobile robot

Author(s):  
Indrani Goswami ◽  
Pradipta Kumar Das ◽  
Amit Konar ◽  
R. Janarthanan
2012 ◽  
Vol 51 (9) ◽  
pp. 40-46 ◽  
Author(s):  
Pradipta KDas ◽  
S. C. Mandhata ◽  
H. S. Behera ◽  
S. N. Patro

2016 ◽  
Vol 16 (4) ◽  
pp. 113-125
Author(s):  
Jianxian Cai ◽  
Xiaogang Ruan ◽  
Pengxuan Li

Abstract An autonomous path-planning strategy based on Skinner operant conditioning principle and reinforcement learning principle is developed in this paper. The core strategies are the use of tendency cell and cognitive learning cell, which simulate bionic orientation and asymptotic learning ability. Cognitive learning cell is designed on the base of Boltzmann machine and improved Q-Learning algorithm, which executes operant action learning function to approximate the operative part of robot system. The tendency cell adjusts network weights by the use of information entropy to evaluate the function of operate action. The results of the simulation experiment in mobile robot showed that the designed autonomous path-planning strategy lets the robot realize autonomous navigation path planning. The robot learns to select autonomously according to the bionic orientate action and have fast convergence rate and higher adaptability.


Author(s):  
Indrani Goswami ◽  
Pradipta Kumar Das ◽  
Amit Konar ◽  
R. Janarthanan

2021 ◽  
Vol 09 (06) ◽  
pp. 138-157
Author(s):  
Guoming Liu ◽  
Caihong Li ◽  
Tengteng Gao ◽  
Yongdi Li ◽  
Xiaopei He

2018 ◽  
Vol 7 (4.27) ◽  
pp. 57
Author(s):  
Ee Soong Low ◽  
Pauline Ong ◽  
Cheng Yee Low

In path planning for mobile robot, classical Q-learning algorithm requires high iteration counts and longer time taken to achieve convergence. This is due to the beginning stage of classical Q-learning for path planning consists of mostly exploration, involving random direction decision making. This paper proposed the addition of distance aspect into direction decision making in Q-learning. This feature is used to reduce the time taken for the Q-learning to fully converge. In the meanwhile, random direction decision making is added and activated when mobile robot gets trapped in local optima. This strategy enables the mobile robot to escape from local optimal trap. The results show that the time taken for the improved Q-learning with distance guiding to converge is longer than the classical Q-learning. However, the total number of steps used is lower than the classical Q-learning. 


Sign in / Sign up

Export Citation Format

Share Document