Improved Q -Learning Method for Multirobot Formation and Path Planning with Concave Obstacles

Journal of Sensors ◽

10.1155/2021/4294841 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Zhilin Fan ◽

Fei Liu ◽

Xinshun Ning ◽

Yilin Han ◽

Jian Wang ◽

...

Keyword(s):

Path Planning ◽

Learning Algorithm ◽

Convergence Time ◽

Multirobot Systems ◽

Selection Strategy ◽

Q Learning ◽

Unknown Environment ◽

Traditional Algorithm ◽

Leader Following ◽

Tracking Strategy

Aiming at the formation and path planning of multirobot systems in an unknown environment, a path planning method for multirobot formation based on improved Q -learning is proposed. Based on the leader-following approach, the leader robot uses an improved Q -learning algorithm to plan the path and the follower robot achieves a tracking strategy of gravitational potential field (GPF) by designing a cost function to select actions. Specifically, to improve the Q-learning, Q -value is initialized by environmental guidance of the target’s GPF. Then, the virtual obstacle-filling avoidance strategy is presented to fill non-obstacles which is judged to tend to concave obstacles with virtual obstacles. Besides, the simulated annealing (SA) algorithm whose controlling temperature is adjusted in real time according to the learning situation of the Q -learning is applied to improve the action selection strategy. The experimental results show that the improved Q -learning algorithm reduces the convergence time by 89.9% and the number of convergence rounds by 63.4% compared with the traditional algorithm. With the help of the method, multiple robots have a clear division of labor and quickly plan a globally optimized formation path in a completely unknown environment.

The Experience-Memory Q-Learning Algorithm for Robot Path Planning in Unknown Environment

IEEE Access ◽

10.1109/access.2020.2978077 ◽

2020 ◽

Vol 8 ◽

pp. 47824-47844 ◽

Cited By ~ 2

Author(s):

Meng Zhao ◽

Hui Lu ◽

Siyi Yang ◽

Fengjuan Guo

Keyword(s):

Path Planning ◽

Learning Algorithm ◽

Robot Path Planning ◽

Q Learning ◽

Unknown Environment ◽

Robot Path

A Novel Q-Learning Algorithm Based on the Stochastic Environment Path Planning Problem

2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) ◽

10.1109/trustcom50675.2020.00270 ◽

2020 ◽

Author(s):

Jian Li ◽

Fei Rong ◽

Yu Tang

Keyword(s):

Path Planning ◽

Learning Algorithm ◽

Planning Problem ◽

Stochastic Environment ◽

Q Learning ◽

Path Planning Problem

An Improved Q-learning Algorithm for Path-Planning of a Mobile Robot

International Journal of Computer Applications ◽

10.5120/8073-1468 ◽

2012 ◽

Vol 51 (9) ◽

pp. 40-46 ◽

Cited By ~ 3

Author(s):

Pradipta KDas ◽

S. C. Mandhata ◽

H. S. Behera ◽

S. N. Patro

Keyword(s):

Path Planning ◽

Mobile Robot ◽

Learning Algorithm ◽

Q Learning

Hybrid Path Planning of A Quadrotor UAV Based on Q-Learning Algorithm

2018 37th Chinese Control Conference (CCC) ◽

10.23919/chicc.2018.8482604 ◽

2018 ◽

Cited By ~ 1

Author(s):

Tianze Zhang ◽

Xin Huo ◽

Songlin Chen ◽

Baoqing Yang ◽

Guojiang Zhang

Keyword(s):

Path Planning ◽

Learning Algorithm ◽

Q Learning ◽

Quadrotor Uav

Autonomous Path Planning Scheme Research for Mobile Robot

Cybernetics and Information Technologies ◽

10.1515/cait-2016-0072 ◽

2016 ◽

Vol 16 (4) ◽

pp. 113-125

Author(s):

Jianxian Cai ◽

Xiaogang Ruan ◽

Pengxuan Li

Keyword(s):

Path Planning ◽

Mobile Robot ◽

Autonomous Navigation ◽

Learning Algorithm ◽

Action Learning ◽

Cognitive Learning ◽

Learning Ability ◽

Q Learning ◽

Planning Strategy ◽

Navigation Path

Abstract An autonomous path-planning strategy based on Skinner operant conditioning principle and reinforcement learning principle is developed in this paper. The core strategies are the use of tendency cell and cognitive learning cell, which simulate bionic orientation and asymptotic learning ability. Cognitive learning cell is designed on the base of Boltzmann machine and improved Q-Learning algorithm, which executes operant action learning function to approximate the operative part of robot system. The tendency cell adjusts network weights by the use of information entropy to evaluate the function of operate action. The results of the simulation experiment in mobile robot showed that the designed autonomous path-planning strategy lets the robot realize autonomous navigation path planning. The robot learns to select autonomously according to the bionic orientate action and have fast convergence rate and higher adaptability.

Implementation of modified Q learning technique in EMCAP control architecture

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.5.9160 ◽

2017 ◽

Vol 7 (1.5) ◽

pp. 269

Author(s):

D. Ganesha ◽

Vijayakumar Maragal Venkatamuni

Keyword(s):

Learning Algorithm ◽

Control Strategies ◽

Cognitive Architecture ◽

Dynamic Environment ◽

Learning System ◽

Selection Strategy ◽

Q Learning ◽

Learning Techniques ◽

Markov Decision ◽

Environment Experiment

This research introduces a self learning modified (Q-Learning) techniques in a EMCAP (Enhanced Mind Cognitive Architecture of pupils). Q-learning is a modelless reinforcement learning (RL) methodology technique. In Specific, Q-learning can be applied to establish an optimal action-selection strategy for any respective Markov decision process. In this research introduces the modified Q-learning in a EMCAP (Enhanced Mind Cognitive Architecture of pupils). EMCAP architecture [1] enables and presents various agent control strategies for static and dynamic environment. Experiment are conducted to evaluate the performace for each agent individually. For result comparison among different agent, the same statistics were collected. This work considered varied kind of agents in different level of architecture for experiment analysis. The Fungus world testbed has been considered for experiment which is has been implemented using SwI-Prolog 5.4.6. The fixed obstructs tend to be more versatile, to make a location that is specific to Fungus world testbed environment. The various parameters are introduced in an environment to test a agent’s performance.his modified q learning algorithm can be more suitable in EMCAP architecture. The experiments are conducted the modified Q-Learning system gets more rewards compare to existing Q-learning.

Adaptive Object Tracking via Multi-Angle Analysis Collaboration

Sensors ◽

10.3390/s18113606 ◽

2018 ◽

Vol 18 (11) ◽

pp. 3606 ◽

Cited By ~ 1

Author(s):

Wanli Xue ◽

Zhiyong Feng ◽

Chao Xu ◽

Zhaopeng Meng ◽

Chengwei Zhang

Keyword(s):

Object Tracking ◽

Learning Algorithm ◽

Action Space ◽

Selection Strategy ◽

Multiple Perspectives ◽

Strategic Framework ◽

Practical Applications ◽

Q Learning ◽

State Action ◽

Speed And Accuracy

Although tracking research has achieved excellent performance in mathematical angles, it is still meaningful to analyze tracking problems from multiple perspectives. This motivation not only promotes the independence of tracking research but also increases the flexibility of practical applications. This paper presents a significant tracking framework based on the multi-dimensional state–action space reinforcement learning, termed as multi-angle analysis collaboration tracking (MACT). MACT is comprised of a basic tracking framework and a strategic framework which assists the former. Especially, the strategic framework is extensible and currently includes feature selection strategy (FSS) and movement trend strategy (MTS). These strategies are abstracted from the multi-angle analysis of tracking problems (observer’s attention and object’s motion). The content of the analysis corresponds to the specific actions in the multidimensional action space. Concretely, the tracker, regarded as an agent, is trained with Q-learning algorithm and ϵ -greedy exploration strategy, where we adopt a customized rewarding function to encourage robust object tracking. Numerous contrast experimental evaluations on the OTB50 benchmark demonstrate the effectiveness of the strategies and improvement in speed and accuracy of MACT tracker.

Multi-Robot Path Planning Method Using Reinforcement Learning

Applied Sciences ◽

10.3390/app9153057 ◽

2019 ◽

Vol 9 (15) ◽

pp. 3057 ◽

Cited By ~ 9

Author(s):

Hyansu Bae ◽

Gidong Kim ◽

Jonguk Kim ◽

Dianwei Qian ◽

Sukgyu Lee

Keyword(s):

Path Planning ◽

Learning Algorithm ◽

Robot Path Planning ◽

Q Learning ◽

Efficient Performance ◽

Planning Algorithm ◽

Neural Network Algorithm ◽

Path Planning Algorithm ◽

Robot Path ◽

Multi Robot

This paper proposes a noble multi-robot path planning algorithm using Deep q learning combined with CNN (Convolution Neural Network) algorithm. In conventional path planning algorithms, robots need to search a comparatively wide area for navigation and move in a predesigned formation under a given environment. Each robot in the multi-robot system is inherently required to navigate independently with collaborating with other robots for efficient performance. In addition, the robot collaboration scheme is highly depends on the conditions of each robot, such as its position and velocity. However, the conventional method does not actively cope with variable situations since each robot has difficulty to recognize the moving robot around it as an obstacle or a cooperative robot. To compensate for these shortcomings, we apply Deep q learning to strengthen the learning algorithm combined with CNN algorithm, which is needed to analyze the situation efficiently. CNN analyzes the exact situation using image information on its environment and the robot navigates based on the situation analyzed through Deep q learning. The simulation results using the proposed algorithm shows the flexible and efficient movement of the robots comparing with conventional methods under various environments.

Path Planning Collision Avoidance using Reinforcement Learning

10.48011/asba.v2i1.1597 ◽

2020 ◽

Author(s):

Josias G. Batista ◽

Felipe J. S. Vasconcelos ◽

Kaio M. Ramos ◽

Darielson A. Souza ◽

José L. N. Silva

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Production Process ◽

Collision Avoidance ◽

Production Systems ◽

Learning Algorithm ◽

Computational Cost ◽

Trajectory Generation ◽

Industrial Robots ◽

Q Learning

Industrial robots have grown over the years making production systems more and more efficient, requiring the need for efficient trajectory generation algorithms that optimize and, if possible, generate collision-free trajectories without interrupting the production process. In this work is presented the use of Reinforcement Learning (RL), based on the Q-Learning algorithm, in the trajectory generation of a robotic manipulator and also a comparison of its use with and without constraints of the manipulator kinematics, in order to generate collisionfree trajectories. The results of the simulations are presented with respect to the efficiency of the algorithm and its use in trajectory generation, a comparison of the computational cost for the use of constraints is also presented.

The optimization of path planning for multi-robot system using Boltzmann Policy based Q-learning algorithm

2013 IEEE International Conference on Robotics and Biomimetics (ROBIO) ◽

10.1109/robio.2013.6739627 ◽

2013 ◽

Cited By ~ 2

Author(s):

Zeying Wang ◽

Zhiguo Shi ◽

Yuankai Li ◽

Jun Tu

Keyword(s):

Path Planning ◽

Learning Algorithm ◽

Robot System ◽

Q Learning ◽

Multi Robot