The optimization of path planning for multi-robot system using Boltzmann Policy based Q-learning algorithm

Author(s):  
Zeying Wang ◽  
Zhiguo Shi ◽  
Yuankai Li ◽  
Jun Tu
2019 ◽  
Vol 9 (15) ◽  
pp. 3057 ◽  
Author(s):  
Hyansu Bae ◽  
Gidong Kim ◽  
Jonguk Kim ◽  
Dianwei Qian ◽  
Sukgyu Lee

This paper proposes a noble multi-robot path planning algorithm using Deep q learning combined with CNN (Convolution Neural Network) algorithm. In conventional path planning algorithms, robots need to search a comparatively wide area for navigation and move in a predesigned formation under a given environment. Each robot in the multi-robot system is inherently required to navigate independently with collaborating with other robots for efficient performance. In addition, the robot collaboration scheme is highly depends on the conditions of each robot, such as its position and velocity. However, the conventional method does not actively cope with variable situations since each robot has difficulty to recognize the moving robot around it as an obstacle or a cooperative robot. To compensate for these shortcomings, we apply Deep q learning to strengthen the learning algorithm combined with CNN algorithm, which is needed to analyze the situation efficiently. CNN analyzes the exact situation using image information on its environment and the robot navigates based on the situation analyzed through Deep q learning. The simulation results using the proposed algorithm shows the flexible and efficient movement of the robots comparing with conventional methods under various environments.


2021 ◽  
Vol 50 (2) ◽  
pp. 357-374
Author(s):  
Novak Zagradjanin ◽  
Aleksandar Rodic ◽  
Dragan Pamucar ◽  
Bojan Pavkovic

This paper considers an autonomous cloud-based multi-robot system designed to execute highly repetitive tasksin a dynamic environment such as a modern megastore. Cloud level is intended for performing the most demandingoperations in order to unload the robots that are users of cloud services in this architecture. For path planningon global level D* Lite algorithm is applied, bearing in mind its high efficiency in dynamic environments. In orderto introduce smart cost map for further improvement of path planning in complex and crowded environment, implementationof fuzzy inference system and learning algorithm is proposed. The results indicate the possibility ofapplying a similar concept in different real-world robotics applications, in order to reduce the total paths length,as well as to minimize the risk in path planning related to the human-robot interactions.


2021 ◽  
Vol 11 (4) ◽  
pp. 1448
Author(s):  
Wenju Mao ◽  
Zhijie Liu ◽  
Heng Liu ◽  
Fuzeng Yang ◽  
Meirong Wang

Multi-robots have shown good application prospects in agricultural production. Studying the synergistic technologies of agricultural multi-robots can not only improve the efficiency of the overall robot system and meet the needs of precision farming but also solve the problems of decreasing effective labor supply and increasing labor costs in agriculture. Therefore, starting from the point of view of an agricultural multiple robot system architectures, this paper reviews the representative research results of five synergistic technologies of agricultural multi-robots in recent years, namely, environment perception, task allocation, path planning, formation control, and communication, and summarizes the technological progress and development characteristics of these five technologies. Finally, because of these development characteristics, it is shown that the trends and research focus for agricultural multi-robots are to optimize the existing technologies and apply them to a variety of agricultural multi-robots, such as building a hybrid architecture of multi-robot systems, SLAM (simultaneous localization and mapping), cooperation learning of robots, hybrid path planning and formation reconstruction. While synergistic technologies of agricultural multi-robots are extremely challenging in production, in combination with previous research results for real agricultural multi-robots and social development demand, we conclude that it is realistic to expect automated multi-robot systems in the future.


2012 ◽  
Vol 51 (9) ◽  
pp. 40-46 ◽  
Author(s):  
Pradipta KDas ◽  
S. C. Mandhata ◽  
H. S. Behera ◽  
S. N. Patro

Author(s):  
Tianze Zhang ◽  
Xin Huo ◽  
Songlin Chen ◽  
Baoqing Yang ◽  
Guojiang Zhang

2016 ◽  
Vol 16 (4) ◽  
pp. 113-125
Author(s):  
Jianxian Cai ◽  
Xiaogang Ruan ◽  
Pengxuan Li

Abstract An autonomous path-planning strategy based on Skinner operant conditioning principle and reinforcement learning principle is developed in this paper. The core strategies are the use of tendency cell and cognitive learning cell, which simulate bionic orientation and asymptotic learning ability. Cognitive learning cell is designed on the base of Boltzmann machine and improved Q-Learning algorithm, which executes operant action learning function to approximate the operative part of robot system. The tendency cell adjusts network weights by the use of information entropy to evaluate the function of operate action. The results of the simulation experiment in mobile robot showed that the designed autonomous path-planning strategy lets the robot realize autonomous navigation path planning. The robot learns to select autonomously according to the bionic orientate action and have fast convergence rate and higher adaptability.


2013 ◽  
Vol 823 ◽  
pp. 321-325
Author(s):  
Lu Jin ◽  
Yue Quan Yang ◽  
Chun Bo Ni ◽  
Zhi Qiang Cao ◽  
Yi Fei Kong

With the more robots, the information interaction of multi-robot system becomes more sophisticated and important in a community perception network environment. By exploiting and fusing the learning information of robots in a perception community, the community information sharing mechanism is proposed, as well as updating rules of the community Q-value table. Moreover, considering the existence of delays of learning information transmission, an improved Q-learning method based on homogeneous delays is presented to improve the robot learning efficiency over the community perception network. Finally, the test experiments demonstrate the effectiveness of the proposed scheme.


Sign in / Sign up

Export Citation Format

Share Document