scholarly journals Indoor Emergency Path Planning Based on the Q-Learning Optimization Algorithm

2022 ◽  
Vol 11 (1) ◽  
pp. 66
Author(s):  
Shenghua Xu ◽  
Yang Gu ◽  
Xiaoyan Li ◽  
Cai Chen ◽  
Yingyi Hu ◽  
...  

The internal structure of buildings is becoming increasingly complex. Providing a scientific and reasonable evacuation route for trapped persons in a complex indoor environment is important for reducing casualties and property losses. In emergency and disaster relief environments, indoor path planning has great uncertainty and higher safety requirements. Q-learning is a value-based reinforcement learning algorithm that can complete path planning tasks through autonomous learning without establishing mathematical models and environmental maps. Therefore, we propose an indoor emergency path planning method based on the Q-learning optimization algorithm. First, a grid environment model is established. The discount rate of the exploration factor is used to optimize the Q-learning algorithm, and the exploration factor in the ε-greedy strategy is dynamically adjusted before selecting random actions to accelerate the convergence of the Q-learning algorithm in a large-scale grid environment. An indoor emergency path planning experiment based on the Q-learning optimization algorithm was carried out using simulated data and real indoor environment data. The proposed Q-learning optimization algorithm basically converges after 500 iterative learning rounds, which is nearly 2000 rounds higher than the convergence rate of the Q-learning algorithm. The SASRA algorithm has no obvious convergence trend in 5000 iterations of learning. The results show that the proposed Q-learning optimization algorithm is superior to the SARSA algorithm and the classic Q-learning algorithm in terms of solving time and convergence speed when planning the shortest path in a grid environment. The convergence speed of the proposed Q- learning optimization algorithm is approximately five times faster than that of the classic Q- learning algorithm. The proposed Q-learning optimization algorithm in the grid environment can successfully plan the shortest path to avoid obstacle areas in a short time.

2012 ◽  
Vol 51 (9) ◽  
pp. 40-46 ◽  
Author(s):  
Pradipta KDas ◽  
S. C. Mandhata ◽  
H. S. Behera ◽  
S. N. Patro

Author(s):  
Tianze Zhang ◽  
Xin Huo ◽  
Songlin Chen ◽  
Baoqing Yang ◽  
Guojiang Zhang

2016 ◽  
Vol 16 (4) ◽  
pp. 113-125
Author(s):  
Jianxian Cai ◽  
Xiaogang Ruan ◽  
Pengxuan Li

Abstract An autonomous path-planning strategy based on Skinner operant conditioning principle and reinforcement learning principle is developed in this paper. The core strategies are the use of tendency cell and cognitive learning cell, which simulate bionic orientation and asymptotic learning ability. Cognitive learning cell is designed on the base of Boltzmann machine and improved Q-Learning algorithm, which executes operant action learning function to approximate the operative part of robot system. The tendency cell adjusts network weights by the use of information entropy to evaluate the function of operate action. The results of the simulation experiment in mobile robot showed that the designed autonomous path-planning strategy lets the robot realize autonomous navigation path planning. The robot learns to select autonomously according to the bionic orientate action and have fast convergence rate and higher adaptability.


Author(s):  
Taichi Chujo ◽  
Kosei Nishida ◽  
Tatsushi Nishi

Abstract In a modern large-scale fabrication, hundreds of vehicles are used for transportation. Since traffic conditions are changing rapidly, the routing of automated guided vehicles (AGV) needs to be changed according to the change in traffic conditions. We propose a conflict-free routing method for AGVs using reinforcement learning in dynamic transportation. An advantage of the proposed method is that a change in the state can be obtained as an evaluation function. Therefore, the action can be selected according to the states. A deadlock avoidance method in bidirectional transport systems is developed using reinforcement learning. The effectiveness of the proposed method is demonstrated by comparing the performance with the conventional Q learning algorithm from computational results.


2019 ◽  
Vol 9 (15) ◽  
pp. 3057 ◽  
Author(s):  
Hyansu Bae ◽  
Gidong Kim ◽  
Jonguk Kim ◽  
Dianwei Qian ◽  
Sukgyu Lee

This paper proposes a noble multi-robot path planning algorithm using Deep q learning combined with CNN (Convolution Neural Network) algorithm. In conventional path planning algorithms, robots need to search a comparatively wide area for navigation and move in a predesigned formation under a given environment. Each robot in the multi-robot system is inherently required to navigate independently with collaborating with other robots for efficient performance. In addition, the robot collaboration scheme is highly depends on the conditions of each robot, such as its position and velocity. However, the conventional method does not actively cope with variable situations since each robot has difficulty to recognize the moving robot around it as an obstacle or a cooperative robot. To compensate for these shortcomings, we apply Deep q learning to strengthen the learning algorithm combined with CNN algorithm, which is needed to analyze the situation efficiently. CNN analyzes the exact situation using image information on its environment and the robot navigates based on the situation analyzed through Deep q learning. The simulation results using the proposed algorithm shows the flexible and efficient movement of the robots comparing with conventional methods under various environments.


2020 ◽  
Author(s):  
Josias G. Batista ◽  
Felipe J. S. Vasconcelos ◽  
Kaio M. Ramos ◽  
Darielson A. Souza ◽  
José L. N. Silva

Industrial robots have grown over the years making production systems more and more efficient, requiring the need for efficient trajectory generation algorithms that optimize and, if possible, generate collision-free trajectories without interrupting the production process. In this work is presented the use of Reinforcement Learning (RL), based on the Q-Learning algorithm, in the trajectory generation of a robotic manipulator and also a comparison of its use with and without constraints of the manipulator kinematics, in order to generate collisionfree trajectories. The results of the simulations are presented with respect to the efficiency of the algorithm and its use in trajectory generation, a comparison of the computational cost for the use of constraints is also presented.


Sign in / Sign up

Export Citation Format

Share Document