scholarly journals Crowd Evacuation Guidance Based on Combined Action Reinforcement Learning

Algorithms ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 26
Author(s):  
Yiran Xue ◽  
Rui Wu ◽  
Jiafeng Liu ◽  
Xianglong Tang

Existing crowd evacuation guidance systems require the manual design of models and input parameters, incurring a significant workload and a potential for errors. This paper proposed an end-to-end intelligent evacuation guidance method based on deep reinforcement learning, and designed an interactive simulation environment based on the social force model. The agent could automatically learn a scene model and path planning strategy with only scene images as input, and directly output dynamic signage information. Aiming to solve the “dimension disaster” phenomenon of the deep Q network (DQN) algorithm in crowd evacuation, this paper proposed a combined action-space DQN (CA-DQN) algorithm that grouped Q network output layer nodes according to action dimensions, which significantly reduced the network complexity and improved system practicality in complex scenes. In this paper, the evacuation guidance system is defined as a reinforcement learning agent and implemented by the CA-DQN method, which provides a novel approach for the evacuation guidance problem. The experiments demonstrate that the proposed method is superior to the static guidance method, and on par with the manually designed model method.

2018 ◽  
Vol 27 (2) ◽  
pp. 111-126 ◽  
Author(s):  
Thommen George Karimpanal ◽  
Roland Bouffanais

The idea of reusing or transferring information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency of a reinforcement learning agent. In this work, we describe a novel approach for reusing previously acquired knowledge by using it to guide the exploration of an agent while it learns new tasks. In order to do so, we employ a variant of the growing self-organizing map algorithm, which is trained using a measure of similarity that is defined directly in the space of the vectorized representations of the value functions. In addition to enabling transfer across tasks, the resulting map is simultaneously used to enable the efficient storage of previously acquired task knowledge in an adaptive and scalable manner. We empirically validate our approach in a simulated navigation environment and also demonstrate its utility through simple experiments using a mobile micro-robotics platform. In addition, we demonstrate the scalability of this approach and analytically examine its relation to the proposed network growth mechanism. Furthermore, we briefly discuss some of the possible improvements and extensions to this approach, as well as its relevance to real-world scenarios in the context of continual learning.


2015 ◽  
Vol 168 ◽  
pp. 529-537 ◽  
Author(s):  
Mingliang Xu ◽  
Yunpeng Wu ◽  
Pei Lv ◽  
Hao Jiang ◽  
Mingxuan Luo ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-7
Author(s):  
Juan Wei ◽  
Wenjie Fan ◽  
Zhongyu Li ◽  
Yangyong Guo ◽  
Yuanyuan Fang ◽  
...  

Due to the interaction and external interference, the crowds will constantly and dynamically adjust their evacuation path in the evacuation process to achieve the purpose of rapid evacuation. The information from previous process can be used to modify the current evacuation control information to achieve a better evacuation effect, and iterative learning control can achieve an effective prediction of the expected path within a limited running time. In order to depict this process, the social force model is improved based on an iterative extended state observer so that the crowds can move along the optimal evacuation path. First, the objective function of the optimal evacuation path is established in the improved model, and an iterative extended state observer is designed to get the estimated value. Second, the above model is verified through simulation experiments. The results show that, as the number of iterations increases, the evacuation time shows a trend of first decreasing and then increasing.


2017 ◽  
Vol 14 (1) ◽  
pp. 359-366
Author(s):  
Liang Li ◽  
Hong Liu ◽  
Lei Lv

Arch Effect is a universal natural phenomenon caused by the non-homogeneous displacement of the media. Likely, the pedestrians in a crowd evacuation simulation, according to the social force model, would also form an arch near the exit in a short time. The resulting arch may decrease evacuation efficiency and lead to extremely dangerous situations. In this paper, inspired by the similarity between the natural Arch Effect and pedestrians’ evacuation behaviors, a hypothesis is proposed that the arching phenomenon in crowd evacuation simulations can be controlled by applying the theory of the natural Arch Effect. To test this hypothesis, two steps have been conducted. First, the obstacles in the scene are treated as the arch feet. By setting obstacles at appropriate positions, the arch is formed at a position more distant from the exit. The outer-arch can help to avoid extremely dangerous situations and improve evacuation efficiency. Second, a modified interval equation based on the Arch Effect is proposed to calculate the proper interval of the obstacles to be set in the scene. With the pressure in the crowd and the size of obstacles considered, the equation aims to provide the optimal interval value for pedestrian evacuation. The results of the experiments illustrate that it is effective to analyze and control the arching phenomenon in crowd evacuation simulations by applying the theory of the Arch Effect.


2020 ◽  
Vol 309 ◽  
pp. 05001
Author(s):  
Benbu Liang ◽  
Kefan Xie ◽  
Xueqin Dong

With growing concerns about stadiums where attract large mass gathering, modeling and simulating crowd evacuation is pertinent to ensuring efficient and safe conditions. Based on the modified social force model and multi-agent simulation, several simulation scenarios are conducted to study the walking-along-side effects. The results show that walking along the sides will increase evacuation time, but it can mitigate the pressure of clogging effects and stream arching queue. Meanwhile, walking-along-side effects can relieve the density pressure of the exit and the "fast-is-slow" phenomenon. At last, several suggestions are put forward to promote evacuating capacity of the stadium.


Author(s):  
Shahzaib Hamid ◽  
Ali Nasir ◽  
Yasir Saleem

Field of robotics has been under the limelight because of recent advances in Artificial Intelligence (AI). Due to increased diversity in multi-agent systems, new models are being developed to handle complexity of such systems. However, most of these models do not address problems such as; uncertainty handling, efficient learning, agent coordination and fault detection. This paper presents a novel approach of implementing Reinforcement Learning (RL) on hierarchical robotic search teams. The proposed algorithm handles uncertainties in the system by implementing Q-learning and depicts enhanced efficiency as well as better time consumption compared to prior models. The reason for that is each agent can take action on its own thus there is less dependency on leader agent for RL policy. The performance of this algorithm is measured by introducing agents in an unknown environment with both Markov Decision Process (MDP) and RL policies at their disposal. Simulation-based comparison of the agent motion is presented using the results from of MDP and RL policies. Furthermore, qualitative comparison of the proposed model with prior models is also presented.


2022 ◽  
Vol 27 (3) ◽  
pp. 619-629
Author(s):  
Wenhan Wu ◽  
Maoyin Chen ◽  
Jinghai Li ◽  
Binglu Liu ◽  
Xiaolu Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document