Crowd Evacuation Guidance Based on Combined Action Reinforcement Learning

Existing crowd evacuation guidance systems require the manual design of models and input parameters, incurring a significant workload and a potential for errors. This paper proposed an end-to-end intelligent evacuation guidance method based on deep reinforcement learning, and designed an interactive simulation environment based on the social force model. The agent could automatically learn a scene model and path planning strategy with only scene images as input, and directly output dynamic signage information. Aiming to solve the “dimension disaster” phenomenon of the deep Q network (DQN) algorithm in crowd evacuation, this paper proposed a combined action-space DQN (CA-DQN) algorithm that grouped Q network output layer nodes according to action dimensions, which significantly reduced the network complexity and improved system practicality in complex scenes. In this paper, the evacuation guidance system is defined as a reinforcement learning agent and implemented by the CA-DQN method, which provides a novel approach for the evacuation guidance problem. The experiments demonstrate that the proposed method is superior to the static guidance method, and on par with the manually designed model method.

Download Full-text

Self-organizing maps for storage and transfer of knowledge in reinforcement learning

Adaptive Behavior ◽

10.1177/1059712318818568 ◽

2018 ◽

Vol 27 (2) ◽

pp. 111-126 ◽

Cited By ~ 5

Author(s):

Thommen George Karimpanal ◽

Roland Bouffanais

Keyword(s):

Reinforcement Learning ◽

Self Organizing Map ◽

Value Functions ◽

Transfer Of Knowledge ◽

Network Growth ◽

Self Organizing Maps ◽

Task Knowledge ◽

Novel Approach ◽

Learning Agent ◽

Self Organizing

The idea of reusing or transferring information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency of a reinforcement learning agent. In this work, we describe a novel approach for reusing previously acquired knowledge by using it to guide the exploration of an agent while it learns new tasks. In order to do so, we employ a variant of the growing self-organizing map algorithm, which is trained using a measure of similarity that is defined directly in the space of the vectorized representations of the value functions. In addition to enabling transfer across tasks, the resulting map is simultaneously used to enable the efficient storage of previously acquired task knowledge in an adaptive and scalable manner. We empirically validate our approach in a simulated navigation environment and also demonstrate its utility through simple experiments using a mobile micro-robotics platform. In addition, we demonstrate the scalability of this approach and analytically examine its relation to the proposed network growth mechanism. Furthermore, we briefly discuss some of the possible improvements and extensions to this approach, as well as its relevance to real-world scenarios in the context of continual learning.

Download Full-text

miSFM: On combination of Mutual Information and Social Force Model towards simulating crowd evacuation

Neurocomputing ◽

10.1016/j.neucom.2015.05.074 ◽

2015 ◽

Vol 168 ◽

pp. 529-537 ◽

Cited By ~ 30

Author(s):

Mingliang Xu ◽

Yunpeng Wu ◽

Pei Lv ◽

Hao Jiang ◽

Mingxuan Luo ◽

...

Keyword(s):

Mutual Information ◽

Force Model ◽

Social Force ◽

Social Force Model ◽

Crowd Evacuation

Download Full-text

Simulating Crowd Evacuation in a Social Force Model with Iterative Extended State Observer

Journal of Advanced Transportation ◽

10.1155/2020/4604187 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Juan Wei ◽

Wenjie Fan ◽

Zhongyu Li ◽

Yangyong Guo ◽

Yuanyuan Fang ◽

...

Keyword(s):

State Observer ◽

Extended State Observer ◽

Force Model ◽

Social Force ◽

Extended State ◽

Social Force Model ◽

External Interference ◽

The Social ◽

Crowd Evacuation ◽

Improved Model

Due to the interaction and external interference, the crowds will constantly and dynamically adjust their evacuation path in the evacuation process to achieve the purpose of rapid evacuation. The information from previous process can be used to modify the current evacuation control information to achieve a better evacuation effect, and iterative learning control can achieve an effective prediction of the expected path within a limited running time. In order to depict this process, the social force model is improved based on an iterative extended state observer so that the crowds can move along the optimal evacuation path. First, the objective function of the optimal evacuation path is established in the improved model, and an iterative extended state observer is designed to get the estimated value. Second, the above model is verified through simulation experiments. The results show that, as the number of iterations increases, the evacuation time shows a trend of first decreasing and then increasing.

Download Full-text

Modified social force model based on information transmission toward crowd evacuation simulation

Physica A Statistical Mechanics and its Applications ◽

10.1016/j.physa.2016.11.014 ◽

2017 ◽

Vol 469 ◽

pp. 499-509 ◽

Cited By ~ 35

Author(s):

Yanbin Han ◽

Hong Liu

Keyword(s):

Information Transmission ◽

Force Model ◽

Social Force ◽

Social Force Model ◽

Evacuation Simulation ◽

Model Based ◽

Crowd Evacuation

Download Full-text

A social force model for the crowd evacuation in a terrorist attack

Physica A Statistical Mechanics and its Applications ◽

10.1016/j.physa.2018.02.136 ◽

2018 ◽

Vol 502 ◽

pp. 315-330 ◽

Cited By ~ 20

Author(s):

Qian Liu

Keyword(s):

Terrorist Attack ◽

Force Model ◽

Social Force ◽

Social Force Model ◽

Crowd Evacuation

Download Full-text

Crowd evacuation simulation method combining the density field and social force model

Physica A Statistical Mechanics and its Applications ◽

10.1016/j.physa.2020.125652 ◽

2021 ◽

Vol 566 ◽

pp. 125652

Author(s):

Yutong Sun ◽

Hong Liu

Keyword(s):

Simulation Method ◽

Density Field ◽

Force Model ◽

Social Force ◽

Social Force Model ◽

Evacuation Simulation ◽

Crowd Evacuation

Download Full-text

Research of Arching Phenomenon in Crowd Evacuation Simulation Based on the Arch Effect

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2017.6329 ◽

2017 ◽

Vol 14 (1) ◽

pp. 359-366

Author(s):

Liang Li ◽

Hong Liu ◽

Lei Lv

Keyword(s):

Natural Phenomenon ◽

Force Model ◽

Social Force ◽

Evacuation Simulation ◽

Optimal Interval ◽

Arch Effect ◽

The Media ◽

Crowd Evacuation ◽

Short Time ◽

And Control

Arch Effect is a universal natural phenomenon caused by the non-homogeneous displacement of the media. Likely, the pedestrians in a crowd evacuation simulation, according to the social force model, would also form an arch near the exit in a short time. The resulting arch may decrease evacuation efficiency and lead to extremely dangerous situations. In this paper, inspired by the similarity between the natural Arch Effect and pedestrians’ evacuation behaviors, a hypothesis is proposed that the arching phenomenon in crowd evacuation simulations can be controlled by applying the theory of the natural Arch Effect. To test this hypothesis, two steps have been conducted. First, the obstacles in the scene are treated as the arch feet. By setting obstacles at appropriate positions, the arch is formed at a position more distant from the exit. The outer-arch can help to avoid extremely dangerous situations and improve evacuation efficiency. Second, a modified interval equation based on the Arch Effect is proposed to calculate the proper interval of the obstacles to be set in the scene. With the pressure in the crowd and the size of obstacles considered, the equation aims to provide the optimal interval value for pedestrian evacuation. The results of the experiments illustrate that it is effective to analyze and control the arching phenomenon in crowd evacuation simulations by applying the theory of the Arch Effect.

Download Full-text

Crowd evacuation simulation for walking-along-side effect in the Stadium

MATEC Web of Conferences ◽

10.1051/matecconf/202030905001 ◽

2020 ◽

Vol 309 ◽

pp. 05001

Author(s):

Benbu Liang ◽

Kefan Xie ◽

Xueqin Dong

Keyword(s):

Side Effects ◽

Large Mass ◽

Force Model ◽

Mass Gathering ◽

Social Force ◽

Social Force Model ◽

Evacuation Simulation ◽

Agent Simulation ◽

Multi Agent ◽

Crowd Evacuation

With growing concerns about stadiums where attract large mass gathering, modeling and simulating crowd evacuation is pertinent to ensuring efficient and safe conditions. Based on the modified social force model and multi-agent simulation, several simulation scenarios are conducted to study the walking-along-side effects. The results show that walking along the sides will increase evacuation time, but it can mitigate the pressure of clogging effects and stream arching queue. Meanwhile, walking-along-side effects can relieve the density pressure of the exit and the "fast-is-slow" phenomenon. At last, several suggestions are put forward to promote evacuating capacity of the stadium.

Download Full-text

Reinforcement Learning Based Hierarchical Multi-Agent Robotic Search Team in Uncertain Environment

Mehran University Research Journal of Engineering and Technology ◽

10.22581/muet1982.2103.17 ◽

2021 ◽

Vol 40 (3) ◽

pp. 645-662

Author(s):

Shahzaib Hamid ◽

Ali Nasir ◽

Yasir Saleem

Keyword(s):

Reinforcement Learning ◽

Multi Agent Systems ◽

Qualitative Comparison ◽

Q Learning ◽

Novel Approach ◽

Learning Agent ◽

Markov Decision ◽

Multi Agent ◽

Efficient Learning ◽

Prior Models

Field of robotics has been under the limelight because of recent advances in Artificial Intelligence (AI). Due to increased diversity in multi-agent systems, new models are being developed to handle complexity of such systems. However, most of these models do not address problems such as; uncertainty handling, efficient learning, agent coordination and fault detection. This paper presents a novel approach of implementing Reinforcement Learning (RL) on hierarchical robotic search teams. The proposed algorithm handles uncertainties in the system by implementing Q-learning and depicts enhanced efficiency as well as better time consumption compared to prior models. The reason for that is each agent can take action on its own thus there is less dependency on leader agent for RL policy. The performance of this algorithm is measured by introducing agents in an unknown environment with both Markov Decision Process (MDP) and RL policies at their disposal. Simulation-based comparison of the agent motion is presented using the results from of MDP and RL policies. Furthermore, qualitative comparison of the proposed model with prior models is also presented.

Download Full-text