A Transfer Learning-Based Multi-cues Multi-scale Spatial–Temporal Modeling for Effective Video-Based Crowd Counting and Density Estimation Using a Single-Column 2D-Atrous Net

2021 ◽  
pp. 179-194
Author(s):  
Santosh Kumar Tripathy ◽  
Rajeev Srivastava
Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3777
Author(s):  
Yani Zhang ◽  
Huailin Zhao ◽  
Zuodong Duan ◽  
Liangjun Huang ◽  
Jiahao Deng ◽  
...  

In this paper, we propose a novel congested crowd counting network for crowd density estimation, i.e., the Adaptive Multi-scale Context Aggregation Network (MSCANet). MSCANet efficiently leverages the spatial context information to accomplish crowd density estimation in a complicated crowd scene. To achieve this, a multi-scale context learning block, called the Multi-scale Context Aggregation module (MSCA), is proposed to first extract different scale information and then adaptively aggregate it to capture the full scale of the crowd. Employing multiple MSCAs in a cascaded manner, the MSCANet can deeply utilize the spatial context information and modulate preliminary features into more distinguishing and scale-sensitive features, which are finally applied to a 1 × 1 convolution operation to obtain the crowd density results. Extensive experiments on three challenging crowd counting benchmarks showed that our model yielded compelling performance against the other state-of-the-art methods. To thoroughly prove the generality of MSCANet, we extend our method to two relevant tasks: crowd localization and remote sensing object counting. The extension experiment results also confirmed the effectiveness of MSCANet.


2020 ◽  
Vol 30 (1) ◽  
pp. 180-191
Author(s):  
Liping Zhu ◽  
Hong Zhang ◽  
Sikandar Ali ◽  
Baoli Yang ◽  
Chengyang Li

Abstract The purpose of crowd counting is to estimate the number of pedestrians in crowd images. Crowd counting or density estimation is an extremely challenging task in computer vision, due to large scale variations and dense scene. Current methods solve these issues by compounding multi-scale Convolutional Neural Network with different receptive fields. In this paper, a novel end-to-end architecture based on Multi-Scale Adversarial Convolutional Neural Network (MSA-CNN) is proposed to generate crowd density and estimate the amount of crowd. Firstly, a multi-scale network is used to extract the globally relevant features in the crowd image, and then fractionally-strided convolutional layers are designed for up-sampling the output to recover the loss of crucial details caused by the earlier max pooling layers. An adversarial loss is directly employed to shrink the estimated value into the realistic subspace to reduce the blurring effect of density estimation. Joint training is performed in an end-to-end fashion using a combination of Adversarial loss and Euclidean loss. The two losses are integrated via a joint training scheme to improve density estimation performance.We conduct some extensive experiments on available datasets to show the significant improvements and supremacy of the proposed approach over the available state-of-the-art approaches.


Author(s):  
Anran Zhang ◽  
Xiaolong Jiang ◽  
Baochang Zhang ◽  
Xianbin Cao
Keyword(s):  

2020 ◽  
Vol 34 (07) ◽  
pp. 11693-11700 ◽  
Author(s):  
Ao Luo ◽  
Fan Yang ◽  
Xin Li ◽  
Dong Nie ◽  
Zhicheng Jiao ◽  
...  

Crowd counting is an important yet challenging task due to the large scale and density variation. Recent investigations have shown that distilling rich relations among multi-scale features and exploiting useful information from the auxiliary task, i.e., localization, are vital for this task. Nevertheless, how to comprehensively leverage these relations within a unified network architecture is still a challenging problem. In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph. Specifically, HyGnn integrates a hybrid graph to jointly represent the task-specific feature maps of different scales as nodes, and two types of relations as edges: (i) multi-scale relations capturing the feature dependencies across scales and (ii) mutual beneficial relations building bridges for the cooperation between counting and localization. Thus, through message passing, HyGnn can capture and distill richer relations between nodes to obtain more powerful representations, providing robust and accurate results. Our HyGnn performs significantly well on four challenging datasets: ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF_QNRF, outperforming the state-of-the-art algorithms by a large margin.


2018 ◽  
Vol 32 (7) ◽  
pp. 2897-2908 ◽  
Author(s):  
Bisheng Wang ◽  
Guo Cao ◽  
Yanfeng Shang ◽  
Licun Zhou ◽  
Youqiang Zhang ◽  
...  

2021 ◽  
Vol 5 (4) ◽  
pp. 50
Author(s):  
Rafik Gouiaa ◽  
Moulay A. Akhloufi ◽  
Mozhdeh Shahbazi

Automatically estimating the number of people in unconstrained scenes is a crucial yet challenging task in different real-world applications, including video surveillance, public safety, urban planning, and traffic monitoring. In addition, methods developed to estimate the number of people can be adapted and applied to related tasks in various fields, such as plant counting, vehicle counting, and cell microscopy. Many challenges and problems face crowd counting, including cluttered scenes, extreme occlusions, scale variation, and changes in camera perspective. Therefore, in the past few years, tremendous research efforts have been devoted to crowd counting, and numerous excellent techniques have been proposed. The significant progress in crowd counting methods in recent years is mostly attributed to advances in deep convolution neural networks (CNNs) as well as to public crowd counting datasets. In this work, we review the papers that have been published in the last decade and provide a comprehensive survey of the recent CNNs based crowd counting techniques. We briefly review detection-based, regression-based, and traditional density estimation based approaches. Then, we delve into detail regarding the deep learning based density estimation approaches and recently published datasets. In addition, we discuss the potential applications of crowd counting and in particular its applications using unmanned aerial vehicle (UAV) images.


2021 ◽  
pp. 403-417
Author(s):  
Jinfang Zheng ◽  
Panpan Zhao ◽  
Jinyang Xie ◽  
Chen Lyu ◽  
Lei Lyu

Sign in / Sign up

Export Citation Format

Share Document