An improved one-stage pedestrian detection method based on multi-scale attention feature extraction

Author(s):  
Jun Ma ◽  
Honglin Wan ◽  
Junxia Wang ◽  
Hao Xia ◽  
Chengjie Bai
2021 ◽  
Vol 13 (2) ◽  
pp. 38
Author(s):  
Yao Xu ◽  
Qin Yu

Great achievements have been made in pedestrian detection through deep learning. For detectors based on deep learning, making better use of features has become the key to their detection effect. While current pedestrian detectors have made efforts in feature utilization to improve their detection performance, the feature utilization is still inadequate. To solve the problem of inadequate feature utilization, we proposed the Multi-Level Feature Fusion Module (MFFM) and its Multi-Scale Feature Fusion Unit (MFFU) sub-module, which connect feature maps of the same scale and different scales by using horizontal and vertical connections and shortcut structures. All of these connections are accompanied by weights that can be learned; thus, they can be used as adaptive multi-level and multi-scale feature fusion modules to fuse the best features. Then, we built a complete pedestrian detector, the Adaptive Feature Fusion Detector (AFFDet), which is an anchor-free one-stage pedestrian detector that can make full use of features for detection. As a result, compared with other methods, our method has better performance on the challenging Caltech Pedestrian Detection Benchmark (Caltech) and has quite competitive speed. It is the current state-of-the-art one-stage pedestrian detection method.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1240
Author(s):  
Yang Liu ◽  
Hailong Su ◽  
Cao Zeng ◽  
Xiaoli Li

In complex scenes, it is a huge challenge to accurately detect motion-blurred, tiny, and dense objects in the thermal infrared images. To solve this problem, robust thermal infrared vehicle and pedestrian detection method is proposed in this paper. An important weight parameter β is first proposed to reconstruct the loss function of the feature selective anchor-free (FSAF) module in its online feature selection process, and the FSAF module is optimized to enhance the detection performance of motion-blurred objects. The proposal of parameter β provides an effective solution to the challenge of motion-blurred object detection. Then, the optimized anchor-free branches of the FSAF module are plugged into the YOLOv3 single-shot detector and work jointly with the anchor-based branches of the YOLOv3 detector in both training and inference, which efficiently improves the detection precision of the detector for tiny and dense objects. Experimental results show that the method proposed is superior to other typical thermal infrared vehicle and pedestrian detection algorithms due to 72.2% mean average precision (mAP).


Electronics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 319
Author(s):  
Yi Wang ◽  
Xiao Song ◽  
Guanghong Gong ◽  
Ni Li

Due to the rapid development of deep learning and artificial intelligence techniques, denoising via neural networks has drawn great attention due to their flexibility and excellent performances. However, for most convolutional network denoising methods, the convolution kernel is only one layer deep, and features of distinct scales are neglected. Moreover, in the convolution operation, all channels are treated equally; the relationships of channels are not considered. In this paper, we propose a multi-scale feature extraction-based normalized attention neural network (MFENANN) for image denoising. In MFENANN, we define a multi-scale feature extraction block to extract and combine features at distinct scales of the noisy image. In addition, we propose a normalized attention network (NAN) to learn the relationships between channels, which smooths the optimization landscape and speeds up the convergence process for training an attention model. Moreover, we introduce the NAN to convolutional network denoising, in which each channel gets gain; channels can play different roles in the subsequent convolution. To testify the effectiveness of the proposed MFENANN, we used both grayscale and color image sets whose noise levels ranged from 0 to 75 to do the experiments. The experimental results show that compared with some state-of-the-art denoising methods, the restored images of MFENANN have larger peak signal-to-noise ratios (PSNR) and structural similarity index measure (SSIM) values and get better overall appearance.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1820
Author(s):  
Xiaotao Shao ◽  
Qing Wang ◽  
Wei Yang ◽  
Yun Chen ◽  
Yi Xie ◽  
...  

The existing pedestrian detection algorithms cannot effectively extract features of heavily occluded targets which results in lower detection accuracy. To solve the heavy occlusion in crowds, we propose a multi-scale feature pyramid network based on ResNet (MFPN) to enhance the features of occluded targets and improve the detection accuracy. MFPN includes two modules, namely double feature pyramid network (FPN) integrated with ResNet (DFR) and repulsion loss of minimum (RLM). We propose the double FPN which improves the architecture to further enhance the semantic information and contours of occluded pedestrians, and provide a new way for feature extraction of occluded targets. The features extracted by our network can be more separated and clearer, especially those heavily occluded pedestrians. Repulsion loss is introduced to improve the loss function which can keep predicted boxes away from the ground truths of the unrelated targets. Experiments carried out on the public CrowdHuman dataset, we obtain 90.96% AP which yields the best performance, 5.16% AP gains compared to the FPN-ResNet50 baseline. Compared with the state-of-the-art works, the performance of the pedestrian detection system has been boosted with our method.


Sign in / Sign up

Export Citation Format

Share Document