Learning a Dynamic High-Resolution Network for Multi-Scale Pedestrian Detection

Author(s):  
Mengyuan Ding ◽  
Shanshan Zhang ◽  
Jian Yang
Electronics ◽  
2019 ◽  
Vol 8 (11) ◽  
pp. 1370 ◽  
Author(s):  
Tingzhu Sun ◽  
Weidong Fang ◽  
Wei Chen ◽  
Yanxin Yao ◽  
Fangming Bi ◽  
...  

Although image inpainting based on the generated adversarial network (GAN) has made great breakthroughs in accuracy and speed in recent years, they can only process low-resolution images because of memory limitations and difficulty in training. For high-resolution images, the inpainted regions become blurred and the unpleasant boundaries become visible. Based on the current advanced image generation network, we proposed a novel high-resolution image inpainting method based on multi-scale neural network. This method is a two-stage network including content reconstruction and texture detail restoration. After holding the visually believable fuzzy texture, we further restore the finer details to produce a smoother, clearer, and more coherent inpainting result. Then we propose a special application scene of image inpainting, that is, to delete the redundant pedestrians in the image and ensure the reality of background restoration. It involves pedestrian detection, identifying redundant pedestrians and filling in them with the seemingly correct content. To improve the accuracy of image inpainting in the application scene, we proposed a new mask dataset, which collected the characters in COCO dataset as a mask. Finally, we evaluated our method on COCO and VOC dataset. the experimental results show that our method can produce clearer and more coherent inpainting results, especially for high-resolution images, and the proposed mask dataset can produce better inpainting results in the special application scene.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1820
Author(s):  
Xiaotao Shao ◽  
Qing Wang ◽  
Wei Yang ◽  
Yun Chen ◽  
Yi Xie ◽  
...  

The existing pedestrian detection algorithms cannot effectively extract features of heavily occluded targets which results in lower detection accuracy. To solve the heavy occlusion in crowds, we propose a multi-scale feature pyramid network based on ResNet (MFPN) to enhance the features of occluded targets and improve the detection accuracy. MFPN includes two modules, namely double feature pyramid network (FPN) integrated with ResNet (DFR) and repulsion loss of minimum (RLM). We propose the double FPN which improves the architecture to further enhance the semantic information and contours of occluded pedestrians, and provide a new way for feature extraction of occluded targets. The features extracted by our network can be more separated and clearer, especially those heavily occluded pedestrians. Repulsion loss is introduced to improve the loss function which can keep predicted boxes away from the ground truths of the unrelated targets. Experiments carried out on the public CrowdHuman dataset, we obtain 90.96% AP which yields the best performance, 5.16% AP gains compared to the FPN-ResNet50 baseline. Compared with the state-of-the-art works, the performance of the pedestrian detection system has been boosted with our method.


2021 ◽  
Vol 13 (2) ◽  
pp. 328
Author(s):  
Wenkai Liang ◽  
Yan Wu ◽  
Ming Li ◽  
Yice Cao ◽  
Xin Hu

The classification of high-resolution (HR) synthetic aperture radar (SAR) images is of great importance for SAR scene interpretation and application. However, the presence of intricate spatial structural patterns and complex statistical nature makes SAR image classification a challenging task, especially in the case of limited labeled SAR data. This paper proposes a novel HR SAR image classification method, using a multi-scale deep feature fusion network and covariance pooling manifold network (MFFN-CPMN). MFFN-CPMN combines the advantages of local spatial features and global statistical properties and considers the multi-feature information fusion of SAR images in representation learning. First, we propose a Gabor-filtering-based multi-scale feature fusion network (MFFN) to capture the spatial pattern and get the discriminative features of SAR images. The MFFN belongs to a deep convolutional neural network (CNN). To make full use of a large amount of unlabeled data, the weights of each layer of MFFN are optimized by unsupervised denoising dual-sparse encoder. Moreover, the feature fusion strategy in MFFN can effectively exploit the complementary information between different levels and different scales. Second, we utilize a covariance pooling manifold network to extract further the global second-order statistics of SAR images over the fusional feature maps. Finally, the obtained covariance descriptor is more distinct for various land covers. Experimental results on four HR SAR images demonstrate the effectiveness of the proposed method and achieve promising results over other related algorithms.


2021 ◽  
Vol 41 (2) ◽  
pp. 0208002
Author(s):  
李江勇 Li Jiangyong ◽  
冯位欣 Feng Weixin ◽  
刘飞 Liu Fei ◽  
魏雅喆 Wei Yazhe ◽  
邵晓鹏 Shao Xiaopeng

2021 ◽  
Vol 11 (22) ◽  
pp. 10508
Author(s):  
Chaowei Tang ◽  
Xinxin Feng ◽  
Haotian Wen ◽  
Xu Zhou ◽  
Yanqing Shao ◽  
...  

Surface defect detection of an automobile wheel hub is important to the automobile industry because these defects directly affect the safety and appearance of automobiles. At present, surface defect detection networks based on convolutional neural network use many pooling layers when extracting features, reducing the spatial resolution of features and preventing the accurate detection of the boundary of defects. On the basis of DeepLab v3+, we propose a semantic segmentation network for the surface defect detection of an automobile wheel hub. To solve the gridding effect of atrous convolution, the high-resolution network (HRNet) is used as the backbone network to extract high-resolution features, and the multi-scale features extracted by the Atrous Spatial Pyramid Pooling (ASPP) of DeepLab v3+ are superimposed. On the basis of the optical flow, we decouple the body and edge features of the defects to accurately detect the boundary of defects. Furthermore, in the upsampling process, a decoder can accurately obtain detection results by fusing the body, edge, and multi-scale features. We use supervised training to optimize these features. Experimental results on four defect datasets (i.e., wheels, magnetic tiles, fabrics, and welds) show that the proposed network has better F1 score, average precision, and intersection over union than SegNet, Unet, and DeepLab v3+, proving that the proposed network is effective for different defect detection scenarios.


Author(s):  
Y. Di ◽  
G. Jiang ◽  
L. Yan ◽  
H. Liu ◽  
S. Zheng

Most of multi-scale segmentation algorithms are not aiming at high resolution remote sensing images and have difficulty to communicate and use layers’ information. In view of them, we proposes a method of multi-scale segmentation of high resolution remote sensing images by integrating multiple features. First, Canny operator is used to extract edge information, and then band weighted distance function is built to obtain the edge weight. According to the criterion, the initial segmentation objects of color images can be gained by Kruskal minimum spanning tree algorithm. Finally segmentation images are got by the adaptive rule of Mumford–Shah region merging combination with spectral and texture information. The proposed method is evaluated precisely using analog images and ZY-3 satellite images through quantitative and qualitative analysis. The experimental results show that the multi-scale segmentation of high resolution remote sensing images by integrating multiple features outperformed the software eCognition fractal network evolution algorithm (highest-resolution network evolution that FNEA) on the accuracy and slightly inferior to FNEA on the efficiency.


Sign in / Sign up

Export Citation Format

Share Document