scholarly journals Multiscale Anchor-Free Region Proposal Network for Pedestrian Detection

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Zhiwei Cao ◽  
Huihua Yang ◽  
Weijin Xu ◽  
Juan Zhao ◽  
Lingqiao Li ◽  
...  

Pedestrian detection based on visual sensors has made significant progress, in which region proposal is the key step. There are two mainstream methods to generate region proposals: anchor-based and anchor-free. However, anchor-based methods need more hyperparameters related to anchors for training compared with anchor-free methods. In this paper, we propose a novel multiscale anchor-free (MSAF) region proposal network to obtain proposals, especially for small-scale pedestrians. It usually has several branches to predict proposals and assigns ground truth according to the height of pedestrian. Each branch consists of two components: one is feature extraction, and the other is detection head. Adapted channel feature fusion (ACFF) is proposed to select features at different levels of the backbone to effectively extract features. The detection head is used to predict the pedestrian center location, center offsets, and height to get bounding boxes. With our classifier, the detection performance can be further improved, especially for small-scale pedestrians. The experiments on the Caltech and CityPersons demonstrate that the MSAF can significantly boost the pedestrian detection performance and the log-average miss rate (MR) on the reasonable setting is 3.97% and 9.5%, respectively. If proposals are reclassified with our classifier, MR is 3.38% and 8.4%. The detection performance can be further improved, especially for small-scale pedestrians.

2021 ◽  
Vol 2035 (1) ◽  
pp. 012023
Author(s):  
Yuhao You ◽  
Houjin Chen ◽  
Yanfeng Li ◽  
Minjun Wang ◽  
Jinlei Zhu

Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4184
Author(s):  
Zhiwei Cao ◽  
Huihua Yang ◽  
Juan Zhao ◽  
Shuhong Guo ◽  
Lingqiao Li

Multispectral pedestrian detection, which consists of a color stream and thermal stream, is essential under conditions of insufficient illumination because the fusion of the two streams can provide complementary information for detecting pedestrians based on deep convolutional neural networks (CNNs). In this paper, we introduced and adapted a simple and efficient one-stage YOLOv4 to replace the current state-of-the-art two-stage fast-RCNN for multispectral pedestrian detection and to directly predict bounding boxes with confidence scores. To further improve the detection performance, we analyzed the existing multispectral fusion methods and proposed a novel multispectral channel feature fusion (MCFF) module for integrating the features from the color and thermal streams according to the illumination conditions. Moreover, several fusion architectures, such as Early Fusion, Halfway Fusion, Late Fusion, and Direct Fusion, were carefully designed based on the MCFF to transfer the feature information from the bottom to the top at different stages. Finally, the experimental results on the KAIST and Utokyo pedestrian benchmarks showed that Halfway Fusion was used to obtain the best performance of all architectures and the MCFF could adapt fused features in the two modalities. The log-average miss rate (MR) for the two modalities with reasonable settings were 4.91% and 23.14%, respectively.


2021 ◽  
Author(s):  
Yanjiao Yang ◽  
Danhong Jin ◽  
Zhenqiang Yuan ◽  
Jiachen Han

2019 ◽  
Vol 12 (3) ◽  
pp. 318-332
Author(s):  
Shuang-Shuang Liu

Purpose The conventional pedestrian detection algorithms lack in scale sensitivity. The purpose of this paper is to propose a novel algorithm of self-adaptive scale pedestrian detection, based on deep residual network (DRN), to address such lacks. Design/methodology/approach First, the “Edge boxes” algorithm is introduced to extract region of interests from pedestrian images. Then, the extracted bounding boxes are incorporated to different DRNs, one is a large-scale DRN and the other one is the small-scale DRN. The height of the bounding boxes is used to classify the results of pedestrians and to regress the bounding boxes to the entity of the pedestrian. At last, a weighted self-adaptive scale function, which combines the large-scale results and small-scale results, is designed for the final pedestrian detection. Findings To validate the effectiveness and feasibility of the proposed algorithm, some comparison experiments have been done on the common pedestrian detection data sets: Caltech, INRIA, ETH and KITTI. Experimental results show that the proposed algorithm is adapted for the various scales of the pedestrians. For the hard detected small-scale pedestrians, the proposed algorithm has improved the accuracy and robustness of detections. Originality/value By applying different models to deal with different scales of pedestrians, the proposed algorithm with the weighted calculation function has improved the accuracy and robustness for different scales of pedestrians.


Mathematics ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 139
Author(s):  
Zhifeng Ding ◽  
Zichen Gu ◽  
Yanpeng Sun ◽  
Xinguang Xiang

The detection method based on anchor-free not only reduces the training cost of object detection, but also avoids the imbalance problem caused by an excessive number of anchors. However, these methods only pay attention to the impact of the detection head on the detection performance, thus ignoring the impact of feature fusion on the detection performance. In this article, we take pedestrian detection as an example and propose a one-stage network Cascaded Cross-layer Fusion Network (CCFNet) based on anchor-free. It consists of Cascaded Cross-layer Fusion module (CCF) and novel detection head. Among them, CCF fully considers the distribution of high-level information and low-level information of feature maps under different stages in the network. First, the deep network is used to remove a large amount of noise in the shallow features, and finally, the high-level features are reused to obtain a more complete feature representation. Secondly, for the pedestrian detection task, a novel detection head is designed, which uses the global smooth map (GSMap) to provide global information for the center map to obtain a more accurate center map. Finally, we verified the feasibility of CCFNet on the Caltech and CityPersons datasets.


2022 ◽  
Vol 11 (01) ◽  
pp. 22-26
Author(s):  
Hui Xiang ◽  
Junyan Han ◽  
Hanqing Wang ◽  
Hao Li ◽  
Shangqing Li ◽  
...  

Aiming at the problems of low detection accuracy and poor recognition effect of small-scale targets in traditional vehicle and pedestrian detection methods, a vehicle and pedestrian detection method based on improved YOLOv4-Tiny is proposed. On the basis of YOLOv4-Tiny, the 8-fold down sampling feature layer was added for feature fusion, the PANet structure was used to perform bidirectional fusion for the deep and shallow features from the output feature layer of backbone network, and the detection head for small targets was added. The results show that the mean average precision of the improved method has reached 85.93%, and the detection performance is similar to that of YOLOv4. Compared with the YOLOv4-Tiny, the mean average precision of the improved method is increased by 24.45%, and the detection speed reaches 67.83FPS, which means that the detection effect is significantly improved and can meet the real-time requirements.


2020 ◽  
Vol 3 (1) ◽  
pp. 61
Author(s):  
Kazuhiro Aruga

In this study, two operational methodologies to extract thinned woods were investigated in the Nasunogahara area, Tochigi Prefecture, Japan. Methodology one included manual extraction and light truck transportation. Methodology two included mini-forwarder forwarding and four-ton truck transportation. Furthermore, a newly introduced chipper was investigated. As a result, costs of manual extractions within 10 m and 20 m were JPY942/m3 and JPY1040/m3, respectively. On the other hand, the forwarding cost of the mini-forwarder was JPY499/m3, which was significantly lower than the cost of manual extractions. Transportation costs with light trucks and four-ton trucks were JPY7224/m3 and JPY1298/m3, respectively, with 28 km transportation distances. Chipping operation costs were JPY1036/m3 and JPY1160/m3 with three and two persons, respectively. Finally, the total costs of methodologies one and two from extraction within 20 m to chipping were estimated as JPY9300/m3 and JPY2833/m3, respectively, with 28 km transportation distances and three-person chipping operations (EUR1 = JPY126, as of 12 August 2020).


2014 ◽  
Vol 881-883 ◽  
pp. 757-760
Author(s):  
Xiao Qing Ren ◽  
Li Zhen Ma ◽  
Xin Yi He

The objective of this study was to examine the effect of different levels of catfish bone paste to flour on the physicochemical, textural and crumb structure properties of steamed bread. Six different levels (0, 1, 3, 5, 7,10 %) of catfish bone paste to flour were used in the formulation of the steamed bread. The results showed that the weight loss and TTA of steamed bread decreased with an increase in the levels of the catfish bone paste. On the other hand, the pH increased with an increase in the levels of the catfish bone paste. The specific volume, hardness, chewiness and gas cell structure in the crumb of steamed bread with catfish bone paste at 5% supplementation level were better. Thus, a value of 5% catfish bone paste was considered a better level for incorporation into the steamed bread.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1299
Author(s):  
Honglin Yuan ◽  
Tim Hoogenkamp ◽  
Remco C. Veltkamp

Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.


Author(s):  
Zhenying Xu ◽  
Ziqian Wu ◽  
Wei Fan

Defect detection of electromagnetic luminescence (EL) cells is the core step in the production and preparation of solar cell modules to ensure conversion efficiency and long service life of batteries. However, due to the lack of feature extraction capability for small feature defects, the traditional single shot multibox detector (SSD) algorithm performs not well in EL defect detection with high accuracy. Consequently, an improved SSD algorithm with modification in feature fusion in the framework of deep learning is proposed to improve the recognition rate of EL multi-class defects. A dataset containing images with four different types of defects through rotation, denoising, and binarization is established for the EL. The proposed algorithm can greatly improve the detection accuracy of the small-scale defect with the idea of feature pyramid networks. An experimental study on the detection of the EL defects shows the effectiveness of the proposed algorithm. Moreover, a comparison study shows the proposed method outperforms other traditional detection methods, such as the SIFT, Faster R-CNN, and YOLOv3, in detecting the EL defect.


Sign in / Sign up

Export Citation Format

Share Document