Multiscale Anchor-Free Region Proposal Network for Pedestrian Detection
Pedestrian detection based on visual sensors has made significant progress, in which region proposal is the key step. There are two mainstream methods to generate region proposals: anchor-based and anchor-free. However, anchor-based methods need more hyperparameters related to anchors for training compared with anchor-free methods. In this paper, we propose a novel multiscale anchor-free (MSAF) region proposal network to obtain proposals, especially for small-scale pedestrians. It usually has several branches to predict proposals and assigns ground truth according to the height of pedestrian. Each branch consists of two components: one is feature extraction, and the other is detection head. Adapted channel feature fusion (ACFF) is proposed to select features at different levels of the backbone to effectively extract features. The detection head is used to predict the pedestrian center location, center offsets, and height to get bounding boxes. With our classifier, the detection performance can be further improved, especially for small-scale pedestrians. The experiments on the Caltech and CityPersons demonstrate that the MSAF can significantly boost the pedestrian detection performance and the log-average miss rate (MR) on the reasonable setting is 3.97% and 9.5%, respectively. If proposals are reclassified with our classifier, MR is 3.38% and 8.4%. The detection performance can be further improved, especially for small-scale pedestrians.