Interactive multi-scale feature representation enhancement for small object detection

2021 ◽  
Vol 108 ◽  
pp. 104128
Author(s):  
Qiyuan Zheng ◽  
Ying Chen
Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3031
Author(s):  
Jing Lian ◽  
Yuhang Yin ◽  
Linhui Li ◽  
Zhenghao Wang ◽  
Yafu Zhou

There are many small objects in traffic scenes, but due to their low resolution and limited information, their detection is still a challenge. Small object detection is very important for the understanding of traffic scene environments. To improve the detection accuracy of small objects in traffic scenes, we propose a small object detection method in traffic scenes based on attention feature fusion. First, a multi-scale channel attention block (MS-CAB) is designed, which uses local and global scales to aggregate the effective information of the feature maps. Based on this block, an attention feature fusion block (AFFB) is proposed, which can better integrate contextual information from different layers. Finally, the AFFB is used to replace the linear fusion module in the object detection network and obtain the final network structure. The experimental results show that, compared to the benchmark model YOLOv5s, this method has achieved a higher mean Average Precison (mAP) under the premise of ensuring real-time performance. It increases the mAP of all objects by 0.9 percentage points on the validation set of the traffic scene dataset BDD100K, and at the same time, increases the mAP of small objects by 3.5%.


2021 ◽  
Vol 13 (6) ◽  
pp. 1198
Author(s):  
Bi-Yuan Liu ◽  
Huai-Xin Chen ◽  
Zhou Huang ◽  
Xing Liu ◽  
Yun-Zhi Yang

Drone-based object detection has been widely applied in ground object surveillance, urban patrol, and some other fields. However, the dramatic scale changes and complex backgrounds of drone images usually result in weak feature representation of small objects, which makes it challenging to achieve high-precision object detection. Aiming to improve small objects detection, this paper proposes a novel cross-scale knowledge distillation (CSKD) method, which enhances the features of small objects in a manner similar to image enlargement, so it is termed as ZoomInNet. First, based on an efficient feature pyramid network structure, the teacher and student network are trained with images in different scales to introduce the cross-scale feature. Then, the proposed layer adaption (LA) and feature level alignment (FA) mechanisms are applied to align the feature size of the two models. After that, the adaptive key distillation point (AKDP) algorithm is used to get the crucial positions in feature maps that need knowledge distillation. Finally, the position-aware L2 loss is used to measure the difference between feature maps from cross-scale models, realizing the cross-scale information compression in a single model. Experiments on the challenging Visdrone2018 dataset show that the proposed method draws on the advantages of the image pyramid methods, while avoids the large calculation of them and significantly improves the detection accuracy of small objects. Simultaneously, the comparison with mainstream methods proves that our method has the best performance in small object detection.


Author(s):  
Tripop Tongboonsong ◽  
Akkarat Boonpoonga ◽  
Kittisak Phaebua ◽  
Titipong Lertwiriyaprapa ◽  
Lakkhana Bannawat

Sign in / Sign up

Export Citation Format

Share Document