Multi-scale Object Detection in Optical Remote Sensing Images Using Atrous Feature Pyramid Network

Nowadays, object detection methods based on deep learning are applied more and more to the interpretation of optical remote sensing images. However, the complex background and the wide range of object sizes in remote sensing images increase the difficulty of object detection. In this paper, we improve the detection performance by combining the attention information, and generate adaptive anchor boxes based on the attention map. Specifically, the attention mechanism is introduced into the proposed method to enhance the features of the object regions while reducing the influence of the background. The generated attention map is then used to obtain diverse and adaptable anchor boxes using the guided anchoring method. The generated anchor boxes can match better with the scene and the objects, compared with the traditional proposal boxes. Finally, the modulated feature adaptation module is applied to transform the feature maps to adapt to the diverse anchor boxes. Comprehensive evaluations on the DIOR dataset demonstrate the superiority of the proposed method over the state-of-the-art methods, such as RetinaNet, FCOS and CornerNet. The mean average precision of the proposed method is 4.5% higher than the feature pyramid network. In addition, the ablation experiments are also implemented to further analyze the respective influence of different blocks on the performance improvement.

Download Full-text

A Multi-Scale Spatial Attention Region Proposal Network for High-Resolution Optical Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs13173362 ◽

2021 ◽

Vol 13 (17) ◽

pp. 3362

Author(s):

Ruchan Dong ◽

Licheng Jiao ◽

Yan Zhang ◽

Jin Zhao ◽

Weiyan Shen

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Object Detection ◽

Spatial Attention ◽

Recall Rate ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Remote Sensing Imagery ◽

Multi Scale ◽

Backbone Network

Deep convolutional neural networks (DCNNs) are driving progress in object detection of high-resolution remote sensing images. Region proposal generation, as one of the key steps in object detection, has also become the focus of research. High-resolution remote sensing images usually contain various sizes of objects and complex background, small objects are easy to miss or be mis-identified in object detection. If the recall rate of region proposal of small objects and multi-scale objects can be improved, it will bring an improvement on the performance of the accuracy in object detection. Spatial attention is the ability to focus on local features in images and can improve the learning efficiency of DCNNs. This study proposes a multi-scale spatial attention region proposal network (MSA-RPN) for high-resolution optical remote sensing imagery. The MSA-RPN is an end-to-end deep learning network with a backbone network of ResNet. It deploys three novel modules to fulfill its task. First, the Scale-specific Feature Gate (SFG) focuses on features of objects by processing multi-scale features extracted from the backbone network. Second, the spatial attention-guided model (SAGM) obtains spatial information of objects from the multi-scale attention maps. Third, the Selective Strong Attention Maps Model (SSAMM) adaptively selects sliding windows according to the loss values from the system’s feedback, and sends the windowed samples to the spatial attention decoder. Finally, the candidate regions and their corresponding confidences can be obtained. We evaluate the proposed network in a public dataset LEVIR and compare with several state-of-the-art methods. The proposed MSA-RPN yields a higher recall rate of region proposal generation, especially for small targets in remote sensing images.

Download Full-text

Laplacian Feature Pyramid Network for Object Detection in VHR Optical Remote Sensing Images

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2021.3072488 ◽

2021 ◽

pp. 1-14

Author(s):

Wenhua Zhang ◽

Licheng Jiao ◽

Yuxuan Li ◽

Zhongjian Huang ◽

Haoran Wang

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Feature Pyramid

Download Full-text

Multi-Scale Feature Fusion Network for Object Detection in VHR Optical Remote Sensing Images

IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium ◽

10.1109/igarss.2019.8897842 ◽

2019 ◽

Cited By ~ 1

Author(s):

Wenhua Zhang ◽

Licheng Jiao ◽

Xu Liu ◽

Jia Liu

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Feature Fusion ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Scale Feature ◽

Multi Scale

Download Full-text

Object Detection in Remote Sensing Images via Multi-Feature Pyramid Network with Receptive Field Block

Remote Sensing ◽

10.3390/rs13050862 ◽

2021 ◽

Vol 13 (5) ◽

pp. 862

Author(s):

Zhichao Yuan ◽

Ziming Liu ◽

Chunbo Zhu ◽

Jing Qi ◽

Danpei Zhao

Keyword(s):

Remote Sensing ◽

Receptive Field ◽

Object Detection ◽

Experimental Results ◽

Local Context ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Global Features ◽

Complex Background ◽

Feature Pyramid

Object detection in optical remote sensing images (ORSIs) remains a difficult task because ORSIs always have some specific characteristics such as scale-differences between classes, numerous instances in one image and complex background texture. To address these problems, we propose a new Multi-Feature Pyramid Network (MFPNet) with Receptive Field Block (RFB) that integrates both local and global features to detect scattered objects and targets with scale-differences in ORSIs. We build a Multi-Feature Pyramid Module (M-FPM) with two cascaded convolution pyramids as the main structure of MFPNet, which handles object detection of different scales very well. RFB is designed to construct local context information, which makes the network more suitable for the objects detection around complex background. Asymmetric convolution kernel is introduced to RFB to improve the ability of feature attraction by adding nonlinear transformation. Then, a two-step detection network is constructed to combine the M-FPM and RFB to obtain more accurate results. Through a comprehensive evaluation of the experimental results on two publicly available remote sensing datasets Levir and DIOR, we demonstrate that our method outperforms state-of-the-art networks for about 1.3% mAP in Levir dataset and 4.1% mAP in DIOR dataset. Experimental results prove the effectiveness of our method in ORSIs of complex environments.

Download Full-text