VaryBlock: A Novel Approach for Object Detection in Remote Sensed Images

In recent years, the research on optical remote sensing images has received greater and greater attention. Object detection, as one of the most challenging tasks in the area of remote sensing, has been remarkably promoted by convolutional neural network (CNN)-based methods like You Only Look Once (YOLO) and Faster R-CNN. However, due to the complexity of backgrounds and the distinctive object distribution, directly applying these general object detection methods to the remote sensing object detection usually renders poor performance. To tackle this problem, a highly efficient and robust framework based on YOLO is proposed. We devise and integrate VaryBlock to the architecture which effectively offsets some of the information loss caused by downsampling. In addition, some techniques are utilized to facilitate the performance and to avoid overfitting. Experimental results show that our proposed method can enormously improve the mean average precision by a large margin on the NWPU VHR-10 dataset.

Download Full-text

Feature Enhancement Network for Object Detection in Optical Remote Sensing Images

Journal of Remote Sensing ◽

10.34133/2021/9805389 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Gong Cheng ◽

Chunbo Lang ◽

Maoxiong Wu ◽

Xingxing Xie ◽

Xiwen Yao ◽

...

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Large Scale ◽

Poor Performance ◽

Natural Image ◽

Land Resource ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Feature Enhancement ◽

Context Cues

Automatic and robust object detection in remote sensing images is of vital significance in real-world applications such as land resource management and disaster rescue. However, poor performance arises when the state-of-the-art natural image detection algorithms are directly applied to remote sensing images, which largely results from the variations in object scale, aspect ratio, indistinguishable object appearances, and complex background scenario. In this paper, we propose a novel Feature Enhancement Network (FENet) for object detection in optical remote sensing images, which consists of a Dual Attention Feature Enhancement (DAFE) module and a Context Feature Enhancement (CFE) module. Specifically, the DAFE module is introduced to highlight the network to focus on the distinctive features of the objects of interest and suppress useless ones by jointly recalibrating the spatial and channel feature responses. The CFE module is designed to capture global context cues and selectively strengthen class-aware features by leveraging image-level contextual information that indicates the presence or absence of the object classes. To this end, we employ a context encoding loss to regularize the model training which promotes the object detector to understand the scene better and narrows the probable object categories in prediction. We achieve our proposed FENet by unifying DAFE and CFE into the framework of Faster R-CNN. In the experiments, we evaluate our proposed method on two large-scale remote sensing image object detection datasets including DIOR and DOTA and demonstrate its effectiveness compared with the baseline methods.

Download Full-text

Generating Anchor Boxes Based on Attention Mechanism for Object Detection in Remote Sensing Images

Remote Sensing ◽

10.3390/rs12152416 ◽

2020 ◽

Vol 12 (15) ◽

pp. 2416 ◽

Cited By ~ 1

Author(s):

Zhuangzhuang Tian ◽

Ronghui Zhan ◽

Jiemin Hu ◽

Wei Wang ◽

Zhiqiang He ◽

...

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Attention Mechanism ◽

Detection Methods ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Feature Maps ◽

Wide Range ◽

Feature Pyramid ◽

Comprehensive Evaluations

Nowadays, object detection methods based on deep learning are applied more and more to the interpretation of optical remote sensing images. However, the complex background and the wide range of object sizes in remote sensing images increase the difficulty of object detection. In this paper, we improve the detection performance by combining the attention information, and generate adaptive anchor boxes based on the attention map. Specifically, the attention mechanism is introduced into the proposed method to enhance the features of the object regions while reducing the influence of the background. The generated attention map is then used to obtain diverse and adaptable anchor boxes using the guided anchoring method. The generated anchor boxes can match better with the scene and the objects, compared with the traditional proposal boxes. Finally, the modulated feature adaptation module is applied to transform the feature maps to adapt to the diverse anchor boxes. Comprehensive evaluations on the DIOR dataset demonstrate the superiority of the proposed method over the state-of-the-art methods, such as RetinaNet, FCOS and CornerNet. The mean average precision of the proposed method is 4.5% higher than the feature pyramid network. In addition, the ablation experiments are also implemented to further analyze the respective influence of different blocks on the performance improvement.

Download Full-text

Dense Attention Fluid Network for Salient Object Detection in Optical Remote Sensing Images

IEEE Transactions on Image Processing ◽

10.1109/tip.2020.3042084 ◽

2021 ◽

Vol 30 ◽

pp. 1305-1317

Author(s):

Qijian Zhang ◽

Runmin Cong ◽

Chongyi Li ◽

Ming-Ming Cheng ◽

Yuming Fang ◽

...

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Salient Object Detection ◽

Optical Remote Sensing ◽

Salient Object ◽

Remote Sensing Images ◽

Fluid Network

Download Full-text

Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2016.2601622 ◽

2016 ◽

Vol 54 (12) ◽

pp. 7405-7415 ◽

Cited By ~ 612

Author(s):

Gong Cheng ◽

Peicheng Zhou ◽

Junwei Han

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Object Detection ◽

Convolutional Neural Networks ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Rotation Invariant

Download Full-text

A Fine-Grained Object Detection Framework Based on Fixed ROI Masking and Feature Optimization in Optical Remote Sensing Images

10.1109/iccais52680.2021.9624648 ◽

2021 ◽

Author(s):

Zhang Xiaohan ◽

Lv Yafei ◽

Bi Aipeng ◽

Zhao Jianming ◽

Yao Libo

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Fine Grained ◽

Feature Optimization

Download Full-text

Improved Oriented Object Detection in Remote Sensing Images Based on a Three-Point Regression Method

Remote Sensing ◽

10.3390/rs13224517 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4517

Author(s):

Falin Wu ◽

Jiaqi He ◽

Guopeng Zhou ◽

Haolun Li ◽

Yushuang Liu ◽

...

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Poor Performance ◽

Regression Method ◽

Remote Sensing Images ◽

Sensing Applications ◽

Bounding Box ◽

Bounding Boxes ◽

Fully Connected ◽

Oriented Object

Object detection in remote sensing images plays an important role in both military and civilian remote sensing applications. Objects in remote sensing images are different from those in natural images. They have the characteristics of scale diversity, arbitrary directivity, and dense arrangement, which causes difficulties in object detection. For objects with a large aspect ratio and that are oblique and densely arranged, using an oriented bounding box can help to avoid deleting some correct detection bounding boxes by mistake. The classic rotational region convolutional neural network (R2CNN) has advantages for text detection. However, R2CNN has poor performance in the detection of slender objects with arbitrary directivity in remote sensing images, and its fault tolerance rate is low. In order to solve this problem, this paper proposes an improved R2CNN based on a double detection head structure and a three-point regression method, namely, TPR-R2CNN. The proposed network modifies the original R2CNN network structure by applying a double fully connected (2-fc) detection head and classification fusion. One detection head is for classification and horizontal bounding box regression, the other is for classification and oriented bounding box regression. The three-point regression method (TPR) is proposed for oriented bounding box regression, which determines the positions of the oriented bounding box by regressing the coordinates of the center point and the first two vertices. The proposed network was validated on the DOTA-v1.5 and HRSC2016 datasets, and it achieved a mean average precision (mAP) of 3.90% and 15.27%, respectively, from feature pyramid network (FPN) baselines with a ResNet-50 backbone.

Download Full-text

Edge-Aware Multiscale Feature Integration Network for Salient Object Detection in Optical Remote Sensing Images

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2021.3091312 ◽

2021 ◽

pp. 1-15

Author(s):

Xiaofei Zhou ◽

Kunye Shen ◽

Zhi Liu ◽

Chen Gong ◽

Jiyong Zhang ◽

...

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Salient Object Detection ◽

Feature Integration ◽

Optical Remote Sensing ◽

Salient Object ◽

Remote Sensing Images

Download Full-text

Rotation-Invariant Feature Learning for Object Detection in VHR Optical Remote Sensing Images by Double-Net

IEEE Access ◽

10.1109/access.2019.2960931 ◽

2020 ◽

Vol 8 ◽

pp. 20818-20827 ◽

Cited By ~ 5

Author(s):

Zhi Zhang ◽

Ruoqiao Jiang ◽

Shaohui Mei ◽

Shun Zhang ◽

Yifan Zhang

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Feature Learning ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Rotation Invariant ◽

Invariant Feature

Download Full-text

Object Detection in Optical Remote Sensing Images Based on Transfer Learning Convolutional Neural Networks

2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS) ◽

10.1109/ccis.2018.8691238 ◽

2018 ◽

Author(s):

Zhenguo Yan ◽

Xin Song ◽

Hanyang Zhong ◽

Xiaozhou Zhu

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Object Detection ◽

Transfer Learning ◽

Convolutional Neural Networks ◽

Optical Remote Sensing ◽

Remote Sensing Images

Download Full-text

Detection of Collapsed Buildings in Post-Earthquake Remote Sensing Images Based on the Improved YOLOv3

Remote Sensing ◽

10.3390/rs12010044 ◽

2019 ◽

Vol 12 (1) ◽

pp. 44 ◽

Cited By ~ 10

Author(s):

Haojie Ma ◽

Yalan Liu ◽

Yuhuan Ren ◽

Jingxian Yu

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Spatial Resolution ◽

High Spatial Resolution ◽

Detection Methods ◽

Remote Sensing Images ◽

Sensing Technology ◽

Collapsed Buildings ◽

Improved Model ◽

Detection Speed

An important and effective method for the preliminary mitigation and relief of an earthquake is the rapid estimation of building damage via high spatial resolution remote sensing technology. Traditional object detection methods only use artificially designed shallow features on post-earthquake remote sensing images, which are uncertain and complex background environment and time-consuming feature selection. The satisfactory results from them are often difficult. Therefore, this study aims to apply the object detection method You Only Look Once (YOLOv3) based on the convolutional neural network (CNN) to locate collapsed buildings from post-earthquake remote sensing images. Moreover, YOLOv3 was improved to obtain more effective detection results. First, we replaced the Darknet53 CNN in YOLOv3 with the lightweight CNN ShuffleNet v2. Second, the prediction box center point, XY loss, and prediction box width and height, WH loss, in the loss function was replaced with the generalized intersection over union (GIoU) loss. Experiments performed using the improved YOLOv3 model, with high spatial resolution aerial remote sensing images at resolutions of 0.5 m after the Yushu and Wenchuan earthquakes, show a significant reduction in the number of parameters, detection speed of up to 29.23 f/s, and target precision of 90.89%. Compared with the general YOLOv3, the detection speed improved by 5.21 f/s and its precision improved by 5.24%. Moreover, the improved model had stronger noise immunity capabilities, which indicates a significant improvement in the model’s generalization. Therefore, this improved YOLOv3 model is effective for the detection of collapsed buildings in post-earthquake high-resolution remote sensing images.

Download Full-text