scholarly journals A Lightweight Keypoint-Based Oriented Object Detection of Remote Sensing Images

2021 ◽  
Vol 13 (13) ◽  
pp. 2459
Author(s):  
Yangyang Li ◽  
Heting Mao ◽  
Ruijiao Liu ◽  
Xuan Pei ◽  
Licheng Jiao ◽  
...  

Object detection in remote sensing images has been widely used in military and civilian fields and is a challenging task due to the complex background, large-scale variation, and dense arrangement in arbitrary orientations of objects. In addition, existing object detection methods rely on the increasingly deeper network, which increases a lot of computational overhead and parameters, and is unfavorable to deployment on the edge devices. In this paper, we proposed a lightweight keypoint-based oriented object detector for remote sensing images. First, we propose a semantic transfer block (STB) when merging shallow and deep features, which reduces noise and restores the semantic information. Then, the proposed adaptive Gaussian kernel (AGK) is adapted to objects of different scales, and further improves detection performance. Finally, we propose the distillation loss associated with object detection to obtain a lightweight student network. Experiments on the HRSC2016 and UCAS-AOD datasets show that the proposed method adapts to different scale objects, obtains accurate bounding boxes, and reduces the influence of complex backgrounds. The comparison with mainstream methods proves that our method has comparable performance under lightweight.

2021 ◽  
Vol 13 (22) ◽  
pp. 4517
Author(s):  
Falin Wu ◽  
Jiaqi He ◽  
Guopeng Zhou ◽  
Haolun Li ◽  
Yushuang Liu ◽  
...  

Object detection in remote sensing images plays an important role in both military and civilian remote sensing applications. Objects in remote sensing images are different from those in natural images. They have the characteristics of scale diversity, arbitrary directivity, and dense arrangement, which causes difficulties in object detection. For objects with a large aspect ratio and that are oblique and densely arranged, using an oriented bounding box can help to avoid deleting some correct detection bounding boxes by mistake. The classic rotational region convolutional neural network (R2CNN) has advantages for text detection. However, R2CNN has poor performance in the detection of slender objects with arbitrary directivity in remote sensing images, and its fault tolerance rate is low. In order to solve this problem, this paper proposes an improved R2CNN based on a double detection head structure and a three-point regression method, namely, TPR-R2CNN. The proposed network modifies the original R2CNN network structure by applying a double fully connected (2-fc) detection head and classification fusion. One detection head is for classification and horizontal bounding box regression, the other is for classification and oriented bounding box regression. The three-point regression method (TPR) is proposed for oriented bounding box regression, which determines the positions of the oriented bounding box by regressing the coordinates of the center point and the first two vertices. The proposed network was validated on the DOTA-v1.5 and HRSC2016 datasets, and it achieved a mean average precision (mAP) of 3.90% and 15.27%, respectively, from feature pyramid network (FPN) baselines with a ResNet-50 backbone.


2021 ◽  
Vol 13 (18) ◽  
pp. 3622
Author(s):  
Xu He ◽  
Shiping Ma ◽  
Linyuan He ◽  
Le Ru ◽  
Chen Wang

Oriented object detection in remote sensing images (RSIs) is a significant yet challenging Earth Vision task, as the objects in RSIs usually emerge with complicated backgrounds, arbitrary orientations, multi-scale distributions, and dramatic aspect ratio variations. Existing oriented object detectors are mostly inherited from the anchor-based paradigm. However, the prominent performance of high-precision and real-time detection with anchor-based detectors is overshadowed by the design limitations of tediously rotated anchors. By using the simplicity and efficiency of keypoint-based detection, in this work, we extend a keypoint-based detector to the task of oriented object detection in RSIs. Specifically, we first simplify the oriented bounding box (OBB) as a center-based rotated inscribed ellipse (RIE), and then employ six parameters to represent the RIE inside each OBB: the center point position of the RIE, the offsets of the long half axis, the length of the short half axis, and an orientation label. In addition, to resolve the influence of complex backgrounds and large-scale variations, a high-resolution gated aggregation network (HRGANet) is designed to identify the targets of interest from complex backgrounds and fuse multi-scale features by using a gated aggregation model (GAM). Furthermore, by analyzing the influence of eccentricity on orientation error, eccentricity-wise orientation loss (ewoLoss) is proposed to assign the penalties on the orientation loss based on the eccentricity of the RIE, which effectively improves the accuracy of the detection of oriented objects with a large aspect ratio. Extensive experimental results on the DOTA and HRSC2016 datasets demonstrate the effectiveness of the proposed method.


2021 ◽  
Vol 13 (7) ◽  
pp. 1318
Author(s):  
Jie-Bo Hou ◽  
Xiaobin Zhu ◽  
Xu-Cheng Yin

Object detection is a significant and challenging problem in the study of remote sensing. Since remote sensing images are typically captured with a bird’s-eye view, the aspect ratios of objects in the same category may obey a Gaussian distribution. Generally, existing object detection methods ignore exploring the distribution character of aspect ratios for improving performance in remote sensing tasks. In this paper, we propose a novel Self-Adaptive Aspect Ratio Anchor (SARA) to explicitly explore aspect ratio variations of objects in remote sensing images. To be concrete, our SARA can self-adaptively learn an appropriate aspect ratio for each category. In this way, we can only utilize a simple squared anchor (related to the strides of feature maps in Feature Pyramid Networks) to regress objects in various aspect ratios. Finally, we adopt an Oriented Box Decoder (OBD) to align the feature maps and encode the orientation information of oriented objects. Our method achieves a promising mAP value of 79.91% on the DOTA dataset.


2021 ◽  
Vol 42 (17) ◽  
pp. 6670-6691
Author(s):  
Qiuyu Guan ◽  
Zhenshen Qu ◽  
Ming Zeng ◽  
Jianxiong Shen ◽  
Jingda Du

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 223373-223384
Author(s):  
Lin Zhou ◽  
Haoran Wei ◽  
Hao Li ◽  
Wenzhe Zhao ◽  
Yi Zhang ◽  
...  

2019 ◽  
Vol 12 (1) ◽  
pp. 44 ◽  
Author(s):  
Haojie Ma ◽  
Yalan Liu ◽  
Yuhuan Ren ◽  
Jingxian Yu

An important and effective method for the preliminary mitigation and relief of an earthquake is the rapid estimation of building damage via high spatial resolution remote sensing technology. Traditional object detection methods only use artificially designed shallow features on post-earthquake remote sensing images, which are uncertain and complex background environment and time-consuming feature selection. The satisfactory results from them are often difficult. Therefore, this study aims to apply the object detection method You Only Look Once (YOLOv3) based on the convolutional neural network (CNN) to locate collapsed buildings from post-earthquake remote sensing images. Moreover, YOLOv3 was improved to obtain more effective detection results. First, we replaced the Darknet53 CNN in YOLOv3 with the lightweight CNN ShuffleNet v2. Second, the prediction box center point, XY loss, and prediction box width and height, WH loss, in the loss function was replaced with the generalized intersection over union (GIoU) loss. Experiments performed using the improved YOLOv3 model, with high spatial resolution aerial remote sensing images at resolutions of 0.5 m after the Yushu and Wenchuan earthquakes, show a significant reduction in the number of parameters, detection speed of up to 29.23 f/s, and target precision of 90.89%. Compared with the general YOLOv3, the detection speed improved by 5.21 f/s and its precision improved by 5.24%. Moreover, the improved model had stronger noise immunity capabilities, which indicates a significant improvement in the model’s generalization. Therefore, this improved YOLOv3 model is effective for the detection of collapsed buildings in post-earthquake high-resolution remote sensing images.


Sensors ◽  
2019 ◽  
Vol 19 (23) ◽  
pp. 5284 ◽  
Author(s):  
Heng Zhang ◽  
Jiayu Wu ◽  
Yanli Liu ◽  
Jia Yu

In recent years, the research on optical remote sensing images has received greater and greater attention. Object detection, as one of the most challenging tasks in the area of remote sensing, has been remarkably promoted by convolutional neural network (CNN)-based methods like You Only Look Once (YOLO) and Faster R-CNN. However, due to the complexity of backgrounds and the distinctive object distribution, directly applying these general object detection methods to the remote sensing object detection usually renders poor performance. To tackle this problem, a highly efficient and robust framework based on YOLO is proposed. We devise and integrate VaryBlock to the architecture which effectively offsets some of the information loss caused by downsampling. In addition, some techniques are utilized to facilitate the performance and to avoid overfitting. Experimental results show that our proposed method can enormously improve the mean average precision by a large margin on the NWPU VHR-10 dataset.


2020 ◽  
Vol 9 (6) ◽  
pp. 370
Author(s):  
Atakan Körez ◽  
Necaattin Barışçı ◽  
Aydın Çetin ◽  
Uçman Ergün

The detection of objects in very high-resolution (VHR) remote sensing images has become increasingly popular with the enhancement of remote sensing technologies. High-resolution images from aircrafts or satellites contain highly detailed and mixed backgrounds that decrease the success of object detection in remote sensing images. In this study, a model that performs weighted ensemble object detection using optimized coefficients is proposed. This model uses the outputs of three different object detection models trained on the same dataset. The model’s structure takes two or more object detection methods as its input and provides an output with an optimized coefficient-weighted ensemble. The Northwestern Polytechnical University Very High Resolution 10 (NWPU-VHR10) and Remote Sensing Object Detection (RSOD) datasets were used to measure the object detection success of the proposed model. Our experiments reveal that the proposed model improved the Mean Average Precision (mAP) performance by 0.78%–16.5% compared to stand-alone models and presents better mean average precision than other state-of-the-art methods (3.55% higher on the NWPU-VHR-10 dataset and 1.49% higher when using the RSOD dataset).


Sign in / Sign up

Export Citation Format

Share Document