Dense Semantic Labeling with Atrous Spatial Pyramid Pooling and Decoder for High-Resolution Remote Sensing Imagery

Dense semantic labeling is significant in high-resolution remote sensing imagery research and it has been widely used in land-use analysis and environment protection. With the recent success of fully convolutional networks (FCN), various types of network architectures have largely improved performance. Among them, atrous spatial pyramid pooling (ASPP) and encoder-decoder are two successful ones. The former structure is able to extract multi-scale contextual information and multiple effective field-of-view, while the latter structure can recover the spatial information to obtain sharper object boundaries. In this study, we propose a more efficient fully convolutional network by combining the advantages from both structures. Our model utilizes the deep residual network (ResNet) followed by ASPP as the encoder and combines two scales of high-level features with corresponding low-level features as the decoder at the upsampling stage. We further develop a multi-scale loss function to enhance the learning procedure. In the postprocessing, a novel superpixel-based dense conditional random field is employed to refine the predictions. We evaluate the proposed method on the Potsdam and Vaihingen datasets and the experimental results demonstrate that our method performs better than other machine learning or deep learning methods. Compared with the state-of-the-art DeepLab_v3+ our model gains 0.4% and 0.6% improvements in overall accuracy on these two datasets respectively.

Download Full-text

Automatic Building Extraction on High-Resolution Remote Sensing Imagery Using Deep Convolutional Encoder-Decoder With Spatial Pyramid Pooling

IEEE Access ◽

10.1109/access.2019.2940527 ◽

2019 ◽

Vol 7 ◽

pp. 128774-128786 ◽

Cited By ~ 5

Author(s):

Yaohui Liu ◽

Lutz Gross ◽

Zhiqiang Li ◽

Xiaoli Li ◽

Xiwei Fan ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Building Extraction ◽

Remote Sensing Imagery ◽

Spatial Pyramid Pooling ◽

Convolutional Encoder ◽

Spatial Pyramid

Download Full-text

A Novel Effectively Optimized One-Stage Network for Object Detection in Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs11111376 ◽

2019 ◽

Vol 11 (11) ◽

pp. 1376 ◽

Cited By ~ 4

Author(s):

Weiying Xie ◽

Haonan Qin ◽

Yunsong Li ◽

Zhuo Wang ◽

Jie Lei

Keyword(s):

Remote Sensing ◽

Spatial Information ◽

Feature Fusion ◽

Field Enhancement ◽

Feature Representation ◽

Detection Accuracy ◽

Convolutional Network ◽

Remote Sensing Imagery ◽

Multi Scale ◽

One Stage

With great significance in military and civilian applications, the topic of detecting small and densely arranged objects in wide-scale remote sensing imagery is still challenging nowadays. To solve this problem, we propose a novel effectively optimized one-stage network (NEOON). As a fully convolutional network, NEOON consists of four parts: Feature extraction, feature fusion, feature enhancement, and multi-scale detection. To extract effective features, the first part has implemented bottom-up and top-down coherent processing by taking successive down-sampling and up-sampling operations in conjunction with residual modules. The second part consolidates high-level and low-level features by adopting concatenation operations with subsequent convolutional operations to explicitly yield strong feature representation and semantic information. The third part is implemented by constructing a receptive field enhancement (RFE) module and incorporating it into the fore part of the network where the information of small objects exists. The final part is achieved by four detectors with different sensitivities accessing the fused features, all four parallel, to enable the network to make full use of information of objects in different scales. Besides, the Focal Loss is set to enable the cross entropy for classification to solve the tough problem of class imbalance in one-stage methods. In addition, we introduce the Soft-NMS to preserve accurate bounding boxes in the post-processing stage especially for densely arranged objects. Note that the split and merge strategy and multi-scale training strategy are employed in training. Thorough experiments are performed on ACS datasets constructed by us and NWPU VHR-10 datasets to evaluate the performance of NEOON. Specifically, 4.77% and 5.50% improvements in mAP and recall, respectively, on the ACS dataset as compared to YOLOv3 powerfully prove that NEOON can effectually improve the detection accuracy of small objects in remote sensing imagery. In addition, extensive experiments and comprehensive evaluations on the NWPU VHR-10 dataset with 10 classes have illustrated the superiority of NEOON in the extraction of spatial information of high-resolution remote sensing images.

Download Full-text

Detection of excavated areas in high-resolution remote sensing imagery using combined hierarchical spatial pyramid pooling and VGGNet

Remote Sensing Letters ◽

10.1080/2150704x.2021.1980240 ◽

2021 ◽

Vol 12 (12) ◽

pp. 1269-1280

Author(s):

Yungang Cao ◽

Wei Zhang ◽

Xueqin Bai ◽

Kai Chen

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Remote Sensing Imagery ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Download Full-text

Greenhouse extraction with high-resolution remote sensing imagery using fused fully convolutional network and object-oriented image analysis

Journal of Applied Remote Sensing ◽

10.1117/1.jrs.15.046502 ◽

2021 ◽

Vol 15 (04) ◽

Author(s):

Hairong Ma ◽

Tianjing Feng ◽

Xiangcheng Shen ◽

Zhiqing Luo ◽

Pingting Chen ◽

...

Keyword(s):

Remote Sensing ◽

Image Analysis ◽

High Resolution ◽

Object Oriented ◽

Convolutional Network ◽

Fully Convolutional Network ◽

Remote Sensing Imagery

Download Full-text

Raft cultivation area extraction from high resolution remote sensing imagery by fusing multi-scale region-line primitive association features

ISPRS Journal of Photogrammetry and Remote Sensing ◽

10.1016/j.isprsjprs.2016.10.008 ◽

2017 ◽

Vol 123 ◽

pp. 104-113 ◽

Cited By ~ 23

Author(s):

Min Wang ◽

Qi Cui ◽

Jie Wang ◽

Dongping Ming ◽

Guonian Lv

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Remote Sensing Imagery ◽

Multi Scale ◽

Region Line

Download Full-text

Adaptive conditional random field classification framework based on spatial homogeneity for high-resolution remote sensing imagery

Remote Sensing Letters ◽

10.1080/2150704x.2020.1731768 ◽

2020 ◽

Vol 11 (6) ◽

pp. 515-524 ◽

Cited By ~ 2

Author(s):

Yanfei Zhong ◽

Jing Wang ◽

Ji Zhao

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Random Field ◽

Conditional Random Field ◽

Spatial Homogeneity ◽

Remote Sensing Imagery ◽

Classification Framework

Download Full-text

Geospatial Object Detection on High Resolution Remote Sensing Imagery Based on Double Multi-Scale Feature Pyramid Network

Remote Sensing ◽

10.3390/rs11070755 ◽

2019 ◽

Vol 11 (7) ◽

pp. 755 ◽

Cited By ~ 20

Author(s):

Xiaodong Zhang ◽

Kun Zhu ◽

Guanzhou Chen ◽

Xiaoliang Tan ◽

Lifei Zhang ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Object Detection ◽

Large Scale ◽

Training Data ◽

Validation Dataset ◽

Remote Sensing Imagery ◽

Scale Feature ◽

Multi Scale ◽

Feature Pyramid

Object detection on very-high-resolution (VHR) remote sensing imagery has attracted a lot of attention in the field of image automatic interpretation. Region-based convolutional neural networks (CNNs) have been vastly promoted in this domain, which first generate candidate regions and then accurately classify and locate the objects existing in these regions. However, the overlarge images, the complex image backgrounds and the uneven size and quantity distribution of training samples make the detection tasks more challenging, especially for small and dense objects. To solve these problems, an effective region-based VHR remote sensing imagery object detection framework named Double Multi-scale Feature Pyramid Network (DM-FPN) was proposed in this paper, which utilizes inherent multi-scale pyramidal features and combines the strong-semantic, low-resolution features and the weak-semantic, high-resolution features simultaneously. DM-FPN consists of a multi-scale region proposal network and a multi-scale object detection network, these two modules share convolutional layers and can be trained end-to-end. We proposed several multi-scale training strategies to increase the diversity of training data and overcome the size restrictions of the input images. We also proposed multi-scale inference and adaptive categorical non-maximum suppression (ACNMS) strategies to promote detection performance, especially for small and dense objects. Extensive experiments and comprehensive evaluations on large-scale DOTA dataset demonstrate the effectiveness of the proposed framework, which achieves mean average precision (mAP) value of 0.7927 on validation dataset and the best mAP value of 0.793 on testing dataset.

Download Full-text