scholarly journals Semantic Multigranularity Feature Learning for High-Resolution Remote Sensing Image Scene Classification

2021 ◽  
Vol 11 (19) ◽  
pp. 9204
Author(s):  
Xinyi Ma ◽  
Zhifeng Xiao ◽  
Hong-sik Yun ◽  
Seung-Jun Lee

High-resolution remote sensing image scene classification is a challenging visual task due to the large intravariance and small intervariance between the categories. To accurately recognize the scene categories, it is essential to learn discriminative features from both global and local critical regions. Recent efforts focus on how to encourage the network to learn multigranularity features with the destruction of the spatial information on the input image at different scales, which leads to meaningless edges that are harmful to training. In this study, we propose a novel method named Semantic Multigranularity Feature Learning Network (SMGFL-Net) for remote sensing image scene classification. The core idea is to learn both global and multigranularity local features from rearranged intermediate feature maps, thus, eliminating the meaningless edges. These features are then fused for the final prediction. Our proposed framework is compared with a collection of state-of-the-art (SOTA) methods on two fine-grained remote sensing image scene datasets, including the NWPU-RESISC45 and Aerial Image Datasets (AID). We justify several design choices, including the branch granularities, fusion strategies, pooling operations, and necessity of feature map rearrangement through a comparative study. Moreover, the overall performance results show that SMGFL-Net consistently outperforms other peer methods in classification accuracy, and the superiority is more apparent with less training data, demonstrating the efficacy of feature learning of our approach.

2019 ◽  
Vol 11 (5) ◽  
pp. 494 ◽  
Author(s):  
Wei Zhang ◽  
Ping Tang ◽  
Lijun Zhao

Remote sensing image scene classification is one of the most challenging problems in understanding high-resolution remote sensing images. Deep learning techniques, especially the convolutional neural network (CNN), have improved the performance of remote sensing image scene classification due to the powerful perspective of feature learning and reasoning. However, several fully connected layers are always added to the end of CNN models, which is not efficient in capturing the hierarchical structure of the entities in the images and does not fully consider the spatial information that is important to classification. Fortunately, capsule network (CapsNet), which is a novel network architecture that uses a group of neurons as a capsule or vector to replace the neuron in the traditional neural network and can encode the properties and spatial information of features in an image to achieve equivariance, has become an active area in the classification field in the past two years. Motivated by this idea, this paper proposes an effective remote sensing image scene classification architecture named CNN-CapsNet to make full use of the merits of these two models: CNN and CapsNet. First, a CNN without fully connected layers is used as an initial feature maps extractor. In detail, a pretrained deep CNN model that was fully trained on the ImageNet dataset is selected as a feature extractor in this paper. Then, the initial feature maps are fed into a newly designed CapsNet to obtain the final classification result. The proposed architecture is extensively evaluated on three public challenging benchmark remote sensing image datasets: the UC Merced Land-Use dataset with 21 scene categories, AID dataset with 30 scene categories, and the NWPU-RESISC45 dataset with 45 challenging scene categories. The experimental results demonstrate that the proposed method can lead to a competitive classification performance compared with the state-of-the-art methods.


2021 ◽  
Vol 13 (16) ◽  
pp. 3113
Author(s):  
Ming Li ◽  
Lin Lei ◽  
Yuqi Tang ◽  
Yuli Sun ◽  
Gangyao Kuang

Remote sensing image scene classification (RSISC) has broad application prospects, but related challenges still exist and urgently need to be addressed. One of the most important challenges is how to learn a strong discriminative scene representation. Recently, convolutional neural networks (CNNs) have shown great potential in RSISC due to their powerful feature learning ability; however, their performance may be restricted by the complexity of remote sensing images, such as spatial layout, varying scales, complex backgrounds, category diversity, etc. In this paper, we propose an attention-guided multilayer feature aggregation network (AGMFA-Net) that attempts to improve the scene classification performance by effectively aggregating features from different layers. Specifically, to reduce the discrepancies between different layers, we employed the channel–spatial attention on multiple high-level convolutional feature maps to capture more accurately semantic regions that correspond to the content of the given scene. Then, we utilized the learned semantic regions as guidance to aggregate the valuable information from multilayer convolutional features, so as to achieve stronger scene features for classification. Experimental results on three remote sensing scene datasets indicated that our approach achieved competitive classification performance in comparison to the baselines and other state-of-the-art methods.


2019 ◽  
Vol 11 (20) ◽  
pp. 2380 ◽  
Author(s):  
Liu ◽  
Luo ◽  
Huang ◽  
Hu ◽  
Sun ◽  
...  

Deep convolutional neural networks have promoted significant progress in building extraction from high-resolution remote sensing imagery. Although most of such work focuses on modifying existing image segmentation networks in computer vision, we propose a new network in this paper, Deep Encoding Network (DE-Net), that is designed for the very problem based on many lately introduced techniques in image segmentation. Four modules are used to construct DE-Net: the inceptionstyle downsampling modules combining a striding convolution layer and a max-pooling layer, the encoding modules comprising six linear residual blocks with a scaled exponential linear unit (SELU) activation function, the compressing modules reducing the feature channels, and a densely upsampling module that enables the network to encode spatial information inside feature maps. Thus, DE-Net achieves stateoftheart performance on the WHU Building Dataset in recall, F1-Score, and intersection over union (IoU) metrics without pretraining. It also outperformed several segmentation networks in our self-built Suzhou Satellite Building Dataset. The experimental results validate the effectiveness of DE-Net on building extraction from aerial imagery and satellite imagery. It also suggests that given enough training data, designing and training a network from scratch may excel fine-tuning models pre-trained on datasets unrelated to building extraction.


2020 ◽  
Vol 12 (11) ◽  
pp. 1887 ◽  
Author(s):  
Xiaolei Zhao ◽  
Jing Zhang ◽  
Jimiao Tian ◽  
Li Zhuo ◽  
Jie Zhang

The scene classification of a remote sensing image has been widely used in various fields as an important task of understanding the content of a remote sensing image. Specially, a high-resolution remote sensing scene contains rich information and complex content. Considering that the scene content in a remote sensing image is very tight to the spatial relationship characteristics, how to design an effective feature extraction network directly decides the quality of classification by fully mining the spatial information in a high-resolution remote sensing image. In recent years, convolutional neural networks (CNNs) have achieved excellent performance in remote sensing image classification, especially the residual dense network (RDN) as one of the representative networks of CNN, which shows a stronger feature learning ability as it fully utilizes all the convolutional layer information. Therefore, we design an RDN based on channel-spatial attention for scene classification of a high-resolution remote sensing image. First, multi-layer convolutional features are fused with residual dense blocks. Then, a channel-spatial attention module is added to obtain more effective feature representation. Finally, softmax classifier is applied to classify the scene after adopting data augmentation strategy for meeting the training requirements of the network parameters. Five experiments are conducted on the UC Merced Land-Use Dataset (UCM) and Aerial Image Dataset (AID), and the competitive results demonstrate that our method can extract more effective features and is more conducive to classifying a scene.


2019 ◽  
Vol 11 (21) ◽  
pp. 2504 ◽  
Author(s):  
Jun Zhang ◽  
Min Zhang ◽  
Lukui Shi ◽  
Wenjie Yan ◽  
Bin Pan

Scene classification is one of the bases for automatic remote sensing image interpretation. Recently, deep convolutional neural networks have presented promising performance in high-resolution remote sensing scene classification research. In general, most researchers directly use raw deep features extracted from the convolutional networks to classify scenes. However, this strategy only considers single scale features, which cannot describe both the local and global features of images. In fact, the dissimilarity of scene targets in the same category may result in convolutional features being unable to classify them into the same category. Besides, the similarity of the global features in different categories may also lead to failure of fully connected layer features to distinguish them. To address these issues, we propose a scene classification method based on multi-scale deep feature representation (MDFR), which mainly includes two contributions: (1) region-based features selection and representation; and (2) multi-scale features fusion. Initially, the proposed method filters the multi-scale deep features extracted from pre-trained convolutional networks. Subsequently, these features are fused via two efficient fusion methods. Our method utilizes the complementarity between local features and global features by effectively exploiting the features of different scales and discarding the redundant information in features. Experimental results on three benchmark high-resolution remote sensing image datasets indicate that the proposed method is comparable to some state-of-the-art algorithms.


Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3232 ◽  
Author(s):  
Yan Liu ◽  
Qirui Ren ◽  
Jiahui Geng ◽  
Meng Ding ◽  
Jiangyun Li

Efficient and accurate semantic segmentation is the key technique for automatic remote sensing image analysis. While there have been many segmentation methods based on traditional hand-craft feature extractors, it is still challenging to process high-resolution and large-scale remote sensing images. In this work, a novel patch-wise semantic segmentation method with a new training strategy based on fully convolutional networks is presented to segment common land resources. First, to handle the high-resolution image, the images are split as local patches and then a patch-wise network is built. Second, training data is preprocessed in several ways to meet the specific characteristics of remote sensing images, i.e., color imbalance, object rotation variations and lens distortion. Third, a multi-scale training strategy is developed to solve the severe scale variation problem. In addition, the impact of conditional random field (CRF) is studied to improve the precision. The proposed method was evaluated on a dataset collected from a capital city in West China with the Gaofen-2 satellite. The dataset contains ten common land resources (Grassland, Road, etc.). The experimental results show that the proposed algorithm achieves 54.96% in terms of mean intersection over union (MIoU) and outperforms other state-of-the-art methods in remote sensing image segmentation.


Author(s):  
Y. Yang ◽  
H. T. Li ◽  
Y. S. Han ◽  
H. Y. Gu

Image segmentation is the foundation of further object-oriented image analysis, understanding and recognition. It is one of the key technologies in high resolution remote sensing applications. In this paper, a new fast image segmentation algorithm for high resolution remote sensing imagery is proposed, which is based on graph theory and fractal net evolution approach (FNEA). Firstly, an image is modelled as a weighted undirected graph, where nodes correspond to pixels, and edges connect adjacent pixels. An initial object layer can be obtained efficiently from graph-based segmentation, which runs in time nearly linear in the number of image pixels. Then FNEA starts with the initial object layer and a pairwise merge of its neighbour object with the aim to minimize the resulting summed heterogeneity. Furthermore, according to the character of different features in high resolution remote sensing image, three different merging criterions for image objects based on spectral and spatial information are adopted. Finally, compared with the commercial remote sensing software eCognition, the experimental results demonstrate that the efficiency of the algorithm has significantly improved, and the result can maintain good feature boundaries.


2019 ◽  
Vol 11 (5) ◽  
pp. 518 ◽  
Author(s):  
Bao-Di Liu ◽  
Jie Meng ◽  
Wen-Yang Xie ◽  
Shuai Shao ◽  
Ye Li ◽  
...  

At present, nonparametric subspace classifiers, such as collaborative representation-based classification (CRC) and sparse representation-based classification (SRC), are widely used in many pattern-classification and -recognition tasks. Meanwhile, the spatial pyramid matching (SPM) scheme, which considers spatial information in representing the image, is efficient for image classification. However, for SPM, the weights to evaluate the representation of different subregions are fixed. In this paper, we first introduce the spatial pyramid matching scheme to remote-sensing (RS)-image scene-classification tasks to improve performance. Then, we propose a weighted spatial pyramid matching collaborative-representation-based classification method, combining the CRC method with the weighted spatial pyramid matching scheme. The proposed method is capable of learning the weights of different subregions in representing an image. Finally, extensive experiments on several benchmark remote-sensing-image datasets were conducted and clearly demonstrate the superior performance of our proposed algorithm when compared with state-of-the-art approaches.


Sign in / Sign up

Export Citation Format

Share Document