scholarly journals Ensemble model with cascade attention mechanism for high-resolution remote sensing image scene classification

2020 ◽  
Vol 28 (15) ◽  
pp. 22358
Author(s):  
Fengpeng Li ◽  
Ruyi Feng ◽  
Wei Han ◽  
Lizhe Wang
2021 ◽  
Vol 13 (10) ◽  
pp. 1950
Author(s):  
Cuiping Shi ◽  
Xin Zhao ◽  
Liguo Wang

In recent years, with the rapid development of computer vision, increasing attention has been paid to remote sensing image scene classification. To improve the classification performance, many studies have increased the depth of convolutional neural networks (CNNs) and expanded the width of the network to extract more deep features, thereby increasing the complexity of the model. To solve this problem, in this paper, we propose a lightweight convolutional neural network based on attention-oriented multi-branch feature fusion (AMB-CNN) for remote sensing image scene classification. Firstly, we propose two convolution combination modules for feature extraction, through which the deep features of images can be fully extracted with multi convolution cooperation. Then, the weights of the feature are calculated, and the extracted deep features are sent to the attention mechanism for further feature extraction. Next, all of the extracted features are fused by multiple branches. Finally, depth separable convolution and asymmetric convolution are implemented to greatly reduce the number of parameters. The experimental results show that, compared with some state-of-the-art methods, the proposed method still has a great advantage in classification accuracy with very few parameters.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Jiayuan Kong ◽  
Yurong Gao ◽  
Yanjun Zhang ◽  
Huimin Lei ◽  
Yao Wang ◽  
...  

2019 ◽  
Vol 11 (20) ◽  
pp. 2349 ◽  
Author(s):  
Zhengyuan Zhang ◽  
Wenhui Diao ◽  
Wenkai Zhang ◽  
Menglong Yan ◽  
Xin Gao ◽  
...  

Significant progress has been made in remote sensing image captioning by encoder-decoder frameworks. The conventional attention mechanism is prevalent in this task but still has some drawbacks. The conventional attention mechanism only uses visual information about the remote sensing images without considering using the label information to guide the calculation of attention masks. To this end, a novel attention mechanism, namely Label-Attention Mechanism (LAM), is proposed in this paper. LAM additionally utilizes the label information of high-resolution remote sensing images to generate natural sentences to describe the given images. It is worth noting that, instead of high-level image features, the predicted categories’ word embedding vectors are adopted to guide the calculation of attention masks. Representing the content of images in the form of word embedding vectors can filter out redundant image features. In addition, it can also preserve pure and useful information for generating complete sentences. The experimental results from UCM-Captions, Sydney-Captions and RSICD demonstrate that LAM can improve the model’s performance for describing high-resolution remote sensing images and obtain better S m scores compared with other methods. S m score is a hybrid scoring method derived from the AI Challenge 2017 scoring method. In addition, the validity of LAM is verified by the experiment of using true labels.


2021 ◽  
Vol 87 (8) ◽  
pp. 577-591
Author(s):  
Fengpeng Li ◽  
Jiabao Li ◽  
Wei Han ◽  
Ruyi Feng ◽  
Lizhe Wang

Inspired by the outstanding achievement of deep learning, supervised deep learning representation methods for high-spatial-resolution remote sensing image scene classification obtained state-of-the-art performance. However, supervised deep learning representation methods need a considerable amount of labeled data to capture class-specific features, limiting the application of deep learning-based methods while there are a few labeled training samples. An unsupervised deep learning representation, high-resolution remote sensing image scene classification method is proposed in this work to address this issue. The proposed method, called contrastive learning, narrows the distance between positive views: color channels belonging to the same images widens the gaps between negative view pairs consisting of color channels from different images to obtain class-specific data representations of the input data without any supervised information. The classifier uses extracted features by the convolutional neural network (CNN)-based feature extractor with labeled information of training data to set space of each category and then, using linear regression, makes predictions in the testing procedure. Comparing with existing unsupervised deep learning representation high-resolution remote sensing image scene classification methods, contrastive learning CNN achieves state-of-the-art performance on three different scale benchmark data sets: small scale RSSCN7 data set, midscale aerial image data set, and large-scale NWPU-RESISC45 data set.


2021 ◽  
Vol 13 (13) ◽  
pp. 2457
Author(s):  
Xuan Wu ◽  
Zhijie Zhang ◽  
Wanchang Zhang ◽  
Yaning Yi ◽  
Chuanrong Zhang ◽  
...  

Convolutional neural network (CNN) is capable of automatically extracting image features and has been widely used in remote sensing image classifications. Feature extraction is an important and difficult problem in current research. In this paper, data augmentation for avoiding over fitting was attempted to enrich features of samples to improve the performance of a newly proposed convolutional neural network with UC-Merced and RSI-CB datasets for remotely sensed scene classifications. A multiple grouped convolutional neural network (MGCNN) for self-learning that is capable of promoting the efficiency of CNN was proposed, and the method of grouping multiple convolutional layers capable of being applied elsewhere as a plug-in model was developed. Meanwhile, a hyper-parameter C in MGCNN is introduced to probe into the influence of different grouping strategies for feature extraction. Experiments on the two selected datasets, the RSI-CB dataset and UC-Merced dataset, were carried out to verify the effectiveness of this newly proposed convolutional neural network, the accuracy obtained by MGCNN was 2% higher than the ResNet-50. An algorithm of attention mechanism was thus adopted and incorporated into grouping processes and a multiple grouped attention convolutional neural network (MGCNN-A) was therefore constructed to enhance the generalization capability of MGCNN. The additional experiments indicate that the incorporation of the attention mechanism to MGCNN slightly improved the accuracy of scene classification, but the robustness of the proposed network was enhanced considerably in remote sensing image classifications.


Sign in / Sign up

Export Citation Format

Share Document