Multi-modal Remote Sensing Image Description Based on Word Embedding and Self-Attention Mechanism

Significant progress has been made in remote sensing image captioning by encoder-decoder frameworks. The conventional attention mechanism is prevalent in this task but still has some drawbacks. The conventional attention mechanism only uses visual information about the remote sensing images without considering using the label information to guide the calculation of attention masks. To this end, a novel attention mechanism, namely Label-Attention Mechanism (LAM), is proposed in this paper. LAM additionally utilizes the label information of high-resolution remote sensing images to generate natural sentences to describe the given images. It is worth noting that, instead of high-level image features, the predicted categories’ word embedding vectors are adopted to guide the calculation of attention masks. Representing the content of images in the form of word embedding vectors can filter out redundant image features. In addition, it can also preserve pure and useful information for generating complete sentences. The experimental results from UCM-Captions, Sydney-Captions and RSICD demonstrate that LAM can improve the model’s performance for describing high-resolution remote sensing images and obtain better S m scores compared with other methods. S m score is a hybrid scoring method derived from the AI Challenge 2017 scoring method. In addition, the validity of LAM is verified by the experiment of using true labels.

Download Full-text

A Multi-Branch Feature Fusion Strategy Based on an Attention Mechanism for Remote Sensing Image Scene Classification

Remote Sensing ◽

10.3390/rs13101950 ◽

2021 ◽

Vol 13 (10) ◽

pp. 1950

Author(s):

Cuiping Shi ◽

Xin Zhao ◽

Liguo Wang

Keyword(s):

Remote Sensing ◽

Feature Extraction ◽

Classification Accuracy ◽

Feature Fusion ◽

State Of The Art ◽

Rapid Development ◽

Remote Sensing Image ◽

Classification Performance ◽

Attention Mechanism ◽

Scene Classification

In recent years, with the rapid development of computer vision, increasing attention has been paid to remote sensing image scene classification. To improve the classification performance, many studies have increased the depth of convolutional neural networks (CNNs) and expanded the width of the network to extract more deep features, thereby increasing the complexity of the model. To solve this problem, in this paper, we propose a lightweight convolutional neural network based on attention-oriented multi-branch feature fusion (AMB-CNN) for remote sensing image scene classification. Firstly, we propose two convolution combination modules for feature extraction, through which the deep features of images can be fully extracted with multi convolution cooperation. Then, the weights of the feature are calculated, and the extracted deep features are sent to the attention mechanism for further feature extraction. Next, all of the extracted features are fused by multiple branches. Finally, depth separable convolution and asymmetric convolution are implemented to greatly reduce the number of parameters. The experimental results show that, compared with some state-of-the-art methods, the proposed method still has a great advantage in classification accuracy with very few parameters.

Download Full-text

Ensemble model with cascade attention mechanism for high-resolution remote sensing image scene classification

Optics Express ◽

10.1364/oe.395866 ◽

2020 ◽

Vol 28 (15) ◽

pp. 22358

Author(s):

Fengpeng Li ◽

Ruyi Feng ◽

Wei Han ◽

Lizhe Wang

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Remote Sensing Image ◽

Attention Mechanism ◽

Ensemble Model ◽

Scene Classification

Download Full-text

An Augmentation Attention Mechanism for High-Spatial-Resolution Remote Sensing Image Scene Classification

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing ◽

10.1109/jstars.2020.3006241 ◽

2020 ◽

Vol 13 ◽

pp. 3862-3878

Author(s):

Fengpeng Li ◽

Ruyi Feng ◽

Wei Han ◽

Lizhe Wang

Keyword(s):

Remote Sensing ◽

Spatial Resolution ◽

High Spatial Resolution ◽

Remote Sensing Image ◽

Attention Mechanism ◽

Scene Classification

Download Full-text

Remote Sensing Image Change Detection Based on Information Transmission and Attention Mechanism

IEEE Access ◽

10.1109/access.2019.2947286 ◽

2019 ◽

Vol 7 ◽

pp. 156349-156359 ◽

Cited By ~ 4

Author(s):

Ruochen Liu ◽

Zhihong Cheng ◽

Langlang Zhang ◽

Jianxia Li

Keyword(s):

Remote Sensing ◽

Change Detection ◽

Information Transmission ◽

Remote Sensing Image ◽

Attention Mechanism ◽

Image Change Detection

Download Full-text

Improved Attention Mechanism and Residual Network for Remote Sensing Image Scene Classification

IEEE Access ◽

10.1109/access.2021.3116968 ◽

2021 ◽

pp. 1-1

Author(s):

Jiayuan Kong ◽

Yurong Gao ◽

Yanjun Zhang ◽

Huimin Lei ◽

Yao Wang ◽

...

Keyword(s):

Remote Sensing ◽

Remote Sensing Image ◽

Attention Mechanism ◽

Scene Classification ◽

Residual Network

Download Full-text

A Remote Sensing Image Segmentation Method Based on Fusion Mechanism

Journal of Physics Conference Series ◽

10.1088/1742-6596/2138/1/012016 ◽

2021 ◽

Vol 2138 (1) ◽

pp. 012016

Author(s):

Shuangling Zhu ◽

Guli Nazi·Aili Mujiang ◽

Huxidan Jumahong ◽

Pazi Laiti·Nuer Maiti

Keyword(s):

Remote Sensing ◽

Semantic Segmentation ◽

Remote Sensing Image ◽

Detection Algorithm ◽

Attention Mechanism ◽

Segmentation Method ◽

Remote Sensing Images ◽

Convolutional Network ◽

Input Layer ◽

Basic Network

Abstract A U-Net convolutional network structure is fully capable of completing the end-to-end training with extremely little data, and can achieve better results. When the convolutional network has a short link between a near input layer and a near output layer, it can implement training in a deeper, more accurate and effective way. This paper mainly proposes a high-resolution remote sensing image change detection algorithm based on dense convolutional channel attention mechanism. The detection algorithm uses U-Net network module as the basic network to extract features, combines Dense-Net dense module to enhance U-Net, and introduces dense convolution channel attention mechanism into the basic convolution unit to highlight important features, thus completing semantic segmentation of dense convolutional remote sensing images. Simulation results have verified the effectiveness and robustness of this study.

Download Full-text