Scene Attention Mechanism for Remote Sensing Image Caption Generation

Author(s):  
Shiqi Wu ◽  
Xiangrong Zhang ◽  
Xin Wang ◽  
Chen Li ◽  
Licheng Jiao
2018 ◽  
Vol 56 (4) ◽  
pp. 2183-2195 ◽  
Author(s):  
Xiaoqiang Lu ◽  
Binqiang Wang ◽  
Xiangtao Zheng ◽  
Xuelong Li

2020 ◽  
Vol 79 (35-36) ◽  
pp. 26661-26682
Author(s):  
Xiangqing Shen ◽  
Bing Liu ◽  
Yong Zhou ◽  
Jiaqi Zhao

2020 ◽  
Vol 12 (6) ◽  
pp. 939 ◽  
Author(s):  
Yangyang Li ◽  
Shuangkang Fang ◽  
Licheng Jiao ◽  
Ruijiao Liu ◽  
Ronghua Shang

The task of image captioning involves the generation of a sentence that can describe an image appropriately, which is the intersection of computer vision and natural language. Although the research on remote sensing image captions has just started, it has great significance. The attention mechanism is inspired by the way humans think, which is widely used in remote sensing image caption tasks. However, the attention mechanism currently used in this task is mainly aimed at images, which is too simple to express such a complex task well. Therefore, in this paper, we propose a multi-level attention model, which is a closer imitation of attention mechanisms of human beings. This model contains three attention structures, which represent the attention to different areas of the image, the attention to different words, and the attention to vision and semantics. Experiments show that our model has achieved better results than before, which is currently state-of-the-art. In addition, the existing datasets for remote sensing image captioning contain a large number of errors. Therefore, in this paper, a lot of work has been done to modify the existing datasets in order to promote the research of remote sensing image captioning.


2021 ◽  
Vol 13 (10) ◽  
pp. 1950
Author(s):  
Cuiping Shi ◽  
Xin Zhao ◽  
Liguo Wang

In recent years, with the rapid development of computer vision, increasing attention has been paid to remote sensing image scene classification. To improve the classification performance, many studies have increased the depth of convolutional neural networks (CNNs) and expanded the width of the network to extract more deep features, thereby increasing the complexity of the model. To solve this problem, in this paper, we propose a lightweight convolutional neural network based on attention-oriented multi-branch feature fusion (AMB-CNN) for remote sensing image scene classification. Firstly, we propose two convolution combination modules for feature extraction, through which the deep features of images can be fully extracted with multi convolution cooperation. Then, the weights of the feature are calculated, and the extracted deep features are sent to the attention mechanism for further feature extraction. Next, all of the extracted features are fused by multiple branches. Finally, depth separable convolution and asymmetric convolution are implemented to greatly reduce the number of parameters. The experimental results show that, compared with some state-of-the-art methods, the proposed method still has a great advantage in classification accuracy with very few parameters.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Jiayuan Kong ◽  
Yurong Gao ◽  
Yanjun Zhang ◽  
Huimin Lei ◽  
Yao Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document