Improved SRCNN remote sensing image spatio-temporal fusion based on multi-stream data input and attention mechanism: taking Landsat8 and MODIS remote sensing images as examples

Author(s):  
Ping Liu ◽  
Xiangru Jia ◽  
Bo Li ◽  
Xinrui Li ◽  
Feilong Wang ◽  
...  
2021 ◽  
Vol 2138 (1) ◽  
pp. 012016
Author(s):  
Shuangling Zhu ◽  
Guli Nazi·Aili Mujiang ◽  
Huxidan Jumahong ◽  
Pazi Laiti·Nuer Maiti

Abstract A U-Net convolutional network structure is fully capable of completing the end-to-end training with extremely little data, and can achieve better results. When the convolutional network has a short link between a near input layer and a near output layer, it can implement training in a deeper, more accurate and effective way. This paper mainly proposes a high-resolution remote sensing image change detection algorithm based on dense convolutional channel attention mechanism. The detection algorithm uses U-Net network module as the basic network to extract features, combines Dense-Net dense module to enhance U-Net, and introduces dense convolution channel attention mechanism into the basic convolution unit to highlight important features, thus completing semantic segmentation of dense convolutional remote sensing images. Simulation results have verified the effectiveness and robustness of this study.


2019 ◽  
Vol 11 (20) ◽  
pp. 2349 ◽  
Author(s):  
Zhengyuan Zhang ◽  
Wenhui Diao ◽  
Wenkai Zhang ◽  
Menglong Yan ◽  
Xin Gao ◽  
...  

Significant progress has been made in remote sensing image captioning by encoder-decoder frameworks. The conventional attention mechanism is prevalent in this task but still has some drawbacks. The conventional attention mechanism only uses visual information about the remote sensing images without considering using the label information to guide the calculation of attention masks. To this end, a novel attention mechanism, namely Label-Attention Mechanism (LAM), is proposed in this paper. LAM additionally utilizes the label information of high-resolution remote sensing images to generate natural sentences to describe the given images. It is worth noting that, instead of high-level image features, the predicted categories’ word embedding vectors are adopted to guide the calculation of attention masks. Representing the content of images in the form of word embedding vectors can filter out redundant image features. In addition, it can also preserve pure and useful information for generating complete sentences. The experimental results from UCM-Captions, Sydney-Captions and RSICD demonstrate that LAM can improve the model’s performance for describing high-resolution remote sensing images and obtain better S m scores compared with other methods. S m score is a hybrid scoring method derived from the AI Challenge 2017 scoring method. In addition, the validity of LAM is verified by the experiment of using true labels.


2019 ◽  
Vol 11 (6) ◽  
pp. 612 ◽  
Author(s):  
Xiangrong Zhang ◽  
Xin Wang ◽  
Xu Tang ◽  
Huiyu Zhou ◽  
Chen Li

Image captioning generates a semantic description of an image. It deals with image understanding and text mining, which has made great progress in recent years. However, it is still a great challenge to bridge the “semantic gap” between low-level features and high-level semantics in remote sensing images, in spite of the improvement of image resolutions. In this paper, we present a new model with an attribute attention mechanism for the description generation of remote sensing images. Therefore, we have explored the impact of the attributes extracted from remote sensing images on the attention mechanism. The results of our experiments demonstrate the validity of our proposed model. The proposed method obtains six higher scores and one slightly lower, compared against several state of the art techniques, on the Sydney Dataset and Remote Sensing Image Caption Dataset (RSICD), and receives all seven higher scores on the UCM Dataset for remote sensing image captioning, indicating that the proposed framework achieves robust performance for semantic description in high-resolution remote sensing images.


2021 ◽  
Vol 13 (16) ◽  
pp. 3192
Author(s):  
Yuxin Dong ◽  
Fukun Chen ◽  
Shuang Han ◽  
Hao Liu

At present, reliable and precise ship detection in high-resolution optical remote sensing images affected by wave clutter, thin clouds, and islands under complex sea conditions is still challenging. At the same time, object detection algorithms in satellite remote sensing images are challenged by color, aspect ratio, complex background, and angle variability. Even the results obtained based on the latest convolutional neural network (CNN) method are not satisfactory. In order to obtain more accurate ship detection results, this paper proposes a remote sensing image ship object detection method based on a brainlike visual attention mechanism. We refer to the robust expression mode of the human brain, design a vector field filter with active rotation capability, and explicitly encode the direction information of the remote sensing object in the neural network. The progressive enhancement learning model guided by the visual attention mechanism is used to dynamically solve the problem, and the object can be discovered and detected through time–space information. To verify the effectiveness of the proposed method, a remote sensing ship object detection data set is established, and the proposed method is compared with other state-of-the-art methods on the established data set. Experiments show that the object detection accuracy of this method and the ability to capture image details have been improved. Compared with other models, the average intersection rate of the joint is 80.12%, which shows a clear advantage. The proposed method is fast enough to meet the needs of ship detection in remote sensing images.


Author(s):  
Xiaochuan Tang ◽  
Mingzhe Liu ◽  
Hao Zhong ◽  
Yuanzhen Ju ◽  
Weile Li ◽  
...  

Landslide recognition is widely used in natural disaster risk management. Traditional landslide recognition is mainly conducted by geologists, which is accurate but inefficient. This article introduces multiple instance learning (MIL) to perform automatic landslide recognition. An end-to-end deep convolutional neural network is proposed, referred to as Multiple Instance Learning–based Landslide classification (MILL). First, MILL uses a large-scale remote sensing image classification dataset to build pre-train networks for landslide feature extraction. Second, MILL extracts instances and assign instance labels without pixel-level annotations. Third, MILL uses a new channel attention–based MIL pooling function to map instance-level labels to bag-level label. We apply MIL to detect landslides in a loess area. Experimental results demonstrate that MILL is effective in identifying landslides in remote sensing images.


2021 ◽  
Vol 13 (9) ◽  
pp. 1713
Author(s):  
Songwei Gu ◽  
Rui Zhang ◽  
Hongxia Luo ◽  
Mengyao Li ◽  
Huamei Feng ◽  
...  

Deep learning is an important research method in the remote sensing field. However, samples of remote sensing images are relatively few in real life, and those with markers are scarce. Many neural networks represented by Generative Adversarial Networks (GANs) can learn from real samples to generate pseudosamples, rather than traditional methods that often require more time and man-power to obtain samples. However, the generated pseudosamples often have poor realism and cannot be reliably used as the basis for various analyses and applications in the field of remote sensing. To address the abovementioned problems, a pseudolabeled sample generation method is proposed in this work and applied to scene classification of remote sensing images. The improved unconditional generative model that can be learned from a single natural image (Improved SinGAN) with an attention mechanism can effectively generate enough pseudolabeled samples from a single remote sensing scene image sample. Pseudosamples generated by the improved SinGAN model have stronger realism and relatively less training time, and the extracted features are easily recognized in the classification network. The improved SinGAN can better identify sub-jects from images with complex ground scenes compared with the original network. This mechanism solves the problem of geographic errors of generated pseudosamples. This study incorporated the generated pseudosamples into training data for the classification experiment. The result showed that the SinGAN model with the integration of the attention mechanism can better guarantee feature extraction of the training data. Thus, the quality of the generated samples is improved and the classification accuracy and stability of the classification network are also enhanced.


2021 ◽  
Vol 13 (5) ◽  
pp. 869
Author(s):  
Zheng Zhuo ◽  
Zhong Zhou

In recent years, the amount of remote sensing imagery data has increased exponentially. The ability to quickly and effectively find the required images from massive remote sensing archives is the key to the organization, management, and sharing of remote sensing image information. This paper proposes a high-resolution remote sensing image retrieval method with Gabor-CA-ResNet and a split-based deep feature transform network. The main contributions include two points. (1) For the complex texture, diverse scales, and special viewing angles of remote sensing images, A Gabor-CA-ResNet network taking ResNet as the backbone network is proposed by using Gabor to represent the spatial-frequency structure of images, channel attention (CA) mechanism to obtain stronger representative and discriminative deep features. (2) A split-based deep feature transform network is designed to divide the features extracted by the Gabor-CA-ResNet network into several segments and transform them separately for reducing the dimensionality and the storage space of deep features significantly. The experimental results on UCM, WHU-RS, RSSCN7, and AID datasets show that, compared with the state-of-the-art methods, our method can obtain competitive performance, especially for remote sensing images with rare targets and complex textures.


2021 ◽  
Vol 11 (11) ◽  
pp. 5050
Author(s):  
Jiahai Tan ◽  
Ming Gao ◽  
Kai Yang ◽  
Tao Duan

Road extraction from remote sensing images has attracted much attention in geospatial applications. However, the existing methods do not accurately identify the connectivity of the road. The identification of the road pixels may be interfered with by the abundant ground such as buildings, trees, and shadows. The objective of this paper is to enhance context and strip features of the road by designing UNet-like architecture. The overall method first enhances the context characteristics in the segmentation step and then maintains the stripe characteristics in a refinement step. The segmentation step exploits an attention mechanism to enhance the context information between the adjacent layers. To obtain the strip features of the road, the refinement step introduces the strip pooling in a refinement network to restore the long distance dependent information of the road. Extensive comparative experiments demonstrate that the proposed method outperforms other methods, achieving an overall accuracy of 98.25% on the DeepGlobe dataset, and 97.68% on the Massachusetts dataset.


2021 ◽  
Vol 13 (10) ◽  
pp. 1950
Author(s):  
Cuiping Shi ◽  
Xin Zhao ◽  
Liguo Wang

In recent years, with the rapid development of computer vision, increasing attention has been paid to remote sensing image scene classification. To improve the classification performance, many studies have increased the depth of convolutional neural networks (CNNs) and expanded the width of the network to extract more deep features, thereby increasing the complexity of the model. To solve this problem, in this paper, we propose a lightweight convolutional neural network based on attention-oriented multi-branch feature fusion (AMB-CNN) for remote sensing image scene classification. Firstly, we propose two convolution combination modules for feature extraction, through which the deep features of images can be fully extracted with multi convolution cooperation. Then, the weights of the feature are calculated, and the extracted deep features are sent to the attention mechanism for further feature extraction. Next, all of the extracted features are fused by multiple branches. Finally, depth separable convolution and asymmetric convolution are implemented to greatly reduce the number of parameters. The experimental results show that, compared with some state-of-the-art methods, the proposed method still has a great advantage in classification accuracy with very few parameters.


Sign in / Sign up

Export Citation Format

Share Document