scholarly journals Swin-HSTPS: Research on Target Detection Algorithms for Multi-Source High-Resolution Remote Sensing Images

Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8113
Author(s):  
Kun Fang ◽  
Jianquan Ouyang ◽  
Buwei Hu

Traffic port stations are composed of buildings, infrastructure, and transportation vehicles. The target detection of traffic port stations in high-resolution remote sensing images needs to collect feature information of nearby small targets, comprehensively analyze and classify, and finally complete the traffic port station positioning. At present, deep learning methods based on convolutional neural networks have made great progress in single-target detection of high-resolution remote sensing images. How to show good adaptability to the recognition of multi-target complexes of high-resolution remote sensing images is a difficult point in the current remote sensing field. This paper constructs a novel high-resolution remote sensing image traffic port station detection model (Swin-HSTPS) to achieve high-resolution remote sensing image traffic port station detection (such as airports, ports) and improve the multi-target complex in high-resolution remote sensing images The recognition accuracy of high-resolution remote sensing images solves the problem of high-precision positioning by comprehensive analysis of the feature combination information of multiple small targets in high-resolution remote sensing images. The model combines the characteristics of the MixUp hybrid enhancement algorithm, and enhances the image feature information in the preprocessing stage. The PReLU activation function is added to the forward network of the Swin Transformer model network to construct a ResNet-like residual network and perform convolutional feature maps. Non-linear transformation strengthens the information interaction of each pixel block. This experiment evaluates the superiority of the model training by comparing the two indicators of average precision and average recall in the training phase. At the same time, in the prediction stage, the accuracy of the prediction target is measured by confidence. Experimental results show that the optimal average precision of the Swin-HSTPS reaches 85.3%, which is about 8% higher than the average precision of the Swin Transformer detection model. At the same time, the target prediction accuracy is also higher than the Swin Transformer detection model, which can accurately locate traffic port stations such as airports and ports in high-resolution remote sensing images. This model inherits the advantages of the Swin Transformer detection model, and is superior to mainstream models such as R-CNN and YOLOv5 in terms of the target prediction ability of high-resolution remote sensing image traffic port stations.

Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3232 ◽  
Author(s):  
Yan Liu ◽  
Qirui Ren ◽  
Jiahui Geng ◽  
Meng Ding ◽  
Jiangyun Li

Efficient and accurate semantic segmentation is the key technique for automatic remote sensing image analysis. While there have been many segmentation methods based on traditional hand-craft feature extractors, it is still challenging to process high-resolution and large-scale remote sensing images. In this work, a novel patch-wise semantic segmentation method with a new training strategy based on fully convolutional networks is presented to segment common land resources. First, to handle the high-resolution image, the images are split as local patches and then a patch-wise network is built. Second, training data is preprocessed in several ways to meet the specific characteristics of remote sensing images, i.e., color imbalance, object rotation variations and lens distortion. Third, a multi-scale training strategy is developed to solve the severe scale variation problem. In addition, the impact of conditional random field (CRF) is studied to improve the precision. The proposed method was evaluated on a dataset collected from a capital city in West China with the Gaofen-2 satellite. The dataset contains ten common land resources (Grassland, Road, etc.). The experimental results show that the proposed algorithm achieves 54.96% in terms of mean intersection over union (MIoU) and outperforms other state-of-the-art methods in remote sensing image segmentation.


2012 ◽  
Vol 500 ◽  
pp. 716-721
Author(s):  
Yi Ding Wang ◽  
Shuai Qin

In the field of remote sensing, the acquirement of higher resolution of remote sensing images has become a hot spot issue with widely use of high resolution of remote sensing images. This paper focus on the characteristics of high resolution remote sensing images, on the basis of fully considerate of the correlation between geometric features and image pixels, bring forward a fusion of image mosaic processing algorithm. With this algorithm, the surface features can be well preserved after the processing of mosaic the remote sensing images, and the overlapping area can transit naturally, it will be better for the post-processing, analysis and application.


2021 ◽  
Vol 13 (22) ◽  
pp. 4528
Author(s):  
Xin Yang ◽  
Lei Hu ◽  
Yongmei Zhang ◽  
Yunqing Li

Remote sensing image change detection (CD) is an important task in remote sensing image analysis and is essential for an accurate understanding of changes in the Earth’s surface. The technology of deep learning (DL) is becoming increasingly popular in solving CD tasks for remote sensing images. Most existing CD methods based on DL tend to use ordinary convolutional blocks to extract and compare remote sensing image features, which cannot fully extract the rich features of high-resolution (HR) remote sensing images. In addition, most of the existing methods lack robustness to pseudochange information processing. To overcome the above problems, in this article, we propose a new method, namely MRA-SNet, for CD in remote sensing images. Utilizing the UNet network as the basic network, the method uses the Siamese network to extract the features of bitemporal images in the encoder separately and perform the difference connection to better generate difference maps. Meanwhile, we replace the ordinary convolution blocks with Multi-Res blocks to extract spatial and spectral features of different scales in remote sensing images. Residual connections are used to extract additional detailed features. To better highlight the change region features and suppress the irrelevant region features, we introduced the Attention Gates module before the skip connection between the encoder and the decoder. Experimental results on a public dataset of remote sensing image CD show that our proposed method outperforms other state-of-the-art (SOTA) CD methods in terms of evaluation metrics and performance.


Author(s):  
Jingtan Li ◽  
Maolin Xu ◽  
Hongling Xiu

With the resolution of remote sensing images is getting higher and higher, high-resolution remote sensing images are widely used in many areas. Among them, image information extraction is one of the basic applications of remote sensing images. In the face of massive high-resolution remote sensing image data, the traditional method of target recognition is difficult to cope with. Therefore, this paper proposes a remote sensing image extraction based on U-net network. Firstly, the U-net semantic segmentation network is used to train the training set, and the validation set is used to verify the training set at the same time, and finally the test set is used for testing. The experimental results show that U-net can be applied to the extraction of buildings.


2019 ◽  
Vol 11 (20) ◽  
pp. 2349 ◽  
Author(s):  
Zhengyuan Zhang ◽  
Wenhui Diao ◽  
Wenkai Zhang ◽  
Menglong Yan ◽  
Xin Gao ◽  
...  

Significant progress has been made in remote sensing image captioning by encoder-decoder frameworks. The conventional attention mechanism is prevalent in this task but still has some drawbacks. The conventional attention mechanism only uses visual information about the remote sensing images without considering using the label information to guide the calculation of attention masks. To this end, a novel attention mechanism, namely Label-Attention Mechanism (LAM), is proposed in this paper. LAM additionally utilizes the label information of high-resolution remote sensing images to generate natural sentences to describe the given images. It is worth noting that, instead of high-level image features, the predicted categories’ word embedding vectors are adopted to guide the calculation of attention masks. Representing the content of images in the form of word embedding vectors can filter out redundant image features. In addition, it can also preserve pure and useful information for generating complete sentences. The experimental results from UCM-Captions, Sydney-Captions and RSICD demonstrate that LAM can improve the model’s performance for describing high-resolution remote sensing images and obtain better S m scores compared with other methods. S m score is a hybrid scoring method derived from the AI Challenge 2017 scoring method. In addition, the validity of LAM is verified by the experiment of using true labels.


2020 ◽  
Vol 12 (14) ◽  
pp. 2334
Author(s):  
Lu Zhao ◽  
Hongyan Ren ◽  
Cheng Cui ◽  
Yaohuan Huang

High-resolution remotely sensed imageries have been widely employed to detect urban villages (UVs) in highly urbanized regions, especially in developing countries. However, the understanding of the potential impacts of spatially and temporally differentiated urban internal development on UV detection is still limited. In this study, a partition-strategy-based framework integrating the random forest (RF) model, object-based image analysis (OBIA) method, and high-resolution remote sensing images was proposed for the UV-detection model. In the core regions of Guangzhou, four original districts were re-divided into five new zones for the subsequent object-based RF-detection of UVs with a series features, according to the different proportion of construction lands. The results show that the proposed framework has a good performance on UV detection with an average overall accuracy of 90.23% and a kappa coefficient of 0.8. It also shows the possibility of transferring samples and models into a similar area. In summary, the partition strategy is a potential solution for the improvement of the UV-detection accuracy through high-resolution remote sensing images in Guangzhou. We suggest that the spatiotemporal process of urban construction land expansion should be comprehensively understood so as to ensure an efficient UV-detection in highly urbanized regions. This study can provide some meaningful clues for city managers identifying the UVs efficiently before devising and implementing their urban planning in the future.


2019 ◽  
Vol 15 (5) ◽  
pp. 391-395 ◽  
Author(s):  
Min Wang ◽  
Jin-yong Chen ◽  
Gang Wang ◽  
Feng Gao ◽  
Kang Sun ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document