scholarly journals Multi-Modal Remote Sensing Image Matching Method Based on Deep Learning Technology

2021 ◽  
Vol 2083 (3) ◽  
pp. 032093
Author(s):  
Hao Han ◽  
Canhai Li ◽  
Xiaofeng Qiu

Abstract Remote sensing is a scientific technology that uses sensors to detect the reflection, radiation or scattering of electromagnetic wave signals from ground objects in a non-contact and long-distance manner. The images are classified by the extracted image feature information Recognition is a further study of obtaining target feature information, which is of great significance to urban planning, disaster monitoring, and ecological environment evaluation. The image matching framework proposed in this paper matches the depth feature maps, and reversely pushes the geometric deformation between the depth feature maps to between the original reference image and the target image, and eliminates the geometric deformation between the original images. Finally, through feature extraction of the corrected image, the extracted local feature image blocks are input into the trained multi-modal feature matching network to complete the entire matching process. Experiments show that the negative sample set construction strategy that takes into account the sample distance proposed in this experiment can effectively deal with the problem of neighboring point interference in RSI matching, and improve the matching performance of the network model.

2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Qi Zhang

AbstractImage classification plays an important role in computer vision. The existing convolutional neural network methods have some problems during image classification process, such as low accuracy of tumor classification and poor ability of feature expression and feature extraction. Therefore, we propose a novel ResNet101 model based on dense dilated convolution for medical liver tumors classification. The multi-scale feature extraction module is used to extract multi-scale features of images, and the receptive field of the network is increased. The depth feature extraction module is used to reduce background noise information and focus on effective features of the focal region. To obtain broader and deeper semantic information, a dense dilated convolution module is deployed in the network. This module combines the advantages of Inception, residual structure, and multi-scale dilated convolution to obtain a deeper level of feature information without causing gradient explosion and gradient disappearance. To solve the common feature loss problems in the classification network, the up- down-sampling module in the network is improved, and multiple convolution kernels with different scales are cascaded to widen the network, which can effectively avoid feature loss. Finally, experiments are carried out on the proposed method. Compared with the existing mainstream classification networks, the proposed method can improve the classification performance, and finally achieve accurate classification of liver tumors. The effectiveness of the proposed method is further verified by ablation experiments.Highlights The multi-scale feature extraction module is introduced to extract multi-scale features of images, it can extract deep context information of the lesion region and surrounding tissues to enhance the feature extraction ability of the network. The depth feature extraction module is used to focus on the local features of the lesion region from both channel and space, weaken the influence of irrelevant information, and strengthen the recognition ability of the lesion region. The feature extraction module is enhanced by the parallel structure of dense dilated convolution, and the deeper feature information is obtained without losing the image feature information to improve the classification accuracy.


2021 ◽  
Vol 13 (19) ◽  
pp. 3871
Author(s):  
Xu Cheng ◽  
Lihua Liu ◽  
Chen Song

Object detection and segmentation have recently shown encouraging results toward image analysis and interpretation due to their promising applications in remote sensing image fusion field. Although numerous methods have been proposed, implementing effective and efficient object detection is still very challenging for now, especially for the limitation of single modal data. The use of a single modal data is not always enough to reach proper spectral and spatial resolutions. The rapid expansion in the number and the availability of multi-source data causes new challenges for their effective and efficient processing. In this paper, we propose an effective feature information–interaction visual attention model for multimodal data segmentation and enhancement, which utilizes channel information to weight self-attentive feature maps of different sources, completing extraction, fusion, and enhancement of global semantic features with local contextual information of the object. Additionally, we further propose an adaptively cyclic feature information–interaction model, which adopts branch prediction to decide the number of visual perceptions, accomplishing adaptive fusion of global semantic features and local fine-grained information. Numerous experiments on several benchmarks show that the proposed approach can achieve significant improvements over baseline model.


2020 ◽  
Vol 12 (4) ◽  
pp. 681
Author(s):  
Yunsheng Xiong ◽  
Xin Niu ◽  
Yong Dou ◽  
Hang Qie ◽  
Kang Wang

Aircraft recognition has great application value, but aircraft in remote sensing images have some problems such as low resolution, poor contrasts, poor sharpness, and lack of details caused by the vertical view, which make the aircraft recognition very difficult. Especially when there are many kinds of aircraft and the differences between aircraft are subtle, the fine-grained recognition of aircraft is more challenging. In this paper, we propose a non-locally enhanced feature fusion network(NLFFNet) and attempt to make full use of the features from discriminative parts of aircraft. First, according to the long-distance self-correlation in aircraft images, we adopt non-locally enhanced operation and guide the network to pay more attention to the discriminating areas and enhance the features beneficial to classification. Second, we propose a part-level feature fusion mechanism(PFF), which crops 5 parts of the aircraft on the shared feature maps, then extracts the subtle features inside the parts through the part full connection layer(PFC) and fuses the features of these parts together through the combined full connection layer(CFC). In addition, by adopting the improved loss function, we can enhance the weight of hard examples in the loss function meanwhile reducing the weight of excessively hard examples, which improves the overall recognition ability of the network. The dataset includes 47 categories of aircraft, including many aircraft of the same family with slight differences in appearance, and our method can achieve 89.12% accuracy on the test dataset, which proves the effectiveness of our method.


2021 ◽  
Vol 13 (15) ◽  
pp. 2966
Author(s):  
Yunchuan Ma ◽  
Pengyuan Lv ◽  
Hao Liu ◽  
Xuehong Sun ◽  
Yanfei Zhong

In the recent years, convolutional neural networks (CNN)-based super resolution (SR) methods are widely used in the field of remote sensing. However, complicated remote sensing images contain abundant high-frequency details, which are difficult to capture and reconstruct effectively. To address this problem, we propose a dense channel attention network (DCAN) to reconstruct high-resolution (HR) remote sensing images. The proposed method learns multi-level feature information and pays more attention to the important and useful regions in order to better reconstruct the final image. Specifically, we construct a dense channel attention mechanism (DCAM), which densely uses the feature maps from the channel attention block via skip connection. This mechanism makes better use of multi-level feature maps which contain abundant high-frequency information. Further, we add a spatial attention block, which makes the network have more flexible discriminative ability. Experimental results demonstrate that the proposed DCAN method outperforms several state-of-the-art methods in both quantitative evaluation and visual quality.


2020 ◽  
Vol 64 (1) ◽  
pp. 10505-1-10505-16
Author(s):  
Yin Zhang ◽  
Xuehan Bai ◽  
Junhua Yan ◽  
Yongqi Xiao ◽  
C. R. Chatwin ◽  
...  

Abstract A new blind image quality assessment method called No-Reference Image Quality Assessment Based on Multi-Order Gradients Statistics is proposed, which is aimed at solving the problem that the existing no-reference image quality assessment methods cannot determine the type of image distortion and that the quality evaluation has poor robustness for different types of distortion. In this article, an 18-dimensional image feature vector is constructed from gradient magnitude features, relative gradient orientation features, and relative gradient magnitude features over two scales and three orders on the basis of the relationship between multi-order gradient statistics and the type and degree of image distortion. The feature matrix and distortion types of known distorted images are used to train an AdaBoost_BP neural network to determine the image distortion type; the feature matrix and subjective scores of known distorted images are used to train an AdaBoost_BP neural network to determine the image distortion degree. A series of comparative experiments were carried out using Laboratory of Image and Video Engineering (LIVE), LIVE Multiply Distorted Image Quality, Tampere Image, and Optics Remote Sensing Image databases. Experimental results show that the proposed method has high distortion type judgment accuracy and that the quality score shows good subjective consistency and robustness for all types of distortion. The performance of the proposed method is not constricted to a particular database, and the proposed method has high operational efficiency.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5312
Author(s):  
Yanni Zhang ◽  
Yiming Liu ◽  
Qiang Li ◽  
Jianzhong Wang ◽  
Miao Qi ◽  
...  

Recently, deep learning-based image deblurring and deraining have been well developed. However, most of these methods fail to distill the useful features. What is more, exploiting the detailed image features in a deep learning framework always requires a mass of parameters, which inevitably makes the network suffer from a high computational burden. We propose a lightweight fusion distillation network (LFDN) for image deblurring and deraining to solve the above problems. The proposed LFDN is designed as an encoder–decoder architecture. In the encoding stage, the image feature is reduced to various small-scale spaces for multi-scale information extraction and fusion without much information loss. Then, a feature distillation normalization block is designed at the beginning of the decoding stage, which enables the network to distill and screen valuable channel information of feature maps continuously. Besides, an information fusion strategy between distillation modules and feature channels is also carried out by the attention mechanism. By fusing different information in the proposed approach, our network can achieve state-of-the-art image deblurring and deraining results with a smaller number of parameters and outperform the existing methods in model complexity.


2021 ◽  
Vol 13 (11) ◽  
pp. 2171
Author(s):  
Yuhao Qing ◽  
Wenyi Liu ◽  
Liuyan Feng ◽  
Wanjia Gao

Despite significant progress in object detection tasks, remote sensing image target detection is still challenging owing to complex backgrounds, large differences in target sizes, and uneven distribution of rotating objects. In this study, we consider model accuracy, inference speed, and detection of objects at any angle. We also propose a RepVGG-YOLO network using an improved RepVGG model as the backbone feature extraction network, which performs the initial feature extraction from the input image and considers network training accuracy and inference speed. We use an improved feature pyramid network (FPN) and path aggregation network (PANet) to reprocess feature output by the backbone network. The FPN and PANet module integrates feature maps of different layers, combines context information on multiple scales, accumulates multiple features, and strengthens feature information extraction. Finally, to maximize the detection accuracy of objects of all sizes, we use four target detection scales at the network output to enhance feature extraction from small remote sensing target pixels. To solve the angle problem of any object, we improved the loss function for classification using circular smooth label technology, turning the angle regression problem into a classification problem, and increasing the detection accuracy of objects at any angle. We conducted experiments on two public datasets, DOTA and HRSC2016. Our results show the proposed method performs better than previous methods.


2021 ◽  
Vol 11 (11) ◽  
pp. 5050
Author(s):  
Jiahai Tan ◽  
Ming Gao ◽  
Kai Yang ◽  
Tao Duan

Road extraction from remote sensing images has attracted much attention in geospatial applications. However, the existing methods do not accurately identify the connectivity of the road. The identification of the road pixels may be interfered with by the abundant ground such as buildings, trees, and shadows. The objective of this paper is to enhance context and strip features of the road by designing UNet-like architecture. The overall method first enhances the context characteristics in the segmentation step and then maintains the stripe characteristics in a refinement step. The segmentation step exploits an attention mechanism to enhance the context information between the adjacent layers. To obtain the strip features of the road, the refinement step introduces the strip pooling in a refinement network to restore the long distance dependent information of the road. Extensive comparative experiments demonstrate that the proposed method outperforms other methods, achieving an overall accuracy of 98.25% on the DeepGlobe dataset, and 97.68% on the Massachusetts dataset.


Sensors ◽  
2019 ◽  
Vol 19 (19) ◽  
pp. 4115 ◽  
Author(s):  
Yuxia Li ◽  
Bo Peng ◽  
Lei He ◽  
Kunlong Fan ◽  
Zhenxu Li ◽  
...  

Roads are vital components of infrastructure, the extraction of which has become a topic of significant interest in the field of remote sensing. Because deep learning has been a popular method in image processing and information extraction, researchers have paid more attention to extracting road using neural networks. This article proposes the improvement of neural networks to extract roads from Unmanned Aerial Vehicle (UAV) remote sensing images. D-Linknet was first considered for its high performance; however, the huge scale of the net reduced computational efficiency. With a focus on the low computational efficiency problem of the popular D-LinkNet, this article made some improvements: (1) Replace the initial block with a stem block. (2) Rebuild the entire network based on ResNet units with a new structure, allowing for the construction of an improved neural network D-Linknetplus. (3) Add a 1 × 1 convolution layer before DBlock to reduce the input feature maps, reducing parameters and improving computational efficiency. Add another 1 × 1 convolution layer after DBlock to recover the required number of output channels. Accordingly, another improved neural network B-D-LinknetPlus was built. Comparisons were performed between the neural nets, and the verification were made with the Massachusetts Roads Dataset. The results show improved neural networks are helpful in reducing the network size and developing the precision needed for road extraction.


Sign in / Sign up

Export Citation Format

Share Document