scholarly journals Local Deep Descriptor for Remote Sensing Image Feature Matching

2019 ◽  
Vol 11 (4) ◽  
pp. 430 ◽  
Author(s):  
Yunyun Dong ◽  
Weili Jiao ◽  
Tengfei Long ◽  
Lanfa Liu ◽  
Guojin He ◽  
...  

Feature matching via local descriptors is one of the most fundamental problems in many computer vision tasks, as well as in the remote sensing image processing community. For example, in terms of remote sensing image registration based on the feature, feature matching is a vital process to determine the quality of transform model. While in the process of feature matching, the quality of feature descriptor determines the matching result directly. At present, the most commonly used descriptor is hand-crafted by the designer’s expertise or intuition. However, it is hard to cover all the different cases, especially for remote sensing images with nonlinear grayscale deformation. Recently, deep learning shows explosive growth and improves the performance of tasks in various fields, especially in the computer vision community. Here, we created remote sensing image training patch samples, named Invar-Dataset in a novel and automatic way, then trained a deep learning convolutional neural network, named DescNet to generate a robust feature descriptor for feature matching. A special experiment was carried out to illustrate that our created training dataset was more helpful to train a network to generate a good feature descriptor. A qualitative experiment was then performed to show that feature descriptor vector learned by the DescNet could be used to register remote sensing images with large gray scale difference successfully. A quantitative experiment was then carried out to illustrate that the feature vector generated by the DescNet could acquire more matched points than those generated by hand-crafted feature Scale Invariant Feature Transform (SIFT) descriptor and other networks. On average, the matched points acquired by DescNet was almost twice those acquired by other methods. Finally, we analyzed the advantages of our created training dataset Invar-Dataset and DescNet and gave the possible development of training deep descriptor network.

Open Physics ◽  
2019 ◽  
Vol 17 (1) ◽  
pp. 871-878
Author(s):  
Yijun Liu ◽  
Ziwen Zhang ◽  
Feng Li

Abstract In key frame extraction of multi-resolution remote sensing image using traditional key frame image feature extraction method, only the feature information of remote sensing images, rather than cluster operation of the remote sensing images is considered, which leads to low efficiency and poor quality of extraction results. To this end, the key frame extraction algorithm of multi-resolution remote sensing image under quality constraint was proposed. Through similarity between image features and the selected image frame, rough key frame can be extracted. On this basis, the key frame extraction of multi resolution remote sensing image based on quality constraints was used to perform clustering operation for multi-resolution remote sensing image corresponding to rough key frame, which shortened the time length for retrieval of key frame image. According to the clustering results, multi-resolution remote sensing images were divided into several clusters. The key frame of each cluster can be obtained by calculating the distance between remote sensing image and cluster center. For key frames that had been determined, their quality was evaluated to meet standard, so as to realize effective extraction of key frame of multi-resolution remote sensing images. The experimental results show that the proposed method can significantly improve the quality of key frame extraction of multi-resolution remote sensing images.


2018 ◽  
Vol 10 (6) ◽  
pp. 964 ◽  
Author(s):  
Zhenfeng Shao ◽  
Ke Yang ◽  
Weixun Zhou

Benchmark datasets are essential for developing and evaluating remote sensing image retrieval (RSIR) approaches. However, most of the existing datasets are single-labeled, with each image in these datasets being annotated by a single label representing the most significant semantic content of the image. This is sufficient for simple problems, such as distinguishing between a building and a beach, but multiple labels and sometimes even dense (pixel) labels are required for more complex problems, such as RSIR and semantic segmentation.We therefore extended the existing multi-labeled dataset collected for multi-label RSIR and presented a dense labeling remote sensing dataset termed "DLRSD". DLRSD contained a total of 17 classes, and the pixels of each image were assigned with 17 pre-defined labels. We used DLRSD to evaluate the performance of RSIR methods ranging from traditional handcrafted feature-based methods to deep learning-based ones. More specifically, we evaluated the performances of RSIR methods from both single-label and multi-label perspectives. These results demonstrated the advantages of multiple labels over single labels for interpreting complex remote sensing images. DLRSD provided the literature a benchmark for RSIR and other pixel-based problems such as semantic segmentation.


Author(s):  
Akey Sungheetha ◽  
Rajesh Sharma R

Over the last decade, remote sensing technology has advanced dramatically, resulting in significant improvements on image quality, data volume, and application usage. These images have essential applications since they can help with quick and easy interpretation. Many standard detection algorithms fail to accurately categorize a scene from a remote sensing image recorded from the earth. A method that uses bilinear convolution neural networks to produce a lessweighted set of models those results in better visual recognition in remote sensing images using fine-grained techniques. This proposed hybrid method is utilized to extract scene feature information in two times from remote sensing images for improved recognition. In layman's terms, these features are defined as raw, and only have a single defined frame, so they will allow basic recognition from remote sensing images. This research work has proposed a double feature extraction hybrid deep learning approach to classify remotely sensed image scenes based on feature abstraction techniques. Also, the proposed algorithm is applied to feature values in order to convert them to feature vectors that have pure black and white values after many product operations. The next stage is pooling and normalization, which occurs after the CNN feature extraction process has changed. This research work has developed a novel hybrid framework method that has a better level of accuracy and recognition rate than any prior model.


2020 ◽  
Vol 12 (18) ◽  
pp. 2937
Author(s):  
Song Cui ◽  
Miaozhong Xu ◽  
Ailong Ma ◽  
Yanfei Zhong

The nonlinear radiation distortions (NRD) among multimodal remote sensing images bring enormous challenges to image registration. The traditional feature-based registration methods commonly use the image intensity or gradient information to detect and describe the features that are sensitive to NRD. However, the nonlinear mapping of the corresponding features of the multimodal images often results in failure of the feature matching, as well as the image registration. In this paper, a modality-free multimodal remote sensing image registration method (SRIFT) is proposed for the registration of multimodal remote sensing images, which is invariant to scale, radiation, and rotation. In SRIFT, the nonlinear diffusion scale (NDS) space is first established to construct a multi-scale space. A local orientation and scale phase congruency (LOSPC) algorithm are then used so that the features of the images with NRD are mapped to establish a one-to-one correspondence, to obtain sufficiently stable key points. In the feature description stage, a rotation-invariant coordinate (RIC) system is adopted to build a descriptor, without requiring estimation of the main direction. The experiments undertaken in this study included one set of simulated data experiments and nine groups of experiments with different types of real multimodal remote sensing images with rotation and scale differences (including synthetic aperture radar (SAR)/optical, digital surface model (DSM)/optical, light detection and ranging (LiDAR) intensity/optical, near-infrared (NIR)/optical, short-wave infrared (SWIR)/optical, classification/optical, and map/optical image pairs), to test the proposed algorithm from both quantitative and qualitative aspects. The experimental results showed that the proposed method has strong robustness to NRD, being invariant to scale, radiation, and rotation, and the achieved registration precision was better than that of the state-of-the-art methods.


Author(s):  
Tong Wang ◽  
Hemeng Yang ◽  
Ling Zhu ◽  
Yazhou Fan ◽  
Xue Yang ◽  
...  

Remote sensing technology is an effective tool for sensing the earth’s surface. With the continuous improvement of remote sensing technology, remote sensing detectors can obtain more spectral and spatial information, including clear feature contours, complex texture features and spatial layout rules. This information was detected in mineral resources. Surface substance identification, water pollution information monitoring and many other aspects have played an important role. The coding algorithm and defects, storage algorithm and interference from atmospheric cloud radiation information during the imaging process lead to varying degrees of distortion and deterioration of remote sensing images during imaging, transmission and storage. This makes it difficult to process, analyze and apply remote sensing images. Therefore, the design of a reasonable remote sensing image quality evaluation method is not only conducive to the remote sensing image quality evaluation in the real-time processing system of remote sensing image, but also conducive to the optimization of remote sensing image system and image processing algorithm. The application is worthwhile. In this paper, the deteriorating features of remote sensing images will change the statistical distribution. We propose a method for evaluating the quality of remote sensing images in depth learning. Feature learning and blurring as well as noise intensity classification for image remote sensing using convolutional neural network are carried out. The evaluation model is modified by masking effect and perceptual weighting factor, and the quality evaluation results of remote sensing images are obtained according to human vision. The research shows that this method can effectively solve the problem of removing and evaluating the noise of remote sensing image, and can effectively and accurately evaluate the quality of remote sensing image. It is also consistent with subjective assessment and human perception.


2011 ◽  
Vol 271-273 ◽  
pp. 205-210
Author(s):  
Ying Zhao Ma ◽  
Wei Li Jiao ◽  
Wang Wei

Cloud is an important factor affect the quality of optical remote sensing image. How to automatically detect the cloud cover of an image, reduce of useless data transmission, make great significance of higher data rate usefulness. This paper represent a method based on Lansat5 data, which can automatically mark the location of clouds region in each image, and effective calculated for each cloud cover, remove useless remote sensing images.


2019 ◽  
Vol 11 (9) ◽  
pp. 1044 ◽  
Author(s):  
Wei Cui ◽  
Fei Wang ◽  
Xin He ◽  
Dongyou Zhang ◽  
Xuxiang Xu ◽  
...  

A comprehensive interpretation of remote sensing images involves not only remote sensing object recognition but also the recognition of spatial relations between objects. Especially in the case of different objects with the same spectrum, the spatial relationship can help interpret remote sensing objects more accurately. Compared with traditional remote sensing object recognition methods, deep learning has the advantages of high accuracy and strong generalizability regarding scene classification and semantic segmentation. However, it is difficult to simultaneously recognize remote sensing objects and their spatial relationship from end-to-end only relying on present deep learning networks. To address this problem, we propose a multi-scale remote sensing image interpretation network, called the MSRIN. The architecture of the MSRIN is a parallel deep neural network based on a fully convolutional network (FCN), a U-Net, and a long short-term memory network (LSTM). The MSRIN recognizes remote sensing objects and their spatial relationship through three processes. First, the MSRIN defines a multi-scale remote sensing image caption strategy and simultaneously segments the same image using the FCN and U-Net on different spatial scales so that a two-scale hierarchy is formed. The output of the FCN and U-Net are masked to obtain the location and boundaries of remote sensing objects. Second, using an attention-based LSTM, the remote sensing image captions include the remote sensing objects (nouns) and their spatial relationships described with natural language. Finally, we designed a remote sensing object recognition and correction mechanism to build the relationship between nouns in captions and object mask graphs using an attention weight matrix to transfer the spatial relationship from captions to objects mask graphs. In other words, the MSRIN simultaneously realizes the semantic segmentation of the remote sensing objects and their spatial relationship identification end-to-end. Experimental results demonstrated that the matching rate between samples and the mask graph increased by 67.37 percentage points, and the matching rate between nouns and the mask graph increased by 41.78 percentage points compared to before correction. The proposed MSRIN has achieved remarkable results.


2020 ◽  
Vol 12 (21) ◽  
pp. 3547 ◽  
Author(s):  
Yuanyuan Ren ◽  
Xianfeng Zhang ◽  
Yongjian Ma ◽  
Qiyuan Yang ◽  
Chuanjian Wang ◽  
...  

Remote sensing image segmentation with samples imbalance is always one of the most important issues. Typically, a high-resolution remote sensing image has the characteristics of high spatial resolution and low spectral resolution, complex large-scale land covers, small class differences for some land covers, vague foreground, and imbalanced distribution of samples. However, traditional machine learning algorithms have limitations in deep image feature extraction and dealing with sample imbalance issue. In the paper, we proposed an improved full-convolution neural network, called DeepLab V3+, with loss function based solution of samples imbalance. In addition, we select Sentinel-2 remote sensing images covering the Yuli County, Bayingolin Mongol Autonomous Prefecture, Xinjiang Uygur Autonomous Region, China as data sources, then a typical region image dataset is built by data augmentation. The experimental results show that the improved DeepLab V3+ model can not only utilize the spectral information of high-resolution remote sensing images, but also consider its rich spatial information. The classification accuracy of the proposed method on the test dataset reaches 97.97%. The mean Intersection-over-Union reaches 87.74%, and the Kappa coefficient 0.9587. The work provides methodological guidance to sample imbalance correction, and the established data resource can be a reference to further study in the future.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 495
Author(s):  
Liang Jin ◽  
Guodong Liu

Compared with ordinary images, each of the remote sensing images contains many kinds of objects with large scale changes, providing more details. As a typical object of remote sensing image, ship detection has been playing an essential role in the field of remote sensing. With the rapid development of deep learning, remote sensing image detection method based on convolutional neural network (CNN) has occupied a key position. In remote sensing images, the objects of which small scale objects account for a large proportion are closely arranged. In addition, the convolution layer in CNN lacks ample context information, leading to low detection accuracy for remote sensing image detection. To improve detection accuracy and keep the speed of real-time detection, this paper proposed an efficient object detection algorithm for ship detection of remote sensing image based on improved SSD. Firstly, we add a feature fusion module to shallow feature layers to refine feature extraction ability of small object. Then, we add Squeeze-and-Excitation Network (SE) module to each feature layers, introducing attention mechanism to network. The experimental results based on Synthetic Aperture Radar ship detection dataset (SSDD) show that the mAP reaches 94.41%, and the average detection speed is 31FPS. Compared with SSD and other representative object detection algorithms, this improved algorithm has a better performance in detection accuracy and can realize real-time detection.


2021 ◽  
Vol 13 (13) ◽  
pp. 2578
Author(s):  
Samir Touzani ◽  
Jessica Granderson

Advances in machine learning and computer vision, combined with increased access to unstructured data (e.g., images and text), have created an opportunity for automated extraction of building characteristics, cost-effectively, and at scale. These characteristics are relevant to a variety of urban and energy applications, yet are time consuming and costly to acquire with today’s manual methods. Several recent research studies have shown that in comparison to more traditional methods that are based on features engineering approach, an end-to-end learning approach based on deep learning algorithms significantly improved the accuracy of automatic building footprint extraction from remote sensing images. However, these studies used limited benchmark datasets that have been carefully curated and labeled. How the accuracy of these deep learning-based approach holds when using less curated training data has not received enough attention. The aim of this work is to leverage the openly available data to automatically generate a larger training dataset with more variability in term of regions and type of cities, which can be used to build more accurate deep learning models. In contrast to most benchmark datasets, the gathered data have not been manually curated. Thus, the training dataset is not perfectly clean in terms of remote sensing images exactly matching the ground truth building’s foot-print. A workflow that includes data pre-processing, deep learning semantic segmentation modeling, and results post-processing is introduced and applied to a dataset that include remote sensing images from 15 cities and five counties from various region of the USA, which include 8,607,677 buildings. The accuracy of the proposed approach was measured on an out of sample testing dataset corresponding to 364,000 buildings from three USA cities. The results favorably compared to those obtained from Microsoft’s recently released US building footprint dataset.


Sign in / Sign up

Export Citation Format

Share Document