Estimation of the Fundamental Matrix and Feature Matching of an Incomplete Image based on Deep Learning

Abstract. Matching images containing large viewpoint and viewing direction changes, resulting in large perspective differences, still is a very challenging problem. Affine shape estimation, orientation assignment and feature description algorithms based on detected hand crafted features have shown to be error prone. In this paper, affine shape estimation, orientation assignment and description of local features is achieved through deep learning. Those three modules are trained based on loss functions optimizing the matching performance of input patch pairs. The trained descriptors are first evaluated on the Brown dataset (Brown et al., 2011), a standard descriptor performance benchmark. The whole pipeline is then tested on images of small blocks acquired with an aerial penta camera, to compute image orientation. The results show that learned features perform significantly better than alternatives based on hand crafted features.

Download Full-text

LSV-ANet: Deep Learning on Local Structure Visualization for Feature Matching

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2021.3062498 ◽

2021 ◽

pp. 1-18

Author(s):

Jiaxuan Chen ◽

Shuang Chen ◽

Xiaoxian Chen ◽

Yang Yang ◽

Linjie Xing ◽

...

Keyword(s):

Deep Learning ◽

Local Structure ◽

Feature Matching

Download Full-text

Deep-learning with synthetic data enables automated picking of cryo-EM particle images of biological macromolecules

Bioinformatics ◽

10.1093/bioinformatics/btz728 ◽

2019 ◽

Cited By ~ 2

Author(s):

Ruijie Yao ◽

Jiaqiang Qian ◽

Qiang Huang

Keyword(s):

Deep Learning ◽

Single Particle ◽

Feature Matching ◽

Synthetic Data ◽

Supplementary Information ◽

Particle Analysis ◽

Biological Macromolecules ◽

Low Contrast ◽

3D Structures ◽

Particle Images

Abstract Motivation Single-particle cryo-electron microscopy (cryo-EM) has become a powerful technique for determining 3D structures of biological macromolecules at near-atomic resolution. However, this approach requires picking huge numbers of macromolecular particle images from thousands of low-contrast, high-noisy electron micrographs. Although machine-learning methods were developed to get rid of this bottleneck, it still lacks universal methods that could automatically picking the noisy cryo-EM particles of various macromolecules. Results Here, we present a deep-learning segmentation model that employs fully convolutional networks trained with synthetic data of known 3D structures, called PARSED (PARticle SEgmentation Detector). Without using any experimental information, PARSED could automatically segment the cryo-EM particles in a whole micrograph at a time, enabling faster particle picking than previous template/feature-matching and particle-classification methods. Applications to six large public cryo-EM datasets clearly validated its universal ability to pick macromolecular particles of various sizes. Thus, our deep-learning method could break the particle-picking bottleneck in the single-particle analysis, and thereby accelerates the high-resolution structure determination by cryo-EM. Availability and implementation The PARSED package and user manual for noncommercial use are available as Supplementary Material (in the compressed file: parsed_v1.zip). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A Deep Learning-Based Semantic Filter for RANSAC-Based Fundamental Matrix Calculation and the ORB-SLAM System

IEEE Access ◽

10.1109/access.2019.2962268 ◽

2020 ◽

Vol 8 ◽

pp. 3212-3223

Author(s):

Chunyan Shao ◽

Chi Zhang ◽

Zaojun Fang ◽

Guilin Yang

Keyword(s):

Deep Learning ◽

Fundamental Matrix ◽

Matrix Calculation

Download Full-text

Image Feature Matching Based on Deep Learning

2018 IEEE 4th International Conference on Computer and Communications (ICCC) ◽

10.1109/compcomm.2018.8780936 ◽

2018 ◽

Cited By ~ 2

Author(s):

Yinyang Liu ◽

Xiaobin Xu ◽

Feixiang Li

Keyword(s):

Deep Learning ◽

Feature Matching ◽

Image Feature

Download Full-text

Local Deep Descriptor for Remote Sensing Image Feature Matching

Remote Sensing ◽

10.3390/rs11040430 ◽

2019 ◽

Vol 11 (4) ◽

pp. 430 ◽

Cited By ~ 4

Author(s):

Yunyun Dong ◽

Weili Jiao ◽

Tengfei Long ◽

Lanfa Liu ◽

Guojin He ◽

...

Keyword(s):

Remote Sensing ◽

Computer Vision ◽

Deep Learning ◽

Feature Matching ◽

Remote Sensing Image ◽

Image Feature ◽

Training Dataset ◽

Feature Descriptor ◽

Remote Sensing Images

Feature matching via local descriptors is one of the most fundamental problems in many computer vision tasks, as well as in the remote sensing image processing community. For example, in terms of remote sensing image registration based on the feature, feature matching is a vital process to determine the quality of transform model. While in the process of feature matching, the quality of feature descriptor determines the matching result directly. At present, the most commonly used descriptor is hand-crafted by the designer’s expertise or intuition. However, it is hard to cover all the different cases, especially for remote sensing images with nonlinear grayscale deformation. Recently, deep learning shows explosive growth and improves the performance of tasks in various fields, especially in the computer vision community. Here, we created remote sensing image training patch samples, named Invar-Dataset in a novel and automatic way, then trained a deep learning convolutional neural network, named DescNet to generate a robust feature descriptor for feature matching. A special experiment was carried out to illustrate that our created training dataset was more helpful to train a network to generate a good feature descriptor. A qualitative experiment was then performed to show that feature descriptor vector learned by the DescNet could be used to register remote sensing images with large gray scale difference successfully. A quantitative experiment was then carried out to illustrate that the feature vector generated by the DescNet could acquire more matched points than those generated by hand-crafted feature Scale Invariant Feature Transform (SIFT) descriptor and other networks. On average, the matched points acquired by DescNet was almost twice those acquired by other methods. Finally, we analyzed the advantages of our created training dataset Invar-Dataset and DescNet and gave the possible development of training deep descriptor network.

Download Full-text

Unsupervised Deep Learning-Based RGB-D Visual Odometry

Applied Sciences ◽

10.3390/app10165426 ◽

2020 ◽

Vol 10 (16) ◽

pp. 5426 ◽

Cited By ~ 1

Author(s):

Qiang Liu ◽

Haidong Zhang ◽

Yiming Xu ◽

Li Wang

Keyword(s):

Deep Learning ◽

Feature Matching ◽

Ground Truth ◽

Visual Odometry ◽

Depth Images ◽

Network Training ◽

Stream Structure ◽

Unsupervised Deep Learning ◽

Rgb Images ◽

Learning Frameworks

Recently, deep learning frameworks have been deployed in visual odometry systems and achieved comparable results to traditional feature matching based systems. However, most deep learning-based frameworks inevitably need labeled data as ground truth for training. On the other hand, monocular odometry systems are incapable of restoring absolute scale. External or prior information has to be introduced for scale recovery. To solve these problems, we present a novel deep learning-based RGB-D visual odometry system. Our two main contributions are: (i) during network training and pose estimation, the depth images are fed into the network to form a dual-stream structure with the RGB images, and a dual-stream deep neural network is proposed. (ii) the system adopts an unsupervised end-to-end training method, thus the labor-intensive data labeling task is not required. We have tested our system on the KITTI dataset, and results show that the proposed RGB-D Visual Odometry (VO) system has obvious advantages over other state-of-the-art systems in terms of both translation and rotation errors.

Download Full-text

3D Reconstruction of a Complex Grid Structure Combining UAS Images and Deep Learning

Remote Sensing ◽

10.3390/rs12193128 ◽

2020 ◽

Vol 12 (19) ◽

pp. 3128

Author(s):

Vladimir A. Knyaz ◽

Vladimir V. Kniaz ◽

Fabio Remondino ◽

Sergey Y. Zheltov ◽

Armin Gruen

Keyword(s):

Neural Network ◽

Deep Learning ◽

3D Reconstruction ◽

Network Model ◽

Neural Network Model ◽

Feature Matching ◽

Semantic Segmentation ◽

Unmanned Aerial Systems ◽

Grid Structure ◽

Large Size

The latest advances in technical characteristics of unmanned aerial systems (UAS) and their onboard sensors opened the way for smart flying vehicles exploiting new application areas and allowing to perform missions seemed to be impossible before. One of these complicated tasks is the 3D reconstruction and monitoring of large-size, complex, grid-like structures as radio or television towers. Although image-based 3D survey contains a lot of visual and geometrical information useful for making preliminary conclusions on construction health, standard photogrammetric processing fails to perform dense and robust 3D reconstruction of complex large-size mesh structures. The main problem of such objects is repeated and self-occlusive similar elements resulting in false feature matching. This paper presents a method developed for an accurate Multi-View Stereo (MVS) dense 3D reconstruction of the Shukhov Radio Tower in Moscow (Russia) based on UAS photogrammetric survey. A key element for the successful image-based 3D reconstruction is the developed WireNetV2 neural network model for robust automatic semantic segmentation of wire structures. The proposed neural network provides high matching quality due to an accurate masking of the tower elements. The main contributions of the paper are: (1) a deep learning WireNetV2 convolutional neural network model that outperforms the state-of-the-art results of semantic segmentation on a dataset containing images of grid structures of complicated topology with repeated elements, holes, self-occlusions, thus providing robust grid structure masking and, as a result, accurate 3D reconstruction, (2) an advanced image-based pipeline aided by a neural network for the accurate 3D reconstruction of the large-size and complex grid structured, evaluated on UAS imagery of Shukhov radio tower in Moscow.

Download Full-text

LFM: A Lightweight LCD Algorithm Based on Feature Matching between Similar Key Frames

Sensors ◽

10.3390/s21134499 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4499

Author(s):

Zuojun Zhu ◽

Xiangrong Xu ◽

Xuefei Liu ◽

Yanglin Jiang

Keyword(s):

Deep Learning ◽

Feature Matching ◽

Binary Classification ◽

Recall Rate ◽

Current Position ◽

Loop Closure ◽

Localization And Mapping ◽

Similar Images ◽

Key Frames ◽

Target Detection Task

Loop Closure Detection (LCD) is an important technique to improve the accuracy of Simultaneous Localization and Mapping (SLAM). In this paper, we propose an LCD algorithm based on binary classification for feature matching between similar images with deep learning, which greatly improves the accuracy of LCD algorithm. Meanwhile, a novel lightweight convolutional neural network (CNN) is proposed and applied to the target detection task of key frames. On this basis, the key frames are binary classified according to their labels. Finally, similar frames are input into the improved lightweight feature matching network based on Transformer to judge whether the current position is loop closure. The experimental results show that, compared with the traditional method, LFM-LCD has higher accuracy and recall rate in the LCD task of indoor SLAM while ensuring the number of parameters and calculation amount. The research in this paper provides a new direction for LCD of robotic SLAM, which will be further improved with the development of deep learning.

Download Full-text

An Optimal Algorithm for Estimating Fundamental Matrix by Removing the Outliers

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.1435 ◽

2014 ◽

Vol 989-994 ◽

pp. 1435-1440

Author(s):

Li Qian Guo ◽

Jun Bi Liao ◽

Fang Ping Zhong ◽

Xiang Yu ◽

Jun Tao

Keyword(s):

Fundamental Matrix ◽

Stereo Matching ◽

Feature Matching ◽

Optimal Algorithm ◽

Estimation Method ◽

Accurate Estimation ◽

Matrix Estimation ◽

Concentration Problem ◽

Random Samples ◽

Partial Concentration

The accurate estimation of the fundamental matrix is one of the most important steps in many computer vision applications such as 3D reconstruction, camera self-calibration, motion estimation and stereo matching. In this paper, an optimal fundamental matrix estimation method based on removing exceptional match points is proposed. Firstly, the initial mismatch is reduced by the bidirectional SIFT feature matching algorithm. Secondly, the partial concentration problem of random samples is solved by the bucket segmentation method. In order to obtain robustness, the fundamental matrix is estimated in a RANSAC framework according to the principle of minimizing the geometric distance. Finally, the iterate process improves the accuracy of the fundamental matrix by using the LM algorithm. Experimental results show that the proposed method can reduce the outlier’s interference better and improve the estimation precision of the fundamental matrix.

Download Full-text