scholarly journals Optimal Feature Transport for Cross-View Image Geo-Localization

2020 ◽  
Vol 34 (07) ◽  
pp. 11990-11997 ◽  
Author(s):  
Yujiao Shi ◽  
Xin Yu ◽  
Liu Liu ◽  
Tong Zhang ◽  
Hongdong Li

This paper addresses the problem of cross-view image geo-localization, where the geographic location of a ground-level street-view query image is estimated by matching it against a large scale aerial map (e.g., a high-resolution satellite image). State-of-the-art deep-learning based methods tackle this problem as deep metric learning which aims to learn global feature representations of the scene seen by the two different views. Despite promising results are obtained by such deep metric learning methods, they, however, fail to exploit a crucial cue relevant for localization, namely, the spatial layout of local features. Moreover, little attention is paid to the obvious domain gap (between aerial view and ground view) in the context of cross-view localization. This paper proposes a novel Cross-View Feature Transport (CVFT) technique to explicitly establish cross-view domain transfer that facilitates feature alignment between ground and aerial images. Specifically, we implement the CVFT as network layers, which transports features from one domain to the other, leading to more meaningful feature similarity comparison. Our model is differentiable and can be learned end-to-end. Experiments on large-scale datasets have demonstrated that our method has remarkably boosted the state-of-the-art cross-view localization performance, e.g., on the CVUSA dataset, with significant improvements for top-1 recall from 40.79% to 61.43%, and for top-10 from 76.36% to 90.49%. We expect the key insight of the paper (i.e., explicitly handling domain difference via domain transport) will prove to be useful for other similar problems in computer vision as well.

2020 ◽  
Vol 10 (2) ◽  
pp. 615 ◽  
Author(s):  
Tomas Iesmantas ◽  
Agne Paulauskaite-Taraseviciene ◽  
Kristina Sutiene

(1) Background: The segmentation of cell nuclei is an essential task in a wide range of biomedical studies and clinical practices. The full automation of this process remains a challenge due to intra- and internuclear variations across a wide range of tissue morphologies, differences in staining protocols and imaging procedures. (2) Methods: A deep learning model with metric embeddings such as contrastive loss and triplet loss with semi-hard negative mining is proposed in order to accurately segment cell nuclei in a diverse set of microscopy images. The effectiveness of the proposed model was tested on a large-scale multi-tissue collection of microscopy image sets. (3) Results: The use of deep metric learning increased the overall segmentation prediction by 3.12% in the average value of Dice similarity coefficients as compared to no metric learning. In particular, the largest gain was observed for segmenting cell nuclei in H&E -stained images when deep learning network and triplet loss with semi-hard negative mining were considered for the task. (4) Conclusion: We conclude that deep metric learning gives an additional boost to the overall learning process and consequently improves the segmentation performance. Notably, the improvement ranges approximately between 0.13% and 22.31% for different types of images in the terms of Dice coefficients when compared to no metric deep learning.


Author(s):  
S. Bullinger ◽  
C. Bodensteiner ◽  
M. Arens

Abstract. The reconstruction of accurate three-dimensional environment models is one of the most fundamental goals in the field of photogrammetry. Since satellite images provide suitable properties for obtaining large-scale environment reconstructions, there exist a variety of Stereo Matching based methods to reconstruct point clouds for satellite image pairs. Recently, a Structure from Motion (SfM) based approach has been proposed, which allows to reconstruct point clouds from multiple satellite images. In this work, we propose an extension of this SfM based pipeline that allows us to reconstruct not only point clouds but watertight meshes including texture information. We provide a detailed description of several steps that are mandatory to exploit state-of-the-art mesh reconstruction algorithms in the context of satellite imagery. This includes a decomposition of finite projective camera calibration matrices, a skew correction of corresponding depth maps and input images as well as the recovery of real-world depth maps from reparameterized depth values. The paper presents an extensive quantitative evaluation on multi-date satellite images demonstrating that the proposed pipeline combined with current meshing algorithms outperforms state-of-the-art point cloud reconstruction algorithms in terms of completeness and median error. We make the source code of our pipeline publicly available.


Author(s):  
Ali Salim Rasheed ◽  
Davood Zabihzadeh ◽  
Sumia Abdulhussien Razooqi Al-Obaidi

Metric learning algorithms aim to make the conceptually related data items closer and keep dissimilar ones at a distance. The most common approach for metric learning on the Mahalanobis method. Despite its success, this method is limited to find a linear projection and also suffer from scalability respecting both the dimensionality and the size of input data. To address these problems, this paper presents a new scalable metric learning algorithm for multi-modal data. Our method learns an optimal metric for any feature set of the multi-modal data in an online fashion. We also combine the learned metrics with a novel Passive/Aggressive (PA)-based algorithm which results in a higher convergence rate compared to the state-of-the-art methods. To address scalability with respect to dimensionality, Dual Random Projection (DRP) is adopted in this paper. The present method is evaluated on some challenging machine vision datasets for image classification and Content-Based Information Retrieval (CBIR) tasks. The experimental results confirm that the proposed method significantly surpasses other state-of-the-art metric learning methods in most of these datasets in terms of both accuracy and efficiency.


2021 ◽  
Author(s):  
Shichao Hu ◽  
Beici Liang ◽  
Zhouxuan Chen ◽  
Xiao Lu ◽  
Ethan Zhao ◽  
...  

2020 ◽  
Vol 12 (16) ◽  
pp. 2603
Author(s):  
Jian Kang ◽  
Rubén Fernández-Beltrán ◽  
Zhen Ye ◽  
Xiaohua Tong ◽  
Pedram Ghamisi ◽  
...  

Deep metric learning has recently received special attention in the field of remote sensing (RS) scene characterization, owing to its prominent capabilities for modeling distances among RS images based on their semantic information. Most of the existing deep metric learning methods exploit pairwise and triplet losses to learn the feature embeddings with the preservation of semantic-similarity, which requires the construction of image pairs and triplets based on the supervised information (e.g., class labels). However, generating such semantic annotations becomes a completely unaffordable task in large-scale RS archives, which may eventually constrain the availability of sufficient training data for this kind of models. To address this issue, we reformulate the deep metric learning scheme in a semi-supervised manner to effectively characterize RS scenes. Specifically, we aim at learning metric spaces by utilizing the supervised information from a small number of labeled RS images and exploring the potential decision boundaries for massive sets of unlabeled aerial scenes. In order to reach this goal, a joint loss function, composed of a normalized softmax loss with margin and a high-rankness regularization term, is proposed, as well as its corresponding optimization algorithm. The conducted experiments (including different state-of-the-art methods and two benchmark RS archives) validate the effectiveness of the proposed approach for RS image classification, clustering and retrieval tasks. The codes of this paper are publicly available.


2021 ◽  
Author(s):  
Igor Soares ◽  
Fernando Camargo ◽  
Adriano Marques ◽  
Oliver Crook

Abstract Genome engineering is undergoing unprecedented development and is now becoming widely available. To ensure responsible biotechnology innovation and to reduce misuse of engineered DNA sequences, it is vital to develop tools to identify the lab-of-origin of engineered plasmids. Genetic engineering attribution (GEA), the ability to make sequence-lab associations, would supportforensic experts in this process. Here, we propose a method, based on metric learning, that ranks the most likely labs-of-origin whilstsimultaneously generating embeddings for plasmid sequences and labs. These embeddings can be used to perform various downstreamtasks, such as clustering DNA sequences and labs, as well as using them as features in machine learning models. Our approach employsa circular shift augmentation approach and is able to correctly rank the lab-of-origin90%of the time within its top 10 predictions -outperforming all current state-of-the-art approaches. We also demonstrate that we can perform few-shot-learning and obtain76%top-10 accuracy using only10%of the sequences. This means, we outperform the previous CNN approach using only one-tenth of the data. We also demonstrate that we are able to extract key signatures in plasmid sequences for particular labs, allowing for an interpretable examination of the model’s outputs.CCS Concepts: Information systems→Similarity measures; Learning to rank.


2020 ◽  
Author(s):  
Yuki Takashima ◽  
Ryoichi Takashima ◽  
Tetsuya Takiguchi ◽  
Yasuo Ariki

Impact ◽  
2019 ◽  
Vol 2019 (10) ◽  
pp. 90-92
Author(s):  
Kae Doki ◽  
Yuki Funabora ◽  
Shinji Doki

Every day we are seeing an increasing number of robots being employed in our day-to-day lives. They are working in factories, cleaning our houses and may soon be chauffeuring us around in vehicles. The affordability of drones too has come down and now it is conceivable for most anyone to own a sophisticated unmanned aerial vehicle (UAV). While fun to fly, these devices also represent powerful new tools for several industries. Anytime an aerial view is needed for a planning, surveillance or surveying, for example, a UAV can be deployed. Further still, equipping these vehicles with an array of sensors, for climate research or mapping, increases their capability even more. This gives companies, governments or researchers a cheap and safe way to collect vast amounts of data and complete tasks in remote or dangerous areas that were once impossible to reach. One area UAVs are proving to be particularly useful is infrastructure inspection. In countries all over the world large scale infrastructure projects like dams and bridges are ageing and in need of upkeep. Identifying which ones and exactly where they are in need of patching is a huge undertaking. Not only can this work be dangerous, requiring trained inspectors to climb these megaprojects, it is incredibly time consuming and costly. Enter the UAVs. With a fleet of specially equipped UAVs and a small team piloting them and interpreting the data they bring back the speed and safety of this work increases exponentially. The promise of UAVs to overturn the infrastructure inspection process is enticing, but there remain several obstacles to overcome. One is achieving the fine level of control and positioning required to navigate the robots around 3D structures for inspection. One can imagine that piloting a small UAV underneath a huge highway bridge without missing a single small crack is quite difficult, especially when the operators are safely on the ground hundreds of meters away. To do this knowing exactly where the vehicle is in space becomes a critical variable. The job can be made even easier if a flight plan based on set waypoints can be pre-programmed and followed autonomously by the UAV. It is exactly this problem that Dr Kae Doki from the Department of Electrical Engineering at Aichi Institute of Technology, and collaborators are focused on solving.


Sign in / Sign up

Export Citation Format

Share Document