Can we Localize an AV from a Single Image? Deep-Geometric 6 DoF Localization in Topo-metric Maps

Author(s):  
Punarjay Chakravarty ◽  
Tom Roussel ◽  
Gaurav Pandey ◽  
Tinne Tuytelaars

Abstract We describe a Deep-Geometric Localizer that is able to estimate the full six degrees-of-freedom (DoF) global pose of the camera from a single image in a previously mapped environment. Our map is a topo-metric one, with discrete topological nodes whose 6DOF poses are known. Each topo-node in our map also comprises of a set of points, whose 2D features and 3D locations are stored as part of the mapping process. For the mapping phase, we utilise a stereo camera and a regular stereo visual SLAM pipeline. During the localization phase, we take a single camera image, localize it to a topological node using Deep Learning, and use a geometric algorithm (PnP) on the matched 2D features (and their 3D positions in the topo map) to determine the full 6DOF globally consistent pose of the camera. Our method divorces the mapping and the localization algorithms and sensors (stereo and mono), and allows accurate 6DOF pose estimation in a previously mapped environment using a single camera. With results in simulated and real environments, our hybrid algorithm is particularly useful for autonomous vehicles (AVs) and shuttles that might repeatedly traverse the same route.

Author(s):  
Karl Ludwig Fetzer ◽  
Sergey G. Nersesov ◽  
Hashem Ashrafiuon

Abstract In this paper, the authors derive backstepping control laws for tracking a time-based reference trajectory for a 3D model of an autonomous vehicle with two degrees of underactuation. Tracking all six degrees of freedom is made possible by a transformation that reduces the order of the error dynamics. Stability of the resulting error dynamics is proven and demonstrated in simulations.


2020 ◽  
Vol 10 (16) ◽  
pp. 5442
Author(s):  
Ryo Hachiuma ◽  
Hideo Saito

This paper presents a method for estimating the six Degrees of Freedom (6DoF) pose of texture-less primitive-shaped objects from depth images. As the conventional methods for object pose estimation require rich texture or geometric features to the target objects, these methods are not suitable for texture-less and geometrically simple shaped objects. In order to estimate the pose of the primitive-shaped object, the parameters that represent primitive shapes are estimated. However, these methods explicitly limit the number of types of primitive shapes that can be estimated. We employ superquadrics as a primitive shape representation that can represent various types of primitive shapes with only a few parameters. In order to estimate the superquadric parameters of primitive-shaped objects, the point cloud of the object must be segmented from a depth image. It is known that the parameter estimation is sensitive to outliers, which are caused by the miss-segmentation of the depth image. Therefore, we propose a novel estimation method for superquadric parameters that are robust to outliers. In the experiment, we constructed a dataset in which the person grasps and moves the primitive-shaped objects. The experimental results show that our estimation method outperformed three conventional methods and the baseline method.


2005 ◽  
Vol 23 (1) ◽  
pp. 74-76
Author(s):  
Li Zhongke ◽  
Wang Yong ◽  
Qin Yongyuan ◽  
Lu Peijun

2019 ◽  
Vol 19 (19) ◽  
pp. 8824-8831 ◽  
Author(s):  
Wouter Jansen ◽  
Dennis Laurijssen ◽  
Walter Daems ◽  
Jan Steckel

Author(s):  
Pascal Fua ◽  
Vincent Lepetit

Mixed Reality applications require accurate knowledge of the relative positions of the camera and the scene. When either of them moves, this means keeping track in real-time of all six degrees of freedom that define the camera position and orientation relative to the scene, or, equivalently, the 3D displacement of an object relative to the camera. Many technologies have been tried to achieve this goal. However, Computer Vision is the only one that has the potential to yield non-invasive, accurate and low-cost solutions to this problem, provided that one is willing to invest the effort required to develop sufficiently robust algorithms. In this chapter, we therefore discuss some of the most promising approaches, their strengths, and their weaknesses.


2017 ◽  
Vol 17 (1) ◽  
pp. 151-159 ◽  
Author(s):  
Dennis Laurijssen ◽  
Steven Truijen ◽  
Wim Saeys ◽  
Walter Daems ◽  
Jan Steckel

Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8112
Author(s):  
Xudong Lv ◽  
Shuo Wang ◽  
Dong Ye

As an essential procedure of data fusion, LiDAR-camera calibration is critical for autonomous vehicles and robot navigation. Most calibration methods require laborious manual work, complicated environmental settings, and specific calibration targets. The targetless methods are based on some complex optimization workflow, which is time-consuming and requires prior information. Convolutional neural networks (CNNs) can regress the six degrees of freedom (6-DOF) extrinsic parameters from raw LiDAR and image data. However, these CNN-based methods just learn the representations of the projected LiDAR and image and ignore the correspondences at different locations. The performances of these CNN-based methods are unsatisfactory and worse than those of non-CNN methods. In this paper, we propose a novel CNN-based LiDAR-camera extrinsic calibration algorithm named CFNet. We first decided that a correlation layer should be used to provide matching capabilities explicitly. Then, we innovatively defined calibration flow to illustrate the deviation of the initial projection from the ground truth. Instead of directly predicting the extrinsic parameters, we utilize CFNet to predict the calibration flow. The efficient Perspective-n-Point (EPnP) algorithm within the RANdom SAmple Consensus (RANSAC) scheme is applied to estimate the extrinsic parameters with 2D–3D correspondences constructed by the calibration flow. Due to its consideration of the geometric information, our proposed method performed better than the state-of-the-art CNN-based methods on the KITTI datasets. Furthermore, we also tested the flexibility of our approach on the KITTI360 datasets.


2017 ◽  
Vol 6 (2) ◽  
Author(s):  
Anko Börner ◽  
Dirk Baumbach ◽  
Maximilian Buder ◽  
Andre Choinowski ◽  
Ines Ernst ◽  
...  

AbstractEgo localization is an important prerequisite for several scientific, commercial, and statutory tasks. Only by knowing one’s own position, can guidance be provided, inspections be executed, and autonomous vehicles be operated. Localization becomes challenging if satellite-based navigation systems are not available, or data quality is not sufficient. To overcome this problem, a team of the German Aerospace Center (DLR) developed a multi-sensor system based on the human head and its navigation sensors – the eyes and the vestibular system. This system is called integrated positioning system (IPS) and contains a stereo camera and an inertial measurement unit for determining an ego pose in six degrees of freedom in a local coordinate system. IPS is able to operate in real time and can be applied for indoor and outdoor scenarios without any external reference or prior knowledge. In this paper, the system and its key hardware and software components are introduced. The main issues during the development of such complex multi-sensor measurement systems are identified and discussed, and the performance of this technology is demonstrated. The developer team started from scratch and transfers this technology into a commercial product right now. The paper finishes with an outlook.


Sign in / Sign up

Export Citation Format

Share Document