Evaluation of four point cloud similarity measures for the use in autonomous driving

Abstract Measuring the similarity between point clouds is required in many areas. In autonomous driving, point clouds for 3D perception are estimated from camera images but these estimations are error-prone. Furthermore, there is a lack of measures for quality quantification using ground truth. In this paper, we derive conditions point cloud comparisons need to fulfill and accordingly evaluate the Chamfer distance, a lower bound of the Gromov Wasserstein metric, and the ratio measure. We show that the ratio measure is not affected by erroneous points and therefore introduce the new measure “average ratio”. All measures are evaluated and compared using exemplary point clouds. We discuss characteristics, advantages and drawbacks with respect to interpretability, noise resistance, environmental representation, and computation.

Download Full-text

Parallel Point Clouds: Hybrid Point Cloud Generation and 3D Model Enhancement via Virtual–Real Integration

Remote Sensing ◽

10.3390/rs13152868 ◽

2021 ◽

Vol 13 (15) ◽

pp. 2868

Author(s):

Yonglin Tian ◽

Xiao Wang ◽

Yu Shen ◽

Zhongzheng Guo ◽

Zilei Wang ◽

...

Keyword(s):

Point Cloud ◽

Closed Loop ◽

Real Data ◽

Point Clouds ◽

Autonomous Driving ◽

Training Model ◽

Labor Cost ◽

Open Loop ◽

Training Dataset ◽

Data Annotation

Three-dimensional information perception from point clouds is of vital importance for improving the ability of machines to understand the world, especially for autonomous driving and unmanned aerial vehicles. Data annotation for point clouds is one of the most challenging and costly tasks. In this paper, we propose a closed-loop and virtual–real interactive point cloud generation and model-upgrading framework called Parallel Point Clouds (PPCs). To our best knowledge, this is the first time that the training model has been changed from an open-loop to a closed-loop mechanism. The feedback from the evaluation results is used to update the training dataset, benefiting from the flexibility of artificial scenes. Under the framework, a point-based LiDAR simulation model is proposed, which greatly simplifies the scanning operation. Besides, a group-based placing method is put forward to integrate hybrid point clouds, via locating candidate positions for virtual objects in real scenes. Taking advantage of the CAD models and mobile LiDAR devices, two hybrid point cloud datasets, i.e., ShapeKITTI and MobilePointClouds, are built for 3D detection tasks. With almost zero labor cost on data annotation for newly added objects, the models (PointPillars) trained with ShapeKITTI and MobilePointClouds achieved 78.6% and 60.0% of the average precision of the model trained with real data on 3D detection, respectively.

Download Full-text

PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation

Sensors ◽

10.3390/s20061573 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1573 ◽

Cited By ~ 1

Author(s):

Haojie Liu ◽

Kang Liao ◽

Chunyu Lin ◽

Yao Zhao ◽

Meiqin Liu

Keyword(s):

Point Cloud ◽

Spatial Information ◽

Interpolation Method ◽

Low Frequency ◽

Point Clouds ◽

Autonomous Driving ◽

Sensor Data ◽

Navigation Systems ◽

Intermediate Point ◽

Cascade Structure

LiDAR sensors can provide dependable 3D spatial information at a low frequency (around 10 Hz) and have been widely applied in the field of autonomous driving and unmanned aerial vehicle (UAV). However, the camera with a higher frequency (around 20 Hz) has to be decreased so as to match with LiDAR in a multi-sensor system. In this paper, we propose a novel Pseudo-LiDAR interpolation network (PLIN) to increase the frequency of LiDAR sensor data. PLIN can generate temporally and spatially high-quality point cloud sequences to match the high frequency of cameras. To achieve this goal, we design a coarse interpolation stage guided by consecutive sparse depth maps and motion relationship. We also propose a refined interpolation stage guided by the realistic scene. Using this coarse-to-fine cascade structure, our method can progressively perceive multi-modal information and generate accurate intermediate point clouds. To the best of our knowledge, this is the first deep framework for Pseudo-LiDAR point cloud interpolation, which shows appealing applications in navigation systems equipped with LiDAR and cameras. Experimental results demonstrate that PLIN achieves promising performance on the KITTI dataset, significantly outperforming the traditional interpolation method and the state-of-the-art video interpolation technique.

Download Full-text

A comparison of multi-view 3D reconstruction of a rock wall using several cameras and a laser scanner

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-5-573-2014 ◽

2014 ◽

Vol XL-5 ◽

pp. 573-580 ◽

Cited By ~ 35

Author(s):

K. Thoeni ◽

A. Giacomini ◽

R. Murtagh ◽

E. Kniest

Keyword(s):

3D Reconstruction ◽

Point Cloud ◽

Laser Scanner ◽

Ground Truth ◽

Point Clouds ◽

Digital Cameras ◽

Rock Wall ◽

Geological Features ◽

Dense Point ◽

Sharp Edges

This work presents a comparative study between multi-view 3D reconstruction using various digital cameras and a terrestrial laser scanner (TLS). Five different digital cameras were used in order to estimate the limits related to the camera type and to establish the minimum camera requirements to obtain comparable results to the ones of the TLS. The cameras used for this study range from commercial grade to professional grade and included a GoPro Hero 1080 (5 Mp), iPhone 4S (8 Mp), Panasonic Lumix LX5 (9.5 Mp), Panasonic Lumix ZS20 (14.1 Mp) and Canon EOS 7D (18 Mp). The TLS used for this work was a FARO Focus 3D laser scanner with a range accuracy of ±2 mm. The study area is a small rock wall of about 6 m height and 20 m length. The wall is partly smooth with some evident geological features, such as non-persistent joints and sharp edges. Eight control points were placed on the wall and their coordinates were measured by using a total station. These coordinates were then used to georeference all models. A similar number of images was acquired from a distance of between approximately 5 to 10 m, depending on field of view of each camera. The commercial software package PhotoScan was used to process the images, georeference and scale the models, and to generate the dense point clouds. Finally, the open-source package CloudCompare was used to assess the accuracy of the multi-view results. Each point cloud obtained from a specific camera was compared to the point cloud obtained with the TLS. The latter is taken as ground truth. The result is a coloured point cloud for each camera showing the deviation in relation to the TLS data. The main goal of this study is to quantify the quality of the multi-view 3D reconstruction results obtained with various cameras as objectively as possible and to evaluate its applicability to geotechnical problems.

Download Full-text

HCNET: A Point Cloud Object Detection Network Based on Height and Channel Attention

Remote Sensing ◽

10.3390/rs13245071 ◽

2021 ◽

Vol 13 (24) ◽

pp. 5071

Author(s):

Jing Zhang ◽

Jiajun Wang ◽

Da Xu ◽

Yunsong Li

Keyword(s):

Object Detection ◽

Point Cloud ◽

Feature Fusion ◽

Three Dimensional ◽

Point Clouds ◽

Autonomous Driving ◽

Attention Mechanism ◽

Uneven Distribution ◽

Adaptive Adjustment ◽

High Level

The use of LiDAR point clouds for accurate three-dimensional perception is crucial for realizing high-level autonomous driving systems. Upon considering the drawbacks of the current point cloud object-detection algorithms, this paper proposes HCNet, an algorithm that combines an attention mechanism with adaptive adjustment, starting from feature fusion and overcoming the sparse and uneven distribution of point clouds. Inspired by the basic idea of an attention mechanism, a feature-fusion structure HC module with height attention and channel attention, weighted in parallel, is proposed to perform feature-fusion on multiple pseudo images. The use of several weighting mechanisms enhances the ability of feature-information expression. Additionally, we designed an adaptively adjusted detection head that also overcomes the sparsity of the point cloud from the perspective of original information fusion. It reduces the interference caused by the uneven distribution of the point cloud from the perspective of adaptive adjustment. The results show that our HCNet has better accuracy than other one-stage-network or even two-stage-network RCNNs under some evaluation detection metrics. Additionally, it has a detection rate of 30FPS. Especially for hard samples, the algorithm in this paper has better detection performance than many existing algorithms.

Download Full-text

Morphing and Sampling Network for Dense Point Cloud Completion

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6827 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11596-11603 ◽

Cited By ~ 4

Author(s):

Minghua Liu ◽

Lu Sheng ◽

Sheng Yang ◽

Jing Shao ◽

Shi-Min Hu

Keyword(s):

Point Cloud ◽

Point Clouds ◽

Coarse Grained ◽

Second Stage ◽

The Earth ◽

Novel Approach ◽

Dense Point ◽

Chamfer Distance ◽

Structural Loss ◽

Two Stages

3D point cloud completion, the task of inferring the complete geometric shape from a partial point cloud, has been attracting attention in the community. For acquiring high-fidelity dense point clouds and avoiding uneven distribution, blurred details, or structural loss of existing methods' results, we propose a novel approach to complete the partial point cloud in two stages. Specifically, in the first stage, the approach predicts a complete but coarse-grained point cloud with a collection of parametric surface elements. Then, in the second stage, it merges the coarse-grained prediction with the input point cloud by a novel sampling algorithm. Our method utilizes a joint loss function to guide the distribution of the points. Extensive experiments verify the effectiveness of our method and demonstrate that it outperforms the existing methods in both the Earth Mover's Distance (EMD) and the Chamfer Distance (CD).

Download Full-text

GENERATION OF GROUND TRUTH DATASETS FOR THE ANALYSIS OF 3D POINT CLOUDS IN URBAN SCENES ACQUIRED VIA DIFFERENT SENSORS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-3-2009-2018 ◽

2018 ◽

Vol XLII-3 ◽

pp. 2009-2015 ◽

Cited By ~ 1

Author(s):

Y. Xu ◽

Z. Sun ◽

R. Boerner ◽

T. Koch ◽

L. Hoegner ◽

...

Keyword(s):

Point Cloud ◽

Spatial Information ◽

Ground Truth ◽

Semantic Segmentation ◽

Point Clouds ◽

Urban Scenes ◽

3D Space ◽

3D Point Clouds ◽

Testing Site ◽

Voxel Grid

In this work, we report a novel way of generating ground truth dataset for analyzing point cloud from different sensors and the validation of algorithms. Instead of directly labeling large amount of 3D points requiring time consuming manual work, a multi-resolution 3D voxel grid for the testing site is generated. Then, with the help of a set of basic labeled points from the reference dataset, we can generate a 3D labeled space of the entire testing site with different resolutions. Specifically, an octree-based voxel structure is applied to voxelize the annotated reference point cloud, by which all the points are organized by 3D grids of multi-resolutions. When automatically annotating the new testing point clouds, a voting based approach is adopted to the labeled points within multiple resolution voxels, in order to assign a semantic label to the 3D space represented by the voxel. Lastly, robust line- and plane-based fast registration methods are developed for aligning point clouds obtained via various sensors. Benefiting from the labeled 3D spatial information, we can easily create new annotated 3D point clouds of different sensors of the same scene directly by considering the corresponding labels of 3D space the points located, which would be convenient for the validation and evaluation of algorithms related to point cloud interpretation and semantic segmentation.

Download Full-text

GACM: A Graph Attention Capsule Model for the Registration of TLS Point Clouds in the Urban Scene

Remote Sensing ◽

10.3390/rs13224497 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4497

Author(s):

Jianjun Zou ◽

Zhenxin Zhang ◽

Dong Chen ◽

Qinghua Li ◽

Lan Sun ◽

...

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Three Dimensional ◽

Point Clouds ◽

Autonomous Driving ◽

Feature Descriptors ◽

Local Point ◽

Urban Scene ◽

Pairwise Registration ◽

Structure Complexity

Point cloud registration is the foundation and key step for many vital applications, such as digital city, autonomous driving, passive positioning, and navigation. The difference of spatial objects and the structure complexity of object surfaces are the main challenges for the registration problem. In this paper, we propose a graph attention capsule model (named as GACM) for the efficient registration of terrestrial laser scanning (TLS) point cloud in the urban scene, which fuses graph attention convolution and a three-dimensional (3D) capsule network to extract local point cloud features and obtain 3D feature descriptors. These descriptors can take into account the differences of spatial structure and point density in objects and make the spatial features of ground objects more prominent. During the training progress, we used both matched points and non-matched points to train the model. In the test process of the registration, the points in the neighborhood of each keypoint were sent to the trained network, in order to obtain feature descriptors and calculate the rotation and translation matrix after constructing a K-dimensional (KD) tree and random sample consensus (RANSAC) algorithm. Experiments show that the proposed method achieves more efficient registration results and higher robustness than other frontier registration methods in the pairwise registration of point clouds.

Download Full-text

Contrastive Learning for 3D Point Clouds Classification and Shape Completion

Sensors ◽

10.3390/s21217392 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7392

Author(s):

Danish Nazir ◽

Muhammad Zeshan Afzal ◽

Alain Pagani ◽

Marcus Liwicki ◽

Didier Stricker

Keyword(s):

Point Cloud ◽

Feature Learning ◽

Point Clouds ◽

Classification Performance ◽

Feature Representations ◽

3D Point Clouds ◽

Chamfer Distance ◽

Shape Completion ◽

Number Of Classes

In this paper, we present the idea of Self Supervised learning on the shape completion and classification of point clouds. Most 3D shape completion pipelines utilize AutoEncoders to extract features from point clouds used in downstream tasks such as classification, segmentation, detection, and other related applications. Our idea is to add contrastive learning into AutoEncoders to encourage global feature learning of the point cloud classes. It is performed by optimizing triplet loss. Furthermore, local feature representations learning of point cloud is performed by adding the Chamfer distance function. To evaluate the performance of our approach, we utilize the PointNet classifier. We also extend the number of classes for evaluation from 4 to 10 to show the generalization ability of the learned features. Based on our results, embeddings generated from the contrastive AutoEncoder enhances shape completion and classification performance from 84.2% to 84.9% of point clouds achieving the state-of-the-art results with 10 classes.

Download Full-text

SFGAN: Unsupervised Generative Adversarial Learning of 3D Scene Flow from the 3D Scene Self

10.22541/au.163335790.03073492/v1 ◽

2021 ◽

Author(s):

Guangming Wang ◽

Chaokang Jiang ◽

Zehang Shen ◽

Yanzi Miao ◽

Hesheng Wang

Keyword(s):

Point Cloud ◽

Ground Truth ◽

Autonomous Driving ◽

Generative Adversarial Networks ◽

Real Point ◽

Current Frame ◽

3D Motion ◽

3D Space ◽

Scene Flow ◽

3D Scene

3D scene flow presents the 3D motion of each point in the 3D space, which forms the fundamental 3D motion perception for autonomous driving and server robots. Although the RGBD camera or LiDAR capture discrete 3D points in space, the objects and motions usually are continuous in the macro world. That is, the objects keep themselves consistent as they flow from the current frame to the next frame. Based on this insight, the Generative Adversarial Networks (GAN) is utilized to self-learn 3D scene flow with no need for ground truth. The fake point cloud of the second frame is synthesized from the predicted scene flow and the point cloud of the first frame. The adversarial training of the generator and discriminator is realized through synthesizing indistinguishable fake point cloud and discriminating the real point cloud and the synthesized fake point cloud. The experiments on KITTI scene flow dataset show that our method realizes promising results without ground truth. Just like a human observing a real-world scene, the proposed approach is capable of determining the consistency of the scene at different moments in spite of the exact flow value of each point is unknown in advance. Corresponding author(s) Email: [email protected]

Download Full-text

COMPLETION OF SPARSE AND PARTIAL POINT CLOUDS OF VEHICLES USING A NOVEL END-TO-END NETWORK

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2020-933-2020 ◽

2020 ◽

Vol V-2-2020 ◽

pp. 933-940

Author(s):

Y. Xia ◽

W. Liu ◽

Z. Luo ◽

Y. Xu ◽

U. Stilla

Keyword(s):

Point Cloud ◽

Point Clouds ◽

Autonomous Driving ◽

Fine Grained ◽

Additional Information ◽

Distribution Uniformity ◽

End To End ◽

Scan Data ◽

Translation Accuracy ◽

3D Shapes

Abstract. Completing the 3D shape of vehicles from real scan data, which aims to estimate the complete geometry of vehicles from partial inputs, acts as a role in the field of remote sensing and autonomous driving. With the recent popularity of deep learning, plenty of data-driven methods have been proposed. However, most of them usually require additional information as prior knowledge for the input, for example, semantic labels and symmetry assumptions. In this paper, we design a novel and end-to-end network, termed as S2U-Net, to achieve the completion of 3D shapes of vehicles from the partial and sparse point clouds. Our network includes two modules of the encoder and the generator. The encoder is designed to extract the global feature of the incomplete and sparse point cloud while the generator is designed to produce fine-grained and dense completion. Specially, we adopt an upsampling strategy to output a more uniform point cloud. Experimental results in the KITTI dataset illustrate our method achieves better performance than the state-of-arts in terms of distribution uniformity and completion quality. Specifically, we improve the translation accuracy by 50.8% and rotation accuracy by 40.6% evaluating completed results with a point cloud registration task.

Download Full-text