scholarly journals PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation

Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1573 ◽  
Author(s):  
Haojie Liu ◽  
Kang Liao ◽  
Chunyu Lin ◽  
Yao Zhao ◽  
Meiqin Liu

LiDAR sensors can provide dependable 3D spatial information at a low frequency (around 10 Hz) and have been widely applied in the field of autonomous driving and unmanned aerial vehicle (UAV). However, the camera with a higher frequency (around 20 Hz) has to be decreased so as to match with LiDAR in a multi-sensor system. In this paper, we propose a novel Pseudo-LiDAR interpolation network (PLIN) to increase the frequency of LiDAR sensor data. PLIN can generate temporally and spatially high-quality point cloud sequences to match the high frequency of cameras. To achieve this goal, we design a coarse interpolation stage guided by consecutive sparse depth maps and motion relationship. We also propose a refined interpolation stage guided by the realistic scene. Using this coarse-to-fine cascade structure, our method can progressively perceive multi-modal information and generate accurate intermediate point clouds. To the best of our knowledge, this is the first deep framework for Pseudo-LiDAR point cloud interpolation, which shows appealing applications in navigation systems equipped with LiDAR and cameras. Experimental results demonstrate that PLIN achieves promising performance on the KITTI dataset, significantly outperforming the traditional interpolation method and the state-of-the-art video interpolation technique.

2021 ◽  
Vol 13 (15) ◽  
pp. 2868
Author(s):  
Yonglin Tian ◽  
Xiao Wang ◽  
Yu Shen ◽  
Zhongzheng Guo ◽  
Zilei Wang ◽  
...  

Three-dimensional information perception from point clouds is of vital importance for improving the ability of machines to understand the world, especially for autonomous driving and unmanned aerial vehicles. Data annotation for point clouds is one of the most challenging and costly tasks. In this paper, we propose a closed-loop and virtual–real interactive point cloud generation and model-upgrading framework called Parallel Point Clouds (PPCs). To our best knowledge, this is the first time that the training model has been changed from an open-loop to a closed-loop mechanism. The feedback from the evaluation results is used to update the training dataset, benefiting from the flexibility of artificial scenes. Under the framework, a point-based LiDAR simulation model is proposed, which greatly simplifies the scanning operation. Besides, a group-based placing method is put forward to integrate hybrid point clouds, via locating candidate positions for virtual objects in real scenes. Taking advantage of the CAD models and mobile LiDAR devices, two hybrid point cloud datasets, i.e., ShapeKITTI and MobilePointClouds, are built for 3D detection tasks. With almost zero labor cost on data annotation for newly added objects, the models (PointPillars) trained with ShapeKITTI and MobilePointClouds achieved 78.6% and 60.0% of the average precision of the model trained with real data on 3D detection, respectively.


2021 ◽  
Vol 13 (24) ◽  
pp. 5071
Author(s):  
Jing Zhang ◽  
Jiajun Wang ◽  
Da Xu ◽  
Yunsong Li

The use of LiDAR point clouds for accurate three-dimensional perception is crucial for realizing high-level autonomous driving systems. Upon considering the drawbacks of the current point cloud object-detection algorithms, this paper proposes HCNet, an algorithm that combines an attention mechanism with adaptive adjustment, starting from feature fusion and overcoming the sparse and uneven distribution of point clouds. Inspired by the basic idea of an attention mechanism, a feature-fusion structure HC module with height attention and channel attention, weighted in parallel, is proposed to perform feature-fusion on multiple pseudo images. The use of several weighting mechanisms enhances the ability of feature-information expression. Additionally, we designed an adaptively adjusted detection head that also overcomes the sparsity of the point cloud from the perspective of original information fusion. It reduces the interference caused by the uneven distribution of the point cloud from the perspective of adaptive adjustment. The results show that our HCNet has better accuracy than other one-stage-network or even two-stage-network RCNNs under some evaluation detection metrics. Additionally, it has a detection rate of 30FPS. Especially for hard samples, the algorithm in this paper has better detection performance than many existing algorithms.


2019 ◽  
Vol 2019 ◽  
pp. 1-17
Author(s):  
Mingde Gong ◽  
Haohao Wang ◽  
Xin Wang

Road input can be provided for a vehicle in advance by using an optical sensor to preview the front terrain and suspension parameters can be adjusted before a corresponding moment to keep the body as smooth as possible and thus improve ride comfort and handling stability. However, few studies have described this phenomenon in detail. In this study, a LiDAR coupled with global positioning and inertial navigation systems was used to obtain the digital terrain in front of a vehicle in the form of a 3D point cloud, which was processed by a statistical filter in the Point Cloud Library for the acquisition of accurate data. Next, the inverse distance weighting interpolation method and fractal interpolation were adopted to extract the road height profile from the 3D point cloud and improve its accuracy. The roughness grade of the road height profile was utilised as the input of active suspension. Then, the active suspension, which was based on an LQG controller, used the analytic hierarchy process method to select proper weight coefficients of performance indicators according to the previously calculated road grade. Finally, the road experiment verified that reasonable selection of active suspension’s LQG controller weightings based on estimated road profile and road class through fractal interpolation can improve the ride comfort and handling stability of the vehicle more than passive suspension did.


2017 ◽  
Vol 36 (13-14) ◽  
pp. 1455-1473 ◽  
Author(s):  
Andreas ten Pas ◽  
Marcus Gualtieri ◽  
Kate Saenko ◽  
Robert Platt

Recently, a number of grasp detection methods have been proposed that can be used to localize robotic grasp configurations directly from sensor data without estimating object pose. The underlying idea is to treat grasp perception analogously to object detection in computer vision. These methods take as input a noisy and partially occluded RGBD image or point cloud and produce as output pose estimates of viable grasps, without assuming a known CAD model of the object. Although these methods generalize grasp knowledge to new objects well, they have not yet been demonstrated to be reliable enough for wide use. Many grasp detection methods achieve grasp success rates (grasp successes as a fraction of the total number of grasp attempts) between 75% and 95% for novel objects presented in isolation or in light clutter. Not only are these success rates too low for practical grasping applications, but the light clutter scenarios that are evaluated often do not reflect the realities of real-world grasping. This paper proposes a number of innovations that together result in an improvement in grasp detection performance. The specific improvement in performance due to each of our contributions is quantitatively measured either in simulation or on robotic hardware. Ultimately, we report a series of robotic experiments that average a 93% end-to-end grasp success rate for novel objects presented in dense clutter.


Author(s):  
Sameera Palipana ◽  
Dariush Salami ◽  
Luis A. Leiva ◽  
Stephan Sigg

We introduce Pantomime, a novel mid-air gesture recognition system exploiting spatio-temporal properties of millimeter-wave radio frequency (RF) signals. Pantomime is positioned in a unique region of the RF landscape: mid-resolution mid-range high-frequency sensing, which makes it ideal for motion gesture interaction. We configure a commercial frequency-modulated continuous-wave radar device to promote spatial information over the temporal resolution by means of sparse 3D point clouds and contribute a deep learning architecture that directly consumes the point cloud, enabling real-time performance with low computational demands. Pantomime achieves 95% accuracy and 99% AUC in a challenging set of 21 gestures articulated by 41 participants in two indoor environments, outperforming four state-of-the-art 3D point cloud recognizers. We further analyze the effect of the environment in 5 different indoor environments, the effect of articulation speed, angle, and the distance of the person up to 5m. We have publicly made available the collected mmWave gesture dataset consisting of nearly 22,000 gesture instances along with our radar sensor configuration, trained models, and source code for reproducibility. We conclude that pantomime is resilient to various input conditions and that it may enable novel applications in industrial, vehicular, and smart home scenarios.


Author(s):  
Y. Xu ◽  
Z. Sun ◽  
R. Boerner ◽  
T. Koch ◽  
L. Hoegner ◽  
...  

In this work, we report a novel way of generating ground truth dataset for analyzing point cloud from different sensors and the validation of algorithms. Instead of directly labeling large amount of 3D points requiring time consuming manual work, a multi-resolution 3D voxel grid for the testing site is generated. Then, with the help of a set of basic labeled points from the reference dataset, we can generate a 3D labeled space of the entire testing site with different resolutions. Specifically, an octree-based voxel structure is applied to voxelize the annotated reference point cloud, by which all the points are organized by 3D grids of multi-resolutions. When automatically annotating the new testing point clouds, a voting based approach is adopted to the labeled points within multiple resolution voxels, in order to assign a semantic label to the 3D space represented by the voxel. Lastly, robust line- and plane-based fast registration methods are developed for aligning point clouds obtained via various sensors. Benefiting from the labeled 3D spatial information, we can easily create new annotated 3D point clouds of different sensors of the same scene directly by considering the corresponding labels of 3D space the points located, which would be convenient for the validation and evaluation of algorithms related to point cloud interpretation and semantic segmentation.


2021 ◽  
Vol 13 (22) ◽  
pp. 4497
Author(s):  
Jianjun Zou ◽  
Zhenxin Zhang ◽  
Dong Chen ◽  
Qinghua Li ◽  
Lan Sun ◽  
...  

Point cloud registration is the foundation and key step for many vital applications, such as digital city, autonomous driving, passive positioning, and navigation. The difference of spatial objects and the structure complexity of object surfaces are the main challenges for the registration problem. In this paper, we propose a graph attention capsule model (named as GACM) for the efficient registration of terrestrial laser scanning (TLS) point cloud in the urban scene, which fuses graph attention convolution and a three-dimensional (3D) capsule network to extract local point cloud features and obtain 3D feature descriptors. These descriptors can take into account the differences of spatial structure and point density in objects and make the spatial features of ground objects more prominent. During the training progress, we used both matched points and non-matched points to train the model. In the test process of the registration, the points in the neighborhood of each keypoint were sent to the trained network, in order to obtain feature descriptors and calculate the rotation and translation matrix after constructing a K-dimensional (KD) tree and random sample consensus (RANSAC) algorithm. Experiments show that the proposed method achieves more efficient registration results and higher robustness than other frontier registration methods in the pairwise registration of point clouds.


Author(s):  
Y. Xia ◽  
W. Liu ◽  
Z. Luo ◽  
Y. Xu ◽  
U. Stilla

Abstract. Completing the 3D shape of vehicles from real scan data, which aims to estimate the complete geometry of vehicles from partial inputs, acts as a role in the field of remote sensing and autonomous driving. With the recent popularity of deep learning, plenty of data-driven methods have been proposed. However, most of them usually require additional information as prior knowledge for the input, for example, semantic labels and symmetry assumptions. In this paper, we design a novel and end-to-end network, termed as S2U-Net, to achieve the completion of 3D shapes of vehicles from the partial and sparse point clouds. Our network includes two modules of the encoder and the generator. The encoder is designed to extract the global feature of the incomplete and sparse point cloud while the generator is designed to produce fine-grained and dense completion. Specially, we adopt an upsampling strategy to output a more uniform point cloud. Experimental results in the KITTI dataset illustrate our method achieves better performance than the state-of-arts in terms of distribution uniformity and completion quality. Specifically, we improve the translation accuracy by 50.8% and rotation accuracy by 40.6% evaluating completed results with a point cloud registration task.


2021 ◽  
Vol 13 (17) ◽  
pp. 3427
Author(s):  
Chunjiao Zhang ◽  
Shenghua Xu ◽  
Tao Jiang ◽  
Jiping Liu ◽  
Zhengjun Liu ◽  
...  

LiDAR point clouds are rich in spatial information and can effectively express the size, shape, position, and direction of objects; thus, they have the advantage of high spatial utilization. The point cloud focuses on describing the shape of the external surface of the object itself and will not store useless redundant information to describe the occupation. Therefore, point clouds have become the research focus of 3D data models and are widely used in large-scale scene reconstruction, virtual reality, digital elevation model production, and other fields. Since point clouds have various characteristics, such as disorder, density inconsistency, unstructuredness, and incomplete information, point cloud classification is still complex and challenging. To realize the semantic classification of LiDAR point clouds in complex scenarios, this paper proposes the integration of normal vector features into an atrous convolution residual network. Based on the RandLA-Net network structure, the proposed network integrates the atrous convolution into the residual module to extract global and local features of the point clouds. The atrous convolution can learn more valuable point cloud feature information by expanding the receptive field. Then, the point cloud normal vector is embedded in the local feature aggregation module of the RandLA-Net network to extract local semantic aggregation features. The improved local feature aggregation module can merge the deep features of the point cloud and mine the fine-grained information of the point cloud to improve the model’s segmentation ability in complex scenes. Finally, to resolve the imbalance of the distribution of the various categories of point clouds, the original loss function is optimized by adopting a reweighted method to prevent overfitting so that the network can focus on small target categories in the training process to effectively improve the classification performance. Through the experimental analysis of a Vaihingen (Germany) urban 3D semantic dataset from the ISPRS website, it is verified that the proposed algorithm has a strong generalization ability. The overall accuracy (OA) of the proposed algorithm on the Vaihingen urban 3D semantic dataset reached 97.9%, and the average reached 96.1%. Experiments show that the proposed algorithm fully exploits the semantic features of point clouds and effectively improves the accuracy of point cloud classification.


2021 ◽  
Author(s):  
Simone Müller ◽  
Dieter Kranzlmüller

Based on depth perception of individual stereo cameras, spatial structures can be derived as point clouds. The quality of such three-dimensional data is technically restricted by sensor limitations, latency of recording, and insufficient object reconstructions caused by surface illustration. Additionally external physical effects like lighting conditions, material properties, and reflections can lead to deviations between real and virtual object perception. Such physical influences can be seen in rendered point clouds as geometrical imaging errors on surfaces and edges. We propose the simultaneous use of multiple and dynamically arranged cameras. The increased information density leads to more details in surrounding detection and object illustration. During a pre-processing phase the collected data are merged and prepared. Subsequently, a logical analysis part examines and allocates the captured images to three-dimensional space. For this purpose, it is necessary to create a new metadata set consisting of image and localisation data. The post-processing reworks and matches the locally assigned images. As a result, the dynamic moving images become comparable so that a more accurate point cloud can be generated. For evaluation and better comparability we decided to use synthetically generated data sets. Our approach builds the foundation for dynamic and real-time based generation of digital twins with the aid of real sensor data.


Sign in / Sign up

Export Citation Format

Share Document