scholarly journals SEMANTIC SEGMENTATION OF POINT CLOUDS WITH POINTNET AND KPCONV ARCHITECTURES APPLIED TO RAILWAY TUNNELS

Author(s):  
M. Soilán ◽  
A. Nóvoa ◽  
A. Sánchez-Rodríguez ◽  
B. Riveiro ◽  
P. Arias

Abstract. Transport infrastructure monitoring has lately attracted increasing attention due to the rise in extreme natural hazards posed by climate change. Mobile Mapping Systems gather information regarding the state of the assets, which allows for more efficient decision-making. These systems provide information in the form of three-dimensional point clouds. Point cloud analysis through deep learning has emerged as a focal research area due to its wide application in areas such as autonomous driving. This paper aims to apply the pioneering PointNet, and the current state-of-the-art KPConv architectures to perform scene segmentation of railway tunnels, in order to validate their employability over heuristic classification methods. The approach is to perform a multi-class classification that classifies the most relevant components of tunnels: ground, lining, wiring and rails. Both architectures are trained from scratch with heuristically classified point clouds of two different railway tunnels. Results show that, while both architectures are suitable for the proposed classification task, KPConv outperforms PointNet with F1-scores over 97% for ground, lining and wiring classes, and over 90% for rails. In addition, KPConv is tested using transfer learning, which gives F1-scores slightly lower than for the model training from scratch but shows better generalization capabilities.

2020 ◽  
Vol 8 (3) ◽  
pp. 188
Author(s):  
Fangfang Liu ◽  
Ming Fang

Image semantic segmentation technology has been increasingly applied in many fields, for example, autonomous driving, indoor navigation, virtual reality and augmented reality. However, underwater scenes, where there is a huge amount of marine biological resources and irreplaceable biological gene banks that need to be researched and exploited, are limited. In this paper, image semantic segmentation technology is exploited to study underwater scenes. We extend the current state-of-the-art semantic segmentation network DeepLabv3 + and employ it as the basic framework. First, the unsupervised color correction method (UCM) module is introduced to the encoder structure of the framework to improve the quality of the image. Moreover, two up-sampling layers are added to the decoder structure to retain more target features and object boundary information. The model is trained by fine-tuning and optimizing relevant parameters. Experimental results indicate that the image obtained by our method demonstrates better performance in improving the appearance of the segmented target object and avoiding its pixels from mingling with other class’s pixels, enhancing the segmentation accuracy of the target boundaries and retaining more feature information. Compared with the original method, our method improves the segmentation accuracy by 3%.


2019 ◽  
Vol 8 (5) ◽  
pp. 213 ◽  
Author(s):  
Florent Poux ◽  
Roland Billen

Automation in point cloud data processing is central in knowledge discovery within decision-making systems. The definition of relevant features is often key for segmentation and classification, with automated workflows presenting the main challenges. In this paper, we propose a voxel-based feature engineering that better characterize point clusters and provide strong support to supervised or unsupervised classification. We provide different feature generalization levels to permit interoperable frameworks. First, we recommend a shape-based feature set (SF1) that only leverages the raw X, Y, Z attributes of any point cloud. Afterwards, we derive relationship and topology between voxel entities to obtain a three-dimensional (3D) structural connectivity feature set (SF2). Finally, we provide a knowledge-based decision tree to permit infrastructure-related classification. We study SF1/SF2 synergy on a new semantic segmentation framework for the constitution of a higher semantic representation of point clouds in relevant clusters. Finally, we benchmark the approach against novel and best-performing deep-learning methods while using the full S3DIS dataset. We highlight good performances, easy-integration, and high F1-score (> 85%) for planar-dominant classes that are comparable to state-of-the-art deep learning.


Author(s):  
S. Vincke ◽  
R. de Lima Hernandez ◽  
M. Bassier ◽  
M. Vergauwen

<p><strong>Abstract.</strong> By adopting Building Information Modelling (BIM) software, the architecture, engineering and construction (AEC) industry shifted from a two-dimensional approach to a three-dimensional one in the design phase of a building. However, a similar three-dimensional approach for the visualisation of the current state of the construction works is lacking. Currently, progress reports typically include numerous pictures of the construction site or elements, alongside the appropriate parts of the 3D as-design BIM model. If a proper transition to a <i>3D design versus 3D current state</i> were achieved, the evolved type of reports would become more comprehensible, resulting in more well-informed decision-making. This requires a single, unique software platform that is able to import, process, analyse and visualise both the as-design BIM model as well as the recorded data of the current construction state. At present however, the visualisation and interpretation of the different datasets alone requires already multiple software packages.</p><p>As a partial solution this work presents a platform to easily visualise and interpret various data sources such as point clouds, meshes and BIM models and analysis results. Recent advances of gaming engines focus on and allow for an excellent visualisation of mesh data. Therefore all of the aforementioned data sources are converted into mesh objects upon importing. Moreover, gaming engines provide the necessary tools to traverse the scene intuitively allowing construction site managers and other stakeholders to gain a more complete and better oversight of the construction project. Furthermore, these engines also provide the possibility to take the immersion to the next level: incorporating the 3D entities into a Virtual Reality (VR) environment makes the visualised data and the executed analyses even more comprehensible.</p><p>By means of a case study, the potential of the presented approach is showcased. The real-world construction site recordings, models and analyses are visualised and implemented in VR using the Unity gaming engine.</p>


2021 ◽  
Vol 13 (16) ◽  
pp. 3121
Author(s):  
Beanbonyka Rim ◽  
Ahyoung Lee ◽  
Min Hong

Semantic segmentation of large-scale outdoor 3D LiDAR point clouds becomes essential to understand the scene environment in various applications, such as geometry mapping, autonomous driving, and more. With an advantage of being a 3D metric space, 3D LiDAR point clouds, on the other hand, pose a challenge for a deep learning approach, due to their unstructured, unorder, irregular, and large-scale characteristics. Therefore, this paper presents an encoder–decoder shared multi-layer perceptron (MLP) with multiple losses, to address an issue of this semantic segmentation. The challenge rises a trade-off between efficiency and effectiveness in performance. To balance this trade-off, we proposed common mechanisms, which is simple and yet effective, by defining a random point sampling layer, an attention-based pooling layer, and a summation of multiple losses integrated with the encoder–decoder shared MLPs method for the large-scale outdoor point clouds semantic segmentation. We conducted our experiments on the following two large-scale benchmark datasets: Toronto-3D and DALES dataset. Our experimental results achieved an overall accuracy (OA) and a mean intersection over union (mIoU) of both the Toronto-3D dataset, with 83.60% and 71.03%, and the DALES dataset, with 76.43% and 59.52%, respectively. Additionally, our proposed method performed a few numbers of parameters of the model, and faster than PointNet++ by about three times during inferencing.


2021 ◽  
Vol 13 (24) ◽  
pp. 5071
Author(s):  
Jing Zhang ◽  
Jiajun Wang ◽  
Da Xu ◽  
Yunsong Li

The use of LiDAR point clouds for accurate three-dimensional perception is crucial for realizing high-level autonomous driving systems. Upon considering the drawbacks of the current point cloud object-detection algorithms, this paper proposes HCNet, an algorithm that combines an attention mechanism with adaptive adjustment, starting from feature fusion and overcoming the sparse and uneven distribution of point clouds. Inspired by the basic idea of an attention mechanism, a feature-fusion structure HC module with height attention and channel attention, weighted in parallel, is proposed to perform feature-fusion on multiple pseudo images. The use of several weighting mechanisms enhances the ability of feature-information expression. Additionally, we designed an adaptively adjusted detection head that also overcomes the sparsity of the point cloud from the perspective of original information fusion. It reduces the interference caused by the uneven distribution of the point cloud from the perspective of adaptive adjustment. The results show that our HCNet has better accuracy than other one-stage-network or even two-stage-network RCNNs under some evaluation detection metrics. Additionally, it has a detection rate of 30FPS. Especially for hard samples, the algorithm in this paper has better detection performance than many existing algorithms.


Micromachines ◽  
2020 ◽  
Vol 11 (5) ◽  
pp. 456 ◽  
Author(s):  
Dingkang Wang ◽  
Connor Watkins ◽  
Huikai Xie

In recent years, Light Detection and Ranging (LiDAR) has been drawing extensive attention both in academia and industry because of the increasing demand for autonomous vehicles. LiDAR is believed to be the crucial sensor for autonomous driving and flying, as it can provide high-density point clouds with accurate three-dimensional information. This review presents an extensive overview of Microelectronechanical Systems (MEMS) scanning mirrors specifically for applications in LiDAR systems. MEMS mirror-based laser scanners have unrivalled advantages in terms of size, speed and cost over other types of laser scanners, making them ideal for LiDAR in a wide range of applications. A figure of merit (FoM) is defined for MEMS mirrors in LiDAR scanners in terms of aperture size, field of view (FoV) and resonant frequency. Various MEMS mirrors based on different actuation mechanisms are compared using the FoM. Finally, a preliminary assessment of off-the-shelf MEMS scanned LiDAR systems is given.


2021 ◽  
Vol 13 (22) ◽  
pp. 4497
Author(s):  
Jianjun Zou ◽  
Zhenxin Zhang ◽  
Dong Chen ◽  
Qinghua Li ◽  
Lan Sun ◽  
...  

Point cloud registration is the foundation and key step for many vital applications, such as digital city, autonomous driving, passive positioning, and navigation. The difference of spatial objects and the structure complexity of object surfaces are the main challenges for the registration problem. In this paper, we propose a graph attention capsule model (named as GACM) for the efficient registration of terrestrial laser scanning (TLS) point cloud in the urban scene, which fuses graph attention convolution and a three-dimensional (3D) capsule network to extract local point cloud features and obtain 3D feature descriptors. These descriptors can take into account the differences of spatial structure and point density in objects and make the spatial features of ground objects more prominent. During the training progress, we used both matched points and non-matched points to train the model. In the test process of the registration, the points in the neighborhood of each keypoint were sent to the trained network, in order to obtain feature descriptors and calculate the rotation and translation matrix after constructing a K-dimensional (KD) tree and random sample consensus (RANSAC) algorithm. Experiments show that the proposed method achieves more efficient registration results and higher robustness than other frontier registration methods in the pairwise registration of point clouds.


Mathematics ◽  
2021 ◽  
Vol 9 (20) ◽  
pp. 2589
Author(s):  
Artyom Makovetskii ◽  
Sergei Voronin ◽  
Vitaly Kober ◽  
Aleksei Voronin

The registration of point clouds in a three-dimensional space is an important task in many areas of computer vision, including robotics and autonomous driving. The purpose of registration is to find a rigid geometric transformation to align two point clouds. The registration problem can be affected by noise and partiality (two point clouds only have a partial overlap). The Iterative Closed Point (ICP) algorithm is a common method for solving the registration problem. Recently, artificial neural networks have begun to be used in the registration of point clouds. The drawback of ICP and other registration algorithms is the possible convergence to a local minimum. Thus, an important characteristic of a registration algorithm is the ability to avoid local minima. In this paper, we propose an ICP-type registration algorithm (λ-ICP) that uses a multiparameter functional (λ-functional). The proposed λ-ICP algorithm generalizes the NICP algorithm (normal ICP). The application of the λ-functional requires a consistent choice of the eigenvectors of the covariance matrix of two point clouds. The paper also proposes an algorithm for choosing the directions of eigenvectors. The performance of the proposed λ-ICP algorithm is compared with that of a standard point-to-point ICP and neural network Deep Closest Points (DCP).


Sensors ◽  
2019 ◽  
Vol 19 (19) ◽  
pp. 4329 ◽  
Author(s):  
Guorong Cai ◽  
Zuning Jiang ◽  
Zongyue Wang ◽  
Shangfeng Huang ◽  
Kai Chen ◽  
...  

Semantic segmentation of 3D point clouds plays a vital role in autonomous driving, 3D maps, and smart cities, etc. Recent work such as PointSIFT shows that spatial structure information can improve the performance of semantic segmentation. Motivated by this phenomenon, we propose Spatial Aggregation Net (SAN) for point cloud semantic segmentation. SAN is based on multi-directional convolution scheme that utilizes the spatial structure information of point cloud. Firstly, Octant-Search is employed to capture the neighboring points around each sampled point. Secondly, we use multi-directional convolution to extract information from different directions of sampled points. Finally, max-pooling is used to aggregate information from different directions. The experimental results conducted on ScanNet database show that the proposed SAN has comparable results with state-of-the-art algorithms such as PointNet, PointNet++, and PointSIFT, etc. In particular, our method has better performance on flat, small objects, and the edge areas that connect objects. Moreover, our model has good trade-off in segmentation accuracy and time complexity.


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5900
Author(s):  
Sungjin Cho ◽  
Chansoo Kim ◽  
Jaehyun Park ◽  
Myoungho Sunwoo ◽  
Kichun Jo

LiDAR-based Simultaneous Localization And Mapping (SLAM), which provides environmental information for autonomous vehicles by map building, is a major challenge for autonomous driving. In addition, the semantic information has been used for the LiDAR-based SLAM with the advent of deep neural network-based semantic segmentation algorithms. The semantic segmented point clouds provide a much greater range of functionality for autonomous vehicles than geometry alone, which can play an important role in the mapping step. However, due to the uncertainty of the semantic segmentation algorithms, the semantic segmented point clouds have limitations in being directly used for SLAM. In order to solve the limitations, this paper proposes a semantic segmentation-based LiDAR SLAM system considering the uncertainty of the semantic segmentation algorithms. The uncertainty is explicitly modeled by proposed probability models which are come from the data-driven approaches. Based on the probability models, this paper proposes semantic registration which calculates the transformation relationship of consecutive point clouds using semantic information with proposed probability models. Furthermore, the proposed probability models are used to determine the semantic class of the points when the multiple scans indicate different classes due to the uncertainty. The proposed framework is verified and evaluated by the KITTI dataset and outdoor environments. The experiment results show that the proposed semantic mapping framework reduces the errors of the mapping poses and eliminates the ambiguity of the semantic information of the generated semantic map.


Sign in / Sign up

Export Citation Format

Share Document