scholarly journals ON THE ASSOCIATION OF LIDAR POINT CLOUDS AND TEXTURED MESHES FOR MULTI-MODAL SEMANTIC SEGMENTATION

Author(s):  
D. Laupheimer ◽  
M. H. Shams Eddin ◽  
N. Haala

Abstract. The semantic segmentation of the huge amount of acquired 3D data has become an important task in recent years. We propose a novel association mechanism that enables information transfer between two 3D representations: point clouds and meshes. The association mechanism can be used in a two-fold manner: (i) feature transfer to stabilize semantic segmentation of one representation with features from the other representation and (ii) label transfer to achieve the semantic annotation of both representations. We claim that point clouds are an intermediate product whereas meshes are a final user product that jointly provides geometrical and textural information. For this reason, we opt for semantic mesh segmentation in the first place. We apply an off-the-shelf PointNet++ to a textured urban triangle mesh as generated from LiDAR and oblique imagery. For each face within a mesh, a feature vector is computed and optionally extended by inherent LiDAR features as provided by the sensor (e.g. intensity). The feature vector extension is accomplished with the proposed association mechanism. By these means, we leverage inherent features from both data representations for the semantic mesh segmentation (multi-modality). We achieve an overall accuracy of 86:40% on the face-level on a dedicated test mesh. Neglecting LiDAR-inherent features in the per-face feature vectors decreases mean intersection over union by ∼2%. Leveraging our association mechanism, we transfer predicted mesh labels to the LiDAR point cloud at a stroke. To this end, we semantically segment the point cloud by implicit usage of geometric and textural mesh features. The semantic point cloud segmentation achieves an overall accuracy close to 84% on the point-level for both feature vector compositions.

Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2161 ◽  
Author(s):  
Arnadi Murtiyoso ◽  
Pierre Grussenmeyer

3D heritage documentation has seen a surge in the past decade due to developments in reality-based 3D recording techniques. Several methods such as photogrammetry and laser scanning are becoming ubiquitous amongst architects, archaeologists, surveyors, and conservators. The main result of these methods is a 3D representation of the object in the form of point clouds. However, a solely geometric point cloud is often insufficient for further analysis, monitoring, and model predicting of the heritage object. The semantic annotation of point clouds remains an interesting research topic since traditionally it requires manual labeling and therefore a lot of time and resources. This paper proposes an automated pipeline to segment and classify multi-scalar point clouds in the case of heritage object. This is done in order to perform multi-level segmentation from the scale of a historical neighborhood up until that of architectural elements, specifically pillars and beams. The proposed workflow involves an algorithmic approach in the form of a toolbox which includes various functions covering the semantic segmentation of large point clouds into smaller, more manageable and semantically labeled clusters. The first part of the workflow will explain the segmentation and semantic labeling of heritage complexes into individual buildings, while a second part will discuss the use of the same toolbox to segment the resulting buildings further into architectural elements. The toolbox was tested on several historical buildings and showed promising results. The ultimate intention of the project is to help the manual point cloud labeling, especially when confronted with the large training data requirements of machine learning-based algorithms.


Author(s):  
N. Haala ◽  
M. Kölle ◽  
M. Cramer ◽  
D. Laupheimer ◽  
G. Mandlburger ◽  
...  

Abstract. This paper presents a study on the potential of ultra-high accurate UAV-based 3D data capture by combining both imagery and LiDAR data. Our work is motivated by a project aiming at the monitoring of subsidence in an area of mixed use. Thus, it covers built-up regions in a village with a ship lock as the main object of interest as well as regions of agricultural use. In order to monitor potential subsidence in the order of 10 mm/year, we aim at sub-centimeter accuracies of the respective 3D point clouds. We show that hybrid georeferencing helps to increase the accuracy of the adjusted LiDAR point cloud by integrating results from photogrammetric block adjustment to improve the time-dependent trajectory corrections. As our main contribution, we demonstrate that joint orientation of laser scans and images in a hybrid adjustment framework significantly improves the relative and absolute height accuracies. By these means, accuracies corresponding to the GSD of the integrated imagery can be achieved. Image data can also help to enhance the LiDAR point clouds. As an example, integrating results from Multi-View Stereo potentially increases the point density from airborne LiDAR. Furthermore, image texture can support 3D point cloud classification. This semantic segmentation discussed in the final part of the paper is a prerequisite for further enhancement and analysis of the captured point cloud.


2021 ◽  
Author(s):  
Radu Alexandru Rosu ◽  
Peer Schütt ◽  
Jan Quenzel ◽  
Sven Behnke

AbstractDeep convolutional neural networks have shown outstanding performance in the task of semantically segmenting images. Applying the same methods on 3D data still poses challenges due to the heavy memory requirements and the lack of structured data. Here, we propose LatticeNet, a novel approach for 3D semantic segmentation, which takes raw point clouds as input. A PointNet describes the local geometry which we embed into a sparse permutohedral lattice. The lattice allows for fast convolutions while keeping a low memory footprint. Further, we introduce DeformSlice, a novel learned data-dependent interpolation for projecting lattice features back onto the point cloud. We present results of 3D segmentation on multiple datasets where our method achieves state-of-the-art performance. We also extend and evaluate our network for instance and dynamic object segmentation.


Author(s):  
F. Politz ◽  
M. Sester

<p><strong>Abstract.</strong> Over the past years, the algorithms for dense image matching (DIM) to obtain point clouds from aerial images improved significantly. Consequently, DIM point clouds are now a good alternative to the established Airborne Laser Scanning (ALS) point clouds for remote sensing applications. In order to derive high-level applications such as digital terrain models or city models, each point within a point cloud must be assigned a class label. Usually, ALS and DIM are labelled with different classifiers due to their varying characteristics. In this work, we explore both point cloud types in a fully convolutional encoder-decoder network, which learns to classify ALS as well as DIM point clouds. As input, we project the point clouds onto a 2D image raster plane and calculate the minimal, average and maximal height values for each raster cell. The network then differentiates between the classes ground, non-ground, building and no data. We test our network in six training setups using only one point cloud type, both point clouds as well as several transfer-learning approaches. We quantitatively and qualitatively compare all results and discuss the advantages and disadvantages of all setups. The best network achieves an overall accuracy of 96<span class="thinspace"></span>% in an ALS and 83<span class="thinspace"></span>% in a DIM test set.</p>


Author(s):  
Y. Cao ◽  
M. Previtali ◽  
M. Scaioni

Abstract. In the wake of the success of Deep Learning Networks (DLN) for image recognition, object detection, shape classification and semantic segmentation, this approach has proven to be both a major breakthrough and an excellent tool in point cloud classification. However, understanding how different types of DLN achieve still lacks. In several studies the output of segmentation/classification process is compared against benchmarks, but the network is treated as a “black-box” and intermediate steps are not deeply analysed. Specifically, here the following questions are discussed: (1) what exactly did DLN learn from a point cloud? (2) On the basis of what information do DLN make decisions? To conduct such a quantitative investigation of these DLN applied to point clouds, this paper investigates the visual interpretability for the decision-making process. Firstly, we introduce a reconstruction network able to reconstruct and visualise the learned features, in order to face with question (1). Then, we propose 3DCAM to indicate the discriminative point cloud regions used by these networks to identify that category, thus dealing with question (2). Through answering the above two questions, the paper would like to offer some initial solutions to better understand the application of DLN to point clouds.


2019 ◽  
Vol 8 (5) ◽  
pp. 213 ◽  
Author(s):  
Florent Poux ◽  
Roland Billen

Automation in point cloud data processing is central in knowledge discovery within decision-making systems. The definition of relevant features is often key for segmentation and classification, with automated workflows presenting the main challenges. In this paper, we propose a voxel-based feature engineering that better characterize point clusters and provide strong support to supervised or unsupervised classification. We provide different feature generalization levels to permit interoperable frameworks. First, we recommend a shape-based feature set (SF1) that only leverages the raw X, Y, Z attributes of any point cloud. Afterwards, we derive relationship and topology between voxel entities to obtain a three-dimensional (3D) structural connectivity feature set (SF2). Finally, we provide a knowledge-based decision tree to permit infrastructure-related classification. We study SF1/SF2 synergy on a new semantic segmentation framework for the constitution of a higher semantic representation of point clouds in relevant clusters. Finally, we benchmark the approach against novel and best-performing deep-learning methods while using the full S3DIS dataset. We highlight good performances, easy-integration, and high F1-score (> 85%) for planar-dominant classes that are comparable to state-of-the-art deep learning.


Author(s):  
A. Adam ◽  
L. Grammatikopoulos ◽  
G. Karras ◽  
E. Protopapadakis ◽  
K. Karantzalos

Abstract. 3D semantic segmentation is the joint task of partitioning a point cloud into semantically consistent 3D regions and assigning them to a semantic class/label. While the traditional approaches for 3D semantic segmentation typically rely only on structural information of the objects (i.e. object geometry and shape), the last years many techniques combining both visual and geometric features have emerged, taking advantage of the progress in SfM/MVS algorithms that reconstruct point clouds from multiple overlapping images. Our work describes a hybrid methodology for 3D semantic segmentation, relying both on 2D and 3D space and aiming at exploring whether image selection is critical as regards the accuracy of 3D semantic segmentation of point clouds. Experimental results are demonstrated on a free online dataset depicting city blocks around Paris. The experimental procedure not only validates that hybrid features (geometric and visual) can achieve a more accurate semantic segmentation, but also demonstrates the importance of the most appropriate view for the 2D feature extraction.


Author(s):  
J. Balado ◽  
P. van Oosterom ◽  
L. Díaz-Vilariño ◽  
P. Arias

Abstract. Although point clouds are characterized as a type of unstructured data, timestamp attribute can structure point clouds into scanlines and shape them into a time signal. The present work studies the transformation of the street point cloud into a time signal based on the Z component for the semantic segmentation using Long Short-Term Memory (LSTM) networks. The experiment was conducted on the point cloud of a real case study. Several training sessions were performed changing the Level of Detail of the classification (coarse level with 3 classes and fine level with 11 classes), two levels of network depth and the use of weighting for the improvement of classes with low number of points. The results showed high accuracy, reaching at best 97.3% in the classification with 3 classes (ground, buildings, and objects) and 95.7% with 11 classes. The distribution of the success rates was not the same for all classes. The classes with the highest number of points obtained better results than the others. The application of weighting improved the classes with few points at the expense of the classes with more points. Increasing the number of hidden layers was shown as a preferable alternative to weighting. Given the high success rates and a behaviour of the LSTM consistent with other Neural Networks in point cloud processing, it is concluded that the LSTM is a feasible alternative for the semantic segmentation of point clouds transformed into time signals.


Author(s):  
D. Tosic ◽  
S. Tuttas ◽  
L. Hoegner ◽  
U. Stilla

<p><strong>Abstract.</strong> This work proposes an approach for semantic classification of an outdoor-scene point cloud acquired with a high precision Mobile Mapping System (MMS), with major goal to contribute to the automatic creation of High Definition (HD) Maps. The automatic point labeling is achieved by utilizing the combination of a feature-based approach for semantic classification of point clouds and a deep learning approach for semantic segmentation of images. Both, point cloud data, as well as the data from a multi-camera system are used for gaining spatial information in an urban scene. Two types of classification applied for this task are: 1) Feature-based approach, in which the point cloud is organized into a supervoxel structure for capturing geometric characteristics of points. Several geometric features are then extracted for appropriate representation of the local geometry, followed by removing the effect of local tendency for each supervoxel to enhance the distinction between similar structures. And lastly, the Random Forests (RF) algorithm is applied in the classification phase, for assigning labels to supervoxels and therefore to points within them. 2) The deep learning approach is employed for semantic segmentation of MMS images of the same scene. To achieve this, an implementation of Pyramid Scene Parsing Network is used. Resulting segmented images with each pixel containing a class label are then projected onto the point cloud, enabling label assignment for each point. At the end, experiment results are presented from a complex urban scene and the performance of this method is evaluated on a manually labeled dataset, for the deep learning and feature-based classification individually, as well as for the result of the labels fusion. The achieved overall accuracy with fusioned output is 0.87 on the final test set, which significantly outperforms the results of individual methods on the same point cloud. The labeled data is published on the TUM-PF Semantic-Labeling-Benchmark.</p>


2021 ◽  
Vol 16 (4) ◽  
pp. 579-587
Author(s):  
Pitisit Dillon ◽  
Pakinee Aimmanee ◽  
Akihiko Wakai ◽  
Go Sato ◽  
Hoang Viet Hung ◽  
...  

The density-based spatial clustering of applications with noise (DBSCAN) algorithm is a well-known algorithm for spatial-clustering data point clouds. It can be applied to many applications, such as crack detection, rockfall detection, and glacier movement detection. Traditional DBSCAN requires two predefined parameters. Suitable values of these parameters depend upon the distribution of the input point cloud. Therefore, estimating these parameters is challenging. This paper proposed a new version of DBSCAN that can automatically customize the parameters. The proposed method consists of two processes: initial parameter estimation based on grid analysis and DBSCAN based on the divide-and-conquer (DC-DBSCAN) approach, which repeatedly performs DBSCAN on each cluster separately and recursively. To verify the proposed method, we applied it to a 3D point cloud dataset that was used to analyze rockfall events at the Puiggcercos cliff, Spain. The total number of data points used in this study was 15,567. The experimental results show that the proposed method is better than the traditional DBSCAN in terms of purity and NMI scores. The purity scores of the proposed method and the traditional DBSCAN method were 96.22% and 91.09%, respectively. The NMI scores of the proposed method and the traditional DBSCAN method are 0.78 and 0.49, respectively. Also, it can detect events that traditional DBSCAN cannot detect.


Sign in / Sign up

Export Citation Format

Share Document