scholarly journals Towards Urban Scene Semantic Segmentation with Deep Learning from LiDAR Point Clouds: A Case Study in Baden-Württemberg, Germany

2021 ◽  
Vol 13 (16) ◽  
pp. 3220
Author(s):  
Yanling Zou ◽  
Holger Weinacker ◽  
Barbara Koch

An accurate understanding of urban objects is critical for urban modeling, intelligent infrastructure planning and city management. The semantic segmentation of light detection and ranging (LiDAR) point clouds is a fundamental approach for urban scene analysis. Over the last years, several methods have been developed to segment urban furniture with point clouds. However, the traditional processing of large amounts of spatial data has become increasingly costly, both time-wise and financially. Recently, deep learning (DL) techniques have been increasingly used for 3D segmentation tasks. Yet, most of these deep neural networks (DNNs) were conducted on benchmarks. It is, therefore, arguable whether DL approaches can achieve the state-of-the-art performance of 3D point clouds segmentation in real-life scenarios. In this research, we apply an adapted DNN (ARandLA-Net) to directly process large-scale point clouds. In particular, we develop a new paradigm for training and validation, which presents a typical urban scene in central Europe (Munzingen, Freiburg, Baden-Württemberg, Germany). Our dataset consists of nearly 390 million dense points acquired by Mobile Laser Scanning (MLS), which has a rather larger quantity of sample points in comparison to existing datasets and includes meaningful object categories that are particular to applications for smart cities and urban planning. We further assess the DNN on our dataset and investigate a number of key challenges from varying aspects, such as data preparation strategies, the advantage of color information and the unbalanced class distribution in the real world. The final segmentation model achieved a mean Intersection-over-Union (mIoU) score of 54.4% and an overall accuracy score of 83.9%. Our experiments indicated that different data preparation strategies influenced the model performance. Additional RGB information yielded an approximately 4% higher mIoU score. Our results also demonstrate that the use of weighted cross-entropy with inverse square root frequency loss led to better segmentation performance than when other losses were considered.

2021 ◽  
Vol 13 (15) ◽  
pp. 3021
Author(s):  
Bufan Zhao ◽  
Xianghong Hua ◽  
Kegen Yu ◽  
Xiaoxing He ◽  
Weixing Xue ◽  
...  

Urban object segmentation and classification tasks are critical data processing steps in scene understanding, intelligent vehicles and 3D high-precision maps. Semantic segmentation of 3D point clouds is the foundational step in object recognition. To identify the intersecting objects and improve the accuracy of classification, this paper proposes a segment-based classification method for 3D point clouds. This method firstly divides points into multi-scale supervoxels and groups them by proposed inverse node graph (IN-Graph) construction, which does not need to define prior information about the node, it divides supervoxels by judging the connection state of edges between them. This method reaches minimum global energy by graph cutting, obtains the structural segments as completely as possible, and retains boundaries at the same time. Then, the random forest classifier is utilized for supervised classification. To deal with the mislabeling of scattered fragments, higher-order CRF with small-label cluster optimization is proposed to refine the classification results. Experiments were carried out on mobile laser scan (MLS) point dataset and terrestrial laser scan (TLS) points dataset, and the results show that overall accuracies of 97.57% and 96.39% were obtained in the two datasets. The boundaries of objects were retained well, and the method achieved a good result in the classification of cars and motorcycles. More experimental analyses have verified the advantages of the proposed method and proved the practicability and versatility of the method.


Sensors ◽  
2020 ◽  
Vol 20 (12) ◽  
pp. 3568 ◽  
Author(s):  
Takayuki Shinohara ◽  
Haoyi Xiu ◽  
Masashi Matsuoka

In the computer vision field, many 3D deep learning models that directly manage 3D point clouds (proposed after PointNet) have been published. Moreover, deep learning-based-techniques have demonstrated state-of-the-art performance for supervised learning tasks on 3D point cloud data, such as classification and segmentation tasks for open datasets in competitions. Furthermore, many researchers have attempted to apply these deep learning-based techniques to 3D point clouds observed by aerial laser scanners (ALSs). However, most of these studies were developed for 3D point clouds without radiometric information. In this paper, we investigate the possibility of using a deep learning method to solve the semantic segmentation task of airborne full-waveform light detection and ranging (lidar) data that consists of geometric information and radiometric waveform data. Thus, we propose a data-driven semantic segmentation model called the full-waveform network (FWNet), which handles the waveform of full-waveform lidar data without any conversion process, such as projection onto a 2D grid or calculating handcrafted features. Our FWNet is based on a PointNet-based architecture, which can extract the local and global features of each input waveform data, along with its corresponding geographical coordinates. Subsequently, the classifier consists of 1D convolutional operational layers, which predict the class vector corresponding to the input waveform from the extracted local and global features. Our trained FWNet achieved higher scores in its recall, precision, and F1 score for unseen test data—higher scores than those of previously proposed methods in full-waveform lidar data analysis domain. Specifically, our FWNet achieved a mean recall of 0.73, a mean precision of 0.81, and a mean F1 score of 0.76. We further performed an ablation study, that is assessing the effectiveness of our proposed method, of the above-mentioned metric. Moreover, we investigated the effectiveness of our PointNet based local and global feature extraction method using the visualization of the feature vector. In this way, we have shown that our network for local and global feature extraction allows training with semantic segmentation without requiring expert knowledge on full-waveform lidar data or translation into 2D images or voxels.


2020 ◽  
Vol 9 (9) ◽  
pp. 535
Author(s):  
Francesca Matrone ◽  
Eleonora Grilli ◽  
Massimo Martini ◽  
Marina Paolanti ◽  
Roberto Pierdicca ◽  
...  

In recent years semantic segmentation of 3D point clouds has been an argument that involves different fields of application. Cultural heritage scenarios have become the subject of this study mainly thanks to the development of photogrammetry and laser scanning techniques. Classification algorithms based on machine and deep learning methods allow to process huge amounts of data as 3D point clouds. In this context, the aim of this paper is to make a comparison between machine and deep learning methods for large 3D cultural heritage classification. Then, considering the best performances of both techniques, it proposes an architecture named DGCNN-Mod+3Dfeat that combines the positive aspects and advantages of these two methodologies for semantic segmentation of cultural heritage point clouds. To demonstrate the validity of our idea, several experiments from the ArCH benchmark are reported and commented.


Author(s):  
D. Tosic ◽  
S. Tuttas ◽  
L. Hoegner ◽  
U. Stilla

<p><strong>Abstract.</strong> This work proposes an approach for semantic classification of an outdoor-scene point cloud acquired with a high precision Mobile Mapping System (MMS), with major goal to contribute to the automatic creation of High Definition (HD) Maps. The automatic point labeling is achieved by utilizing the combination of a feature-based approach for semantic classification of point clouds and a deep learning approach for semantic segmentation of images. Both, point cloud data, as well as the data from a multi-camera system are used for gaining spatial information in an urban scene. Two types of classification applied for this task are: 1) Feature-based approach, in which the point cloud is organized into a supervoxel structure for capturing geometric characteristics of points. Several geometric features are then extracted for appropriate representation of the local geometry, followed by removing the effect of local tendency for each supervoxel to enhance the distinction between similar structures. And lastly, the Random Forests (RF) algorithm is applied in the classification phase, for assigning labels to supervoxels and therefore to points within them. 2) The deep learning approach is employed for semantic segmentation of MMS images of the same scene. To achieve this, an implementation of Pyramid Scene Parsing Network is used. Resulting segmented images with each pixel containing a class label are then projected onto the point cloud, enabling label assignment for each point. At the end, experiment results are presented from a complex urban scene and the performance of this method is evaluated on a manually labeled dataset, for the deep learning and feature-based classification individually, as well as for the result of the labels fusion. The achieved overall accuracy with fusioned output is 0.87 on the final test set, which significantly outperforms the results of individual methods on the same point cloud. The labeled data is published on the TUM-PF Semantic-Labeling-Benchmark.</p>


2020 ◽  
Vol 12 (6) ◽  
pp. 1005 ◽  
Author(s):  
Roberto Pierdicca ◽  
Marina Paolanti ◽  
Francesca Matrone ◽  
Massimo Martini ◽  
Christian Morbidoni ◽  
...  

In the Digital Cultural Heritage (DCH) domain, the semantic segmentation of 3D Point Clouds with Deep Learning (DL) techniques can help to recognize historical architectural elements, at an adequate level of detail, and thus speed up the process of modeling of historical buildings for developing BIM models from survey data, referred to as HBIM (Historical Building Information Modeling). In this paper, we propose a DL framework for Point Cloud segmentation, which employs an improved DGCNN (Dynamic Graph Convolutional Neural Network) by adding meaningful features such as normal and colour. The approach has been applied to a newly collected DCH Dataset which is publicy available: ArCH (Architectural Cultural Heritage) Dataset. This dataset comprises 11 labeled points clouds, derived from the union of several single scans or from the integration of the latter with photogrammetric surveys. The involved scenes are both indoor and outdoor, with churches, chapels, cloisters, porticoes and loggias covered by a variety of vaults and beared by many different types of columns. They belong to different historical periods and different styles, in order to make the dataset the least possible uniform and homogeneous (in the repetition of the architectural elements) and the results as general as possible. The experiments yield high accuracy, demonstrating the effectiveness and suitability of the proposed approach.


Sensors ◽  
2019 ◽  
Vol 19 (19) ◽  
pp. 4329 ◽  
Author(s):  
Guorong Cai ◽  
Zuning Jiang ◽  
Zongyue Wang ◽  
Shangfeng Huang ◽  
Kai Chen ◽  
...  

Semantic segmentation of 3D point clouds plays a vital role in autonomous driving, 3D maps, and smart cities, etc. Recent work such as PointSIFT shows that spatial structure information can improve the performance of semantic segmentation. Motivated by this phenomenon, we propose Spatial Aggregation Net (SAN) for point cloud semantic segmentation. SAN is based on multi-directional convolution scheme that utilizes the spatial structure information of point cloud. Firstly, Octant-Search is employed to capture the neighboring points around each sampled point. Secondly, we use multi-directional convolution to extract information from different directions of sampled points. Finally, max-pooling is used to aggregate information from different directions. The experimental results conducted on ScanNet database show that the proposed SAN has comparable results with state-of-the-art algorithms such as PointNet, PointNet++, and PointSIFT, etc. In particular, our method has better performance on flat, small objects, and the edge areas that connect objects. Moreover, our model has good trade-off in segmentation accuracy and time complexity.


Author(s):  
E. S. Malinverni ◽  
R. Pierdicca ◽  
M. Paolanti ◽  
M. Martini ◽  
C. Morbidoni ◽  
...  

<p><strong>Abstract.</strong> Cultural Heritage is a testimony of past human activity, and, as such, its objects exhibit great variety in their nature, size and complexity; from small artefacts and museum items to cultural landscapes, from historical building and ancient monuments to city centers and archaeological sites. Cultural Heritage around the globe suffers from wars, natural disasters and human negligence. The importance of digital documentation is well recognized and there is an increasing pressure to document our heritage both nationally and internationally. For this reason, the three-dimensional scanning and modeling of sites and artifacts of cultural heritage have remarkably increased in recent years. The semantic segmentation of point clouds is an essential step of the entire pipeline; in fact, it allows to decompose complex architectures in single elements, which are then enriched with meaningful information within Building Information Modelling software. Notwithstanding, this step is very time consuming and completely entrusted on the manual work of domain experts, far from being automatized. This work describes a method to label and cluster automatically a point cloud based on a supervised Deep Learning approach, using a state-of-the-art Neural Network called PointNet++. Despite other methods are known, we have choose PointNet++ as it reached significant results for classifying and segmenting 3D point clouds. PointNet++ has been tested and improved, by training the network with annotated point clouds coming from a real survey and to evaluate how performance changes according to the input training data. It can result of great interest for the research community dealing with the point cloud semantic segmentation, since it makes public a labelled dataset of CH elements for further tests.</p>


Author(s):  
Desire Mulindwa Burume ◽  
Shengzhi Du

Beyond semantic segmentation,3D instance segmentation(a process to delineate objects of interest and also classifying the objects into a set of categories) is gaining more and more interest among researchers since numerous computer vision applications need accurate segmentation processes(autonomous driving, indoor navigation, and even virtual or augmented reality systems…) This paper gives an overview and a technical comparison of the existing deep learning architectures in handling unstructured Euclidean data for the rapidly developing 3D instance segmentation. First, the authors divide the 3D point clouds based instance segmentation techniques into two major categories which are proposal based methods and proposal free methods. Then, they also introduce and compare the most used datasets with regard to 3D instance segmentation. Furthermore, they compare and analyze these techniques performance (speed, accuracy, response to noise…). Finally, this paper provides a review of the possible future directions of deep learning for 3D sensor-based information and provides insight into the most promising areas for prospective research.


2021 ◽  
Vol 87 (4) ◽  
pp. 283-293
Author(s):  
Wei Wang ◽  
Yuan Xu ◽  
Yingchao Ren ◽  
Gang Wang

Recently, performance improvement in facade parsing from 3D point clouds has been brought about by designing more complex network structures, which cost huge computing resources and do not take full advantage of prior knowledge of facade structure. Instead, from the perspective of data distribution, we construct a new hierarchical mesh multi-view data domain based on the characteristics of facade objects to achieve fusion of deep-learning models and prior knowledge, thereby significantly improving segmentation accuracy. We comprehensively evaluate the current mainstream method on the RueMonge 2014 data set and demonstrate the superiority of our method. The mean intersection-over-union index on the facade-parsing task reached 76.41%, which is 2.75% higher than the current best result. In addition, through comparative experiments, the reasons for the performance improvement of the proposed method are further analyzed.


Sign in / Sign up

Export Citation Format

Share Document