WHICH 3D DATA REPRESENTATION DOES THE CROWD LIKE BEST? CROWD-BASED ACTIVE LEARNING FOR COUPLED SEMANTIC SEGMENTATION OF POINT CLOUDS AND TEXTURED MESHES

Abstract. Semantic interpretation of multi-modal datasets is of great importance in many domains of geospatial data analysis. However, when training models for automated semantic segmentation, labeled training data is required and in case of multi-modality for each representation form of the scene. To completely avoid the time-consuming and cost-intensive involvement of an expert in the annotation procedure, we propose an Active Learning (AL) pipeline where a Random Forest classifier selects a subset of points sufficient for training and where necessary labels are received from the crowd. In this AL loop, we aim on coupled semantic segmentation of an Airborne Laser Scanning (ALS) point cloud and the corresponding 3D textured mesh generated from LiDAR data and imagery in a hybrid manner. Within this work we pursue two main objectives: i) We evaluate the performance of the AL pipeline applied to an ultra-high resolution ALS point cloud and a derived textured mesh (both benchmark datasets are available at https://ifpwww.ifp.uni-stuttgart.de/benchmark/hessigheim/default.aspx). ii) We investigate the capabilities of the crowd regarding interpretation of 3D geodata and observed that the crowd performs about 3 percentage points better when labeling meshes compared to point clouds. We additionally demonstrate that labels received solely by the crowd can power a machine learning system only differing in Overall Accuracy by less than 2 percentage points for the point cloud and less than 3 percentage points for the mesh, compared to using the completely labeled training pool. For deriving this sparse training set, we ask the crowd to label 0.25 % of available training points, resulting in costs of 190 &dollar;.

Download Full-text

HYBRID ACQUISITION OF HIGH QUALITY TRAINING DATA FOR SEMANTIC SEGMENTATION OF 3D POINT CLOUDS USING CROWD-BASED ACTIVE LEARNING

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2020-501-2020 ◽

2020 ◽

Vol V-2-2020 ◽

pp. 501-508

Author(s):

M. Kölle ◽

V. Walter ◽

S. Schmohl ◽

U. Soergel

Keyword(s):

Active Learning ◽

Semantic Segmentation ◽

Point Clouds ◽

Human Interaction ◽

Training Data ◽

Semantic Interpretation ◽

Training Dataset ◽

3D Point Clouds ◽

Semantic Labeling ◽

3D Cnn

Abstract. Automated semantic interpretation of 3D point clouds is crucial for many tasks in the domain of geospatial data analysis. For this purpose, labeled training data is required, which has often to be provided manually by experts. One approach to minimize effort in terms of costs of human interaction is Active Learning (AL). The aim is to process only the subset of an unlabeled dataset that is particularly helpful with respect to class separation. Here a machine identifies informative instances which are then labeled by humans, thereby increasing the performance of the machine. In order to completely avoid involvement of an expert, this time-consuming annotation can be resolved via crowdsourcing. Therefore, we propose an approach combining AL with paid crowdsourcing. Although incorporating human interaction, our method can run fully automatically, so that only an unlabeled dataset and a fixed financial budget for the payment of the crowdworkers need to be provided. We conduct multiple iteration steps of the AL process on the ISPRS Vaihingen 3D Semantic Labeling benchmark dataset (V3D) and especially evaluate the performance of the crowd when labeling 3D points. We prove our concept by using labels derived from our crowd-based AL method for classifying the test dataset. The analysis outlines that by labeling only 0:4% of the training dataset by the crowd and spending less than 145 $, both our trained Random Forest and sparse 3D CNN classifier differ in Overall Accuracy by less than 3 percentage points compared to the same classifiers trained on the complete V3D training set.

Download Full-text

Virtual Disassembling of Historical Edifices: Experiments and Assessments of an Automatic Approach for Classifying Multi-Scalar Point Clouds into Architectural Elements

Sensors ◽

10.3390/s20082161 ◽

2020 ◽

Vol 20 (8) ◽

pp. 2161 ◽

Cited By ~ 4

Author(s):

Arnadi Murtiyoso ◽

Pierre Grussenmeyer

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Semantic Annotation ◽

Semantic Segmentation ◽

Point Clouds ◽

Training Data ◽

Algorithmic Approach ◽

Architectural Elements ◽

Semantic Labeling ◽

Geometric Point

3D heritage documentation has seen a surge in the past decade due to developments in reality-based 3D recording techniques. Several methods such as photogrammetry and laser scanning are becoming ubiquitous amongst architects, archaeologists, surveyors, and conservators. The main result of these methods is a 3D representation of the object in the form of point clouds. However, a solely geometric point cloud is often insufficient for further analysis, monitoring, and model predicting of the heritage object. The semantic annotation of point clouds remains an interesting research topic since traditionally it requires manual labeling and therefore a lot of time and resources. This paper proposes an automated pipeline to segment and classify multi-scalar point clouds in the case of heritage object. This is done in order to perform multi-level segmentation from the scale of a historical neighborhood up until that of architectural elements, specifically pillars and beams. The proposed workflow involves an algorithmic approach in the form of a toolbox which includes various functions covering the semantic segmentation of large point clouds into smaller, more manageable and semantically labeled clusters. The first part of the workflow will explain the segmentation and semantic labeling of heritage complexes into individual buildings, while a second part will discuss the use of the same toolbox to segment the resulting buildings further into architectural elements. The toolbox was tested on several historical buildings and showed promising results. The ultimate intention of the project is to help the manual point cloud labeling, especially when confronted with the large training data requirements of machine learning-based algorithms.

Download Full-text

EXPLORING CROSS-CITY SEMANTIC SEGMENTATION OF ALS POINT CLOUDS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2021-247-2021 ◽

2021 ◽

Vol XLIII-B2-2021 ◽

pp. 247-254

Author(s):

Y. Xie ◽

K. Schindler ◽

J. Tian ◽

X. X. Zhu

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Data Augmentation ◽

Poor Performance ◽

Semantic Segmentation ◽

Point Clouds ◽

Training Data ◽

The Impact ◽

Cloud Density ◽

Deep Coral

Abstract. Deep learning models achieve excellent semantic segmentation results for airborne laser scanning (ALS) point clouds, if sufficient training data are provided. Increasing amounts of annotated data are becoming publicly available thanks to contributors from all over the world. However, models trained on a specific dataset typically exhibit poor performance on other datasets. I.e., there are significant domain shifts, as data captured in different environments or by distinct sensors have different distributions. In this work, we study this domain shift and potential strategies to mitigate it, using two popular ALS datasets: the ISPRS Vaihingen benchmark from Germany and the LASDU benchmark from China. We compare different training strategies for cross-city ALS point cloud semantic segmentation. In our experiments, we analyse three factors that may lead to domain shift and affect the learning: point cloud density, LiDAR intensity, and the role of data augmentation. Moreover, we evaluate a well-known standard method of domain adaptation, deep CORAL (Sun and Saenko, 2016). In our experiments, adapting the point cloud density and appropriate data augmentation both help to reduce the domain gap and improve segmentation accuracy. On the contrary, intensity features can bring an improvement within a dataset, but deteriorate the generalisation across datasets. Deep CORAL does not further improve the accuracy over the simple adaptation of density and data augmentation, although it can mitigate the impact of improperly chosen point density, intensity features, and further dataset biases like lack of diversity.

Download Full-text

Towards synthesized training data for semantic segmentation of mobile laser scanning point clouds: Generating level crossings from real and synthetic point cloud samples

Automation in Construction ◽

10.1016/j.autcon.2021.103839 ◽

2021 ◽

Vol 130 ◽

pp. 103839

Author(s):

Gustaf Uggla ◽

Milan Horemuz

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Semantic Segmentation ◽

Point Clouds ◽

Training Data ◽

Level Crossings

Download Full-text

GEOMETRY-BASED POINT CLOUD CLASSIFICATION USING HEIGHT DISTRIBUTIONS

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2020-259-2020 ◽

2020 ◽

Vol V-2-2020 ◽

pp. 259-266

Author(s):

F. Politz ◽

M. Sester ◽

C. Brenner

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Ground Level ◽

Semantic Segmentation ◽

Point Clouds ◽

Training Data ◽

Mountainous Regions ◽

Dense Image ◽

The Mean ◽

Point Cloud Classification

Abstract. Semantic segmentation is one of the main steps in the processing chain for Airborne Laser Scanning (ALS) point clouds, but it is also one of the most labour intensive steps, as it requires many labelled examples to train a classifier. National mapping agencies (NMAs) have to acquire nationwide ALS data every couple of years for their duties. Having point clouds cover different terrains such as flat or mountainous regions, a classifier often requires a refinement using additional data from those specific terrains. In this study, we present an algorithm, which is able to classify point clouds of similar terrain types without requiring any additional training data and which is still able to achieve overall F1-Scores of over 90% in most setups. Our algorithm uses up to two height distributions within a single cell in a rasterized point cloud. For each distribution, the empirical mean and standard deviation are calculated, which are the input for a Convolutional Neural Network (CNN) classifier. Consequently, our approach only requires the geometry of point clouds, which enables also the usage of the same network structure for point clouds from other sensor systems such as Dense Image Matching. Since the mean ground level varies with the observed area, we also examined five different normalisation methods for our input in order to reduce the ground influence on the point clouds and thus increase its transferability towards other datasets. We test our trained networks on four different tests sets with the classes’ ground, building, water, non-ground and bridge.

Download Full-text

EXPLORING ALS AND DIM DATA FOR SEMANTIC SEGMENTATION USING CNNS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-1-347-2018 ◽

2018 ◽

Vol XLII-1 ◽

pp. 347-354 ◽

Cited By ~ 5

Author(s):

F. Politz ◽

M. Sester

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Semantic Segmentation ◽

Point Clouds ◽

Good Alternative ◽

Aerial Images ◽

Learning Approaches ◽

Advantages And Disadvantages ◽

Sensing Applications ◽

High Level

Abstract. Over the past years, the algorithms for dense image matching (DIM) to obtain point clouds from aerial images improved significantly. Consequently, DIM point clouds are now a good alternative to the established Airborne Laser Scanning (ALS) point clouds for remote sensing applications. In order to derive high-level applications such as digital terrain models or city models, each point within a point cloud must be assigned a class label. Usually, ALS and DIM are labelled with different classifiers due to their varying characteristics. In this work, we explore both point cloud types in a fully convolutional encoder-decoder network, which learns to classify ALS as well as DIM point clouds. As input, we project the point clouds onto a 2D image raster plane and calculate the minimal, average and maximal height values for each raster cell. The network then differentiates between the classes ground, non-ground, building and no data. We test our network in six training setups using only one point cloud type, both point clouds as well as several transfer-learning approaches. We quantitatively and qualitatively compare all results and discuss the advantages and disadvantages of all setups. The best network achieves an overall accuracy of 96% in an ALS and 83% in a DIM test set.

Download Full-text

SEMANTIC SEGMENTATION OF MOBILE LASER SCANNING POINT CLOUDS WITH LONG SHORT-TERM MEMORY NETWORKS: PRELIMINARY RESULTS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2021-123-2021 ◽

2021 ◽

Vol XLIII-B2-2021 ◽

pp. 123-130

Author(s):

J. Balado ◽

P. van Oosterom ◽

L. Díaz-Vilariño ◽

P. Arias

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Short Term Memory ◽

Semantic Segmentation ◽

Point Clouds ◽

Time Signal ◽

Success Rates ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Abstract. Although point clouds are characterized as a type of unstructured data, timestamp attribute can structure point clouds into scanlines and shape them into a time signal. The present work studies the transformation of the street point cloud into a time signal based on the Z component for the semantic segmentation using Long Short-Term Memory (LSTM) networks. The experiment was conducted on the point cloud of a real case study. Several training sessions were performed changing the Level of Detail of the classification (coarse level with 3 classes and fine level with 11 classes), two levels of network depth and the use of weighting for the improvement of classes with low number of points. The results showed high accuracy, reaching at best 97.3% in the classification with 3 classes (ground, buildings, and objects) and 95.7% with 11 classes. The distribution of the success rates was not the same for all classes. The classes with the highest number of points obtained better results than the others. The application of weighting improved the classes with few points at the expense of the classes with more points. Increasing the number of hidden layers was shown as a preferable alternative to weighting. Given the high success rates and a behaviour of the LSTM consistent with other Neural Networks in point cloud processing, it is concluded that the LSTM is a feasible alternative for the semantic segmentation of point clouds transformed into time signals.

Download Full-text

AN EFFICIENT DEEP LEARNING APPROACH FOR GROUND POINT FILTERING IN AERIAL LASER SCANNING POINT CLOUDS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b1-2021-31-2021 ◽

2021 ◽

Vol XLIII-B1-2021 ◽

pp. 31-38

Author(s):

A. Nurunnabi ◽

F. N. Teferle ◽

J. Li ◽

R. C. Lindenbergh ◽

A. Hunegnaw

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Laser Scanning ◽

Three Dimensional ◽

Ground Surface ◽

Point Clouds ◽

Urban Environments ◽

Training Data ◽

Ground Point ◽

End To End

Abstract. Ground surface extraction is one of the classic tasks in airborne laser scanning (ALS) point cloud processing that is used for three-dimensional (3D) city modelling, infrastructure health monitoring, and disaster management. Many methods have been developed over the last three decades. Recently, Deep Learning (DL) has become the most dominant technique for 3D point cloud classification. DL methods used for classification can be categorized into end-to-end and non end-to-end approaches. One of the main challenges of using supervised DL approaches is getting a sufficient amount of training data. The main advantage of using a supervised non end-to-end approach is that it requires less training data. This paper introduces a novel local feature-based non end-to-end DL algorithm that generates a binary classifier for ground point filtering. It studies feature relevance, and investigates three models that are different combinations of features. This method is free from the limitations of point clouds’ irregular data structure and varying data density, which is the biggest challenge for using the elegant convolutional neural network. The new algorithm does not require transforming data into regular 3D voxel grids or any rasterization. The performance of the new method has been demonstrated through two ALS datasets covering urban environments. The method successfully labels ground and non-ground points in the presence of steep slopes and height discontinuity in the terrain. Experiments in this paper show that the algorithm achieves around 97% in both F1-score and model accuracy for ground point labelling.

Download Full-text

Semantic Segmentation of Building Point Clouds Using Deep Learning: A Method for Creating Training Data Using BIM to Point Cloud Label Transfer

Computing in Civil Engineering 2019 ◽

10.1061/9780784482421.052 ◽

2019 ◽

Cited By ~ 2

Author(s):

Thomas Czerniawski ◽

Fernanda Leite

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Semantic Segmentation ◽

Point Clouds ◽

Training Data ◽

Label Transfer

Download Full-text

Road Environment Semantic Segmentation with Deep Learning from MLS Point Cloud Data

Sensors ◽

10.3390/s19163466 ◽

2019 ◽

Vol 19 (16) ◽

pp. 3466 ◽

Cited By ~ 15

Author(s):

Balado ◽

Martínez-Sánchez ◽

Arias ◽

Novo

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Semantic Segmentation ◽

Point Clouds ◽

Success Rates ◽

Cloud Data ◽

Road Surfaces ◽

Autonomous Cars ◽

Training Cost ◽

Near Future

In the near future, the communication between autonomous cars will produce a network of sensors that will allow us to know the state of the roads in real time. Lidar technology, upon which most autonomous cars are based, allows the acquisition of 3D geometric information of the environment. The objective of this work is to use point clouds acquired by Mobile Laser Scanning (MLS) to segment the main elements of road environment (road surface, ditches, guardrails, fences, embankments, and borders) through the use of PointNet. Previously, the point cloud was automatically divided into sections in order for semantic segmentation to be scalable to different case studies, regardless of their shape or length. An overall accuracy of 92.5% has been obtained, but with large variations between classes. Elements with a greater number of points have been segmented more effectively than the other elements. In comparison with other point-by-point extraction and ANN-based classification techniques, the same success rates have been obtained for road surfaces and fences, and better results have been obtained for guardrails. Semantic segmentation with PointNet is suitable when segmenting the scene as a whole, however, if certain classes have more interest, there are other alternatives that do not need a high training cost.

Download Full-text