GENERATION OF GROUND TRUTH DATASETS FOR THE ANALYSIS OF 3D POINT CLOUDS IN URBAN SCENES ACQUIRED VIA DIFFERENT SENSORS

In this work, we report a novel way of generating ground truth dataset for analyzing point cloud from different sensors and the validation of algorithms. Instead of directly labeling large amount of 3D points requiring time consuming manual work, a multi-resolution 3D voxel grid for the testing site is generated. Then, with the help of a set of basic labeled points from the reference dataset, we can generate a 3D labeled space of the entire testing site with different resolutions. Specifically, an octree-based voxel structure is applied to voxelize the annotated reference point cloud, by which all the points are organized by 3D grids of multi-resolutions. When automatically annotating the new testing point clouds, a voting based approach is adopted to the labeled points within multiple resolution voxels, in order to assign a semantic label to the 3D space represented by the voxel. Lastly, robust line- and plane-based fast registration methods are developed for aligning point clouds obtained via various sensors. Benefiting from the labeled 3D spatial information, we can easily create new annotated 3D point clouds of different sensors of the same scene directly by considering the corresponding labels of 3D space the points located, which would be convenient for the validation and evaluation of algorithms related to point cloud interpretation and semantic segmentation.

Download Full-text

Semantic Segmentation of 3D Point Cloud Based on Spatial Eight-Quadrant Kernel Convolution

Remote Sensing ◽

10.3390/rs13163140 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3140

Author(s):

Liman Liu ◽

Jinjin Yu ◽

Longyu Tan ◽

Wanjuan Su ◽

Lin Zhao ◽

...

Keyword(s):

Point Cloud ◽

Poor Performance ◽

Semantic Segmentation ◽

Point Clouds ◽

Small Object ◽

Fine Grained ◽

3D Point Clouds ◽

Kernel Convolution ◽

Segmentation Accuracy ◽

Indoor Scenes

In order to deal with the problem that some existing semantic segmentation networks for 3D point clouds generally have poor performance on small objects, a Spatial Eight-Quadrant Kernel Convolution (SEQKC) algorithm is proposed to enhance the ability of the network for extracting fine-grained features from 3D point clouds. As a result, the semantic segmentation accuracy of small objects in indoor scenes can be improved. To be specific, in the spherical space of the point cloud neighborhoods, a kernel point with attached weights is constructed in each octant, the distances between the kernel point and the points in its neighborhood are calculated, and the distance and the kernel points’ weights are used together to weight the point cloud features in the neighborhood space. In this case, the relationship between points are modeled, so that the local fine-grained features of the point clouds can be extracted by the SEQKC. Based on the SEQKC, we design a downsampling module for point clouds, and embed it into classical semantic segmentation networks (PointNet++, PointSIFT and PointConv) for semantic segmentation. Experimental results on benchmark dataset ScanNet V2 show that SEQKC-based PointNet++, PointSIFT and PointConv outperform the original networks about 1.35–2.12% in terms of MIoU, and they effectively improve the semantic segmentation performance of the networks for small objects of indoor scenes, e.g., the segmentation accuracy of small object “picture” is improved from 0.70% of PointNet++ to 10.37% of SEQKC-PointNet++.

Download Full-text

Point Cloud Semantic Segmentation Using a Deep Learning Framework for Cultural Heritage

Remote Sensing ◽

10.3390/rs12061005 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1005 ◽

Cited By ~ 7

Author(s):

Roberto Pierdicca ◽

Marina Paolanti ◽

Francesca Matrone ◽

Massimo Martini ◽

Christian Morbidoni ◽

...

Keyword(s):

Deep Learning ◽

Cultural Heritage ◽

Point Cloud ◽

Semantic Segmentation ◽

Point Clouds ◽

Information Modeling ◽

Dynamic Graph ◽

Historical Building ◽

Architectural Elements ◽

3D Point Clouds

In the Digital Cultural Heritage (DCH) domain, the semantic segmentation of 3D Point Clouds with Deep Learning (DL) techniques can help to recognize historical architectural elements, at an adequate level of detail, and thus speed up the process of modeling of historical buildings for developing BIM models from survey data, referred to as HBIM (Historical Building Information Modeling). In this paper, we propose a DL framework for Point Cloud segmentation, which employs an improved DGCNN (Dynamic Graph Convolutional Neural Network) by adding meaningful features such as normal and colour. The approach has been applied to a newly collected DCH Dataset which is publicy available: ArCH (Architectural Cultural Heritage) Dataset. This dataset comprises 11 labeled points clouds, derived from the union of several single scans or from the integration of the latter with photogrammetric surveys. The involved scenes are both indoor and outdoor, with churches, chapels, cloisters, porticoes and loggias covered by a variety of vaults and beared by many different types of columns. They belong to different historical periods and different styles, in order to make the dataset the least possible uniform and homogeneous (in the repetition of the architectural elements) and the results as general as possible. The experiments yield high accuracy, demonstrating the effectiveness and suitability of the proposed approach.

Download Full-text

JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6994 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12951-12958 ◽

Cited By ~ 3

Author(s):

Lin Zhao ◽

Wenbing Tao

Keyword(s):

Point Cloud ◽

Large Scale ◽

Feature Fusion ◽

Mean Shift ◽

Semantic Segmentation ◽

Point Clouds ◽

Semantic Features ◽

Backbone Network ◽

3D Point Clouds ◽

Instance Segmentation

In this paper, we propose a novel joint instance and semantic segmentation approach, which is called JSNet, in order to address the instance and semantic segmentation of 3D point clouds simultaneously. Firstly, we build an effective backbone network to extract robust features from the raw point clouds. Secondly, to obtain more discriminative features, a point cloud feature fusion module is proposed to fuse the different layer features of the backbone network. Furthermore, a joint instance semantic segmentation module is developed to transform semantic features into instance embedding space, and then the transformed features are further fused with instance features to facilitate instance segmentation. Meanwhile, this module also aggregates instance features into semantic feature space to promote semantic segmentation. Finally, the instance predictions are generated by applying a simple mean-shift clustering on instance embeddings. As a result, we evaluate the proposed JSNet on a large-scale 3D indoor point cloud dataset S3DIS and a part dataset ShapeNet, and compare it with existing approaches. Experimental results demonstrate our approach outperforms the state-of-the-art method in 3D instance segmentation with a significant improvement in 3D semantic prediction and our method is also beneficial for part segmentation. The source code for this work is available at https://github.com/dlinzhao/JSNet.

Download Full-text

Pantomime

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3448110 ◽

2021 ◽

Vol 5 (1) ◽

pp. 1-27

Author(s):

Sameera Palipana ◽

Dariush Salami ◽

Luis A. Leiva ◽

Stephan Sigg

Keyword(s):

Point Cloud ◽

Continuous Wave ◽

Spatial Information ◽

Point Clouds ◽

Recognition System ◽

Indoor Environments ◽

Unique Region ◽

Temporal Properties ◽

3D Point Clouds ◽

Wave Radar

We introduce Pantomime, a novel mid-air gesture recognition system exploiting spatio-temporal properties of millimeter-wave radio frequency (RF) signals. Pantomime is positioned in a unique region of the RF landscape: mid-resolution mid-range high-frequency sensing, which makes it ideal for motion gesture interaction. We configure a commercial frequency-modulated continuous-wave radar device to promote spatial information over the temporal resolution by means of sparse 3D point clouds and contribute a deep learning architecture that directly consumes the point cloud, enabling real-time performance with low computational demands. Pantomime achieves 95% accuracy and 99% AUC in a challenging set of 21 gestures articulated by 41 participants in two indoor environments, outperforming four state-of-the-art 3D point cloud recognizers. We further analyze the effect of the environment in 5 different indoor environments, the effect of articulation speed, angle, and the distance of the person up to 5m. We have publicly made available the collected mmWave gesture dataset consisting of nearly 22,000 gesture instances along with our radar sensor configuration, trained models, and source code for reproducibility. We conclude that pantomime is resilient to various input conditions and that it may enable novel applications in industrial, vehicular, and smart home scenarios.

Download Full-text

TESSERAE3D: A BENCHMARK FOR TESSERAE SEMANTIC SEGMENTATION IN 3D POINT CLOUDS

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2021-121-2021 ◽

2021 ◽

Vol V-2-2021 ◽

pp. 121-128

Author(s):

A. Kharroubi ◽

L. Van Wersch ◽

R. Billen ◽

F. Poux

Keyword(s):

Point Cloud ◽

Semantic Segmentation ◽

Point Clouds ◽

3D Point Cloud ◽

Learning Models ◽

Semantic Classes ◽

3D Point Clouds ◽

Wide Range ◽

Extraction Pattern ◽

Automated Processes

Abstract. 3D point cloud of mosaic tesserae is used by heritage researchers, restorers and archaeologists for digital investigations. Information extraction, pattern analysis and semantic assignment are necessary to complement the geometric information. Automated processes that can speed up the task are highly sought after, especially new supervised approaches. However, the availability of labelled data necessary for training supervised learning models is a significant constraint. This paper introduces Tesserae3D, a 3D point cloud benchmark dataset for training and evaluating machine learning models, applied to mosaic tesserae segmentation. It is a publicly available, very high density and coloured dataset, accompanied by a standard multi-class semantic segmentation baseline. It consists of about 502 million points and contains 11 semantic classes covering a wide range of tesserae types. We propose a semantic segmentation baseline building on radiometric and covariance features fed to ensemble learning methods. The results delineate an achievable 89% F1-score and are made available under https://github.com/akharroubi/Tesserae3D, providing a simple interface to improve the score based on feedback from the research community.

Download Full-text

Spatial Aggregation Net: Point Cloud Semantic Segmentation Based on Multi-Directional Convolution

Sensors ◽

10.3390/s19194329 ◽

2019 ◽

Vol 19 (19) ◽

pp. 4329 ◽

Cited By ~ 3

Author(s):

Guorong Cai ◽

Zuning Jiang ◽

Zongyue Wang ◽

Shangfeng Huang ◽

Kai Chen ◽

...

Keyword(s):

Spatial Structure ◽

Point Cloud ◽

Smart Cities ◽

Semantic Segmentation ◽

Point Clouds ◽

Autonomous Driving ◽

Spatial Aggregation ◽

Structure Information ◽

Aggregate Information ◽

3D Point Clouds

Semantic segmentation of 3D point clouds plays a vital role in autonomous driving, 3D maps, and smart cities, etc. Recent work such as PointSIFT shows that spatial structure information can improve the performance of semantic segmentation. Motivated by this phenomenon, we propose Spatial Aggregation Net (SAN) for point cloud semantic segmentation. SAN is based on multi-directional convolution scheme that utilizes the spatial structure information of point cloud. Firstly, Octant-Search is employed to capture the neighboring points around each sampled point. Secondly, we use multi-directional convolution to extract information from different directions of sampled points. Finally, max-pooling is used to aggregate information from different directions. The experimental results conducted on ScanNet database show that the proposed SAN has comparable results with state-of-the-art algorithms such as PointNet, PointNet++, and PointSIFT, etc. In particular, our method has better performance on flat, small objects, and the edge areas that connect objects. Moreover, our model has good trade-off in segmentation accuracy and time complexity.

Download Full-text

DEEP LEARNING FOR SEMANTIC SEGMENTATION OF 3D POINT CLOUD

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-2-w15-735-2019 ◽

2019 ◽

Vol XLII-2/W15 ◽

pp. 735-742 ◽

Cited By ~ 6

Author(s):

E. S. Malinverni ◽

R. Pierdicca ◽

M. Paolanti ◽

M. Martini ◽

C. Morbidoni ◽

...

Keyword(s):

Deep Learning ◽

Cultural Heritage ◽

Point Cloud ◽

Cultural Landscapes ◽

Three Dimensional ◽

Semantic Segmentation ◽

Point Clouds ◽

Training Data ◽

Historical Building ◽

3D Point Clouds

<p><strong>Abstract.</strong> Cultural Heritage is a testimony of past human activity, and, as such, its objects exhibit great variety in their nature, size and complexity; from small artefacts and museum items to cultural landscapes, from historical building and ancient monuments to city centers and archaeological sites. Cultural Heritage around the globe suffers from wars, natural disasters and human negligence. The importance of digital documentation is well recognized and there is an increasing pressure to document our heritage both nationally and internationally. For this reason, the three-dimensional scanning and modeling of sites and artifacts of cultural heritage have remarkably increased in recent years. The semantic segmentation of point clouds is an essential step of the entire pipeline; in fact, it allows to decompose complex architectures in single elements, which are then enriched with meaningful information within Building Information Modelling software. Notwithstanding, this step is very time consuming and completely entrusted on the manual work of domain experts, far from being automatized. This work describes a method to label and cluster automatically a point cloud based on a supervised Deep Learning approach, using a state-of-the-art Neural Network called PointNet++. Despite other methods are known, we have choose PointNet++ as it reached significant results for classifying and segmenting 3D point clouds. PointNet++ has been tested and improved, by training the network with annotated point clouds coming from a real survey and to evaluate how performance changes according to the input training data. It can result of great interest for the research community dealing with the point cloud semantic segmentation, since it makes public a labelled dataset of CH elements for further tests.</p>

Download Full-text

HYBRID GEOREFERENCING, ENHANCEMENT AND CLASSIFICATION OF ULTRA-HIGH RESOLUTION UAV LIDAR AND IMAGE POINT CLOUDS FOR MONITORING APPLICATIONS

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2020-727-2020 ◽

2020 ◽

Vol V-2-2020 ◽

pp. 727-734

Author(s):

N. Haala ◽

M. Kölle ◽

M. Cramer ◽

D. Laupheimer ◽

G. Mandlburger ◽

...

Keyword(s):

Point Cloud ◽

Image Data ◽

Semantic Segmentation ◽

Point Clouds ◽

Airborne Lidar ◽

Image Texture ◽

Image Point ◽

Joint Orientation ◽

3D Point Clouds ◽

3D Data

Abstract. This paper presents a study on the potential of ultra-high accurate UAV-based 3D data capture by combining both imagery and LiDAR data. Our work is motivated by a project aiming at the monitoring of subsidence in an area of mixed use. Thus, it covers built-up regions in a village with a ship lock as the main object of interest as well as regions of agricultural use. In order to monitor potential subsidence in the order of 10 mm/year, we aim at sub-centimeter accuracies of the respective 3D point clouds. We show that hybrid georeferencing helps to increase the accuracy of the adjusted LiDAR point cloud by integrating results from photogrammetric block adjustment to improve the time-dependent trajectory corrections. As our main contribution, we demonstrate that joint orientation of laser scans and images in a hybrid adjustment framework significantly improves the relative and absolute height accuracies. By these means, accuracies corresponding to the GSD of the integrated imagery can be achieved. Image data can also help to enhance the LiDAR point clouds. As an example, integrating results from Multi-View Stereo potentially increases the point density from airborne LiDAR. Furthermore, image texture can support 3D point cloud classification. This semantic segmentation discussed in the final part of the paper is a prerequisite for further enhancement and analysis of the captured point cloud.

Download Full-text

INDOOR 3D POINT CLOUDS SEMANTIC SEGMENTATION BASES ON MODIFIED POINTNET NETWORK

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2020-369-2020 ◽

2020 ◽

Vol XLIII-B2-2020 ◽

pp. 369-373

Author(s):

J. Zhao ◽

X. Zhang ◽

Y. Wang

Keyword(s):

Point Cloud ◽

Semantic Segmentation ◽

Point Clouds ◽

Local Features ◽

Indoor Navigation ◽

Convolution Kernel ◽

Structural Elements ◽

Global Features ◽

3D Point Clouds ◽

Point Cloud Segmentation

Abstract. Indoor 3D point clouds semantics segmentation is one of the key technologies of constructing 3D indoor models,which play an important role on domains like indoor navigation and positioning,intelligent city, intelligent robot etc. The deep-learning-based methods for point cloud segmentation take on higher degree of automation and intelligence. PointNet,the first deep neural network which manipulate point cloud directly, mainly extracts the global features but lacks of learning and extracting local features,which causes the poor ability of segmenting the local details of architecture and affects the precision of structural elements segmentation . Focusing on the problems above,this paper put forward an automatic end-to-end segmentation method base on the modified PointNet. According to the characteristic that the intensity of different indoor structural elements differ a lot, we input the point cloud information of 3D coordinate, color and intensity into the feature space of points. Also,a MaxPooling is added into the original PointNet network to improve the ability of attracting and learning local features. In addition, replace the 1×1 convolution kernel of original PointNet with 3×3 convolution kernel in the process of attracting features to improve the segmentation precision of indoor point cloud. The result shows that this method improves the automation and precision of indoor point cloud segmentation for the precision achieves over 80% to segment the structural elements like wall,door and so on ,and the average segmentation precision of every structural elements achieves 66%.

Download Full-text

GENERATIVE NETWORKS FOR POINT CLOUD GENERATION IN CULTURAL HERITAGE DOMAIN

10.4995/arqueologica9.2021.12101 ◽

2021 ◽

Author(s):

Massimo Martini ◽

Roberto Pierdicca ◽

Marina Paolanti ◽

Ramona Quattrini ◽

Eva Savina Malinverni ◽

...

Keyword(s):

Cultural Heritage ◽

Point Cloud ◽

Semantic Segmentation ◽

Point Clouds ◽

Historical Buildings ◽

Architectural Elements ◽

3D Point Clouds ◽

Chamfer Distance ◽

Segmentation Task ◽

Jensen Shannon Divergence

In the Cultural Heritage (CH) domain, the semantic segmentation of 3D point clouds with Deep Learning (DL) techniques allows to recognize historical architectural elements, at a suitable level of detail, and hence expedite the process of modelling historical buildings for the development of BIM models from survey data. However, it is more difficult to collect a balanced dataset of labelled architectural elements for training a network. In fact, the CH objects are unique, and it is challenging for the network to recognize this kind of data. In recent years, Generative Networks have proven to be proper for generating new data. Starting from such premises, in this paper Generative Networks have been used for augmenting a CH dataset. In particular, the performances of three state-of-art Generative Networks such as PointGrow, PointFLow and PointGMM have been compared in terms of Jensen-Shannon Divergence (JSD), the Minimum Matching Distance-Chamfer Distance (MMD-CD) and the Minimum Matching Distance-Earth Mover’s Distance (MMD-EMD). The objects generated have been used for augmenting two classes of ArCH dataset, which are columns and windows. Then a DGCNN-Mod network was trained and tested for the semantic segmentation task, comparing the performance in the case of the ArCH dataset without and with augmentation.

Download Full-text