JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds

Lin Zhao; Wenbing Tao

doi:10.1609/aaai.v34i07.6994

JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6994 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12951-12958 ◽

Cited By ~ 3

Author(s):

Lin Zhao ◽

Wenbing Tao

Keyword(s):

Point Cloud ◽

Large Scale ◽

Feature Fusion ◽

Mean Shift ◽

Semantic Segmentation ◽

Point Clouds ◽

Semantic Features ◽

Backbone Network ◽

3D Point Clouds ◽

Instance Segmentation

In this paper, we propose a novel joint instance and semantic segmentation approach, which is called JSNet, in order to address the instance and semantic segmentation of 3D point clouds simultaneously. Firstly, we build an effective backbone network to extract robust features from the raw point clouds. Secondly, to obtain more discriminative features, a point cloud feature fusion module is proposed to fuse the different layer features of the backbone network. Furthermore, a joint instance semantic segmentation module is developed to transform semantic features into instance embedding space, and then the transformed features are further fused with instance features to facilitate instance segmentation. Meanwhile, this module also aggregates instance features into semantic feature space to promote semantic segmentation. Finally, the instance predictions are generated by applying a simple mean-shift clustering on instance embeddings. As a result, we evaluate the proposed JSNet on a large-scale 3D indoor point cloud dataset S3DIS and a part dataset ShapeNet, and compare it with existing approaches. Experimental results demonstrate our approach outperforms the state-of-the-art method in 3D instance segmentation with a significant improvement in 3D semantic prediction and our method is also beneficial for part segmentation. The source code for this work is available at https://github.com/dlinzhao/JSNet.

Download Full-text

Exploiting Structured CNNs For Semantic Segmentation Of Unstructured Point Clouds From LiDAR Sensor

Remote Sensing ◽

10.3390/rs13183621 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3621

Author(s):

Muhammad Ibrahim ◽

Naveed Akhtar ◽

Khalil Ullah ◽

Ajmal Mian

Keyword(s):

Point Cloud ◽

Large Scale ◽

Semantic Segmentation ◽

Point Clouds ◽

Histogram Equalization ◽

Lidar Data ◽

3D Point Clouds ◽

Novel Method ◽

Public Datasets ◽

Processing Techniques

Accurate semantic segmentation of 3D point clouds is a long-standing problem in remote sensing and computer vision. Due to the unstructured nature of point clouds, designing deep neural architectures for point cloud semantic segmentation is often not straightforward. In this work, we circumvent this problem by devising a technique to exploit structured neural architectures for unstructured data. In particular, we employ the popular convolutional neural network (CNN) architectures to perform semantic segmentation of LiDAR data. We propose a projection-based scheme that performs an angle-wise slicing of large 3D point clouds and transforms those slices into 2D grids. Accounting for intensity and reflectivity of the LiDAR input, the 2D grid allows us to construct a pseudo image for the point cloud slice. We enhance this image with low-level image processing techniques of normalization, histogram equalization, and decorrelation stretch to suit our ultimate object of semantic segmentation. A large number of images thus generated are used to induce an encoder-decoder CNN model that learns to compute a segmented 2D projection of the scene, which we finally back project to the 3D point cloud. In addition to a novel method, this article also makes a second major contribution of introducing the enhanced version of our large-scale public PC-Urban outdoor dataset which is captured in a civic setup with an Ouster LiDAR sensor. The updated dataset (PC-Urban_V2) provides nearly 8 billion points including over 100 million points labeled for 25 classes of interest. We provide a thorough evaluation of our technique on PC-Urban_V2 and three other public datasets.

Download Full-text

TUM-MLS-2016: An Annotated Mobile LiDAR Dataset of the TUM City Campus for Semantic Point Cloud Interpretation in Urban Areas

Remote Sensing ◽

10.3390/rs12111875 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1875 ◽

Cited By ~ 1

Author(s):

Jingwei Zhu ◽

Joachim Gehrung ◽

Rong Huang ◽

Björn Borgmann ◽

Zhenghao Sun ◽

...

Keyword(s):

Test Data ◽

Urban Areas ◽

Point Cloud ◽

Large Scale ◽

Point Clouds ◽

Semantic Interpretation ◽

3D Point Clouds ◽

Semantic Labeling ◽

Benchmark Datasets ◽

Semantic Point

In the past decade, a vast amount of strategies, methods, and algorithms have been developed to explore the semantic interpretation of 3D point clouds for extracting desirable information. To assess the performance of the developed algorithms or methods, public standard benchmark datasets should invariably be introduced and used, which serve as an indicator and ruler in the evaluation and comparison. In this work, we introduce and present large-scale Mobile LiDAR point clouds acquired at the city campus of the Technical University of Munich, which have been manually annotated and can be used for the evaluation of related algorithms and methods for semantic point cloud interpretation. We created three datasets from a measurement campaign conducted in April 2016, including a benchmark dataset for semantic labeling, test data for instance segmentation, and test data for annotated single 360 ° laser scans. These datasets cover an urban area of approximately 1 km long roadways and include more than 40 million annotated points with eight classes of objects labeled. Moreover, experiments were carried out with results from several baseline methods compared and analyzed, revealing the quality of this dataset and its effectiveness when using it for performance evaluation.

Download Full-text

Tunnel Deformation Inspection via Global Spatial Axis Extraction from 3D Raw Point Cloud

Sensors ◽

10.3390/s20236815 ◽

2020 ◽

Vol 20 (23) ◽

pp. 6815

Author(s):

Cheng Yi ◽

Dening Lu ◽

Qian Xie ◽

Jinxuan Xu ◽

Jun Wang

Keyword(s):

Cross Sections ◽

Point Cloud ◽

Large Scale ◽

Spline Approximation ◽

Point Clouds ◽

Driving Safety ◽

Inspection Method ◽

B Spline ◽

3D Point Clouds ◽

Spatial Axis

Global inspection of large-scale tunnels is a fundamental yet challenging task to ensure the structural stability of tunnels and driving safety. Advanced LiDAR scanners, which sample tunnels into 3D point clouds, are making their debut in the Tunnel Deformation Inspection (TDI). However, the acquired raw point clouds inevitably possess noticeable occlusions, missing areas, and noise/outliers. Considering the tunnel as a geometrical sweeping feature, we propose an effective tunnel deformation inspection algorithm by extracting the global spatial axis from the poor-quality raw point cloud. Essentially, we convert tunnel axis extraction into an iterative fitting optimization problem. Specifically, given the scanned raw point cloud of a tunnel, the initial design axis is sampled to generate a series of normal planes within the corresponding Frenet frame, followed by intersecting those planes with the tunnel point cloud to yield a sequence of cross sections. By fitting cross sections with circles, the fitted circle centers are approximated with a B-Spline curve, which is considered as an updated axis. The procedure of “circle fitting and B-SPline approximation” repeats iteratively until convergency, that is, the distance of each fitted circle center to the current axis is smaller than a given threshold. By this means, the spatial axis of the tunnel can be accurately obtained. Subsequently, according to the practical mechanism of tunnel deformation, we design a segmentation approach to partition cross sections into meaningful pieces, based on which various inspection parameters can be automatically computed regarding to tunnel deformation. A variety of practical experiments have demonstrated the feasibility and effectiveness of our inspection method.

Download Full-text

IMAGE TO POINT CLOUD TRANSLATION USING CONDITIONAL GENERATIVE ADVERSARIAL NETWORK FOR AIRBORNE LIDAR DATA

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2021-169-2021 ◽

2021 ◽

Vol V-2-2021 ◽

pp. 169-174

Author(s):

T. Shinohara ◽

H. Xiu ◽

M. Matsuoka

Keyword(s):

Point Cloud ◽

Large Scale ◽

Point Clouds ◽

Airborne Lidar ◽

Real Point ◽

3D Point Cloud ◽

Generative Adversarial Network ◽

Adversarial Network ◽

3D Point Clouds ◽

Latent Features

Abstract. This study introduces a novel image to a 3D point-cloud translation method with a conditional generative adversarial network that creates a large-scale 3D point cloud. This can generate supervised point clouds observed via airborne LiDAR from aerial images. The network is composed of an encoder to produce latent features of input images, generator to translate latent features to fake point clouds, and discriminator to classify false or real point clouds. The encoder is a pre-trained ResNet; to overcome the difficulty of generating 3D point clouds in an outdoor scene, we use a FoldingNet with features from ResNet. After a fixed number of iterations, our generator can produce fake point clouds that correspond to the input image. Experimental results show that our network can learn and generate certain point clouds using the data from the 2018 IEEE GRSS Data Fusion Contest.

Download Full-text

Can 3D Point Clouds Replace GCPs?

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-ii-5-347-2014 ◽

2014 ◽

Vol II-5 ◽

pp. 347-354 ◽

Cited By ~ 2

Author(s):

G. Stavropoulou ◽

G. Tzovla ◽

A. Georgopoulos

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Large Scale ◽

Point Clouds ◽

Field Work ◽

Surface Model ◽

Control Points ◽

3D Point Clouds ◽

Surface Models

Over the past decade, large-scale photogrammetric products have been extensively used for the geometric documentation of cultural heritage monuments, as they combine metric information with the qualities of an image document. Additionally, the rising technology of terrestrial laser scanning has enabled the easier and faster production of accurate digital surface models (DSM), which have in turn contributed to the documentation of heavily textured monuments. However, due to the required accuracy of control points, the photogrammetric methods are always applied in combination with surveying measurements and hence are dependent on them. Along this line of thought, this paper explores the possibility of limiting the surveying measurements and the field work necessary for the production of large-scale photogrammetric products and proposes an alternative method on the basis of which the necessary control points instead of being measured with surveying procedures are chosen from a dense and accurate point cloud. Using this point cloud also as a surface model, the only field work necessary is the scanning of the object and image acquisition, which need not be subject to strict planning. To evaluate the proposed method an algorithm and the complementary interface were produced that allow the parallel manipulation of 3D point clouds and images and through which single image procedures take place. The paper concludes by presenting the results of a case study in the ancient temple of Hephaestus in Athens and by providing a set of guidelines for implementing effectively the method.

Download Full-text

Semantic Segmentation of 3D Point Cloud Based on Spatial Eight-Quadrant Kernel Convolution

Remote Sensing ◽

10.3390/rs13163140 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3140

Author(s):

Liman Liu ◽

Jinjin Yu ◽

Longyu Tan ◽

Wanjuan Su ◽

Lin Zhao ◽

...

Keyword(s):

Point Cloud ◽

Poor Performance ◽

Semantic Segmentation ◽

Point Clouds ◽

Small Object ◽

Fine Grained ◽

3D Point Clouds ◽

Kernel Convolution ◽

Segmentation Accuracy ◽

Indoor Scenes

In order to deal with the problem that some existing semantic segmentation networks for 3D point clouds generally have poor performance on small objects, a Spatial Eight-Quadrant Kernel Convolution (SEQKC) algorithm is proposed to enhance the ability of the network for extracting fine-grained features from 3D point clouds. As a result, the semantic segmentation accuracy of small objects in indoor scenes can be improved. To be specific, in the spherical space of the point cloud neighborhoods, a kernel point with attached weights is constructed in each octant, the distances between the kernel point and the points in its neighborhood are calculated, and the distance and the kernel points’ weights are used together to weight the point cloud features in the neighborhood space. In this case, the relationship between points are modeled, so that the local fine-grained features of the point clouds can be extracted by the SEQKC. Based on the SEQKC, we design a downsampling module for point clouds, and embed it into classical semantic segmentation networks (PointNet++, PointSIFT and PointConv) for semantic segmentation. Experimental results on benchmark dataset ScanNet V2 show that SEQKC-based PointNet++, PointSIFT and PointConv outperform the original networks about 1.35–2.12% in terms of MIoU, and they effectively improve the semantic segmentation performance of the networks for small objects of indoor scenes, e.g., the segmentation accuracy of small object “picture” is improved from 0.70% of PointNet++ to 10.37% of SEQKC-PointNet++.

Download Full-text

Point Cloud Semantic Segmentation Using a Deep Learning Framework for Cultural Heritage

Remote Sensing ◽

10.3390/rs12061005 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1005 ◽

Cited By ~ 7

Author(s):

Roberto Pierdicca ◽

Marina Paolanti ◽

Francesca Matrone ◽

Massimo Martini ◽

Christian Morbidoni ◽

...

Keyword(s):

Deep Learning ◽

Cultural Heritage ◽

Point Cloud ◽

Semantic Segmentation ◽

Point Clouds ◽

Information Modeling ◽

Dynamic Graph ◽

Historical Building ◽

Architectural Elements ◽

3D Point Clouds

In the Digital Cultural Heritage (DCH) domain, the semantic segmentation of 3D Point Clouds with Deep Learning (DL) techniques can help to recognize historical architectural elements, at an adequate level of detail, and thus speed up the process of modeling of historical buildings for developing BIM models from survey data, referred to as HBIM (Historical Building Information Modeling). In this paper, we propose a DL framework for Point Cloud segmentation, which employs an improved DGCNN (Dynamic Graph Convolutional Neural Network) by adding meaningful features such as normal and colour. The approach has been applied to a newly collected DCH Dataset which is publicy available: ArCH (Architectural Cultural Heritage) Dataset. This dataset comprises 11 labeled points clouds, derived from the union of several single scans or from the integration of the latter with photogrammetric surveys. The involved scenes are both indoor and outdoor, with churches, chapels, cloisters, porticoes and loggias covered by a variety of vaults and beared by many different types of columns. They belong to different historical periods and different styles, in order to make the dataset the least possible uniform and homogeneous (in the repetition of the architectural elements) and the results as general as possible. The experiments yield high accuracy, demonstrating the effectiveness and suitability of the proposed approach.

Download Full-text

LASDU: A Large-Scale Aerial LiDAR Dataset for Semantic Labeling in Dense Urban Areas

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9070450 ◽

2020 ◽

Vol 9 (7) ◽

pp. 450

Author(s):

Zhen Ye ◽

Yusheng Xu ◽

Rong Huang ◽

Xiaohua Tong ◽

Xin Li ◽

...

Keyword(s):

Urban Area ◽

Urban Areas ◽

Point Cloud ◽

Large Scale ◽

Point Clouds ◽

Light Detection And Ranging ◽

Early Age ◽

3D Mapping ◽

3D Point Clouds ◽

Semantic Labeling

The semantic labeling of the urban area is an essential but challenging task for a wide variety of applications such as mapping, navigation, and monitoring. The rapid advance in Light Detection and Ranging (LiDAR) systems provides this task with a possible solution using 3D point clouds, which are accessible, affordable, accurate, and applicable. Among all types of platforms, the airborne platform with LiDAR can serve as an efficient and effective tool for large-scale 3D mapping in the urban area. Against this background, a large number of algorithms and methods have been developed to fully explore the potential of 3D point clouds. However, the creation of publicly accessible large-scale annotated datasets, which are critical for assessing the performance of the developed algorithms and methods, is still at an early age. In this work, we present a large-scale aerial LiDAR point cloud dataset acquired in a highly-dense and complex urban area for the evaluation of semantic labeling methods. This dataset covers an urban area with highly-dense buildings of approximately 1 km2 and includes more than three million points with five classes of objects labeled. Moreover, experiments are carried out with the results from several baseline methods, demonstrating the feasibility and capability of the dataset serving as a benchmark for assessing semantic labeling methods.

Download Full-text

EDC-Net: Edge Detection Capsule Network for 3D Point Clouds

Applied Sciences ◽

10.3390/app11041833 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1833 ◽

Cited By ~ 1

Author(s):

Dena Bazazian ◽

M. Eulàlia Parés

Keyword(s):

Edge Detection ◽

Network Architecture ◽

Large Scale ◽

Semantic Segmentation ◽

Point Clouds ◽

3D Point Clouds ◽

Abstract Shape ◽

Edge Points ◽

Weakly Supervised ◽

Edge Features

Edge features in point clouds are prominent due to the capability of describing an abstract shape of a set of points. Point clouds obtained by 3D scanner devices are often immense in terms of size. Edges are essential features in large scale point clouds since they are capable of describing the shapes in down-sampled point clouds while maintaining the principal information. In this paper, we tackle challenges of edge detection tasks in 3D point clouds. To this end, we propose a novel technique to detect edges of point clouds based on a capsule network architecture. In this approach, we define the edge detection task of point clouds as a semantic segmentation problem. We built a classifier through the capsules to predict edge and non-edge points in 3D point clouds. We applied a weakly-supervised learning approach in order to improve the performance of our proposed method and built in the capability of testing the technique in wider range of shapes. We provide several quantitative and qualitative experimental results to demonstrate the robustness of our proposed EDC-Net for edge detection in 3D point clouds. We performed a statistical analysis over the ABC and ShapeNet datasets. Our numerical results demonstrate the robust and efficient performance of EDC-Net.

Download Full-text

GENERATION OF GROUND TRUTH DATASETS FOR THE ANALYSIS OF 3D POINT CLOUDS IN URBAN SCENES ACQUIRED VIA DIFFERENT SENSORS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-3-2009-2018 ◽

2018 ◽

Vol XLII-3 ◽

pp. 2009-2015 ◽

Cited By ~ 1

Author(s):

Y. Xu ◽

Z. Sun ◽

R. Boerner ◽

T. Koch ◽

L. Hoegner ◽

...

Keyword(s):

Point Cloud ◽

Spatial Information ◽

Ground Truth ◽

Semantic Segmentation ◽

Point Clouds ◽

Urban Scenes ◽

3D Space ◽

3D Point Clouds ◽

Testing Site ◽

Voxel Grid

In this work, we report a novel way of generating ground truth dataset for analyzing point cloud from different sensors and the validation of algorithms. Instead of directly labeling large amount of 3D points requiring time consuming manual work, a multi-resolution 3D voxel grid for the testing site is generated. Then, with the help of a set of basic labeled points from the reference dataset, we can generate a 3D labeled space of the entire testing site with different resolutions. Specifically, an octree-based voxel structure is applied to voxelize the annotated reference point cloud, by which all the points are organized by 3D grids of multi-resolutions. When automatically annotating the new testing point clouds, a voting based approach is adopted to the labeled points within multiple resolution voxels, in order to assign a semantic label to the 3D space represented by the voxel. Lastly, robust line- and plane-based fast registration methods are developed for aligning point clouds obtained via various sensors. Benefiting from the labeled 3D spatial information, we can easily create new annotated 3D point clouds of different sensors of the same scene directly by considering the corresponding labels of 3D space the points located, which would be convenient for the validation and evaluation of algorithms related to point cloud interpretation and semantic segmentation.

Download Full-text