scholarly journals Point Cloud Semantic Segmentation Network Based on Multi-Scale Feature Fusion

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1625
Author(s):  
Jing Du ◽  
Zuning Jiang ◽  
Shangfeng Huang ◽  
Zongyue Wang ◽  
Jinhe Su ◽  
...  

The semantic segmentation of small objects in point clouds is currently one of the most demanding tasks in photogrammetry and remote sensing applications. Multi-resolution feature extraction and fusion can significantly enhance the ability of object classification and segmentation, so it is widely used in the image field. For this motivation, we propose a point cloud semantic segmentation network based on multi-scale feature fusion (MSSCN) to aggregate the feature of a point cloud with different densities and improve the performance of semantic segmentation. In our method, random downsampling is first applied to obtain point clouds of different densities. A Spatial Aggregation Net (SAN) is then employed as the backbone network to extract local features from these point clouds, followed by concatenation of the extracted feature descriptors at different scales. Finally, a loss function is used to combine the different semantic information from point clouds of different densities for network optimization. Experiments were conducted on the S3DIS and ScanNet datasets, and our MSSCN achieved accuracies of 89.80% and 86.3%, respectively, on these datasets. Our method showed better performance than the recent methods PointNet, PointNet++, PointCNN, PointSIFT, and SAN.

Author(s):  
F. Politz ◽  
M. Sester

<p><strong>Abstract.</strong> Over the past years, the algorithms for dense image matching (DIM) to obtain point clouds from aerial images improved significantly. Consequently, DIM point clouds are now a good alternative to the established Airborne Laser Scanning (ALS) point clouds for remote sensing applications. In order to derive high-level applications such as digital terrain models or city models, each point within a point cloud must be assigned a class label. Usually, ALS and DIM are labelled with different classifiers due to their varying characteristics. In this work, we explore both point cloud types in a fully convolutional encoder-decoder network, which learns to classify ALS as well as DIM point clouds. As input, we project the point clouds onto a 2D image raster plane and calculate the minimal, average and maximal height values for each raster cell. The network then differentiates between the classes ground, non-ground, building and no data. We test our network in six training setups using only one point cloud type, both point clouds as well as several transfer-learning approaches. We quantitatively and qualitatively compare all results and discuss the advantages and disadvantages of all setups. The best network achieves an overall accuracy of 96<span class="thinspace"></span>% in an ALS and 83<span class="thinspace"></span>% in a DIM test set.</p>


2020 ◽  
Vol 34 (07) ◽  
pp. 12951-12958 ◽  
Author(s):  
Lin Zhao ◽  
Wenbing Tao

In this paper, we propose a novel joint instance and semantic segmentation approach, which is called JSNet, in order to address the instance and semantic segmentation of 3D point clouds simultaneously. Firstly, we build an effective backbone network to extract robust features from the raw point clouds. Secondly, to obtain more discriminative features, a point cloud feature fusion module is proposed to fuse the different layer features of the backbone network. Furthermore, a joint instance semantic segmentation module is developed to transform semantic features into instance embedding space, and then the transformed features are further fused with instance features to facilitate instance segmentation. Meanwhile, this module also aggregates instance features into semantic feature space to promote semantic segmentation. Finally, the instance predictions are generated by applying a simple mean-shift clustering on instance embeddings. As a result, we evaluate the proposed JSNet on a large-scale 3D indoor point cloud dataset S3DIS and a part dataset ShapeNet, and compare it with existing approaches. Experimental results demonstrate our approach outperforms the state-of-the-art method in 3D instance segmentation with a significant improvement in 3D semantic prediction and our method is also beneficial for part segmentation. The source code for this work is available at https://github.com/dlinzhao/JSNet.


2020 ◽  
Vol 12 (6) ◽  
pp. 1049 ◽  
Author(s):  
Jie Chen ◽  
Fen He ◽  
Yi Zhang ◽  
Geng Sun ◽  
Min Deng

The lack of pixel-level labeling limits the practicality of deep learning-based building semantic segmentation. Weakly supervised semantic segmentation based on image-level labeling results in incomplete object regions and missing boundary information. This paper proposes a weakly supervised semantic segmentation method for building detection. The proposed method takes the image-level label as supervision information in a classification network that combines superpixel pooling and multi-scale feature fusion structures. The main advantage of the proposed strategy is its ability to improve the intactness and boundary accuracy of a detected building. Our method achieves impressive results on two 2D semantic labeling datasets, which outperform some competing weakly supervised methods and are close to the result of the fully supervised method.


2021 ◽  
Vol 13 (4) ◽  
pp. 691
Author(s):  
Xiaoxiao Geng ◽  
Shunping Ji ◽  
Meng Lu ◽  
Lingli Zhao

Semantic segmentation of LiDAR point clouds has implications in self-driving, robots, and augmented reality, among others. In this paper, we propose a Multi-Scale Attentive Aggregation Network (MSAAN) to achieve the global consistency of point cloud feature representation and super segmentation performance. First, upon a baseline encoder-decoder architecture for point cloud segmentation, namely, RandLA-Net, an attentive skip connection was proposed to replace the commonly used concatenation to balance the encoder and decoder features of the same scales. Second, a channel attentive enhancement module was introduced to the local attention enhancement module to boost the local feature discriminability and aggregate the local channel structure information. Third, we developed a multi-scale feature aggregation method to capture the global structure of a point cloud from both the encoder and the decoder. The experimental results reported that our MSAAN significantly outperformed state-of-the-art methods, i.e., at least 15.3% mIoU improvement for scene-2 of CSPC dataset, 5.2% for scene-5 of CSPC dataset, and 6.6% for Toronto3D dataset.


2021 ◽  
Vol 13 (17) ◽  
pp. 3474
Author(s):  
Jian Li ◽  
Shuowen Huang ◽  
Hao Cui ◽  
Yurong Ma ◽  
Xiaolong Chen

As an important and fundamental step in 3D reconstruction, point cloud registration aims to find rigid transformation that register two point sets. The major challenge in point cloud registration techniques is finding correct correspondences in the scenes which may contain many repetitive structures and noise. This paper is primarily concerned with improving registration using a priori semantic information in the search for correspondences. In particular, we present a new point cloud registration pipeline for large outdoor scenes that takes advantage of semantic segmentation. Our method consists of extracting semantic segments from point clouds uses an efficient deep neural network; then, detecting the key points of the point cloud and using a feature descriptor to get the initial correspondence set; finally, applying a Random Sample Consensus (RANSAC) strategy to estimate the transformations that align segments with the same labels. Instead of using all points to estimate a global alignment, our method aligns two point clouds using transformations calculated by each segment with the highest inlier ratio. We evaluate our method on the publicly available Whu-TLS registration dataset. These experiments demonstrate how a priori semantic information the improves registration in terms of precision and speed.


Sign in / Sign up

Export Citation Format

Share Document