scholarly journals Neighborhood co-occurrence modeling in 3D point cloud segmentation

2021 ◽  
Vol 8 (2) ◽  
pp. 303-315
Author(s):  
Jingyu Gong ◽  
Zhou Ye ◽  
Lizhuang Ma

AbstractA significant performance boost has been achieved in point cloud semantic segmentation by utilization of the encoder-decoder architecture and novel convolution operations for point clouds. However, co-occurrence relationships within a local region which can directly influence segmentation results are usually ignored by current works. In this paper, we propose a neighborhood co-occurrence matrix (NCM) to model local co-occurrence relationships in a point cloud. We generate target NCM and prediction NCM from semantic labels and a prediction map respectively. Then, Kullback-Leibler (KL) divergence is used to maximize the similarity between the target and prediction NCMs to learn the co-occurrence relationship. Moreover, for large scenes where the NCMs for a sampled point cloud and the whole scene differ greatly, we introduce a reverse form of KL divergence which can better handle the difference to supervise the prediction NCMs. We integrate our method into an existing backbone and conduct comprehensive experiments on three datasets: Semantic3D for outdoor space segmentation, and S3DIS and ScanNet v2 for indoor scene segmentation. Results indicate that our method can significantly improve upon the backbone and outperform many leading competitors.

2021 ◽  
Vol 13 (4) ◽  
pp. 691
Author(s):  
Xiaoxiao Geng ◽  
Shunping Ji ◽  
Meng Lu ◽  
Lingli Zhao

Semantic segmentation of LiDAR point clouds has implications in self-driving, robots, and augmented reality, among others. In this paper, we propose a Multi-Scale Attentive Aggregation Network (MSAAN) to achieve the global consistency of point cloud feature representation and super segmentation performance. First, upon a baseline encoder-decoder architecture for point cloud segmentation, namely, RandLA-Net, an attentive skip connection was proposed to replace the commonly used concatenation to balance the encoder and decoder features of the same scales. Second, a channel attentive enhancement module was introduced to the local attention enhancement module to boost the local feature discriminability and aggregate the local channel structure information. Third, we developed a multi-scale feature aggregation method to capture the global structure of a point cloud from both the encoder and the decoder. The experimental results reported that our MSAAN significantly outperformed state-of-the-art methods, i.e., at least 15.3% mIoU improvement for scene-2 of CSPC dataset, 5.2% for scene-5 of CSPC dataset, and 6.6% for Toronto3D dataset.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Liang Gong ◽  
Xiaofeng Du ◽  
Kai Zhu ◽  
Ke Lin ◽  
Qiaojun Lou ◽  
...  

The automated measurement of crop phenotypic parameters is of great significance to the quantitative study of crop growth. The segmentation and classification of crop point cloud help to realize the automation of crop phenotypic parameter measurement. At present, crop spike-shaped point cloud segmentation has problems such as fewer samples, uneven distribution of point clouds, occlusion of stem and spike, disorderly arrangement of point clouds, and lack of targeted network models. The traditional clustering method can realize the segmentation of the plant organ point cloud with relatively independent spatial location, but the accuracy is not acceptable. This paper first builds a desktop-level point cloud scanning apparatus based on a structured-light projection module to facilitate the point cloud acquisition process. Then, the rice ear point cloud was collected, and the rice ear point cloud data set was made. In addition, data argumentation is used to improve sample utilization efficiency and training accuracy. Finally, a 3D point cloud convolutional neural network model called Panicle-3D was designed to achieve better segmentation accuracy. Specifically, the design of Panicle-3D is aimed at the multiscale characteristics of plant organs, combined with the structure of PointConv and long and short jumps, which accelerates the convergence speed of the network and reduces the loss of features in the process of point cloud downsampling. After comparison experiments, the segmentation accuracy of Panicle-3D reaches 93.4%, which is higher than PointNet. Panicle-3D is suitable for other similar crop point cloud segmentation tasks.


Author(s):  
R. Honma ◽  
H. Date ◽  
S. Kanai

<p><strong>Abstract.</strong> Point clouds acquired using Mobile Laser Scanning (MLS) are applied to extract road information such as curb stones, road markings, and road side objects. In this paper, we present a scanline-based MLS point cloud segmentation method for various road and road side objects. First, end points of the scanline, jump edge points, and corner points are extracted as feature points. The feature points are then interpolated to accurately extract irregular parts consisting of irregularly distributed points such as vegetation. Next, using a point reduction method, additional feature points on a smooth surface are extracted for segmentation at the edges of the curb cut. Finally, points between the feature points are extracted as flat segments on the scanline, and continuing feature points are extracted as irregular segments on the scanline. Furthermore, these segments on the scanline are integrated as flat or irregular regions. In the extraction of the feature points, neighboring points based on the spatial distance are used to avoid being influenced by the difference in the point density. Based on experiments, the effectiveness of the proposed method was indicated based on an application to an MLS point cloud.</p>


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2731
Author(s):  
Yunbo Rao ◽  
Menghan Zhang ◽  
Zhanglin Cheng ◽  
Junmin Xue ◽  
Jiansu Pu ◽  
...  

Accurate segmentation of entity categories is the critical step for 3D scene understanding. This paper presents a fast deep neural network model with Dense Conditional Random Field (DCRF) as a post-processing method, which can perform accurate semantic segmentation for 3D point cloud scene. On this basis, a compact but flexible framework is introduced for performing segmentation to the semantics of point clouds concurrently, contribute to more precise segmentation. Moreover, based on semantics labels, a novel DCRF model is elaborated to refine the result of segmentation. Besides, without any sacrifice to accuracy, we apply optimization to the original data of the point cloud, allowing the network to handle fewer data. In the experiment, our proposed method is conducted comprehensively through four evaluation indicators, proving the superiority of our method.


Author(s):  
J. Zhao ◽  
X. Zhang ◽  
Y. Wang

Abstract. Indoor 3D point clouds semantics segmentation is one of the key technologies of constructing 3D indoor models,which play an important role on domains like indoor navigation and positioning,intelligent city, intelligent robot etc. The deep-learning-based methods for point cloud segmentation take on higher degree of automation and intelligence. PointNet,the first deep neural network which manipulate point cloud directly, mainly extracts the global features but lacks of learning and extracting local features,which causes the poor ability of segmenting the local details of architecture and affects the precision of structural elements segmentation . Focusing on the problems above,this paper put forward an automatic end-to-end segmentation method base on the modified PointNet. According to the characteristic that the intensity of different indoor structural elements differ a lot, we input the point cloud information of 3D coordinate, color and intensity into the feature space of points. Also,a MaxPooling is added into the original PointNet network to improve the ability of attracting and learning local features. In addition, replace the 1×1 convolution kernel of original PointNet with 3×3 convolution kernel in the process of attracting features to improve the segmentation precision of indoor point cloud. The result shows that this method improves the automation and precision of indoor point cloud segmentation for the precision achieves over 80% to segment the structural elements like wall,door and so on ,and the average segmentation precision of every structural elements achieves 66%.


2021 ◽  
Author(s):  
Radu Alexandru Rosu ◽  
Peer Schütt ◽  
Jan Quenzel ◽  
Sven Behnke

AbstractDeep convolutional neural networks have shown outstanding performance in the task of semantically segmenting images. Applying the same methods on 3D data still poses challenges due to the heavy memory requirements and the lack of structured data. Here, we propose LatticeNet, a novel approach for 3D semantic segmentation, which takes raw point clouds as input. A PointNet describes the local geometry which we embed into a sparse permutohedral lattice. The lattice allows for fast convolutions while keeping a low memory footprint. Further, we introduce DeformSlice, a novel learned data-dependent interpolation for projecting lattice features back onto the point cloud. We present results of 3D segmentation on multiple datasets where our method achieves state-of-the-art performance. We also extend and evaluate our network for instance and dynamic object segmentation.


2021 ◽  
Vol 13 (5) ◽  
pp. 1003
Author(s):  
Nan Luo ◽  
Hongquan Yu ◽  
Zhenfeng Huo ◽  
Jinhui Liu ◽  
Quan Wang ◽  
...  

Semantic segmentation of the sensed point cloud data plays a significant role in scene understanding and reconstruction, robot navigation, etc. This work presents a Graph Convolutional Network integrating K-Nearest Neighbor searching (KNN) and Vector of Locally Aggregated Descriptors (VLAD). KNN searching is utilized to construct the topological graph of each point and its neighbors. Then, we perform convolution on the edges of constructed graph to extract representative local features by multiple Multilayer Perceptions (MLPs). Afterwards, a trainable VLAD layer, NetVLAD, is embedded in the feature encoder to aggregate the local and global contextual features. The designed feature encoder is repeated for multiple times, and the extracted features are concatenated in a jump-connection style to strengthen the distinctiveness of features and thereby improve the segmentation. Experimental results on two datasets show that the proposed work settles the shortcoming of insufficient local feature extraction and promotes the accuracy (mIoU 60.9% and oAcc 87.4% for S3DIS) of semantic segmentation comparing to existing models.


Author(s):  
F. Politz ◽  
M. Sester

<p><strong>Abstract.</strong> Over the past years, the algorithms for dense image matching (DIM) to obtain point clouds from aerial images improved significantly. Consequently, DIM point clouds are now a good alternative to the established Airborne Laser Scanning (ALS) point clouds for remote sensing applications. In order to derive high-level applications such as digital terrain models or city models, each point within a point cloud must be assigned a class label. Usually, ALS and DIM are labelled with different classifiers due to their varying characteristics. In this work, we explore both point cloud types in a fully convolutional encoder-decoder network, which learns to classify ALS as well as DIM point clouds. As input, we project the point clouds onto a 2D image raster plane and calculate the minimal, average and maximal height values for each raster cell. The network then differentiates between the classes ground, non-ground, building and no data. We test our network in six training setups using only one point cloud type, both point clouds as well as several transfer-learning approaches. We quantitatively and qualitatively compare all results and discuss the advantages and disadvantages of all setups. The best network achieves an overall accuracy of 96<span class="thinspace"></span>% in an ALS and 83<span class="thinspace"></span>% in a DIM test set.</p>


Author(s):  
Y. Cao ◽  
M. Previtali ◽  
M. Scaioni

Abstract. In the wake of the success of Deep Learning Networks (DLN) for image recognition, object detection, shape classification and semantic segmentation, this approach has proven to be both a major breakthrough and an excellent tool in point cloud classification. However, understanding how different types of DLN achieve still lacks. In several studies the output of segmentation/classification process is compared against benchmarks, but the network is treated as a “black-box” and intermediate steps are not deeply analysed. Specifically, here the following questions are discussed: (1) what exactly did DLN learn from a point cloud? (2) On the basis of what information do DLN make decisions? To conduct such a quantitative investigation of these DLN applied to point clouds, this paper investigates the visual interpretability for the decision-making process. Firstly, we introduce a reconstruction network able to reconstruct and visualise the learned features, in order to face with question (1). Then, we propose 3DCAM to indicate the discriminative point cloud regions used by these networks to identify that category, thus dealing with question (2). Through answering the above two questions, the paper would like to offer some initial solutions to better understand the application of DLN to point clouds.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2501 ◽  
Author(s):  
Yanan Song ◽  
Liang Gao ◽  
Xinyu Li ◽  
Weiming Shen

Deep learning is robust to the perturbation of a point cloud, which is an important data form in the Internet of Things. However, it cannot effectively capture the local information of the point cloud and recognize the fine-grained features of an object. Different levels of features in the deep learning network are integrated to obtain local information, but this strategy increases network complexity. This paper proposes an effective point cloud encoding method that facilitates the deep learning network to utilize the local information. An axis-aligned cube is used to search for a local region that represents the local information. All of the points in the local region are available to construct the feature representation of each point. These feature representations are then input to a deep learning network. Two well-known datasets, ModelNet40 shape classification benchmark and Stanford 3D Indoor Semantics Dataset, are used to test the performance of the proposed method. Compared with other methods with complicated structures, the proposed method with only a simple deep learning network, can achieve a higher accuracy in 3D object classification and semantic segmentation.


Sign in / Sign up

Export Citation Format

Share Document