Go Wider: An Efficient Neural Network for Point Cloud Analysis via Group Convolutions

In order to achieve a better performance for point cloud analysis, many researchers apply deep neural networks using stacked Multi-Layer-Perceptron (MLP) convolutions over an irregular point cloud. However, applying these dense MLP convolutions over a large amount of points (e.g., autonomous driving application) leads to limitations due to the computation and memory capabilities. To achieve higher performances but decrease the computational complexity, we propose a deep-wide neural network, named ShufflePointNet, which can exploit fine-grained local features, but also reduce redundancies using group convolution and channel shuffle operation. Unlike conventional operations that directly apply MLPs on the high-dimensional features of a point cloud, our model goes “wider” by splitting features into groups with smaller depth in advance, having the respective MLP computations applied only to a single group, which can significantly reduce complexity and computation. At the same time, we allow communication between groups by shuffling the feature channel to capture fine-grained features. We further discuss the multi-branch method for wider neural networks being also beneficial to feature extraction for point clouds. We present extensive experiments for shape classification tasks on a ModelNet40 dataset and semantic segmentation task on large scale datasets ShapeNet part, S3DIS and KITTI. Finally, we carry out an ablation study and compare our model to other state-of-the-art algorithms to show its efficiency in terms of complexity and accuracy.

Download Full-text

Ground-distance segmentation of 3D LiDAR point cloud toward autonomous driving

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2020.21 ◽

2020 ◽

Vol 9 ◽

Author(s):

Jian Wu ◽

Qingxiong Yang

Keyword(s):

Point Cloud ◽

Large Scale ◽

Ground Plane ◽

Semantic Segmentation ◽

Point Clouds ◽

Autonomous Driving ◽

Urban Environments ◽

Cloud Data ◽

Dense Point ◽

3D Lidar

In this paper, we study the semantic segmentation of 3D LiDAR point cloud data in urban environments for autonomous driving, and a method utilizing the surface information of the ground plane was proposed. In practice, the resolution of a LiDAR sensor installed in a self-driving vehicle is relatively low and thus the acquired point cloud is indeed quite sparse. While recent work on dense point cloud segmentation has achieved promising results, the performance is relatively low when directly applied to sparse point clouds. This paper is focusing on semantic segmentation of the sparse point clouds obtained from 32-channel LiDAR sensor with deep neural networks. The main contribution is the integration of the ground information which is used to group ground points far away from each other. Qualitative and quantitative experiments on two large-scale point cloud datasets show that the proposed method outperforms the current state-of-the-art.

Download Full-text

SketchGNN: Semantic Sketch Segmentation with Graph Neural Networks

ACM Transactions on Graphics ◽

10.1145/3450284 ◽

2021 ◽

Vol 40 (3) ◽

pp. 1-13

Author(s):

Lumin Yang ◽

Jiajie Zhuang ◽

Hongbo Fu ◽

Xiangzhi Wei ◽

Kun Zhou ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Large Scale ◽

State Of The Art ◽

Semantic Segmentation ◽

Structure Information ◽

Graph Neural Networks ◽

Node Labels ◽

Point Level

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.

Download Full-text

Transformer Meets Convolution: A Bilateral Awareness Network for Semantic Segmentation of Very Fine Resolution Urban Scene Images

Remote Sensing ◽

10.3390/rs13163065 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3065

Author(s):

Libo Wang ◽

Rui Li ◽

Dongzhi Wang ◽

Chenxi Duan ◽

Teng Wang ◽

...

Keyword(s):

Large Scale ◽

Texture Features ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Research Field ◽

Learning Approaches ◽

Fine Grained ◽

Urban Scene ◽

Fine Resolution ◽

With Memory

Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, urban planning, etc. However, the tremendous details contained in the VFR image, especially the considerable variations in scale and appearance of objects, severely limit the potential of the existing deep learning approaches. Addressing such issues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this paper, we propose a Bilateral Awareness Network which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specifically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convolution operation. In addition, using the linear attention mechanism, a feature aggregation module is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effectiveness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.

Download Full-text

Recalibration of Neural Networks for Point Cloud Analysis

2020 International Conference on 3D Vision (3DV) ◽

10.1109/3dv50981.2020.00054 ◽

2020 ◽

Author(s):

Ignacio Sarasua ◽

Sebastian Polsterl ◽

Christian Wachinger

Keyword(s):

Neural Networks ◽

Point Cloud ◽

Point Cloud Analysis ◽

Cloud Analysis

Download Full-text

Methodology for remote assessment of pavement distresses from point cloud analysis

10.21079/11681/40401 ◽

2021 ◽

Author(s):

Ernest Berney ◽

Naveen Ganesh ◽

Andrew Ward ◽

J. Newman ◽

John Rushing

Keyword(s):

Point Cloud ◽

Surface Condition ◽

Three Dimensional ◽

Point Clouds ◽

Pavement Surface ◽

Cloud Models ◽

Rapid Generation ◽

Point To Point ◽

Point Cloud Analysis ◽

Cloud Analysis

The ability to remotely assess road and airfield pavement condition is critical to dynamic basing, contingency deployment, convoy entry and sustainment, and post-attack reconnaissance. Current Army processes to evaluate surface condition are time-consuming and require Soldier presence. Recent developments in the area of photogrammetry and light detection and ranging (LiDAR) enable rapid generation of three-dimensional point cloud models of the pavement surface. Point clouds were generated from data collected on a series of asphalt, concrete, and unsurfaced pavements using ground- and aerial-based sensors. ERDC-developed algorithms automatically discretize the pavement surface into cross- and grid-based sections to identify physical surface distresses such as depressions, ruts, and cracks. Depressions can be sized from the point-to-point distances bounding each depression, and surface roughness is determined based on the point heights along a given cross section. Noted distresses are exported to a distress map file containing only the distress points and their locations for later visualization and quality control along with classification and quantification. Further research and automation into point cloud analysis is ongoing with the goal of enabling Soldiers with limited training the capability to rapidly assess pavement surface condition from a remote platform.

Download Full-text

Semantic Segmentation of 3D Point Cloud Based on Spatial Eight-Quadrant Kernel Convolution

Remote Sensing ◽

10.3390/rs13163140 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3140

Author(s):

Liman Liu ◽

Jinjin Yu ◽

Longyu Tan ◽

Wanjuan Su ◽

Lin Zhao ◽

...

Keyword(s):

Point Cloud ◽

Poor Performance ◽

Semantic Segmentation ◽

Point Clouds ◽

Small Object ◽

Fine Grained ◽

3D Point Clouds ◽

Kernel Convolution ◽

Segmentation Accuracy ◽

Indoor Scenes

In order to deal with the problem that some existing semantic segmentation networks for 3D point clouds generally have poor performance on small objects, a Spatial Eight-Quadrant Kernel Convolution (SEQKC) algorithm is proposed to enhance the ability of the network for extracting fine-grained features from 3D point clouds. As a result, the semantic segmentation accuracy of small objects in indoor scenes can be improved. To be specific, in the spherical space of the point cloud neighborhoods, a kernel point with attached weights is constructed in each octant, the distances between the kernel point and the points in its neighborhood are calculated, and the distance and the kernel points’ weights are used together to weight the point cloud features in the neighborhood space. In this case, the relationship between points are modeled, so that the local fine-grained features of the point clouds can be extracted by the SEQKC. Based on the SEQKC, we design a downsampling module for point clouds, and embed it into classical semantic segmentation networks (PointNet++, PointSIFT and PointConv) for semantic segmentation. Experimental results on benchmark dataset ScanNet V2 show that SEQKC-based PointNet++, PointSIFT and PointConv outperform the original networks about 1.35–2.12% in terms of MIoU, and they effectively improve the semantic segmentation performance of the networks for small objects of indoor scenes, e.g., the segmentation accuracy of small object “picture” is improved from 0.70% of PointNet++ to 10.37% of SEQKC-PointNet++.

Download Full-text

Semantic Segmentation of Large-Scale Outdoor Point Clouds by Encoder–Decoder Shared MLPs with Multiple Losses

Remote Sensing ◽

10.3390/rs13163121 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3121

Author(s):

Beanbonyka Rim ◽

Ahyoung Lee ◽

Min Hong

Keyword(s):

Large Scale ◽

Semantic Segmentation ◽

Point Clouds ◽

Autonomous Driving ◽

Trade Off ◽

Efficiency And Effectiveness ◽

Benchmark Datasets ◽

Scale Characteristics ◽

3D Lidar ◽

Geometry Mapping

Semantic segmentation of large-scale outdoor 3D LiDAR point clouds becomes essential to understand the scene environment in various applications, such as geometry mapping, autonomous driving, and more. With an advantage of being a 3D metric space, 3D LiDAR point clouds, on the other hand, pose a challenge for a deep learning approach, due to their unstructured, unorder, irregular, and large-scale characteristics. Therefore, this paper presents an encoder–decoder shared multi-layer perceptron (MLP) with multiple losses, to address an issue of this semantic segmentation. The challenge rises a trade-off between efficiency and effectiveness in performance. To balance this trade-off, we proposed common mechanisms, which is simple and yet effective, by defining a random point sampling layer, an attention-based pooling layer, and a summation of multiple losses integrated with the encoder–decoder shared MLPs method for the large-scale outdoor point clouds semantic segmentation. We conducted our experiments on the following two large-scale benchmark datasets: Toronto-3D and DALES dataset. Our experimental results achieved an overall accuracy (OA) and a mean intersection over union (mIoU) of both the Toronto-3D dataset, with 83.60% and 71.03%, and the DALES dataset, with 76.43% and 59.52%, respectively. Additionally, our proposed method performed a few numbers of parameters of the model, and faster than PointNet++ by about three times during inferencing.

Download Full-text

JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6994 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12951-12958 ◽

Cited By ~ 3

Author(s):

Lin Zhao ◽

Wenbing Tao

Keyword(s):

Point Cloud ◽

Large Scale ◽

Feature Fusion ◽

Mean Shift ◽

Semantic Segmentation ◽

Point Clouds ◽

Semantic Features ◽

Backbone Network ◽

3D Point Clouds ◽

Instance Segmentation

In this paper, we propose a novel joint instance and semantic segmentation approach, which is called JSNet, in order to address the instance and semantic segmentation of 3D point clouds simultaneously. Firstly, we build an effective backbone network to extract robust features from the raw point clouds. Secondly, to obtain more discriminative features, a point cloud feature fusion module is proposed to fuse the different layer features of the backbone network. Furthermore, a joint instance semantic segmentation module is developed to transform semantic features into instance embedding space, and then the transformed features are further fused with instance features to facilitate instance segmentation. Meanwhile, this module also aggregates instance features into semantic feature space to promote semantic segmentation. Finally, the instance predictions are generated by applying a simple mean-shift clustering on instance embeddings. As a result, we evaluate the proposed JSNet on a large-scale 3D indoor point cloud dataset S3DIS and a part dataset ShapeNet, and compare it with existing approaches. Experimental results demonstrate our approach outperforms the state-of-the-art method in 3D instance segmentation with a significant improvement in 3D semantic prediction and our method is also beneficial for part segmentation. The source code for this work is available at https://github.com/dlinzhao/JSNet.

Download Full-text

Geometric Attentional Dynamic Graph Convolutional Neural Networks for Point Cloud Analysis

Neurocomputing ◽

10.1016/j.neucom.2020.12.067 ◽

2020 ◽

Author(s):

Yiming Cui ◽

Xin Liu ◽

Hongmin Liu ◽

Jiyong Zhang ◽

Alina Zare ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Point Cloud ◽

Dynamic Graph ◽

Point Cloud Analysis ◽

Cloud Analysis

Download Full-text

Lightweight Convolutional Neural Networks with Model-Switching Architecture for Multi-Scenario Road Semantic Segmentation

Applied Sciences ◽

10.3390/app11167424 ◽

2021 ◽

Vol 11 (16) ◽

pp. 7424

Author(s):

Peng-Wei Lin ◽

Chih-Ming Hsu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Individual Model ◽

Suppression Effect ◽

Optimal Weights ◽

Model Size ◽

Multiple Scenarios

A convolutional neural network (CNN) that was trained using datasets for multiple scenarios was proposed to facilitate real-time road semantic segmentation for various scenarios encountered in autonomous driving. However, the CNN inhibited the mutual suppression effect between weights; thus, it did not perform as well as a network that was trained using a single scenario. To address this limitation, we used a model-switching architecture in the network and maintained the optimal weights of each individual model which required considerable space and computation. We, subsequently, incorporated a lightweight process into the model to reduce the model size and computational load. The experimental results indicated that the proposed lightweight CNN with a model-switching architecture outperformed and was faster than the conventional methods across multiple scenarios in road semantic segmentation.

Download Full-text