Real-Time LiDAR Point Cloud Semantic Segmentation for Autonomous Driving

Xing Xie; Lin Bai; Xinming Huang

doi:10.3390/electronics11010011

Real-Time LiDAR Point Cloud Semantic Segmentation for Autonomous Driving

Electronics ◽

10.3390/electronics11010011 ◽

2021 ◽

Vol 11 (1) ◽

pp. 11

Author(s):

Xing Xie ◽

Lin Bai ◽

Xinming Huang

Keyword(s):

Real Time ◽

Power Efficiency ◽

Point Cloud ◽

Processing Time ◽

State Of The Art ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Geometric Information ◽

Embedded Platform ◽

Gpu Implementation

LiDAR has been widely used in autonomous driving systems to provide high-precision 3D geometric information about the vehicle’s surroundings for perception, localization, and path planning. LiDAR-based point cloud semantic segmentation is an important task with a critical real-time requirement. However, most of the existing convolutional neural network (CNN) models for 3D point cloud semantic segmentation are very complex and can hardly be processed at real-time on an embedded platform. In this study, a lightweight CNN structure was proposed for projection-based LiDAR point cloud semantic segmentation with only 1.9 M parameters that gave an 87% reduction comparing to the state-of-the-art networks. When evaluated on a GPU, the processing time was 38.5 ms per frame, and it achieved a 47.9% mIoU score on Semantic-KITTI dataset. In addition, the proposed CNN is targeted on an FPGA using an NVDLA architecture, which results in a 2.74x speedup over the GPU implementation with a 46 times improvement in terms of power efficiency.

Download Full-text

Point Cloud Semantic Segmentation with Cross-Correction Features

10.21203/rs.3.rs-1218117/v1 ◽

2022 ◽

Author(s):

Yuehua Zhao ◽

Ma Jie ◽

Chong Nannan ◽

Wen Junjie

Keyword(s):

Real Time ◽

Point Cloud ◽

Large Scale ◽

Spatial Information ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Basic Unit ◽

Semantic Features ◽

Point Cloud Segmentation ◽

Scale Point

Abstract Real time large scale point cloud segmentation is an important but challenging task for practical application like autonomous driving. Existing real time methods have achieved acceptance performance by aggregating local information. However, most of them only exploit local spatial information or local semantic information dependently, few considering the complementarity of both. In this paper, we propose a model named Spatial-Semantic Incorporation Network (SSI-Net) for real time large scale point cloud segmentation. A Spatial-Semantic Cross-correction (SSC) module is introduced in SSI-Net as a basic unit. High quality contextual features can be learned through SSC by correct and update semantic features using spatial cues, and vice verse. Adopting the plug-and-play SSC module, we design SSI-Net as an encoder-decoder architecture. To ensure efficiency, it also adopts a random sample based hierarchical network structure. Extensive experiments on several prevalent datasets demonstrate that our method can achieve state-of-the-art performance.

Download Full-text

Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving

Proceedings of the 27th ACM International Conference on Multimedia ◽

10.1145/3343031.3351076 ◽

2019 ◽

Cited By ~ 2

Author(s):

Jian Wu ◽

Jianbo Jiao ◽

Qingxiong Yang ◽

Zheng-Jun Zha ◽

Xuejin Chen

Keyword(s):

Point Cloud ◽

Semantic Segmentation ◽

Autonomous Driving

Download Full-text

Real-Time Semantic Segmentation with Dual Encoder and Self-Attention Mechanism for Autonomous Driving

Sensors ◽

10.3390/s21238072 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8072

Author(s):

Yu-Bang Chang ◽

Chieh Tsai ◽

Chang-Hong Lin ◽

Poki Chen

Keyword(s):

Deep Learning ◽

Real Time ◽

Network Architecture ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Attention Mechanism ◽

Trade Off ◽

Segmentation Methods ◽

General Semantic ◽

Deep Learning Model

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.

Download Full-text

A Real-Time Semantic Segmentation Method of Sheep Carcass Images Based on ICNet

Journal of Robotics ◽

10.1155/2021/8847984 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Shida Zhao ◽

Guangzhao Hao ◽

Yichi Zhang ◽

Shucai Wang

Keyword(s):

Image Processing ◽

Real Time ◽

Processing Time ◽

Image Data ◽

Semantic Segmentation ◽

Segmentation Method ◽

Single Image ◽

Evaluation Standard ◽

Segmentation Accuracy ◽

Image Dataset

How to realize the accurate recognition of 3 parts of sheep carcass is the key to the research of mutton cutting robots. The characteristics of each part of the sheep carcass are connected to each other and have similar features, which make it difficult to identify and detect, but with the development of image semantic segmentation technology based on deep learning, it is possible to explore this technology for real-time recognition of the 3 parts of the sheep carcass. Based on the ICNet, we propose a real-time semantic segmentation method for sheep carcass images. We first acquire images of the sheep carcass and use augmentation technology to expand the image data, after normalization, using LabelMe to annotate the image and build the sheep carcass image dataset. After that, we establish the ICNet model and train it with transfer learning. The segmentation accuracy, MIoU, and the average processing time of single image are then obtained and used as the evaluation standard of the segmentation effect. In addition, we verify the generalization ability of the ICNet for the sheep carcass image dataset by setting different brightness image segmentation experiments. Finally, the U-Net, DeepLabv3, PSPNet, and Fast-SCNN are introduced for comparative experiments to further verify the segmentation performance of the ICNet. The experimental results show that for the sheep carcass image datasets, the segmentation accuracy and MIoU of our method are 97.68% and 88.47%, respectively. The single image processing time is 83 ms. Besides, the MIoU of U-Net and DeepLabv3 is 0.22% and 0.03% higher than the ICNet, but the processing time of a single image is longer by 186 ms and 430 ms. Besides, compared with the PSPNet and Fast-SCNN, the MIoU of the ICNet model is increased by 1.25% and 4.49%, respectively. However, the processing time of a single image is shorter by 469 ms and expands by 7 ms, respectively.

Download Full-text

Geometry Sharing Network for 3D Point Cloud Classification and Segmentation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6938 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12500-12507 ◽

Cited By ~ 2

Author(s):

Mingye Xu ◽

Zhipeng Zhou ◽

Yu Qiao

Keyword(s):

Point Cloud ◽

State Of The Art ◽

Nearest Neighbors ◽

Classification Performance ◽

3D Point Cloud ◽

Geometric Information ◽

Geometric Transformations ◽

Geometry Similarity ◽

Public Datasets ◽

Point Cloud Classification

In spite of the recent progresses on classifying 3D point cloud with deep CNNs, large geometric transformations like rotation and translation remain challenging problem and harm the final classification performance. To address this challenge, we propose Geometry Sharing Network (GS-Net) which effectively learns point descriptors with holistic context to enhance the robustness to geometric transformations. Compared with previous 3D point CNNs which perform convolution on nearby points, GS-Net can aggregate point features in a more global way. Specially, GS-Net consists of Geometry Similarity Connection (GSC) modules which exploit Eigen-Graph to group distant points with similar and relevant geometric information, and aggregate features from nearest neighbors in both Euclidean space and Eigenvalue space. This design allows GS-Net to efficiently capture both local and holistic geometric features such as symmetry, curvature, convexity and connectivity. Theoretically, we show the nearest neighbors of each point in Eigenvalue space are invariant to rotation and translation. We conduct extensive experiments on public datasets, ModelNet40, ShapeNet Part. Experiments demonstrate that GS-Net achieves the state-of-the-art performances on major datasets, 93.3% on ModelNet40, and are more robust to geometric transformations.

Download Full-text

LiDAR Data Enrichment by Fusing Spatial and Temporal Adjacent Frames

Remote Sensing ◽

10.3390/rs13183640 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3640

Author(s):

Hao Fu ◽

Hanzhang Xue ◽

Xiaochang Hu ◽

Bokai Liu

Keyword(s):

Sparse Representation ◽

Real Time ◽

Point Cloud ◽

Moving Objects ◽

Autonomous Driving ◽

Experimental Results ◽

Lidar Data ◽

Identification Algorithm ◽

Data Enrichment ◽

A New Technique

In autonomous driving scenarios, the point cloud generated by LiDAR is usually considered as an accurate but sparse representation. In order to enrich the LiDAR point cloud, this paper proposes a new technique that combines spatial adjacent frames and temporal adjacent frames. To eliminate the “ghost” artifacts caused by moving objects, a moving point identification algorithm is introduced that employs the comparison between range images. Experiments are performed on the publicly available Semantic KITTI dataset. Experimental results show that the proposed method outperforms most of the previous approaches. Compared with these previous works, the proposed method is the only method that can run in real-time for online usage.

Download Full-text

Real-time object detection and semantic segmentation for autonomous driving

MIPPR 2017: Automatic Target Recognition and Navigation ◽

10.1117/12.2288713 ◽

2018 ◽

Author(s):

Weichao Xu ◽

Baojun Li ◽

Sun Liu ◽

Wei Qiu

Keyword(s):

Object Detection ◽

Real Time ◽

Semantic Segmentation ◽

Autonomous Driving

Download Full-text

A Comparative Study of Real-Time Semantic Segmentation for Autonomous Driving

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) ◽

10.1109/cvprw.2018.00101 ◽

2018 ◽

Cited By ~ 17

Author(s):

Mennatullah Siam ◽

Mostafa Gamal ◽

Moemen Abdel-Razek ◽

Senthil Yogamani ◽

Martin Jagersand ◽

...

Keyword(s):

Comparative Study ◽

Real Time ◽

Semantic Segmentation ◽

Autonomous Driving

Download Full-text

Spatial Aggregation Net: Point Cloud Semantic Segmentation Based on Multi-Directional Convolution

Sensors ◽

10.3390/s19194329 ◽

2019 ◽

Vol 19 (19) ◽

pp. 4329 ◽

Cited By ~ 3

Author(s):

Guorong Cai ◽

Zuning Jiang ◽

Zongyue Wang ◽

Shangfeng Huang ◽

Kai Chen ◽

...

Keyword(s):

Spatial Structure ◽

Point Cloud ◽

Smart Cities ◽

Semantic Segmentation ◽

Point Clouds ◽

Autonomous Driving ◽

Spatial Aggregation ◽

Structure Information ◽

Aggregate Information ◽

3D Point Clouds

Semantic segmentation of 3D point clouds plays a vital role in autonomous driving, 3D maps, and smart cities, etc. Recent work such as PointSIFT shows that spatial structure information can improve the performance of semantic segmentation. Motivated by this phenomenon, we propose Spatial Aggregation Net (SAN) for point cloud semantic segmentation. SAN is based on multi-directional convolution scheme that utilizes the spatial structure information of point cloud. Firstly, Octant-Search is employed to capture the neighboring points around each sampled point. Secondly, we use multi-directional convolution to extract information from different directions of sampled points. Finally, max-pooling is used to aggregate information from different directions. The experimental results conducted on ScanNet database show that the proposed SAN has comparable results with state-of-the-art algorithms such as PointNet, PointNet++, and PointSIFT, etc. In particular, our method has better performance on flat, small objects, and the edge areas that connect objects. Moreover, our model has good trade-off in segmentation accuracy and time complexity.

Download Full-text

Implementation of a Lightweight Semantic Segmentation Algorithm in Road Obstacle Detection

Sensors ◽

10.3390/s20247089 ◽

2020 ◽

Vol 20 (24) ◽

pp. 7089

Author(s):

Bushi Liu ◽

Yongbo Lv ◽

Yang Gu ◽

Wanjun Lv

Keyword(s):

Real Time ◽

Spatial Information ◽

Feature Fusion ◽

Semantic Segmentation ◽

Spatial Location ◽

Autonomous Driving ◽

Obstacle Detection ◽

Depth Information ◽

Long Time ◽

Deep Learning Network

Due to deep learning’s accurate cognition of the street environment, the convolutional neural network has achieved dramatic development in the application of street scenes. Considering the needs of autonomous driving and assisted driving, in a general way, computer vision technology is used to find obstacles to avoid collisions, which has made semantic segmentation a research priority in recent years. However, semantic segmentation has been constantly facing new challenges for quite a long time. Complex network depth information, large datasets, real-time requirements, etc., are typical problems that need to be solved urgently in the realization of autonomous driving technology. In order to address these problems, we propose an improved lightweight real-time semantic segmentation network, which is based on an efficient image cascading network (ICNet) architecture, using multi-scale branches and a cascaded feature fusion unit to extract rich multi-level features. In this paper, a spatial information network is designed to transmit more prior knowledge of spatial location and edge information. During the course of the training phase, we append an external loss function to enhance the learning process of the deep learning network system as well. This lightweight network can quickly perceive obstacles and detect roads in the drivable area from images to satisfy autonomous driving characteristics. The proposed model shows substantial performance on the Cityscapes dataset. With the premise of ensuring real-time performance, several sets of experimental comparisons illustrate that SP-ICNet enhances the accuracy of road obstacle detection and provides nearly ideal prediction outputs. Compared to the current popular semantic segmentation network, this study also demonstrates the effectiveness of our lightweight network for road obstacle detection in autonomous driving.

Download Full-text