DCARN: Deep Context Aware Recurrent Neural Network for Semantic Segmentation of Large Scale Unstructured 3D Point Cloud

In order to achieve a better performance for point cloud analysis, many researchers apply deep neural networks using stacked Multi-Layer-Perceptron (MLP) convolutions over an irregular point cloud. However, applying these dense MLP convolutions over a large amount of points (e.g., autonomous driving application) leads to limitations due to the computation and memory capabilities. To achieve higher performances but decrease the computational complexity, we propose a deep-wide neural network, named ShufflePointNet, which can exploit fine-grained local features, but also reduce redundancies using group convolution and channel shuffle operation. Unlike conventional operations that directly apply MLPs on the high-dimensional features of a point cloud, our model goes “wider” by splitting features into groups with smaller depth in advance, having the respective MLP computations applied only to a single group, which can significantly reduce complexity and computation. At the same time, we allow communication between groups by shuffling the feature channel to capture fine-grained features. We further discuss the multi-branch method for wider neural networks being also beneficial to feature extraction for point clouds. We present extensive experiments for shape classification tasks on a ModelNet40 dataset and semantic segmentation task on large scale datasets ShapeNet part, S3DIS and KITTI. Finally, we carry out an ablation study and compare our model to other state-of-the-art algorithms to show its efficiency in terms of complexity and accuracy.

Download Full-text

3D point cloud semantic segmentation toward large-scale unstructured agricultural scene classification

Computers and Electronics in Agriculture ◽

10.1016/j.compag.2021.106445 ◽

2021 ◽

Vol 190 ◽

pp. 106445

Author(s):

Yi Chen ◽

Yingjun Xiong ◽

Baohua Zhang ◽

Jun Zhou ◽

Qian Zhang

Keyword(s):

Point Cloud ◽

Large Scale ◽

Semantic Segmentation ◽

3D Point Cloud ◽

Scene Classification

Download Full-text

A progressive image semantic segmentation method using recurrent neural network

2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP) ◽

10.1109/icsp51882.2021.9408920 ◽

2021 ◽

Author(s):

Li Yi

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Semantic Segmentation ◽

Segmentation Method

Download Full-text

SketchGNN: Semantic Sketch Segmentation with Graph Neural Networks

ACM Transactions on Graphics ◽

10.1145/3450284 ◽

2021 ◽

Vol 40 (3) ◽

pp. 1-13

Author(s):

Lumin Yang ◽

Jiajie Zhuang ◽

Hongbo Fu ◽

Xiangzhi Wei ◽

Kun Zhou ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Large Scale ◽

State Of The Art ◽

Semantic Segmentation ◽

Structure Information ◽

Graph Neural Networks ◽

Node Labels ◽

Point Level

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.

Download Full-text

Real-Time 3D object detection using improved convolutional neural network based on image-driven point cloud

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096514666211026142721 ◽

2021 ◽

Vol 14 ◽

Author(s):

Zhiyong Gao ◽

Jianhong Xiang

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Real Time ◽

Point Cloud ◽

Point Clouds ◽

3D Point Cloud ◽

3D Object ◽

3D Object Detection ◽

Instance Segmentation

Background: While detecting the object directly from the 3D point cloud, the natural 3D patterns and invariance of 3D data are often obscure. Objective: In this work, we aimed at studying the 3D object detection from discrete, disordered and sparse 3D point clouds. Methods: The CNN is composed of the frustum sequence module, 3D instance segmentation module S-NET, 3D point cloud transformation module T-NET, and 3D boundary box estimation module E-NET. The search space of the object is determined by the frustum sequence module. The instance segmentation of the point cloud is performed by the 3D instance segmentation module. The 3D coordinates of the object are confirmed by the transformation module and the 3D bounding box estimation module. Results: Evaluated on KITTI benchmark dataset, our method outperforms the state of the art by remarkable margins while having real-time capability. Conclusion: We achieve real-time 3D object detection by proposing an improved convolutional neural network (CNN) based on image-driven point clouds.

Download Full-text

Food Volume Estimation Based on Deep Learning View Synthesis from a Single Depth Map

Nutrients ◽

10.3390/nu10122005 ◽

2018 ◽

Vol 10 (12) ◽

pp. 2005 ◽

Cited By ~ 12

Author(s):

Frank Lo ◽

Yingnan Sun ◽

Jianing Qiu ◽

Benny Lo

Keyword(s):

Neural Network ◽

Deep Learning ◽

Point Cloud ◽

Volume Estimation ◽

Assessment System ◽

View Synthesis ◽

Depth Image ◽

3D Point Cloud ◽

Viewing Angle ◽

3D Point Clouds

An objective dietary assessment system can help users to understand their dietary behavior and enable targeted interventions to address underlying health problems. To accurately quantify dietary intake, measurement of the portion size or food volume is required. For volume estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles which can be tedious. In this paper, a view synthesis approach based on deep learning is proposed to reconstruct 3D point clouds of food items and estimate the volume from a single depth image. A distinct neural network is designed to use a depth image from one viewing angle to predict another depth image captured from the corresponding opposite viewing angle. The whole 3D point cloud map is then reconstructed by fusing the initial data points with the synthesized points of the object items through the proposed point cloud completion and Iterative Closest Point (ICP) algorithms. Furthermore, a database with depth images of food object items captured from different viewing angles is constructed with image rendering and used to validate the proposed neural network. The methodology is then evaluated by comparing the volume estimated by the synthesized 3D point cloud with the ground truth volume of the object items.

Download Full-text

Survey Analysis of Robust and Real-Time Multi-Lane and Single Lane Detection in Indian Highway Scenarios

E3S Web of Conferences ◽

10.1051/e3sconf/202130901117 ◽

2021 ◽

Vol 309 ◽

pp. 01117

Author(s):

A. Sai Hanuman ◽

G. Prasanna Kumar

Keyword(s):

Neural Network ◽

System Integration ◽

Large Scale ◽

Feature Learning ◽

Semantic Segmentation ◽

Lane Detection ◽

Learning Approaches ◽

Survey Analysis ◽

Lane Recognition ◽

Continuous Frames

Studies on lane detection Lane identification methods, integration, and evaluation strategies square measure all examined. The system integration approaches for building a lot of strong detection systems are then evaluated and analyzed, taking into account the inherent limits of camera-based lane detecting systems. Present deep learning approaches to lane detection are inherently CNN's semantic segmentation network the results of the segmentation of the roadways and the segmentation of the lane markers are fused using a fusion method. By manipulating a huge number of frames from a continuous driving environment, we examine lane detection, and we propose a hybrid deep architecture that combines the convolution neural network (CNN) and the continuous neural network (CNN) (RNN). Because of the extensive information background and the high cost of camera equipment, a substantial number of existing results concentrate on vision-based lane recognition systems. Extensive tests on two large-scale datasets show that the planned technique outperforms rivals' lane detection strategies, particularly in challenging settings. A CNN block in particular isolates information from each frame before sending the CNN choices of several continuous frames with time-series qualities to the RNN block for feature learning and lane prediction.

Download Full-text