3d Object Detection For Autonomous Driving Using Temporal Lidar Data

Multi-Object Tracking (MOT) is an integral part of any autonomous driving pipelines because it produces trajectories of other moving objects in the scene and predicts their future motion. Thanks to the recent advances in 3D object detection enabled by deep learning, track-by-detection has become the dominant paradigm in 3D MOT. In this paradigm, a MOT system is essentially made of an object detector and a data association algorithm which establishes track-to-detection correspondence. While 3D object detection has been actively researched, association algorithms for 3D MOT has settled at bipartite matching formulated as a Linear Assignment Problem (LAP) and solved by the Hungarian algorithm. In this paper, we adapt a two-stage data association method which was successfully applied to image-based tracking to the 3D setting, thus providing an alternative for data association for 3D MOT. Our method outperforms the baseline using one-stage bipartite matching for data association by achieving 0.587 Average Multi-Object Tracking Accuracy (AMOTA) in NuScenes validation set and 0.365 AMOTA (at level 2) in Waymo test set.

Download Full-text

Strong-Weak Feature Alignment for 3D Object Detection

Electronics ◽

10.3390/electronics10101205 ◽

2021 ◽

Vol 10 (10) ◽

pp. 1205

Author(s):

Zhiyu Wang ◽

Li Wang ◽

Bin Dai

Keyword(s):

Object Detection ◽

Point Clouds ◽

Autonomous Driving ◽

Feature Representation ◽

Alignment Algorithm ◽

3D Object ◽

3D Point Clouds ◽

Object Feature ◽

3D Object Detection ◽

Feature Alignment

Object detection in 3D point clouds is still a challenging task in autonomous driving. Due to the inherent occlusion and density changes of the point cloud, the data distribution of the same object will change dramatically. Especially, the incomplete data with sparsity or occlusion can not represent the complete characteristics of the object. In this paper, we proposed a novel strong–weak feature alignment algorithm between complete and incomplete objects for 3D object detection, which explores the correlations within the data. It is an end-to-end adaptive network that does not require additional data and can be easily applied to other object detection networks. Through a complete object feature extractor, we achieve a robust feature representation of the object. It serves as a guarding feature to help the incomplete object feature generator to generate effective features. The strong–weak feature alignment algorithm reduces the gap between different states of the same object and enhances the ability to represent the incomplete object. The proposed adaptation framework is validated on the KITTI object benchmark and gets about 6% improvement in detection average precision on 3D moderate difficulty compared to the basic model. The results show that our adaptation method improves the detection performance of incomplete 3D objects.

Download Full-text

Monocular 3D Object Detection for Autonomous Driving

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr.2016.236 ◽

2016 ◽

Cited By ~ 221

Author(s):

Xiaozhi Chen ◽

Kaustav Kundu ◽

Ziyu Zhang ◽

Huimin Ma ◽

Sanja Fidler ◽

...

Keyword(s):

Object Detection ◽

Autonomous Driving ◽

3D Object ◽

3D Object Detection

Download Full-text

Deep 3D Object Detection Networks Using LiDAR Data: A Review

IEEE Sensors Journal ◽

10.1109/jsen.2020.3020626 ◽

2021 ◽

Vol 21 (2) ◽

pp. 1152-1171

Author(s):

Yutian Wu ◽

Yueyu Wang ◽

Shuwei Zhang ◽

Harutoshi Ogai

Keyword(s):

Object Detection ◽

Lidar Data ◽

3D Object ◽

3D Object Detection

Download Full-text

Optimization of the PointPillars network for 3D object detection in point clouds

10.36227/techrxiv.12593555.v1 ◽

2020 ◽

Author(s):

Joanna Stanisz ◽

Konrad Lis ◽

Tomasz Kryjak ◽

Marek Gorgon

Keyword(s):

Object Detection ◽

Point Cloud ◽

Main Part ◽

Point Clouds ◽

Lidar Data ◽

Detection Accuracy ◽

3D Object ◽

Fold Reduction ◽

Low Energy Consumption ◽

3D Object Detection

In this paper we present our research on the optimisation of a deep neural network for 3D object detection in a point cloud. Techniques like quantisation and pruning available in the Brevitas and PyTorch tools were used. We performed the experiments for the PointPillars network, which offers a reasonable compromise between detection accuracy and calculation complexity. The aim of this work was to propose a variant of the network which we will ultimately implement in an FPGA device. This will allow for real-time LiDAR data processing with low energy consumption. The obtained results indicate that even a significant quantisation from 32-bit floating point to 2-bit integer in the main part of the algorithm, results in 5%-9% decrease of the detection accuracy, while allowing for almost a 16-fold reduction in size of the model.

Download Full-text

A Novel Regional Fusion Network for 3D Object Detection based on RGB Images and Point Clouds

10.5121/csit.2021.111812 ◽

2021 ◽

Author(s):

Hung-Hao Chen ◽

Chia-Hung Wang ◽

Hsueh-Wei Chen ◽

Pei-Yung Hsiao ◽

Li-Chen Fu ◽

...

Keyword(s):

Object Detection ◽

Receptive Fields ◽

Point Clouds ◽

Detection Methods ◽

Lidar Data ◽

3D Object ◽

Multi Scale ◽

Interest Level ◽

Rgb Images ◽

3D Object Detection

The current fusion-based methods transform LiDAR data into bird’s eye view (BEV) representations or 3D voxel, leading to information loss and heavy computation cost of 3D convolution. In contrast, we directly consume raw point clouds and perform fusion between two modalities. We employ the concept of region proposal network to generate proposals from two streams, respectively. In order to make two sensors compensate the weakness of each other, we utilize the calibration parameters to project proposals from one stream onto the other. With the proposed multi-scale feature aggregation module, we are able to combine the extracted regionof-interest-level (RoI-level) features of RGB stream from different receptive fields, resulting in fertilizing feature richness. Experiments on KITTI dataset show that our proposed network outperforms other fusion-based methods with meaningful improvements as compared to 3D object detection methods under challenging setting.

Download Full-text

R-CNN Based 3D Object Detection for Autonomous Driving

CICTP 2020 ◽

10.1061/9780784483053.077 ◽

2020 ◽

Author(s):

Hongyu Hu ◽

Tongtong Zhao ◽

Qi Wang ◽

Fei Gao ◽

Lei He

Keyword(s):

Object Detection ◽

Autonomous Driving ◽

3D Object ◽

3D Object Detection

Download Full-text

Stereo R-CNN Based 3D Object Detection for Autonomous Driving

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr.2019.00783 ◽

2019 ◽

Cited By ~ 48

Author(s):

Peiliang Li ◽

Xiaozhi Chen ◽

Shaojie Shen

Keyword(s):

Object Detection ◽

Autonomous Driving ◽

3D Object ◽

3D Object Detection

Download Full-text

3D Object Detection Based on LiDAR Data

2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) ◽

10.1109/uemcon47517.2019.8993088 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ramin Sahba ◽

Amin Sahba ◽

Mo Jamshidi ◽

Paul Rad

Keyword(s):

Object Detection ◽

Lidar Data ◽

3D Object ◽

3D Object Detection

Download Full-text

ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6945 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12557-12564 ◽

Cited By ~ 4

Author(s):

Zhenbo Xu ◽

Wei Zhang ◽

Xiaoqing Ye ◽

Xiao Tan ◽

Wei Yang ◽

...

Keyword(s):

Object Detection ◽

Point Clouds ◽

Autonomous Driving ◽

Disparity Estimation ◽

3D Object ◽

Detection Model ◽

Occluded Objects ◽

Bounding Boxes ◽

Detection Quality ◽

3D Object Detection

3D object detection is an essential task in autonomous driving and robotics. Though great progress has been made, challenges remain in estimating 3D pose for distant and occluded objects. In this paper, we present a novel framework named ZoomNet for stereo imagery-based 3D detection. The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes. To further exploit the abundant texture cues in rgb images for more accurate disparity estimation, we introduce a conceptually straight-forward module – adaptive zooming, which simultaneously resizes 2D instance bounding boxes to a unified resolution and adjusts the camera intrinsic parameters accordingly. In this way, we are able to estimate higher-quality disparity maps from the resized box images then construct dense point clouds for both nearby and distant objects. Moreover, we introduce to learn part locations as complementary features to improve the resistance against occlusion and put forward the 3D fitting score to better estimate the 3D detection quality. Extensive experiments on the popular KITTI 3D detection dataset indicate ZoomNet surpasses all previous state-of-the-art methods by large margins (improved by 9.4% on APbv (IoU=0.7) over pseudo-LiDAR). Ablation study also demonstrates that our adaptive zooming strategy brings an improvement of over 10% on AP3d (IoU=0.7). In addition, since the official KITTI benchmark lacks fine-grained annotations like pixel-wise part locations, we also present our KFG dataset by augmenting KITTI with detailed instance-wise annotations including pixel-wise part location, pixel-wise disparity, etc.. Both the KFG dataset and our codes will be publicly available at https://github.com/detectRecog/ZoomNet.

Download Full-text