Efficient representation and feature extraction for neural network-based 3D object pose estimation

Three-dimensional (3D) object detection has important applications in robotics, automatic loading, automatic driving and other scenarios. With the improvement of devices, people can collect multi-sensor/multimodal data from a variety of sensors such as Lidar and cameras. In order to make full use of various information advantages and improve the performance of object detection, we proposed a Complex-Retina network, a convolution neural network for 3D object detection based on multi-sensor data fusion. Firstly, a unified architecture with two feature extraction networks was designed, and the feature extraction of point clouds and images from different sensors realized synchronously. Then, we set a series of 3D anchors and projected them to the feature maps, which were cropped into 2D anchors with the same size and fused together. Finally, the object classification and 3D bounding box regression were carried out on the multipath of fully connected layers. The proposed network is a one-stage convolution neural network, which achieves the balance between the accuracy and speed of object detection. The experiments on KITTI datasets show that the proposed network is superior to the contrast algorithms in average precision (AP) and time consumption, which shows the effectiveness of the proposed network.

Download Full-text

Pose Guided RGBD Feature Learning for 3D Object Pose Estimation

2017 IEEE International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2017.416 ◽

2017 ◽

Cited By ~ 24

Author(s):

Vassileios Balntas ◽

Andreas Doumanoglou ◽

Caner Sahin ◽

Juil Sock ◽

Rigas Kouskouridas ◽

...

Keyword(s):

Pose Estimation ◽

Feature Learning ◽

3D Object ◽

Object Pose Estimation

Download Full-text

Extra FAT: A Photorealistic Dataset for 6D Object Pose Estimation

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.8.imawm-221 ◽

2020 ◽

Vol 2020 (8) ◽

pp. 221-1-221-7

Author(s):

Jianhang Chen ◽

Daniel Mas Montserrat ◽

Qian Lin ◽

Edward J. Delp ◽

Jan P. Allebach

Keyword(s):

Object Detection ◽

Pose Estimation ◽

Object Segmentation ◽

3D Object ◽

Virtual Camera ◽

Image Dataset ◽

Rgb Images ◽

Object Pose Estimation ◽

Object Models

We introduce a new image dataset for object detection and 6D pose estimation, named Extra FAT. The dataset consists of 825K photorealistic RGB images with annotations of groundtruth location and rotation for both the virtual camera and the objects. A registered pixel-level object segmentation mask is also provided for object detection and segmentation tasks. The dataset includes 110 different 3D object models. The object models were rendered in five scenes with diverse illumination, reflection, and occlusion conditions.

Download Full-text

Vision System for Pose Estimation of a 3D Object Using Neural Network

Proceedings of the Thirty-First International Matador Conference ◽

10.1007/978-1-349-13796-1_97 ◽

1995 ◽

pp. 647-653

Author(s):

K. Jayakumar ◽

K. V. Rajaram ◽

M. A. Faruqi

Keyword(s):

Neural Network ◽

Pose Estimation ◽

Vision System ◽

3D Object

Download Full-text