Image Reconstruction and Per-pixel Classification

Depth maps obtained through sensors are often unsatisfactory because of their low-resolution and noise interference. In this paper, we propose a real-time depth map enhancement system based on a residual network which uses dual channels to process depth maps and intensity maps respectively and cancels the preprocessing process, and the algorithm proposed can achieve real-time processing speed at more than 30 fps. Furthermore, the FPGA design and implementation for depth sensing is also introduced. In this FPGA design, intensity image and depth image are captured by the dual-camera synchronous acquisition system as the input of neural network. Experiments on various depth map restoration shows our algorithms has better performance than existing LRMC, DE-CNN and DDTF algorithms on standard datasets and has a better depth map super-resolution, and our FPGA completed the test of the system to ensure that the data throughput of the USB 3.0 interface of the acquisition system is stable at 226 Mbps, and support dual-camera to work at full speed, that is, 54 fps@ (1280 × 960 + 328 × 248 × 3).

Download Full-text

RobotP: A Benchmark Dataset for 6D Object Pose Estimation

Sensors ◽

10.3390/s21041299 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1299

Author(s):

Honglin Yuan ◽

Tim Hoogenkamp ◽

Remco C. Veltkamp

Keyword(s):

Pose Estimation ◽

Ground Truth ◽

3D Models ◽

Depth Image ◽

Great Success ◽

Estimation Algorithms ◽

Depth Images ◽

Object Pose Estimation ◽

Image Pairs ◽

Bounding Boxes

Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.

Download Full-text

Hole Concealment for Depth Image Using Pixel Classification in Multiview System

2021 IEEE International Conference on Consumer Electronics (ICCE) ◽

10.1109/icce50685.2021.9427596 ◽

2021 ◽

Author(s):

Geon-Won Lee ◽

Jong-Ki Han

Keyword(s):

Depth Image ◽

Pixel Classification

Download Full-text

Robust hand gesture recognition using multiple shape-oriented visual cues

EURASIP Journal on Image and Video Processing ◽

10.1186/s13640-021-00567-1 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Samy Bakheet ◽

Ayoub Al-Hamadi

Keyword(s):

Real Time ◽

Gesture Recognition ◽

Pose Estimation ◽

Depth Map ◽

Hand Gesture Recognition ◽

Support Vector ◽

Hand Gesture ◽

Hand Pose Estimation ◽

Time Operation ◽

Hand Pose

AbstractRobust vision-based hand pose estimation is highly sought but still remains a challenging task, due to its inherent difficulty partially caused by self-occlusion among hand fingers. In this paper, an innovative framework for real-time static hand gesture recognition is introduced, based on an optimized shape representation build from multiple shape cues. The framework incorporates a specific module for hand pose estimation based on depth map data, where the hand silhouette is first extracted from the extremely detailed and accurate depth map captured by a time-of-flight (ToF) depth sensor. A hybrid multi-modal descriptor that integrates multiple affine-invariant boundary-based and region-based features is created from the hand silhouette to obtain a reliable and representative description of individual gestures. Finally, an ensemble of one-vs.-all support vector machines (SVMs) is independently trained on each of these learned feature representations to perform gesture classification. When evaluated on a publicly available dataset incorporating a relatively large and diverse collection of egocentric hand gestures, the approach yields encouraging results that agree very favorably with those reported in the literature, while maintaining real-time operation.

Download Full-text

MOPED: A scalable and low latency object recognition and pose estimation system

2010 IEEE International Conference on Robotics and Automation ◽

10.1109/robot.2010.5509801 ◽

2010 ◽

Cited By ~ 48

Author(s):

Manuel Martinez ◽

Alvaro Collet ◽

Siddhartha S Srinivasa

Keyword(s):

Object Recognition ◽

Pose Estimation ◽

Low Latency ◽

Estimation System

Download Full-text

RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features

2015 IEEE International Conference on Robotics and Automation (ICRA) ◽

10.1109/icra.2015.7139363 ◽

2015 ◽

Cited By ~ 117

Author(s):

Max Schwarz ◽

Hannes Schulz ◽

Sven Behnke

Keyword(s):

Neural Network ◽

Object Recognition ◽

Convolutional Neural Network ◽

Pose Estimation

Download Full-text

Local Regression Based Hourglass Network for Hand Pose Estimation from a Single Depth Image

2018 24th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr.2018.8545460 ◽

2018 ◽

Author(s):

Jia Li ◽

Zengfu Wang

Keyword(s):

Pose Estimation ◽

Depth Image ◽

Local Regression ◽

Hand Pose Estimation ◽

Hand Pose

Download Full-text

Principal Component Analysis For ICP Pose Estimation Of Space Structures

10.32920/ryerson.14656941 ◽

2021 ◽

Author(s):

Lun H. Mark

Keyword(s):

Pose Estimation ◽

Point Cloud ◽

Principal Component ◽

Point Clouds ◽

Optimal Combination ◽

Estimation Accuracy ◽

Space Structures ◽

Complex Objects ◽

Error Norm ◽

Simulation Results

This thesis investigates how geometry of complex objects is related to LIDAR scanning with the Iterative Closest Point (ICP) pose estimation and provides statistical means to assess the pose accuracy. LIDAR scanners have become essential parts of space vision systems for autonomous docking and rendezvous. Principal Componenet Analysis based geometric constraint indices have been found to be strongly related to the pose error norm and the error of each individual degree of freedom. This leads to the development of several strategies for identifying the best view of an object and the optimal combination of localized scanned areas of the object's surface to achieve accurate pose estimation. Also investigated is the possible relation between the ICP pose estimation accuracy and the districution or allocation of the point cloud. The simulation results were validated using point clouds generated by scanning models of Quicksat and a cuboctahedron using Neptec's TriDAR scanner.

Download Full-text

Principal Component Analysis For ICP Pose Estimation Of Space Structures

10.32920/ryerson.14656941.v1 ◽

2021 ◽

Author(s):

Lun H. Mark

Keyword(s):

Pose Estimation ◽

Point Cloud ◽

Principal Component ◽

Point Clouds ◽

Optimal Combination ◽

Estimation Accuracy ◽

Space Structures ◽

Complex Objects ◽

Error Norm ◽

Simulation Results

This thesis investigates how geometry of complex objects is related to LIDAR scanning with the Iterative Closest Point (ICP) pose estimation and provides statistical means to assess the pose accuracy. LIDAR scanners have become essential parts of space vision systems for autonomous docking and rendezvous. Principal Componenet Analysis based geometric constraint indices have been found to be strongly related to the pose error norm and the error of each individual degree of freedom. This leads to the development of several strategies for identifying the best view of an object and the optimal combination of localized scanned areas of the object's surface to achieve accurate pose estimation. Also investigated is the possible relation between the ICP pose estimation accuracy and the districution or allocation of the point cloud. The simulation results were validated using point clouds generated by scanning models of Quicksat and a cuboctahedron using Neptec's TriDAR scanner.

Download Full-text

Foreground Segmentation and High-Resolution Depth Map Generation Using a Time-of-Flight Depth Camera

The Journal of Korean Institute of Communications and Information Sciences ◽

10.7840/kics.2012.37c.9.751 ◽

2012 ◽

Vol 37C (9) ◽

pp. 751-756

Author(s):

Yun-Suk Kang ◽

Yo-Sung Ho

Keyword(s):

High Resolution ◽

Time Of Flight ◽

Depth Map ◽

Foreground Segmentation ◽

Depth Camera ◽

Map Generation

Download Full-text