scholarly journals Image Reconstruction and Per-pixel Classification

2020 ◽  
Vol 8 (6) ◽  
pp. 5612-5617

We describe face classification algorithm which can be used for object recognition, pose estimation, tracking and gesture recognition which are useful for human-computer interaction. We make use of depth camera (Creative Interactive Gesture Camera – Kinect®) to acquire the images which gives several advantages when compared over a normal RGB optical camera. In this paper we demonstrate a intermediate parsing scheme, so that an accurate per-pixel classification is used to localize the joints. We make use of an efficient random decision forest to classify the image which in turn helps to estimate the pose. As we employ depth camera to acquire depth image it may contain holes on or around depth map, so we first fill those holes and the classify the image. Simulation results was observed by varying several training parameters of the decision forest. We generally learned an efficient method which stems the basics in the development of pose estimation and tracking. Also we gained an intensive knowledge on Decision forests

Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 546
Author(s):  
Zhenni Li ◽  
Haoyi Sun ◽  
Yuliang Gao ◽  
Jiao Wang

Depth maps obtained through sensors are often unsatisfactory because of their low-resolution and noise interference. In this paper, we propose a real-time depth map enhancement system based on a residual network which uses dual channels to process depth maps and intensity maps respectively and cancels the preprocessing process, and the algorithm proposed can achieve real-time processing speed at more than 30 fps. Furthermore, the FPGA design and implementation for depth sensing is also introduced. In this FPGA design, intensity image and depth image are captured by the dual-camera synchronous acquisition system as the input of neural network. Experiments on various depth map restoration shows our algorithms has better performance than existing LRMC, DE-CNN and DDTF algorithms on standard datasets and has a better depth map super-resolution, and our FPGA completed the test of the system to ensure that the data throughput of the USB 3.0 interface of the acquisition system is stable at 226 Mbps, and support dual-camera to work at full speed, that is, 54 fps@ (1280 × 960 + 328 × 248 × 3).


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1299
Author(s):  
Honglin Yuan ◽  
Tim Hoogenkamp ◽  
Remco C. Veltkamp

Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Samy Bakheet ◽  
Ayoub Al-Hamadi

AbstractRobust vision-based hand pose estimation is highly sought but still remains a challenging task, due to its inherent difficulty partially caused by self-occlusion among hand fingers. In this paper, an innovative framework for real-time static hand gesture recognition is introduced, based on an optimized shape representation build from multiple shape cues. The framework incorporates a specific module for hand pose estimation based on depth map data, where the hand silhouette is first extracted from the extremely detailed and accurate depth map captured by a time-of-flight (ToF) depth sensor. A hybrid multi-modal descriptor that integrates multiple affine-invariant boundary-based and region-based features is created from the hand silhouette to obtain a reliable and representative description of individual gestures. Finally, an ensemble of one-vs.-all support vector machines (SVMs) is independently trained on each of these learned feature representations to perform gesture classification. When evaluated on a publicly available dataset incorporating a relatively large and diverse collection of egocentric hand gestures, the approach yields encouraging results that agree very favorably with those reported in the literature, while maintaining real-time operation.


2021 ◽  
Author(s):  
Lun H. Mark

This thesis investigates how geometry of complex objects is related to LIDAR scanning with the Iterative Closest Point (ICP) pose estimation and provides statistical means to assess the pose accuracy. LIDAR scanners have become essential parts of space vision systems for autonomous docking and rendezvous. Principal Componenet Analysis based geometric constraint indices have been found to be strongly related to the pose error norm and the error of each individual degree of freedom. This leads to the development of several strategies for identifying the best view of an object and the optimal combination of localized scanned areas of the object's surface to achieve accurate pose estimation. Also investigated is the possible relation between the ICP pose estimation accuracy and the districution or allocation of the point cloud. The simulation results were validated using point clouds generated by scanning models of Quicksat and a cuboctahedron using Neptec's TriDAR scanner.


2021 ◽  
Author(s):  
Lun H. Mark

This thesis investigates how geometry of complex objects is related to LIDAR scanning with the Iterative Closest Point (ICP) pose estimation and provides statistical means to assess the pose accuracy. LIDAR scanners have become essential parts of space vision systems for autonomous docking and rendezvous. Principal Componenet Analysis based geometric constraint indices have been found to be strongly related to the pose error norm and the error of each individual degree of freedom. This leads to the development of several strategies for identifying the best view of an object and the optimal combination of localized scanned areas of the object's surface to achieve accurate pose estimation. Also investigated is the possible relation between the ICP pose estimation accuracy and the districution or allocation of the point cloud. The simulation results were validated using point clouds generated by scanning models of Quicksat and a cuboctahedron using Neptec's TriDAR scanner.


Sign in / Sign up

Export Citation Format

Share Document