Iterative Pose Refinement for Object Pose Estimation Based on RGBD Data

Accurate estimation of 3D object pose is highly desirable in a wide range of applications, such as robotics and augmented reality. Although significant advancement has been made for pose estimation, there is room for further improvement. Recent pose estimation systems utilize an iterative refinement process to revise the predicted pose to obtain a better final output. However, such refinement process only takes account of geometric features for pose revision during the iteration. Motivated by this approach, this paper designs a novel iterative refinement process that deals with both color and geometric features for object pose refinement. Experiments show that the proposed method is able to reach 94.74% and 93.2% in ADD(-S) metric with only 2 iterations, outperforming the state-of-the-art methods on the LINEMOD and YCB-Video datasets, respectively.

Download Full-text

Mobile Augmented Reality for Low-End Devices Based on Planar Surface Recognition and Optimized Vertex Data Rendering

Applied Sciences ◽

10.3390/app11188750 ◽

2021 ◽

Vol 11 (18) ◽

pp. 8750

Author(s):

Styliani Verykokou ◽

Argyro-Maria Boutsi ◽

Charalabos Ioannidis

Keyword(s):

Augmented Reality ◽

Pose Estimation ◽

Image Matching ◽

Mobile Augmented Reality ◽

Surface Recognition ◽

Wide Range ◽

Speed Up ◽

Time Required ◽

Fast Rendering ◽

Textured Models

Mobile Augmented Reality (MAR) is designed to keep pace with high-end mobile computing and their powerful sensors. This evolution excludes users with low-end devices and network constraints. This article presents ModAR, a hybrid Android prototype that expands the MAR experience to the aforementioned target group. It combines feature-based image matching and pose estimation with fast rendering of 3D textured models. Planar objects of the real environment are used as pattern images for overlaying users’ meshes or the app’s default ones. Since ModAR is based on the OpenCV C++ library at Android NDK and OpenGL ES 2.0 graphics API, there are no dependencies on additional software, operating system version or model-specific hardware. The developed 3D graphics engine implements optimized vertex-data rendering with a combination of data grouping, synchronization, sub-texture compression and instancing for limited CPU/GPU resources and a single-threaded approach. It achieves up to 3 × speed-up compared to standard index rendering, and AR overlay of a 50 K vertices 3D model in less than 30 s. Several deployment scenarios on pose estimation demonstrate that the oriented FAST detector with an upper threshold of features per frame combined with the ORB descriptor yield best results in terms of robustness and efficiency, achieving a 90% reduction of image matching time compared to the time required by the AGAST detector and the BRISK descriptor, corresponding to pattern recognition accuracy of above 90% for a wide range of scale changes, regardless of any in-plane rotations and partial occlusions of the pattern.

Download Full-text

6D Object Pose Estimation with Pairwise Compatible Geometric Features

10.1109/icra48506.2021.9561404 ◽

2021 ◽

Author(s):

Muyuan Lin ◽

Varun Murali ◽

Sertac Karaman

Keyword(s):

Pose Estimation ◽

Geometric Features ◽

Object Pose Estimation

Download Full-text

6DoF Pose Estimation of Transparent Object from a Single RGB-D Image

Sensors ◽

10.3390/s20236790 ◽

2020 ◽

Vol 20 (23) ◽

pp. 6790

Author(s):

Chi Xu ◽

Jiale Chen ◽

Mengyang Yao ◽

Jun Zhou ◽

Lijun Zhang ◽

...

Keyword(s):

Pose Estimation ◽

Point Cloud ◽

State Of The Art ◽

Transparent Material ◽

Automatic Driving ◽

Second Stage ◽

Transparent Objects ◽

Object Pose Estimation ◽

Transparent Object ◽

Extended Point

6DoF object pose estimation is a foundation for many important applications, such as robotic grasping, automatic driving, and so on. However, it is very challenging to estimate 6DoF pose of transparent object which is commonly seen in our daily life, because the optical characteristics of transparent material lead to significant depth error which results in false estimation. To solve this problem, a two-stage approach is proposed to estimate 6DoF pose of transparent object from a single RGB-D image. In the first stage, the influence of the depth error is eliminated by transparent segmentation, surface normal recovering, and RANSAC plane estimation. In the second stage, an extended point-cloud representation is presented to accurately and efficiently estimate object pose. As far as we know, it is the first deep learning based approach which focuses on 6DoF pose estimation of transparent objects from a single RGB-D image. Experimental results show that the proposed approach can effectively estimate 6DoF pose of transparent object, and it out-performs the state-of-the-art baselines by a large margin.

Download Full-text

Toward Augmented Reality in Museums: Evaluation of Design Choices for 3D Object Pose Estimation

Frontiers in Virtual Reality ◽

10.3389/frvir.2021.649784 ◽

2021 ◽

Vol 2 ◽

Author(s):

Paschalis Panteleris ◽

Damien Michel ◽

Antonis Argyros

Keyword(s):

Computer Vision ◽

Augmented Reality ◽

Pose Estimation ◽

Mobile Device ◽

Hybrid Method ◽

Real Word ◽

Museum Exhibit ◽

3D Object ◽

Augmented Reality Application ◽

Object Pose Estimation

The solutions to many computer vision problems, including that of 6D object pose estimation, are dominated nowadays by the explosion of the learning-based paradigm. In this paper, we investigate 6D object pose estimation in a practical, real-word setting in which a mobile device (smartphone/tablet) needs to be localized in front of a museum exhibit, in support of an augmented-reality application scenario. In view of the constraints and the priorities set by this particular setting, we consider an appropriately tailored classical as well as a learning-based method. Moreover, we develop a hybrid method that consists of both classical and learning based components. All three methods are evaluated quantitatively on a standard, benchmark dataset, but also on a new dataset that is specific to the museum guidance scenario of interest.

Download Full-text

Coarse-to-Fine Hand–Object Pose Estimation with Interaction-Aware Graph Convolutional Network

Sensors ◽

10.3390/s21238092 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8092

Author(s):

Maomao Zhang ◽

Ao Li ◽

Honglei Liu ◽

Minghui Wang

Keyword(s):

Object Relations ◽

Pose Estimation ◽

State Of The Art ◽

Convolutional Network ◽

Specific Relation ◽

Key Factor ◽

Rgb Images ◽

Object Pose Estimation ◽

Coarse To Fine ◽

Rgb Image

The analysis of hand–object poses from RGB images is important for understanding and imitating human behavior and acts as a key factor in various applications. In this paper, we propose a novel coarse-to-fine two-stage framework for hand–object pose estimation, which explicitly models hand–object relations in 3D pose refinement rather than in the process of converting 2D poses to 3D poses. Specifically, in the coarse stage, 2D heatmaps of hand and object keypoints are obtained from RGB image and subsequently fed into pose regressor to derive coarse 3D poses. As for the fine stage, an interaction-aware graph convolutional network called InterGCN is introduced to perform pose refinement by fully leveraging the hand–object relations in 3D context. One major challenge in 3D pose refinement lies in the fact that relations between hand and object change dynamically according to different HOI scenarios. In response to this issue, we leverage both general and interaction-specific relation graphs to significantly enhance the capacity of the network to cover variations of HOI scenarios for successful 3D pose refinement. Extensive experiments demonstrate state-of-the-art performance of our approach on benchmark hand–object datasets.

Download Full-text

Function Summarization Modulo Theories

10.29007/d3bt ◽

2018 ◽

Cited By ~ 1

Author(s):

Sepideh Asadi ◽

Martin Blicha ◽

Grigory Fedyukovich ◽

Antti Hyvärinen ◽

Karine Even-Mendoza ◽

...

Keyword(s):

Program Verification ◽

State Of The Art ◽

Iterative Refinement ◽

Light Weight ◽

Model Checker ◽

Abstraction Level ◽

On Demand ◽

Wide Range ◽

Refinement Procedure ◽

Program Modules

SMT-based program verification can achieve high precision using bit-precise models or combinations of different theories. Often such approaches suffer from problems related to scalability due to the complexity of the underlying decision procedures. Precision is traded for performance by increasing the abstraction level of the model. As the level of abstraction increases, missing important details of the program model becomes problematic. In this paper we address this problem with an incremental verification approach that alternates precision of the program modules on demand. The idea is to model a program using the lightest possible (i.e., less expensive) theories that suffice to verify the desired property. To this end, we employ safe over-approximations for the program based on both function summaries and light-weight SMT theories. If during verification it turns out that the precision is too low, our approach lazily strengthens all affected summaries or the theory through an iterative refinement procedure. The resulting summarization framework provides a natural and light-weight approach for carrying information between different theories. An experimental evaluation with a bounded model checker for C on a wide range of benchmarks demonstrates that our approach scales well, often effortlessly solving instances where the state-of-the-art model checker CBMC runs out of time or memory.

Download Full-text

DARLENE – Improving situational awareness of European law enforcement agents through a combination of augmented reality and artificial intelligence solutions

Open Research Europe ◽

10.12688/openreseurope.13715.1 ◽

2021 ◽

Vol 1 ◽

pp. 87

Author(s):

Konstantinos C. Apostolakis ◽

Nikolaos Dimitriou ◽

George Margetis ◽

Stavroula Ntoa ◽

Dimitrios Tzovaras ◽

...

Keyword(s):

Artificial Intelligence ◽

Decision Making ◽

Computer Vision ◽

Law Enforcement ◽

Augmented Reality ◽

Pose Estimation ◽

Situational Awareness ◽

Estimation Method ◽

Violence Detection ◽

Wide Range

Background: Augmented reality (AR) and artificial intelligence (AI) are highly disruptive technologies that have revolutionised practices in a wide range of domains. Their potential has not gone unnoticed in the security sector with several law enforcement agencies (LEAs) employing AI applications in their daily operations for forensics and surveillance. In this paper, we present the DARLENE ecosystem, which aims to bridge existing gaps in applying AR and AI technologies for rapid tactical decision-making in situ with minimal error margin, thus enhancing LEAs’ efficiency and Situational Awareness (SA). Methods: DARLENE incorporates novel AI techniques for computer vision tasks such as activity recognition and pose estimation, while also building an AR framework for visualization of the inferenced results via dynamic content adaptation according to each individual officer’s stress level and current context. The concept has been validated with end-users through co-creation workshops, while the decision-making mechanism for enhancing LEAs’ SA has been assessed with experts. Regarding computer vision components, preliminary tests of the instance segmentation method for humans’ and objects’ detection have been conducted on a subset of videos from the RWF-2000 dataset for violence detection, which have also been used to test a human pose estimation method that has so far exhibited impressive results and will constitute the basis of further developments in DARLENE. Results: Evaluation results highlight that target users are positive towards the adoption of the proposed solution in field operations, and that the SA decision-making mechanism produces highly acceptable outcomes. Evaluation of the computer vision components yielded promising results and identified opportunities for improvement. Conclusions: This work provides the context of the DARLENE ecosystem and presents the DARLENE architecture, analyses its individual technologies, and demonstrates preliminary results, which are positive both in terms of technological achievements and user acceptance of the proposed solution.

Download Full-text

Deep Multi-state Object Pose Estimation for Augmented Reality Assembly

2019 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct) ◽

10.1109/ismar-adjunct.2019.00-42 ◽

2019 ◽

Cited By ~ 2

Author(s):

Yongzhi Su ◽

Jason Rambach ◽

Nareg Minaskan ◽

Paul Lesur ◽

Alain Pagani ◽

...

Keyword(s):

Augmented Reality ◽

Pose Estimation ◽

Object Pose Estimation

Download Full-text

IEKF based object pose estimation for Augmented Reality

Proceedings of the 4th International Conference on Virtual Reality - ICVR 2018 ◽

10.1145/3198910.3198914 ◽

2018 ◽

Author(s):

Jiaru Song ◽

Shiqiang Hu ◽

Yongsheng Yang

Keyword(s):

Augmented Reality ◽

Pose Estimation ◽

Object Pose Estimation

Download Full-text

Intelligent Spacecraft Visual GNC Architecture With the State-Of-the-Art AI Components for On-Orbit Manipulation

Frontiers in Robotics and AI ◽

10.3389/frobt.2021.639327 ◽

2021 ◽

Vol 8 ◽

Author(s):

Zhou Hao ◽

R. B. Ashith Shyam ◽

Arunkumar Rathinam ◽

Yang Gao

Keyword(s):

Decision Making ◽

Pose Estimation ◽

Degrees Of Freedom ◽

State Of The Art ◽

Robot Manipulator ◽

Estimation Algorithm ◽

Ground Control ◽

Robot Arm ◽

Wide Range ◽

Human Operators

Conventional spacecraft Guidance, Navigation, and Control (GNC) architectures have been designed to receive and execute commands from ground control with minimal automation and autonomy onboard spacecraft. In contrast, Artificial Intelligence (AI)-based systems can allow real-time decision-making by considering system information that is difficult to model and incorporate in the conventional decision-making process involving ground control or human operators. With growing interests in on-orbit services with manipulation, the conventional GNC faces numerous challenges in adapting to a wide range of possible scenarios, such as removing unknown debris, potentially addressed using emerging AI-enabled robotic technologies. However, a complete paradigm shift may need years' efforts. As an intermediate solution, we introduce a novel visual GNC system with two state-of-the-art AI modules to replace the corresponding functions in the conventional GNC system for on-orbit manipulation. The AI components are as follows: (i) A Deep Learning (DL)-based pose estimation algorithm that can estimate a target's pose from two-dimensional images using a pre-trained neural network without requiring any prior information on the dynamics or state of the target. (ii) A technique for modeling and controlling space robot manipulator trajectories using probabilistic modeling and reproduction to previously unseen situations to avoid complex trajectory optimizations on board. This also minimizes the attitude disturbances of spacecraft induced on it due to the motion of the robot arm. This architecture uses a centralized camera network as the main sensor, and the trajectory learning module of the 7 degrees of freedom robotic arm is integrated into the GNC system. The intelligent visual GNC system is demonstrated by simulation of a conceptual mission—AISAT. The mission is a micro-satellite to carry out on-orbit manipulation around a non-cooperative CubeSat. The simulation shows how the GNC system works in discrete-time simulation with the control and trajectory planning are generated in Matlab/Simulink. The physics rendering engine, Eevee, renders the whole simulation to provide a graphic realism for the DL pose estimation. In the end, the testbeds developed to evaluate and demonstrate the GNC system are also introduced. The novel intelligent GNC system can be a stepping stone toward future fully autonomous orbital robot systems.

Download Full-text