scholarly journals Depth Estimation of a Deformable Object via a Monocular Camera

2019 ◽  
Vol 9 (7) ◽  
pp. 1366 ◽  
Author(s):  
Guolai Jiang ◽  
Shaokun Jin ◽  
Yongsheng Ou ◽  
Shoujun Zhou

The depth estimation of the 3D deformable object has become increasingly crucial to various intelligent applications. In this paper, we propose a feature-based approach for accurate depth estimation of a deformable 3D object with a single camera, which reduces the problem of depth estimation to a pose estimation problem. The proposed method needs to reconstruct the target object at the very beginning. With the 3D reconstruction as an a priori model, only one monocular image is required afterwards to estimate the target object’s depth accurately, regardless of pose changes or deformability of the object. Experiments are taken on an NAO robot and a human to evaluate the depth estimation accuracy by the proposed method.

Author(s):  
CHENGGUANG ZHU ◽  
zhongpai Gao ◽  
Jiankang Zhao ◽  
Haihui Long ◽  
Chuanqi Liu

Abstract The relative pose estimation of a space noncooperative target is an attractive yet challenging task due to the complexity of the target background and illumination, and the lack of a priori knowledge. Unfortunately, these negative factors have a grave impact on the estimation accuracy and the robustness of filter algorithms. In response, this paper proposes a novel filter algorithm to estimate the relative pose to improve the robustness based on a stereovision system. First, to obtain a coarse relative pose, the weighted total least squares (WTLS) algorithm is adopted to estimate the relative pose based on several feature points. The resulting relative pose is fed into the subsequent filter scheme as observation quantities. Second, the classic Bayes filter is exploited to estimate the relative state except for moment-of-inertia ratios. Additionally, the one-step prediction results are used as feedback for WTLS initialization. The proposed algorithm successfully eliminates the dependency on continuous tracking of several fixed points. Finally, comparison experiments demonstrate that the proposed algorithm presents a better performance in terms of robustness and convergence time.


2014 ◽  
Vol 11 (03) ◽  
pp. 1450025
Author(s):  
Jonghyun Seo ◽  
Jangmyung Lee

This paper proposes an object based vision (O-BV) system to implement visual servoing for autonomous mobile manipulators using two charge-coupled device (CCD) cameras. Conventional stereo vision (C-SV) system estimates the depth based on the disparity between two camera images for the same object. However, the disparity is not an effective cue for a small disparity at a long distance. To resolve this problem, in the proposed O-BV system, the individual camera tracks the object independently, and the angles of the two cameras are used to estimate the distance to the object. This depth estimation technique is applied for an autonomous mobile robot to approach to a target object precisely. The O-BV system is experimentally compared to the C-SV system in terms of computing time and depth estimation accuracy. Also the two cameras which are attached on the top of the autonomous mobile manipulator have been utilized for the mobile manipulator to approach to a target object precisely through the visual servoing. Through the experiments, it is demonstrated that the fast and precise depth estimation is a critical factor for the successful visual servoing.


2020 ◽  
Vol 11 (1) ◽  
pp. 3
Author(s):  
Laura Gonçalves Ribeiro ◽  
Olli J. Suominen ◽  
Ahmed Durmush ◽  
Sari Peltonen ◽  
Emilio Ruiz Morales ◽  
...  

Visual technologies have an indispensable role in safety-critical applications, where tasks must often be performed through teleoperation. Due to the lack of stereoscopic and motion parallax depth cues in conventional images, alignment tasks pose a significant challenge to remote operation. In this context, machine vision can provide mission-critical information to augment the operator’s perception. In this paper, we propose a retro-reflector marker-based teleoperation aid to be used in hostile remote handling environments. The system computes the remote manipulator’s position with respect to the target using a set of one or two low-resolution cameras attached to its wrist. We develop an end-to-end pipeline of calibration, marker detection, and pose estimation, and extensively study the performance of the overall system. The results demonstrate that we have successfully engineered a retro-reflective marker from materials that can withstand the extreme temperature and radiation levels of the environment. Furthermore, we demonstrate that the proposed maker-based approach provides robust and reliable estimates and significantly outperforms a previous stereo-matching-based approach, even with a single camera.


Author(s):  
Zhihui Yang ◽  
Xiangyu Tang ◽  
Lijuan Zhang ◽  
Zhiling Yang

Human pose estimate can be used in action recognition, video surveillance and other fields, which has received a lot of attentions. Since the flexibility of human joints and environmental factors greatly influence pose estimation accuracy, related research is confronted with many challenges. In this paper, we incorporate the pyramid convolution and attention mechanism into the residual block, and introduce a hybrid structure model which synthetically applies the local and global information of the image for the analysis of keypoints detection. In addition, our improved structure model adopts grouped convolution, and the attention module used is lightweight, which will reduce the computational cost of the network. Simulation experiments based on the MS COCO human body keypoints detection data set show that, compared with the Simple Baseline model, our model is similar in parameters and GFLOPs (giga floating-point operations per second), but the performance is better on the detection of accuracy under the multi-person scenes.


Author(s):  
Ran Li ◽  
Nayun Xu ◽  
Xutong Lu ◽  
Yucheng Xing ◽  
Haohua Zhao ◽  
...  

Author(s):  
Tao Chen ◽  
Dongbing Gu

Abstract6D object pose estimation plays a crucial role in robotic manipulation and grasping tasks. The aim to estimate the 6D object pose from RGB or RGB-D images is to detect objects and estimate their orientations and translations relative to the given canonical models. RGB-D cameras provide two sensory modalities: RGB and depth images, which could benefit the estimation accuracy. But the exploitation of two different modality sources remains a challenging issue. In this paper, inspired by recent works on attention networks that could focus on important regions and ignore unnecessary information, we propose a novel network: Channel-Spatial Attention Network (CSA6D) to estimate the 6D object pose from RGB-D camera. The proposed CSA6D includes a pre-trained 2D network to segment the interested objects from RGB image. Then it uses two separate networks to extract appearance and geometrical features from RGB and depth images for each segmented object. Two feature vectors for each pixel are stacked together as a fusion vector which is refined by an attention module to generate a aggregated feature vector. The attention module includes a channel attention block and a spatial attention block which can effectively leverage the concatenated embeddings into accurate 6D pose prediction on known objects. We evaluate proposed network on two benchmark datasets YCB-Video dataset and LineMod dataset and the results show it can outperform previous state-of-the-art methods under ADD and ADD-S metrics. Also, the attention map demonstrates our proposed network searches for the unique geometry information as the most likely features for pose estimation. From experiments, we conclude that the proposed network can accurately estimate the object pose by effectively leveraging multi-modality features.


2021 ◽  
Author(s):  
Dengqing Tang ◽  
Lincheng Shen ◽  
Xiaojiao Xiang ◽  
Han Zhou ◽  
Tianjiang Hu

<p>We propose a learning-type anchors-driven real-time pose estimation method for the autolanding fixed-wing unmanned aerial vehicle (UAV). The proposed method enables online tracking of both position and attitude by the ground stereo vision system in the Global Navigation Satellite System denied environments. A pipeline of convolutional neural network (CNN)-based UAV anchors detection and anchors-driven UAV pose estimation are employed. To realize robust and accurate anchors detection, we design and implement a Block-CNN architecture to reduce the impact of the outliers. With the basis of the anchors, monocular and stereo vision-based filters are established to update the UAV position and attitude. To expand the training dataset without extra outdoor experiments, we develop a parallel system containing the outdoor and simulated systems with the same configuration. Simulated and outdoor experiments are performed to demonstrate the remarkable pose estimation accuracy improvement compared with the conventional Perspective-N-Points solution. In addition, the experiments also validate the feasibility of the proposed architecture and algorithm in terms of the accuracy and real-time capability requirements for fixed-wing autolanding UAVs.</p>


Sign in / Sign up

Export Citation Format

Share Document