FloW vision: Depth image enhancement by combining stereo RGB-depth sensor

Author(s):  
Suryo Aji Waskitho ◽  
Ardiansyah Alfarouq ◽  
Sritrusta Sukaridhoto ◽  
Dadet Pramadihanto
Mathematics ◽  
2021 ◽  
Vol 9 (21) ◽  
pp. 2815
Author(s):  
Shih-Hung Yang ◽  
Yao-Mao Cheng ◽  
Jyun-We Huang ◽  
Yon-Ping Chen

Automatic fingerspelling recognition tackles the communication barrier between deaf and hearing individuals. However, the accuracy of fingerspelling recognition is reduced by high intra-class variability and low inter-class variability. In the existing methods, regular convolutional kernels, which have limited receptive fields (RFs) and often cannot detect subtle discriminative details, are applied to learn features. In this study, we propose a receptive field-aware network with finger attention (RFaNet) that highlights the finger regions and builds inter-finger relations. To highlight the discriminative details of these fingers, RFaNet reweights the low-level features of the hand depth image with those of the non-forearm image and improves finger localization, even when the wrist is occluded. RFaNet captures neighboring and inter-region dependencies between fingers in high-level features. An atrous convolution procedure enlarges the RFs at multiple scales and a non-local operation computes the interactions between multi-scale feature maps, thereby facilitating the building of inter-finger relations. Thus, the representation of a sign is invariant to viewpoint changes, which are primarily responsible for intra-class variability. On an American Sign Language fingerspelling dataset, RFaNet achieved 1.77% higher classification accuracy than state-of-the-art methods. RFaNet achieved effective transfer learning when the number of labeled depth images was insufficient. The fingerspelling representation of a depth image can be effectively transferred from large- to small-scale datasets via highlighting the finger regions and building inter-finger relations, thereby reducing the requirement for expensive fingerspelling annotations.


Sensors ◽  
2019 ◽  
Vol 19 (2) ◽  
pp. 393 ◽  
Author(s):  
Jonha Lee ◽  
Dong-Wook Kim ◽  
Chee Won ◽  
Seung-Won Jung

Segmentation of human bodies in images is useful for a variety of applications, including background substitution, human activity recognition, security, and video surveillance applications. However, human body segmentation has been a challenging problem, due to the complicated shape and motion of a non-rigid human body. Meanwhile, depth sensors with advanced pattern recognition algorithms provide human body skeletons in real time with reasonable accuracy. In this study, we propose an algorithm that projects the human body skeleton from a depth image to a color image, where the human body region is segmented in the color image by using the projected skeleton as a segmentation cue. Experimental results using the Kinect sensor demonstrate that the proposed method provides high quality segmentation results and outperforms the conventional methods.


2020 ◽  
Vol 6 (3) ◽  
pp. 11
Author(s):  
Naoyuki Awano

Depth sensors are important in several fields to recognize real space. However, there are cases where most depth values in a depth image captured by a sensor are constrained because the depths of distal objects are not always captured. This often occurs when a low-cost depth sensor or structured-light depth sensor is used. This also occurs frequently in applications where depth sensors are used to replicate human vision, e.g., when using the sensors in head-mounted displays (HMDs). One ideal inpainting (repair or restoration) approach for depth images with large missing areas, such as partial foreground depths, is to inpaint only the foreground; however, conventional inpainting studies have attempted to inpaint entire images. Thus, under the assumption of an HMD-mounted depth sensor, we propose a method to inpaint partially and reconstruct an RGB-D depth image to preserve foreground shapes. The proposed method is comprised of a smoothing process for noise reduction, filling defects in the foreground area, and refining the filled depths. Experimental results demonstrate that the inpainted results produced using the proposed method preserve object shapes in the foreground area with accurate results of the inpainted area with respect to the real depth with the peak signal-to-noise ratio metric.


2015 ◽  
Vol 9 (3) ◽  
pp. 435-448 ◽  
Author(s):  
Pravin Kumar Rana ◽  
Jalil Taghia ◽  
Zhanyu Ma ◽  
Markus Flierl

Sign in / Sign up

Export Citation Format

Share Document