Manipulator grabbing position detection with information fusion of color image and depth image using deep learning

Author(s):  
Du Jiang ◽  
Gongfa Li ◽  
Ying Sun ◽  
Jiabing Hu ◽  
Juntong Yun ◽  
...  
2018 ◽  
Vol 2018 ◽  
pp. 1-17 ◽  
Author(s):  
Rui Zhang ◽  
Zhaokui Wang ◽  
Yulin Zhang

Real-time astronaut visual tracking is the most important prerequisite for flying assistant robot to follow and assist the served astronaut in the space station. In this paper, an astronaut visual tracking algorithm which is based on deep learning and probabilistic model is proposed. Fine-tuned with feature extraction layers’ parameters being initialized by ready-made model, an improved SSD (Single Shot Multibox Detector) network was proposed for robust astronaut detection in color image. Associating the detection results with synchronized depth image measured by RGB-D camera, a probabilistic model is presented to ensure accurate and consecutive tracking of the certain served astronaut. The algorithm runs 10 fps at Jetson TX2, and it was extensively validated by several datasets which contain most instances of astronaut activities. The experimental results indicate that our proposed algorithm achieves not only robust tracking of the specified person with diverse postures or dressings but also effective occlusion detection for avoiding mistaken tracking.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 1962
Author(s):  
Enrico Buratto ◽  
Adriano Simonetto ◽  
Gianluca Agresti ◽  
Henrik Schäfer ◽  
Pietro Zanuttigh

In this work, we propose a novel approach for correcting multi-path interference (MPI) in Time-of-Flight (ToF) cameras by estimating the direct and global components of the incoming light. MPI is an error source linked to the multiple reflections of light inside a scene; each sensor pixel receives information coming from different light paths which generally leads to an overestimation of the depth. We introduce a novel deep learning approach, which estimates the structure of the time-dependent scene impulse response and from it recovers a depth image with a reduced amount of MPI. The model consists of two main blocks: a predictive model that learns a compact encoded representation of the backscattering vector from the noisy input data and a fixed backscattering model which translates the encoded representation into the high dimensional light response. Experimental results on real data show the effectiveness of the proposed approach, which reaches state-of-the-art performances.


2021 ◽  
pp. 1-1
Author(s):  
Lianjie Jiang ◽  
Xinli Wang ◽  
Wei Li ◽  
Lei Wang ◽  
Xiaohong Yin ◽  
...  

Mathematics ◽  
2020 ◽  
Vol 8 (12) ◽  
pp. 2258
Author(s):  
Madhab Raj Joshi ◽  
Lewis Nkenyereye ◽  
Gyanendra Prasad Joshi ◽  
S. M. Riazul Islam ◽  
Mohammad Abdullah-Al-Wadud ◽  
...  

Enhancement of Cultural Heritage such as historical images is very crucial to safeguard the diversity of cultures. Automated colorization of black and white images has been subject to extensive research through computer vision and machine learning techniques. Our research addresses the problem of generating a plausible colored photograph of ancient, historically black, and white images of Nepal using deep learning techniques without direct human intervention. Motivated by the recent success of deep learning techniques in image processing, a feed-forward, deep Convolutional Neural Network (CNN) in combination with Inception- ResnetV2 is being trained by sets of sample images using back-propagation to recognize the pattern in RGB and grayscale values. The trained neural network is then used to predict two a* and b* chroma channels given grayscale, L channel of test images. CNN vividly colorizes images with the help of the fusion layer accounting for local features as well as global features. Two objective functions, namely, Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR), are employed for objective quality assessment between the estimated color image and its ground truth. The model is trained on the dataset created by ourselves with 1.2 K historical images comprised of old and ancient photographs of Nepal, each having 256 × 256 resolution. The loss i.e., MSE, PSNR, and accuracy of the model are found to be 6.08%, 34.65 dB, and 75.23%, respectively. Other than presenting the training results, the public acceptance or subjective validation of the generated images is assessed by means of a user study where the model shows 41.71% of naturalness while evaluating colorization results.


Author(s):  
Mohammadreza Hajiarbabi ◽  
Arvin Agah

Human skin detection is an important and challenging problem in computer vision. Skin detection can be used as the first phase in face detection when using color images. The differences in illumination and ranges of skin colors have made skin detection a challenging task. Gaussian model, rule based methods, and artificial neural networks are methods that have been used for human skin color detection. Deep learning methods are new techniques in learning that have shown improved classification power compared to neural networks. In this paper the authors use deep learning methods in order to enhance the capabilities of skin detection algorithms. Several experiments have been performed using auto encoders and different color spaces. The proposed technique is evaluated compare with other available methods in this domain using two color image databases. The results show that skin detection utilizing deep learning has better results compared to other methods such as rule-based, Gaussian model and feed forward neural network.


Nutrients ◽  
2018 ◽  
Vol 10 (12) ◽  
pp. 2005 ◽  
Author(s):  
Frank Lo ◽  
Yingnan Sun ◽  
Jianing Qiu ◽  
Benny Lo

An objective dietary assessment system can help users to understand their dietary behavior and enable targeted interventions to address underlying health problems. To accurately quantify dietary intake, measurement of the portion size or food volume is required. For volume estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles which can be tedious. In this paper, a view synthesis approach based on deep learning is proposed to reconstruct 3D point clouds of food items and estimate the volume from a single depth image. A distinct neural network is designed to use a depth image from one viewing angle to predict another depth image captured from the corresponding opposite viewing angle. The whole 3D point cloud map is then reconstructed by fusing the initial data points with the synthesized points of the object items through the proposed point cloud completion and Iterative Closest Point (ICP) algorithms. Furthermore, a database with depth images of food object items captured from different viewing angles is constructed with image rendering and used to validate the proposed neural network. The methodology is then evaluated by comparing the volume estimated by the synthesized 3D point cloud with the ground truth volume of the object items.


2021 ◽  
Vol 23 (Supplement_6) ◽  
pp. vi133-vi134
Author(s):  
Julia Cluceru ◽  
Joanna Phillips ◽  
Annette Molinaro ◽  
Yannet Interian ◽  
Tracy Luks ◽  
...  

Abstract In contrast to the WHO 2016 guidelines that use genetic alterations to further stratify patients within a designated grade, new recommendations suggest that IDH mutation status, followed by 1p19q-codeletion, should be used before grade when differentiating gliomas. Although most gliomas will be resected and their tissue evaluated with genetic profiling, non-invasive characterization of genetic subgroup can benefit patients where surgery is not otherwise advised or a fast turn-around is required for clinical trial eligibility. Prior studies have demonstrated the utility of using anatomical images and deep learning to distinguish either IDH-mutant from IDH-wildtype tumors or 1p19q-codeleted from non-codeleted lesions separately, but not combined or using the most recent recommendations for stratification. The goal of this study was to evaluate the effects of training strategy and incorporation of Apparent Diffusion Coefficient (ADC) maps from diffusion-weighted imaging on predicting new genetic subgroups with deep learning. Using 414 patients with newly-diagnosed glioma (split 285/50/49 training/validation/test) and optimized training hyperparameters, we found that a 3-class approach with T1-post-contrast, T2-FLAIR, and ADC maps as inputs achieved the best performance for molecular subgroup classification, with overall accuracies of 86.0%[CI:0.839,1.0], 80.0%[CI:0.720,1.0], and 85.7%[CI:0.771,1.0] on training, validation, and test sets, respectively, and final test class accuracies of 95.2%(IDH-wildtype), 88.9%(IDH-mutated,1p19qintact), and 60%(IDHmutated,1p19q-codeleted). Creating an RGB-color image from 3 MRI images and applying transfer learning with a residual network architecture pretrained on ImageNet resulted in an 8% averaged increase in overall accuracy. Although classifying both IDH and 1p19q mutations together was overall advantageous compared with a tiered structure that first classified IDH mutational status, the 2-tiered approach better generalized to an independent multi-site dataset when only anatomical images were used. Including biologically relevant ADC images improved model generalization to our test set regardless of modeling approach, highlighting the utility of incorporating diffusion-weighted imaging in future multi-site analyses of molecular subgroup.


Sign in / Sign up

Export Citation Format

Share Document