scholarly journals Accurate Hand Detection from Single-Color Images by Reconstructing Hand Appearances

Sensors ◽  
2019 ◽  
Vol 20 (1) ◽  
pp. 192 ◽  
Author(s):  
Chi Xu ◽  
Wendi Cai ◽  
Yongbo Li ◽  
Jun Zhou ◽  
Longsheng Wei

Hand detection is a crucial pre-processing procedure for many human hand related computer vision tasks, such as hand pose estimation, hand gesture recognition, human activity analysis, and so on. However, reliably detecting multiple hands from cluttering scenes remains to be a challenging task because of complex appearance diversities of dexterous human hands (e.g., different hand shapes, skin colors, illuminations, orientations, and scales, etc.) in color images. To tackle this problem, an accurate hand detection method is proposed to reliably detect multiple hands from a single color image using a hybrid detection/reconstruction convolutional neural networks (CNN) framework, in which regions of hands are detected and appearances of hands are reconstructed in parallel by sharing features extracted from a region proposal layer, and the proposed model is trained in an end-to-end manner. Furthermore, it is observed that the generative adversarial network (GAN) could further boost the detection performance by generating more realistic hand appearances. The experimental results show that the proposed approach outperforms the state-of-the-art on public challenging hand detection benchmarks.

2020 ◽  
Vol 10 (23) ◽  
pp. 8699
Author(s):  
Yeongseop Lee ◽  
Seongjin Lee

Line-arts are used in many ways in the media industry. However, line-art colorization is tedious, labor-intensive, and time consuming. For such reasons, a Generative Adversarial Network (GAN)-based image-to-image colorization method has received much attention because of its promising results. In this paper, we propose to use color a point hinting method with two GAN-based generators used for enhancing the image quality. To improve the coloring performance of drawing with various line styles, generator takes account of the loss of the line-art. We propose a Line Detection Model (LDM) which is used in measuring line loss. LDM is a method of extracting line from a color image. We also propose histogram equalizer in the input line-art to generalize the distribution of line styles. This approach allows the generalization of the distribution of line style without increasing the complexity of inference stage. In addition, we propose seven segment hint pointing constraints to evaluate the colorization performance of the model with Fréchet Inception Distance (FID) score. We present visual and qualitative evaluations of the proposed methods. The result shows that using histogram equalization and LDM enabled line loss exhibits the best result. The Base model with XDoG (eXtended Difference-Of-Gaussians)generated line-art with and without color hints exhibits FID for colorized images score of 35.83 and 44.70, respectively, whereas the proposed model in the same scenario exhibits 32.16 and 39.77, respectively.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Chen Xie ◽  
Kecheng Yang ◽  
Anni Wang ◽  
Chunxu Chen ◽  
Wei Li

2021 ◽  
Author(s):  
Kazutake Uehira ◽  
Hiroshi Unno

A technique for removing unnecessary patterns from captured images by using a generative network is studied. The patterns, composed of lines and spaces, are superimposed onto a blue component image of RGB color image when the image is captured for the purpose of acquiring a depth map. The superimposed patterns become unnecessary after the depth map is acquired. We tried to remove these unnecessary patterns by using a generative adversarial network (GAN) and an auto encoder (AE). The experimental results show that the patterns can be removed by using a GAN and AE to the point of being invisible. They also show that the performance of GAN is much higher than that of AE and that its PSNR and SSIM were over 45 and about 0.99, respectively. From the results, we demonstrate the effectiveness of the technique with a GAN.


2020 ◽  
Vol 34 (05) ◽  
pp. 8830-8837
Author(s):  
Xin Sheng ◽  
Linli Xu ◽  
Junliang Guo ◽  
Jingchang Liu ◽  
Ruoyu Zhao ◽  
...  

We propose a novel introspective model for variational neural machine translation (IntroVNMT) in this paper, inspired by the recent successful application of introspective variational autoencoder (IntroVAE) in high quality image synthesis. Different from the vanilla variational NMT model, IntroVNMT is capable of improving itself introspectively by evaluating the quality of the generated target sentences according to the high-level latent variables of the real and generated target sentences. As a consequence of introspective training, the proposed model is able to discriminate between the generated and real sentences of the target language via the latent variables generated by the encoder of the model. In this way, IntroVNMT is able to generate more realistic target sentences in practice. In the meantime, IntroVNMT inherits the advantages of the variational autoencoders (VAEs), and the model training process is more stable than the generative adversarial network (GAN) based models. Experimental results on different translation tasks demonstrate that the proposed model can achieve significant improvements over the vanilla variational NMT model.


Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 702
Author(s):  
Seungbin Roh ◽  
Johyun Shin ◽  
Keemin Sohn

Almost all vision technologies that are used to measure traffic volume use a two-step procedure that involves tracking and detecting. Object detection algorithms such as YOLO and Fast-RCNN have been successfully applied to detecting vehicles. The tracking of vehicles requires an additional algorithm that can trace the vehicles that appear in a previous video frame to their appearance in a subsequent frame. This two-step algorithm prevails in the field but requires substantial computation resources for training, testing, and evaluation. The present study devised a simpler algorithm based on an autoencoder that requires no labeled data for training. An autoencoder was trained on the pixel intensities of a virtual line placed on images in an unsupervised manner. The last hidden node of the former encoding portion of the autoencoder generates a scalar signal that can be used to judge whether a vehicle is passing. A cycle-consistent generative adversarial network (CycleGAN) was used to transform an original input photo of complex vehicle images and backgrounds into a simple illustration input image that enhances the performance of the autoencoder in judging the presence of a vehicle. The proposed model is much lighter and faster than a YOLO-based model, and accuracy of the proposed model is equivalent to, or better than, a YOLO-based model. In measuring traffic volumes, the proposed approach turned out to be robust in terms of both accuracy and efficiency.


2014 ◽  
Vol 511-512 ◽  
pp. 550-553 ◽  
Author(s):  
Jian Yong Liang

Edge detection is an old and hot topic in image processing, pattern recognition and computer vision. Numerous edge detection approaches have been proposed to gray images. It is difficult to extend these approaches to color image edge detection. A novel edge detection method based on mathematical morphology for color images is proposed in this paper. The proposed approach firstly compute vector gradient based on morphological gradient operators, and then compute the optimal gradient according to structure elements with different size. Finally, we use a threshold to binary the gradient images and then obtain the edge images. Experimental results show that the proposed approach has advantages of suppressing noise and preserving edge details and it is not sensitive to noise pixel. The finally edge images via the proposed method have high PSNR and NC compared with the traditional approaches.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Erick Costa de Farias ◽  
Christian di Noia ◽  
Changhee Han ◽  
Evis Sala ◽  
Mauro Castelli ◽  
...  

AbstractRobust machine learning models based on radiomic features might allow for accurate diagnosis, prognosis, and medical decision-making. Unfortunately, the lack of standardized radiomic feature extraction has hampered their clinical use. Since the radiomic features tend to be affected by low voxel statistics in regions of interest, increasing the sample size would improve their robustness in clinical studies. Therefore, we propose a Generative Adversarial Network (GAN)-based lesion-focused framework for Computed Tomography (CT) image Super-Resolution (SR); for the lesion (i.e., cancer) patch-focused training, we incorporate Spatial Pyramid Pooling (SPP) into GAN-Constrained by the Identical, Residual, and Cycle Learning Ensemble (GAN-CIRCLE). At $$2\times $$ 2 × SR, the proposed model achieved better perceptual quality with less blurring than the other considered state-of-the-art SR methods, while producing comparable results at $$4\times $$ 4 × SR. We also evaluated the robustness of our model’s radiomic feature in terms of quantization on a different lung cancer CT dataset using Principal Component Analysis (PCA). Intriguingly, the most important radiomic features in our PCA-based analysis were the most robust features extracted on the GAN-super-resolved images. These achievements pave the way for the application of GAN-based image Super-Resolution techniques for studies of radiomics for robust biomarker discovery.


Sign in / Sign up

Export Citation Format

Share Document