scholarly journals Parallel Image Completion with Edge and Color Map

2019 ◽  
Vol 9 (18) ◽  
pp. 3856 ◽  
Author(s):  
Dan Zhao ◽  
Baolong Guo ◽  
Yunyi Yan

Over the last few years, image completion has made significant progress due to the generative adversarial networks (GANs) that are able to synthesize photorealistic contents. However, one of the main obstacles faced by many existing methods is that they often create blurry textures or distorted structures that are inconsistent with surrounding regions. The main reason is the ineffectiveness of disentangling style latent space implicitly from images. To address this problem, we develop a novel image completion framework called PIC-EC: parallel image completion networks with edge and color maps, which explicitly provides image edge and color information as the prior knowledge for image completion. The PIC-EC framework consists of the parallel edge and color generators followed by an image completion network. Specifically, the parallel paths generate edge and color maps for the missing region at the same time, and then the image completion network fills the missing region with fine details using the generated edge and color information as the priors. The proposed method was evaluated over CelebA-HQ and Paris StreetView datasets. Experimental results demonstrate that PIC-EC achieves superior performance on challenging cases with complex compositions and outperforms existing methods on evaluations of realism and accuracy, both quantitatively and qualitatively.

Author(s):  
Sudipto Mukherjee ◽  
Himanshu Asnani ◽  
Eugene Lin ◽  
Sreeram Kannan

Generative Adversarial networks (GANs) have obtained remarkable success in many unsupervised learning tasks and unarguably, clustering is an important unsupervised learning problem. While one can potentially exploit the latent-space back-projection in GANs to cluster, we demonstrate that the cluster structure is not retained in the GAN latent space. In this paper, we propose ClusterGAN as a new mechanism for clustering using GANs. By sampling latent variables from a mixture of one-hot encoded variables and continuous latent variables, coupled with an inverse network (which projects the data to the latent space) trained jointly with a clustering specific loss, we are able to achieve clustering in the latent space. Our results show a remarkable phenomenon that GANs can preserve latent space interpolation across categories, even though the discriminator is never exposed to such vectors. We compare our results with various clustering baselines and demonstrate superior performance on both synthetic and real datasets.


2021 ◽  
Author(s):  
Van Bettauer ◽  
Anna CBP Costa ◽  
Raha Parvizi Omran ◽  
Samira Massahi ◽  
Eftyhios Kirbizakis ◽  
...  

We present deep learning-based approaches for exploring the complex array of morphologies exhibited by the opportunistic human pathogen C. albicans. Our system entitled Candescence automatically detects C. albicans cells from Differential Image Contrast microscopy, and labels each detected cell with one of nine vegetative, mating-competent or filamentous morphologies. The software is based upon a fully convolutional one-stage object detector and exploits a novel cumulative curriculum-based learning strategy that stratifies our images by difficulty from simple vegetative forms to more complex filamentous architectures. Candescence achieves very good performance on this difficult learning set which has substantial intermixing between the predicted classes. To capture the essence of each C. albicans morphology, we develop models using generative adversarial networks and identify subcomponents of the latent space which control technical variables, developmental trajectories or morphological switches. We envision Candescence as a community meeting point for quantitative explorations of C. albicans morphology.


2020 ◽  
Vol 128 (10-11) ◽  
pp. 2665-2683 ◽  
Author(s):  
Grigorios G. Chrysos ◽  
Jean Kossaifi ◽  
Stefanos Zafeiriou

Abstract Conditional image generation lies at the heart of computer vision and conditional generative adversarial networks (cGAN) have recently become the method of choice for this task, owing to their superior performance. The focus so far has largely been on performance improvement, with little effort in making cGANs more robust to noise. However, the regression (of the generator) might lead to arbitrarily large errors in the output, which makes cGANs unreliable for real-world applications. In this work, we introduce a novel conditional GAN model, called RoCGAN, which leverages structure in the target space of the model to address the issue. Specifically, we augment the generator with an unsupervised pathway, which promotes the outputs of the generator to span the target manifold, even in the presence of intense noise. We prove that RoCGAN share similar theoretical properties as GAN and establish with both synthetic and real data the merits of our model. We perform a thorough experimental validation on large scale datasets for natural scenes and faces and observe that our model outperforms existing cGAN architectures by a large margin. We also empirically demonstrate the performance of our approach in the face of two types of noise (adversarial and Bernoulli).


Author(s):  
Jianfu Zhang ◽  
Yuanyuan Huang ◽  
Yaoyi Li ◽  
Weijie Zhao ◽  
Liqing Zhang

Recent studies show significant progress in image-to-image translation task, especially facilitated by Generative Adversarial Networks. They can synthesize highly realistic images and alter the attribute labels for the images. However, these works employ attribute vectors to specify the target domain which diminishes image-level attribute diversity. In this paper, we propose a novel model formulating disentangled representations by projecting images to latent units, grouped feature channels of Convolutional Neural Network, to disassemble the information between different attributes. Thanks to disentangled representation, we can transfer attributes according to the attribute labels and moreover retain the diversity beyond the labels, namely, the styles inside each image. This is achieved by specifying some attributes and swapping the corresponding latent units to “swap” the attributes appearance, or applying channel-wise interpolation to blend different attributes. To verify the motivation of our proposed model, we train and evaluate our model on face dataset CelebA. Furthermore, the evaluation of another facial expression dataset RaFD demonstrates the generalizability of our proposed model.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Hui Liu ◽  
Tinglong Tang ◽  
Jake Luo ◽  
Meng Zhao ◽  
Baole Zheng ◽  
...  

Purpose This study aims to address the challenge of training a detection model for the robot to detect the abnormal samples in the industrial environment, while abnormal patterns are very rare under this condition. Design/methodology/approach The authors propose a new model with double encoder–decoder (DED) generative adversarial networks to detect anomalies when the model is trained without any abnormal patterns. The DED approach is used to map high-dimensional input images to a low-dimensional space, through which the latent variables are obtained. Minimizing the change in the latent variables during the training process helps the model learn the data distribution. Anomaly detection is achieved by calculating the distance between two low-dimensional vectors obtained from two encoders. Findings The proposed method has better accuracy and F1 score when compared with traditional anomaly detection models. Originality/value A new architecture with a DED pipeline is designed to capture the distribution of images in the training process so that anomalous samples are accurately identified. A new weight function is introduced to control the proportion of losses in the encoding reconstruction and adversarial phases to achieve better results. An anomaly detection model is proposed to achieve superior performance against prior state-of-the-art approaches.


Author(s):  
Zhizhong Huang ◽  
Shouzhen Chen ◽  
Junping Zhang ◽  
Hongming Shan

Age progression and regression aim to synthesize photorealistic appearance of a given face image with aging and rejuvenation effects, respectively. Existing generative adversarial networks (GANs) based methods suffer from the following three major issues: 1) unstable training introducing strong ghost artifacts in the generated faces, 2) unpaired training leading to unexpected changes in facial attributes such as genders and races, and 3) non-bijective age mappings increasing the uncertainty in the face transformation. To overcome these issues, this paper proposes a novel framework, termed AgeFlow, to integrate the advantages of both flow-based models and GANs. The proposed AgeFlow contains three parts: an encoder that maps a given face to a latent space through an invertible neural network, a novel invertible conditional translation module (ICTM) that translates the source latent vector to target one, and a decoder that reconstructs the generated face from the target latent vector using the same encoder network; all parts are invertible achieving bijective age mappings. The novelties of ICTM are two-fold. First, we propose an attribute-aware knowledge distillation to learn the manipulation direction of age progression while keeping other unrelated attributes unchanged, alleviating unexpected changes in facial attributes. Second, we propose to use GANs in the latent space to ensure the learned latent vector indistinguishable from the real ones, which is much easier than traditional use of GANs in the image domain. Experimental results demonstrate superior performance over existing GANs-based methods on two benchmarked datasets. The source code is available at https://github.com/Hzzone/AgeFlow.


Sign in / Sign up

Export Citation Format

Share Document