scholarly journals CDL-GAN: Contrastive Distance Learning Generative Adversarial Network for Image Generation

2021 ◽  
Vol 11 (4) ◽  
pp. 1380
Author(s):  
Yingbo Zhou ◽  
Pengcheng Zhao ◽  
Weiqin Tong ◽  
Yongxin Zhu

While Generative Adversarial Networks (GANs) have shown promising performance in image generation, they suffer from numerous issues such as mode collapse and training instability. To stabilize GAN training and improve image synthesis quality with diversity, we propose a simple yet effective approach as Contrastive Distance Learning GAN (CDL-GAN) in this paper. Specifically, we add Consistent Contrastive Distance (CoCD) and Characteristic Contrastive Distance (ChCD) into a principled framework to improve GAN performance. The CoCD explicitly maximizes the ratio of the distance between generated images and the increment between noise vectors to strengthen image feature learning for the generator. The ChCD measures the sampling distance of the encoded images in Euler space to boost feature representations for the discriminator. We model the framework by employing Siamese Network as a module into GANs without any modification on the backbone. Both qualitative and quantitative experiments conducted on three public datasets demonstrate the effectiveness of our method.

Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1810
Author(s):  
Dat Tien Nguyen ◽  
Tuyen Danh Pham ◽  
Ganbayar Batchuluun ◽  
Kyoung Jun Noh ◽  
Kang Ryoung Park

Although face-based biometric recognition systems have been widely used in many applications, this type of recognition method is still vulnerable to presentation attacks, which use fake samples to deceive the recognition system. To overcome this problem, presentation attack detection (PAD) methods for face recognition systems (face-PAD), which aim to classify real and presentation attack face images before performing a recognition task, have been developed. However, the performance of PAD systems is limited and biased due to the lack of presentation attack images for training PAD systems. In this paper, we propose a method for artificially generating presentation attack face images by learning the characteristics of real and presentation attack images using a few captured images. As a result, our proposed method helps save time in collecting presentation attack samples for training PAD systems and possibly enhance the performance of PAD systems. Our study is the first attempt to generate PA face images for PAD system based on CycleGAN network, a deep-learning-based framework for image generation. In addition, we propose a new measurement method to evaluate the quality of generated PA images based on a face-PAD system. Through experiments with two public datasets (CASIA and Replay-mobile), we show that the generated face images can capture the characteristics of presentation attack images, making them usable as captured presentation attack samples for PAD system training.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Linyan Li ◽  
Yu Sun ◽  
Fuyuan Hu ◽  
Tao Zhou ◽  
Xuefeng Xi ◽  
...  

In this paper, we propose an Attentional Concatenation Generative Adversarial Network (ACGAN) aiming at generating 1024 × 1024 high-resolution images. First, we propose a multilevel cascade structure, for text-to-image synthesis. During training progress, we gradually add new layers and, at the same time, use the results and word vectors from the previous layer as inputs to the next layer to generate high-resolution images with photo-realistic details. Second, the deep attentional multimodal similarity model is introduced into the network, and we match word vectors with images in a common semantic space to compute a fine-grained matching loss for training the generator. In this way, we can pay attention to the fine-grained information of the word level in the semantics. Finally, the measure of diversity is added to the discriminator, which enables the generator to obtain more diverse gradient directions and improve the diversity of generated samples. The experimental results show that the inception scores of the proposed model on the CUB and Oxford-102 datasets have reached 4.48 and 4.16, improved by 2.75% and 6.42% compared to Attentional Generative Adversarial Networks (AttenGAN). The ACGAN model has a better effect on text-generated images, and the resulting image is closer to the real image.


2020 ◽  
Vol 2020 ◽  
pp. 1-13 ◽  
Author(s):  
C. Yuan ◽  
C. Q. Sun ◽  
X. Y. Tang ◽  
R. F. Liu

The purpose of image fusion is to combine the source images of the same scene into a single composite image with more useful information and better visual effects. Fusion GAN has made a breakthrough in this field by proposing to use the generative adversarial network to fuse images. In some cases, considering retain infrared radiation information and gradient information at the same time, the existing fusion methods ignore the image contrast and other elements. To this end, we propose a new end-to-end network structure based on generative adversarial networks (GANs), termed as FLGC-Fusion GAN. In the generator, using the learnable grouping convolution can improve the efficiency of the model and save computing resources. Therefore, we can have a better trade-off between the accuracy and speed of the model. Besides, we take the residual dense block as the basic network building unit and use the perception characteristics of the inactive as content loss characteristics of input, achieving the effect of deep network supervision. Experimental results on two public datasets show that the proposed method performs well in subjective visual performance and objective criteria and has obvious advantages over other current typical methods.


2019 ◽  
Vol 490 (4) ◽  
pp. 5424-5439 ◽  
Author(s):  
Ping Guo ◽  
Fuqing Duan ◽  
Pei Wang ◽  
Yao Yao ◽  
Qian Yin ◽  
...  

ABSTRACT Discovering pulsars is a significant and meaningful research topic in the field of radio astronomy. With the advent of astronomical instruments, the volume and rate of data acquisition have grown exponentially. This development necessitates a focus on artificial intelligence (AI) technologies that can mine large astronomical data sets. Automatic pulsar candidate identification (APCI) can be considered as a task determining potential candidates for further investigation and eliminating the noise of radio-frequency interference and other non-pulsar signals. As reported in the existing literature, AI techniques, especially convolutional neural network (CNN)-based techniques, have been adopted for APCI. However, it is challenging to enhance the performance of CNN-based pulsar identification because only an extremely limited number of real pulsar samples exist, which results in a crucial class imbalance problem. To address these problems, we propose a framework that combines a deep convolution generative adversarial network (DCGAN) with a support vector machine (SVM). The DCGAN is used as a sample generation and feature learning model, and the SVM is adopted as the classifier for predicting the label of a candidate at the inference stage. The proposed framework is a novel technique, which not only can solve the class imbalance problem but also can learn the discriminative feature representations of pulsar candidates instead of computing hand-crafted features in the pre-processing steps. The proposed method can enhance the accuracy of the APCI, and the computer experiments performed on two pulsar data sets verified the effectiveness and efficiency of the proposed method.


2019 ◽  
Vol 11 (9) ◽  
pp. 1017 ◽  
Author(s):  
Yang Zhang ◽  
Zhangyue Xiong ◽  
Yu Zang ◽  
Cheng Wang ◽  
Jonathan Li ◽  
...  

Road network extraction from remote sensing images has played an important role in various areas. However, due to complex imaging conditions and terrain factors, such as occlusion and shades, it is very challenging to extract road networks with complete topology structures. In this paper, we propose a learning-based road network extraction framework via a Multi-supervised Generative Adversarial Network (MsGAN), which is jointly trained by the spectral and topology features of the road network. Such a design makes the network capable of learning how to “guess” the aberrant road cases, which is caused by occlusion and shadow, based on the relationship between the road region and centerline; thus, it is able to provide a road network with integrated topology. Additionally, we also present a sample quality measurement to efficiently generate a large number of training samples with a little human interaction. Through the experiments on images from various satellites and the comprehensive comparisons to state-of-the-art approaches on the public datasets, it is demonstrated that the proposed method is able to provide high-quality results, especially for the completeness of the road network.


Algorithms ◽  
2018 ◽  
Vol 11 (10) ◽  
pp. 164 ◽  
Author(s):  
Aggeliki Vlachostergiou ◽  
George Caridakis ◽  
Phivos Mylonas ◽  
Andreas Stafylopatis

The ability to learn robust, resizable feature representations from unlabeled data has potential applications in a wide variety of machine learning tasks. One way to create such representations is to train deep generative models that can learn to capture the complex distribution of real-world data. Generative adversarial network (GAN) approaches have shown impressive results in producing generative models of images, but relatively little work has been done on evaluating the performance of these methods for the learning representation of natural language, both in supervised and unsupervised settings at the document, sentence, and aspect level. Extensive research validation experiments were performed by leveraging the 20 Newsgroups corpus, the Movie Review (MR) Dataset, and the Finegrained Sentiment Dataset (FSD). Our experimental analysis suggests that GANs can successfully learn representations of natural language texts at all three aforementioned levels.


2021 ◽  
Author(s):  
Bingqi Liu ◽  
Jiwei Lv ◽  
Xinyue Fan ◽  
Jie Luo ◽  
Tianyi Zou

Abstract With the rapid development of deep learning, image generation technology has become one of the current hot research areas. A deep convolutional generative adversarial network (DCGAN) can better adapt to complex image distributions than other methods. In this paper, based on a traditional generative adversarial networks (GANs) image generation model, first, the fully connected layer of the DCGAN is further improved. To solve the problem of gradient disappearance in GANs, the activation functions of all layers of the discriminator are LeakyReLU functions, the output layer of the generator uses the Tanh activation function, and the other layers use ReLU. Second, the improved DCGAN model is verified on the MNIST dataset, and simple initial fraction (ISs) and complex initial fraction (ISc) indexes are established from the two aspects of image quality and image generation diversity, respectively. Finally, through a comparison of the two groups of experiments, it is found that the quality of images generated by the DCGAN model constructed in this paper is 2.02 higher than that of the GANs model, and the diversity of the images generated by the DCGAN is 1.55 higher than that of GANs. The results show that the improved DCGAN model can solve the problem of low-quality images being generated by the GANs and achieve good results.


Electronics ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 688
Author(s):  
Sung-Wook Park ◽  
Jun-Ho Huh ◽  
Jong-Chan Kim

In the field of deep learning, the generative model did not attract much attention until GANs (generative adversarial networks) appeared. In 2014, Google’s Ian Goodfellow proposed a generative model called GANs. GANs use different structures and objective functions from the existing generative model. For example, GANs use two neural networks: a generator that creates a realistic image, and a discriminator that distinguishes whether the input is real or synthetic. If there are no problems in the training process, GANs can generate images that are difficult even for experts to distinguish in terms of authenticity. Currently, GANs are the most researched subject in the field of computer vision, which deals with the technology of image style translation, synthesis, and generation, and various models have been unveiled. The issues raised are also improving one by one. In image synthesis, BEGAN (Boundary Equilibrium Generative Adversarial Network), which outperforms the previously announced GANs, learns the latent space of the image, while balancing the generator and discriminator. Nonetheless, BEGAN also has a mode collapse wherein the generator generates only a few images or a single one. Although BEGAN-CS (Boundary Equilibrium Generative Adversarial Network with Constrained Space), which was improved in terms of loss function, was introduced, it did not solve the mode collapse. The discriminator structure of BEGAN-CS is AE (AutoEncoder), which cannot create a particularly useful or structured latent space. Compression performance is not good either. In this paper, this characteristic of AE is considered to be related to the occurrence of mode collapse. Thus, we used VAE (Variational AutoEncoder), which added statistical techniques to AE. As a result of the experiment, the proposed model did not cause mode collapse but converged to a better state than BEGAN-CS.


Sensors ◽  
2020 ◽  
Vol 20 (11) ◽  
pp. 3119 ◽  
Author(s):  
Jingtao Li ◽  
Zhanlong Chen ◽  
Xiaozhen Zhao ◽  
Lijia Shao

In recent years, the generative adversarial network (GAN)-based image translation model has achieved great success in image synthesis, image inpainting, image super-resolution, and other tasks. However, the images generated by these models often have problems such as insufficient details and low quality. Especially for the task of map generation, the generated electronic map cannot achieve effects comparable to industrial production in terms of accuracy and aesthetics. This paper proposes a model called Map Generative Adversarial Networks (MapGAN) for generating multitype electronic maps accurately and quickly based on both remote sensing images and render matrices. MapGAN improves the generator architecture of Pix2pixHD and adds a classifier to enhance the model, enabling it to learn the characteristics and style differences of different types of maps. Using the datasets of Google Maps, Baidu maps, and Map World maps, we compare MapGAN with some recent image translation models in the fields of one-to-one map generation and one-to-many domain map generation. The results show that the quality of the electronic maps generated by MapGAN is optimal in terms of both intuitive vision and classic evaluation indicators.


2020 ◽  
Vol 10 (6) ◽  
pp. 1995 ◽  
Author(s):  
Jeong gi Kwak ◽  
Hanseok Ko

The processing of facial images is an important task, because it is required for a large number of real-world applications. As deep-learning models evolve, they require a huge number of images for training. In reality, however, the number of images available is limited. Generative adversarial networks (GANs) have thus been utilized for database augmentation, but they suffer from unstable training, low visual quality, and a lack of diversity. In this paper, we propose an auto-encoder-based GAN with an enhanced network structure and training scheme for Database (DB) augmentation and image synthesis. Our generator and decoder are divided into two separate modules that each take input vectors for low-level and high-level features; these input vectors affect all layers within the generator and decoder. The effectiveness of the proposed method is demonstrated by comparing it with baseline methods. In addition, we introduce a new scheme that can combine two existing images without the need for extra networks based on the auto-encoder structure of the discriminator in our model. We add a novel double-constraint loss to make the encoded latent vectors equal to the input vectors.


Sign in / Sign up

Export Citation Format

Share Document