Review on Generative Adversarial Networks: Focusing on Computer Vision and Its Applications

Sung-Wook Park; Jae-Sub Ko; Jun-Ho Huh; Jong-Chan Kim

doi:10.3390/electronics10101216

Review on Generative Adversarial Networks: Focusing on Computer Vision and Its Applications

Electronics ◽

10.3390/electronics10101216 ◽

2021 ◽

Vol 10 (10) ◽

pp. 1216

Author(s):

Sung-Wook Park ◽

Jae-Sub Ko ◽

Jun-Ho Huh ◽

Jong-Chan Kim

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Random Noise ◽

Image Synthesis ◽

Research Direction ◽

Input Image ◽

Generative Model ◽

Generative Adversarial Networks ◽

Future Research ◽

Adversarial Networks

The emergence of deep learning model GAN (Generative Adversarial Networks) is an important turning point in generative modeling. GAN is more powerful in feature and expression learning compared to machine learning-based generative model algorithms. Nowadays, it is also used to generate non-image data, such as voice and natural language. Typical technologies include BERT (Bidirectional Encoder Representations from Transformers), GPT-3 (Generative Pretrained Transformer-3), and MuseNet. GAN differs from the machine learning-based generative model and the objective function. Training is conducted by two networks: generator and discriminator. The generator converts random noise into a true-to-life image, whereas the discriminator distinguishes whether the input image is real or synthetic. As the training continues, the generator learns more sophisticated synthesis techniques, and the discriminator grows into a more accurate differentiator. GAN has problems, such as mode collapse, training instability, and lack of evaluation matrix, and many researchers have tried to solve these problems. For example, solutions such as one-sided label smoothing, instance normalization, and minibatch discrimination have been proposed. The field of application has also expanded. This paper provides an overview of GAN and application solutions for computer vision and artificial intelligence healthcare field researchers. The structure and principle of operation of GAN, the core models of GAN proposed to date, and the theory of GAN were analyzed. Application examples of GAN such as image classification and regression, image synthesis and inpainting, image-to-image translation, super-resolution and point registration were then presented. The discussion tackled GAN’s problems and solutions, and the future research direction was finally proposed.

Download Full-text

Handwritten Signature Spoofing With Conditional Generative Adversarial Nets

10.4018/978-1-7998-7323-5.ch006 ◽

2022 ◽

pp. 98-110

Author(s):

Md Fazle Rabby ◽

Md Abdullah Al Momin ◽

Xiali Hei

Keyword(s):

Computer Vision ◽

Image Synthesis ◽

Research Topic ◽

Generative Adversarial Networks ◽

Adversarial Networks ◽

Image Translation ◽

Handwritten Signature ◽

Condition Vector

Generative adversarial networks have been a highly focused research topic in computer vision, especially in image synthesis and image-to-image translation. There are a lot of variations in generative nets, and different GANs are suitable for different applications. In this chapter, the authors investigated conditional generative adversarial networks to generate fake images, such as handwritten signatures. The authors demonstrated an implementation of conditional generative adversarial networks, which can generate fake handwritten signatures according to a condition vector tailored by humans.

Download Full-text

BEGAN v3: Avoiding Mode Collapse in GANs Using Variational Inference

Electronics ◽

10.3390/electronics9040688 ◽

2020 ◽

Vol 9 (4) ◽

pp. 688

Author(s):

Sung-Wook Park ◽

Jun-Ho Huh ◽

Jong-Chan Kim

Keyword(s):

Image Synthesis ◽

Generative Model ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Compression Performance ◽

Adversarial Network ◽

Adversarial Networks ◽

Latent Space ◽

Boundary Equilibrium ◽

Space Compression

In the field of deep learning, the generative model did not attract much attention until GANs (generative adversarial networks) appeared. In 2014, Google’s Ian Goodfellow proposed a generative model called GANs. GANs use different structures and objective functions from the existing generative model. For example, GANs use two neural networks: a generator that creates a realistic image, and a discriminator that distinguishes whether the input is real or synthetic. If there are no problems in the training process, GANs can generate images that are difficult even for experts to distinguish in terms of authenticity. Currently, GANs are the most researched subject in the field of computer vision, which deals with the technology of image style translation, synthesis, and generation, and various models have been unveiled. The issues raised are also improving one by one. In image synthesis, BEGAN (Boundary Equilibrium Generative Adversarial Network), which outperforms the previously announced GANs, learns the latent space of the image, while balancing the generator and discriminator. Nonetheless, BEGAN also has a mode collapse wherein the generator generates only a few images or a single one. Although BEGAN-CS (Boundary Equilibrium Generative Adversarial Network with Constrained Space), which was improved in terms of loss function, was introduced, it did not solve the mode collapse. The discriminator structure of BEGAN-CS is AE (AutoEncoder), which cannot create a particularly useful or structured latent space. Compression performance is not good either. In this paper, this characteristic of AE is considered to be related to the occurrence of mode collapse. Thus, we used VAE (Variational AutoEncoder), which added statistical techniques to AE. As a result of the experiment, the proposed model did not cause mode collapse but converged to a better state than BEGAN-CS.

Download Full-text

Deep Fake Image Detection Based on Pairwise Learning

Applied Sciences ◽

10.3390/app10010370 ◽

2020 ◽

Vol 10 (1) ◽

pp. 370 ◽

Cited By ~ 6

Author(s):

Chih-Chung Hsu ◽

Yi-Xiu Zhuang ◽

Chia-Yen Lee

Keyword(s):

State Of The Art ◽

Random Noise ◽

Input Image ◽

Generative Adversarial Networks ◽

Source Image ◽

Image Forgery ◽

Social Media Networks ◽

Adversarial Networks ◽

Pairwise Learning ◽

Image Pairs

Generative adversarial networks (GANs) can be used to generate a photo-realistic image from a low-dimension random noise. Such a synthesized (fake) image with inappropriate content can be used on social media networks, which can cause severe problems. With the aim to successfully detect fake images, an effective and efficient image forgery detector is necessary. However, conventional image forgery detectors fail to recognize fake images generated by the GAN-based generator since these images are generated and manipulated from the source image. Therefore, in this paper, we propose a deep learning-based approach for detecting the fake images by using the contrastive loss. First, several state-of-the-art GANs are employed to generate the fake–real image pairs. Next, the reduced DenseNet is developed to a two-streamed network structure to allow pairwise information as the input. Then, the proposed common fake feature network is trained using the pairwise learning to distinguish the features between the fake and real images. Finally, a classification layer is concatenated to the proposed common fake feature network to detect whether the input image is fake or real. The experimental results demonstrated that the proposed method significantly outperformed other state-of-the-art fake image detectors.

Download Full-text

Generative Adversarial Networks in Computer Vision

ACM Computing Surveys ◽

10.1145/3439723 ◽

2021 ◽

Vol 54 (2) ◽

pp. 1-38

Author(s):

Zhengwei Wang ◽

Qi She ◽

Tomás E. Ward

Keyword(s):

Computer Vision ◽

Generative Adversarial Networks ◽

Future Research ◽

Image Generation ◽

Adversarial Networks ◽

Image Translation ◽

The Status ◽

Loss Variant ◽

Future Research Directions ◽

Application Requirements

Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably their most significant impact has been in the area of computer vision where great advances have been made in challenges such as plausible image generation, image-to-image translation, facial attribute manipulation, and similar domains. Despite the significant successes achieved to date, applying GANs to real-world problems still poses significant challenges, three of which we focus on here. These are as follows: (1) the generation of high quality images, (2) diversity of image generation, and (3) stabilizing training. Focusing on the degree to which popular GAN technologies have made progress against these challenges, we provide a detailed review of the state-of-the-art in GAN-related research in the published scientific literature. We further structure this review through a convenient taxonomy we have adopted based on variations in GAN architectures and loss functions. While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress toward addressing practical challenges relevant to computer vision. Accordingly, we review and critically discuss the most popular architecture-variant, and loss-variant GANs, for tackling these challenges. Our objective is to provide an overview as well as a critical analysis of the status of GAN research in terms of relevant progress toward critical computer vision application requirements. As we do this we also discuss the most compelling applications in computer vision in which GANs have demonstrated considerable success along with some suggestions for future research directions. Codes related to the GAN-variants studied in this work is summarized on https://github.com/sheqi/GAN_Review.

Download Full-text

Unpaired Image- to- Image Translation using Cycle Generative Adversarial Networks

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1525.089620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 380-385

Keyword(s):

Mapping Function ◽

Image Synthesis ◽

Input Image ◽

General Purpose ◽

Training Dataset ◽

Generative Adversarial Networks ◽

Inverse Mapping ◽

Style Transfer ◽

Adversarial Networks ◽

Image Translation

In this burgeoning age and society where people are tending towards learning the benefits adversarial network we hereby benefiting the society tend to extend our research towards adversarial networks as a general-purpose solution to image-to-image translation problems. Image to image translation comes under the peripheral class of computer sciences extending our branch in the field of neural networks. We apprentice Generative adversarial networks as an optimum solution for generating Image to image translation where our motive is to learn a mapping between an input image(X) and an output image(Y) using a set of predefined pairs[4]. But it is not necessary that the paired dataset is provided to for our use and hence adversarial methods comes into existence. Further, we advance a method that is able to convert and recapture an image from a domain X to another domain Y in the absence of paired datasets. Our objective is to learn a mapping function G: A —B such that the mapping is able to distinguish the images of G(A) within the distribution of B using an adversarial loss.[1] Because this mapping is high biased, we introduce an inverse mapping function F B—A and introduce a cycle consistency loss[7]. Furthermore we wish to extend our research with various domains and involve them with neural style transfer, semantic image synthesis. Our essential commitment is to show that on a wide assortment of issues, conditional GANs produce sensible outcomes. This paper hence calls for the attention to the purpose of converting image X to image Y and we commit to the transfer learning of training dataset and optimising our code.You can find the source code for the same here.

Download Full-text

Wasserstein Generative Adversarial Networks Based Data Augmentation for Radar Data Analysis

Applied Sciences ◽

10.3390/app10041449 ◽

2020 ◽

Vol 10 (4) ◽

pp. 1449

Author(s):

Hansoo Lee ◽

Jonggeun Kim ◽

Eun Kyeong Kim ◽

Sungshin Kim

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Weather Radar ◽

Image Synthesis ◽

Radar Data ◽

Generative Adversarial Networks ◽

Necessary Condition ◽

Radar Images ◽

Adversarial Networks ◽

Wide Range

Ground-based weather radar can observe a wide range with a high spatial and temporal resolution. They are beneficial to meteorological research and services by providing valuable information. Recent weather radar data related research has focused on applying machine learning and deep learning to solve complicated problems. It is a well-known fact that an adequate amount of data is a positively necessary condition in machine learning and deep learning. Generative adversarial networks (GANs) have received extensive attention for their remarkable data generation capacity, with a fascinating competitive structure having been proposed since. Consequently, a massive number of variants have been proposed; which model is adequate to solve the given problem is an inevitable concern. In this paper, we propose exploring the problem of radar image synthesis and evaluating different GANs with authentic radar observation results. The experimental results showed that the improved Wasserstein GAN is more capable of generating similar radar images while achieving higher structural similarity results.

Download Full-text

PEMBUATAN GAMBAR SINTESIS DARI DEKSRIPSI TEKS DENGAN ALGORITMA GENERATIVE ADVERSARIAL NETWORK

Aisyah Journal Of Informatics and Electrical Engineering (A.J.I.E.E) ◽

10.30604/jti.v2i2.31 ◽

2020 ◽

Vol 2 (2) ◽

pp. 111-114

Author(s):

R Wisnu Prio Pamungkas ◽

Rakhmi Khalida ◽

Siti Setiawati

Keyword(s):

Machine Learning ◽

Image Synthesis ◽

Generative Adversarial Networks ◽

Human Intelligence ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Adversarial Networks

ABSTRACT Recently computers have been able to produce realistic photos from text. This is one of the potentials of machine learning to be used creatively. Machine learning is the field of solving problems that require an equivalent understanding of human intelligence. In this study using the Generative Adversarial Networks (GAN) algorithm is used to create images from text descriptions. The basic GAN architecture consists of 2 networks called a Generator and Discriminator network. The results of this study is images that are still not detailed in interpreting a text description, but the authors try to produce images that inspire, images can be more poetic when tried using poetry, lyrics, or book quotes. Keywords: GAN, Image Synthesis, Text Description ABSTRAK Baru-baru ini komputer mampu menghasilkan foto-foto yang realistis dari sebuah teks. Hal ini adalah salah satu potensi dari machine learning untuk digunakan secara kreatif. Machine learning adalah bidang menyelesaikan masalah-masalah yang membutuhkan pemahaman yang setara dengan kecerdasan manusia. Pada penelitian ini menggunakan algoritme Generative Adversarial Networks (GAN) digunakan untuk menciptakan gambar dari deskripsi teks. Dasar arsitektur GAN terdiri dari 2 jaringan yang disebut sebagai jaringan Generator dan Discriminator. Hasil dari penelitian ini berupa gambar yang masih tidak detail dalam memaknai sebuah deskripsi teks, tetapi penulis mencoba menghasilkan gambar yang menginspirasi, gambar dapat lebih puitis ketika dicoba menggunakan puisi, lirik, atau kutipan buku. Kata Kunci: GAN, Sintesis Gambar, Deskripsi Teks

Download Full-text

A Survey on Generative Adversarial Networks: Variants, Applications, and Training

ACM Computing Surveys ◽

10.1145/3463475 ◽

2022 ◽

Vol 54 (8) ◽

pp. 1-49

Author(s):

Abdul Jabbar ◽

Xi Li ◽

Bourahla Omar

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Nash Equilibrium ◽

Generative Models ◽

Generative Adversarial Networks ◽

Data Generation ◽

Crucial Issue ◽

Practical Applications ◽

Adversarial Networks ◽

And Training

The Generative Models have gained considerable attention in unsupervised learning via a new and practical framework called Generative Adversarial Networks (GAN) due to their outstanding data generation capability. Many GAN models have been proposed, and several practical applications have emerged in various domains of computer vision and machine learning. Despite GANs excellent success, there are still obstacles to stable training. The problems are Nash equilibrium, internal covariate shift, mode collapse, vanishing gradient, and lack of proper evaluation metrics. Therefore, stable training is a crucial issue in different applications for the success of GANs. Herein, we survey several training solutions proposed by different researchers to stabilize GAN training. We discuss (I) the original GAN model and its modified versions, (II) a detailed analysis of various GAN applications in different domains, and (III) a detailed study about the various GAN training obstacles as well as training solutions. Finally, we reveal several issues as well as research outlines to the topic.

Download Full-text

Image Synthesis Based On Feature Description

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37812 ◽

2021 ◽

Vol 9 (8) ◽

pp. 2492-2494

Author(s):

Rohan Bolusani

Keyword(s):

Neural Network ◽

Machine Learning ◽

Language Processing ◽

Image Synthesis ◽

Generative Models ◽

Generative Adversarial Networks ◽

Feature Representations ◽

Adversarial Networks ◽

Text Feature ◽

Image Modelling

Abstract: Generating realistic images from text is innovative and interesting, but modern-day machine learning models are still far from this goal. With research and development in the field of natural language processing, neural network architectures have been developed to learn discriminative text feature representations. Meanwhile, in the field of machine learning, generative adversarial networks (GANs) have begun to generate extremely accurate images of especially in categories, such as faces, album covers, and room interiors. In this work, the main goal is to develop a neural network to bridge these advances in text and image modelling, by essentially translating characters to pixels the project will demonstrate the capability of generative models by taking detailed text descriptions and generate plausible images. Keywords: Deep Learning, Computer Vision, NLP, Generative Adversarial Networks

Download Full-text

ORGANIC (1).pdf

10.26434/chemrxiv.5309668.v1 ◽

2017 ◽

Author(s):

Benjamin Sanchez-Lengeling ◽

Carlos Outeiral ◽

Gabriel L. Guimaraes ◽

Alan Aspuru-Guzik

Keyword(s):

Machine Learning ◽

Learning Community ◽

Chemical Species ◽

Material Design ◽

Organic Photovoltaic ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Adversarial Networks ◽

Photovoltaic Material

Molecular discovery seeks to generate chemical species tailored to very specific needs. In this paper, we present ORGANIC, a framework based on Objective-Reinforced Generative Adversarial Networks (ORGAN), capable of producing a distribution over molecular space that matches with a certain set of desirable metrics. This methodology combines two successful techniques from the machine learning community: a Generative Adversarial Network (GAN), to create non-repetitive sensible molecular species, and Reinforcement Learning (RL), to bias this generative distribution towards certain attributes. We explore several applications, from optimization of random physicochemical properties to candidates for drug discovery and organic photovoltaic material design.

Download Full-text