scholarly journals Distilling Portable Generative Adversarial Networks for Image Translation

2020 ◽  
Vol 34 (04) ◽  
pp. 3585-3592 ◽  
Author(s):  
Hanting Chen ◽  
Yunhe Wang ◽  
Han Shu ◽  
Changyuan Wen ◽  
Chunjing Xu ◽  
...  

Despite Generative Adversarial Networks (GANs) have been widely used in various image-to-image translation tasks, they can be hardly applied on mobile devices due to their heavy computation and storage cost. Traditional network compression methods focus on visually recognition tasks, but never deal with generation tasks. Inspired by knowledge distillation, a student generator of fewer parameters is trained by inheriting the low-level and high-level information from the original heavy teacher generator. To promote the capability of student generator, we include a student discriminator to measure the distances between real images, and images generated by student and teacher generators. An adversarial learning process is therefore established to optimize student generator and student discriminator. Qualitative and quantitative analysis by conducting experiments on benchmark datasets demonstrate that the proposed method can learn portable generative models with strong performance.

2021 ◽  
Vol 8 (1) ◽  
pp. 3-31
Author(s):  
Yuan Xue ◽  
Yuan-Chen Guo ◽  
Han Zhang ◽  
Tao Xu ◽  
Song-Hai Zhang ◽  
...  

AbstractIn many applications of computer graphics, art, and design, it is desirable for a user to provide intuitive non-image input, such as text, sketch, stroke, graph, or layout, and have a computer system automatically generate photo-realistic images according to that input. While classically, works that allow such automatic image content generation have followed a framework of image retrieval and composition, recent advances in deep generative models such as generative adversarial networks (GANs), variational autoencoders (VAEs), and flow-based methods have enabled more powerful and versatile image generation approaches. This paper reviews recent works for image synthesis given intuitive user input, covering advances in input versatility, image generation methodology, benchmark datasets, and evaluation metrics. This motivates new perspectives on input representation and interactivity, cross fertilization between major image generation paradigms, and evaluation and comparison of generation methods.


2020 ◽  
Vol 2020 (10) ◽  
pp. 312-1-312-7
Author(s):  
Habib Ullah ◽  
Sultan Daud Khan ◽  
Mohib Ullah ◽  
Maqsood Mahmud ◽  
Faouzi Alaya Cheikh

Generative adversarial networks (GANs) have been significantly investigated in the past few years due to its outstanding data generation capacity. The extensive use of the GANs techniques is dominant in the field of computer vision, for example, plausible image generation, image to image translation, facial attribute manipulation, improving image resolution, and image to text translation. In spite of the significant success achieved in these domains, applying GANs to various other problems still presents important challenges. Several reviews and surveys for GANs are available in the literature. However, none of them present short but focused review about the most significant aspects of GANs. In this paper, we address these aspects. We analyze the basic theory of GANs and the differences among various generative models. Then, we discuss the recent spectrum of applications covered by the GANs. We also provide an insight into the challenges and future directions.


2021 ◽  
Author(s):  
Huajun Liu ◽  
Ziyan Wang ◽  
Haigang Sui ◽  
Qing Zhu ◽  
Shubo Liu ◽  
...  

2021 ◽  
Vol 54 (3) ◽  
pp. 1-42
Author(s):  
Divya Saxena ◽  
Jiannong Cao

Generative Adversarial Networks (GANs) is a novel class of deep generative models that has recently gained significant attention. GANs learn complex and high-dimensional distributions implicitly over images, audio, and data. However, there exist major challenges in training of GANs, i.e., mode collapse, non-convergence, and instability, due to inappropriate design of network architectre, use of objective function, and selection of optimization algorithm. Recently, to address these challenges, several solutions for better design and optimization of GANs have been investigated based on techniques of re-engineered network architectures, new objective functions, and alternative optimization algorithms. To the best of our knowledge, there is no existing survey that has particularly focused on the broad and systematic developments of these solutions. In this study, we perform a comprehensive survey of the advancements in GANs design and optimization solutions proposed to handle GANs challenges. We first identify key research issues within each design and optimization technique and then propose a new taxonomy to structure solutions by key research issues. In accordance with the taxonomy, we provide a detailed discussion on different GANs variants proposed within each solution and their relationships. Finally, based on the insights gained, we present promising research directions in this rapidly growing field.


2021 ◽  
Vol 13 (4) ◽  
pp. 596
Author(s):  
David Vint ◽  
Matthew Anderson ◽  
Yuhao Yang ◽  
Christos Ilioudis ◽  
Gaetano Di Caterina ◽  
...  

In recent years, the technological advances leading to the production of high-resolution Synthetic Aperture Radar (SAR) images has enabled more and more effective target recognition capabilities. However, high spatial resolution is not always achievable, and, for some particular sensing modes, such as Foliage Penetrating Radars, low resolution imaging is often the only option. In this paper, the problem of automatic target recognition in Low Resolution Foliage Penetrating (FOPEN) SAR is addressed through the use of Convolutional Neural Networks (CNNs) able to extract both low and high level features of the imaged targets. Additionally, to address the issue of limited dataset size, Generative Adversarial Networks are used to enlarge the training set. Finally, a Receiver Operating Characteristic (ROC)-based post-classification decision approach is used to reduce classification errors and measure the capability of the classifier to provide a reliable output. The effectiveness of the proposed framework is demonstrated through the use of real SAR FOPEN data.


2021 ◽  
Vol 251 ◽  
pp. 03055
Author(s):  
John Blue ◽  
Braden Kronheim ◽  
Michelle Kuchera ◽  
Raghuram Ramanujan

Detector simulation in high energy physics experiments is a key yet computationally expensive step in the event simulation process. There has been much recent interest in using deep generative models as a faster alternative to the full Monte Carlo simulation process in situations in which the utmost accuracy is not necessary. In this work we investigate the use of conditional Wasserstein Generative Adversarial Networks to simulate both hadronization and the detector response to jets. Our model takes the 4-momenta of jets formed from partons post-showering and pre-hadronization as inputs and predicts the 4-momenta of the corresponding reconstructed jet. Our model is trained on fully simulated tt events using the publicly available GEANT-based simulation of the CMS Collaboration. We demonstrate that the model produces accurate conditional reconstructed jet transverse momentum (pT) distributions over a wide range of pT for the input parton jet. Our model takes only a fraction of the time necessary for conventional detector simulation methods, running on a CPU in less than a millisecond per event.


Author(s):  
Chien-Yu Lu ◽  
Min-Xin Xue ◽  
Chia-Che Chang ◽  
Che-Rung Lee ◽  
Li Su

Style transfer of polyphonic music recordings is a challenging task when considering the modeling of diverse, imaginative, and reasonable music pieces in the style different from their original one. To achieve this, learning stable multi-modal representations for both domain-variant (i.e., style) and domaininvariant (i.e., content) information of music in an unsupervised manner is critical. In this paper, we propose an unsupervised music style transfer method without the need for parallel data. Besides, to characterize the multi-modal distribution of music pieces, we employ the Multi-modal Unsupervised Image-to-Image Translation (MUNIT) framework in the proposed system. This allows one to generate diverse outputs from the learned latent distributions representing contents and styles. Moreover, to better capture the granularity of sound, such as the perceptual dimensions of timbre and the nuance in instrument-specific performance, cognitively plausible features including mel-frequency cepstral coefficients (MFCC), spectral difference, and spectral envelope, are combined with the widely-used mel-spectrogram into a timbreenhanced multi-channel input representation. The Relativistic average Generative Adversarial Networks (RaGAN) is also utilized to achieve fast convergence and high stability. We conduct experiments on bilateral style transfer tasks among three different genres, namely piano solo, guitar solo, and string quartet. Results demonstrate the advantages of the proposed method in music style transfer with improved sound quality and in allowing users to manipulate the output.


Sign in / Sign up

Export Citation Format

Share Document