S2I-Bird: Sound-to-Image Generation of Bird Species using Generative Adversarial Networks

Generative adversarial networks and variational autoencoders (VAEs) provide impressive image generation from Gaussian white noise, but both are difficult to train, since they need a generator (or encoder) and a discriminator (or decoder) to be trained simultaneously, which can easily lead to unstable training. To solve or alleviate these synchronous training problems of generative adversarial networks (GANs) and VAEs, researchers recently proposed generative scattering networks (GSNs), which use wavelet scattering networks (ScatNets) as the encoder to obtain features (or ScatNet embeddings) and convolutional neural networks (CNNs) as the decoder to generate an image. The advantage of GSNs is that the parameters of ScatNets do not need to be learned, while the disadvantage of GSNs is that their ability to obtain representations of ScatNets is slightly weaker than that of CNNs. In addition, the dimensionality reduction method of principal component analysis (PCA) can easily lead to overfitting in the training of GSNs and, therefore, affect the quality of generated images in the testing process. To further improve the quality of generated images while keeping the advantages of GSNs, this study proposes generative fractional scattering networks (GFRSNs), which use more expressive fractional wavelet scattering networks (FrScatNets), instead of ScatNets as the encoder to obtain features (or FrScatNet embeddings) and use similar CNNs of GSNs as the decoder to generate an image. Additionally, this study develops a new dimensionality reduction method named feature-map fusion (FMF) instead of performing PCA to better retain the information of FrScatNets,; it also discusses the effect of image fusion on the quality of the generated image. The experimental results obtained on the CIFAR-10 and CelebA datasets show that the proposed GFRSNs can lead to better generated images than the original GSNs on testing datasets. The experimental results of the proposed GFRSNs with deep convolutional GAN (DCGAN), progressive GAN (PGAN), and CycleGAN are also given.

Download Full-text

High-Quality Sonar Image Generation Algorithm Based on Generative Adversarial Networks

10.23919/ccc52363.2021.9550195 ◽

2021 ◽

Author(s):

Zhengyang Wang ◽

Qingchang Guo ◽

Min Lei ◽

Shuxiang Guo ◽

Xiufen Ye

Keyword(s):

Generative Adversarial Networks ◽

Image Generation ◽

High Quality ◽

Generation Algorithm ◽

Sonar Image ◽

Adversarial Networks

Download Full-text

Comprehensive Modeling of Neonatal Brain Image Generation for Disorder Development Onset Prediction Based on Generative Adversarial Networks

10.1007/978-981-16-4325-5_35 ◽

2021 ◽

pp. 269-273

Author(s):

Saadia Binte Alam ◽

Syoji Kobashi

Keyword(s):

Generative Adversarial Networks ◽

Image Generation ◽

Neonatal Brain ◽

Brain Image ◽

Adversarial Networks ◽

Comprehensive Modeling

Download Full-text

A Realistic Image Generation of Face From Text Description Using the Fully Trained Generative Adversarial Networks

IEEE Access ◽

10.1109/access.2020.3015656 ◽

2021 ◽

Vol 9 ◽

pp. 1250-1260

Author(s):

Muhammad Zeeshan Khan ◽

Saira Jabeen ◽

Muhammad Usman Ghani Khan ◽

Tanzila Saba ◽

Asim Rehmat ◽

...

Keyword(s):

Generative Adversarial Networks ◽

Image Generation ◽

Adversarial Networks ◽

Realistic Image

Download Full-text

Constrained Generative Adversarial Networks for Interactive Image Generation

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr.2019.01101 ◽

2019 ◽

Cited By ~ 1

Author(s):

Eric Heim

Keyword(s):

Generative Adversarial Networks ◽

Image Generation ◽

Adversarial Networks

Download Full-text

Automatic Synthetic Document Image Generation using Generative Adversarial Networks: Application in Mobile-Captured Document Analysis

2019 International Conference on Document Analysis and Recognition (ICDAR) ◽

10.1109/icdar.2019.00070 ◽

2019 ◽

Author(s):

Quang Anh Bui ◽

David Mollard ◽

Salvatore Tabbone

Keyword(s):

Document Analysis ◽

Document Image ◽

Generative Adversarial Networks ◽

Image Generation ◽

Adversarial Networks

Download Full-text

Red Blood Cell Image Generation for Data Augmentation Using Conditional Generative Adversarial Networks

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) ◽

10.1109/cvprw.2019.00136 ◽

2019 ◽

Cited By ~ 9

Author(s):

Oleksandr Bailo ◽

DongShik Ham ◽

Young Min Shin

Keyword(s):

Blood Cell ◽

Red Blood Cell ◽

Data Augmentation ◽

Generative Adversarial Networks ◽

Image Generation ◽

Cell Image ◽

Adversarial Networks

Download Full-text

Surgical workflow image generation based on generative adversarial networks

2018 International Conference on Artificial Intelligence and Big Data (ICAIBD) ◽

10.1109/icaibd.2018.8396171 ◽

2018 ◽

Cited By ~ 2

Author(s):

Yuwen Chen ◽

Kunhua Zhong ◽

Fei Wang ◽

Hongqian Wang ◽

Xueliang Zhao

Keyword(s):

Generative Adversarial Networks ◽

Image Generation ◽

Surgical Workflow ◽

Adversarial Networks

Download Full-text

PeaceGAN: A GAN-Based Multi-Task Learning Method for SAR Target Image Generation with a Pose Estimator and an Auxiliary Classifier

Remote Sensing ◽

10.3390/rs13193939 ◽

2021 ◽

Vol 13 (19) ◽

pp. 3939

Author(s):

Jihyong Oh ◽

Munchurl Kim

Keyword(s):

Human Perception ◽

Speckle Noise ◽

Generative Adversarial Networks ◽

Image Generation ◽

Target Image ◽

Multiple Sources ◽

Sar Images ◽

Task Learning ◽

Adversarial Networks ◽

The One

Although generative adversarial networks (GANs) are successfully applied to diverse fields, training GANs on synthetic aperture radar (SAR) data is a challenging task due to speckle noise. On the one hand, in a learning perspective of human perception, it is natural to learn a task by using information from multiple sources. However, in the previous GAN works on SAR image generation, information on target classes has only been used. Due to the backscattering characteristics of SAR signals, the structures of SAR images are strongly dependent on their pose angles. Nevertheless, the pose angle information has not been incorporated into GAN models for SAR images. In this paper, we propose a novel GAN-based multi-task learning (MTL) method for SAR target image generation, called PeaceGAN, that has two additional structures, a pose estimator and an auxiliary classifier, at the side of its discriminator in order to effectively combine the pose and class information via MTL. Extensive experiments showed that the proposed MTL framework can help the PeaceGAN’s generator effectively learn the distributions of SAR images so that it can better generate the SAR target images more faithfully at intended pose angles for desired target classes in comparison with the recent state-of-the-art methods.

Download Full-text

Reconstruction of Generative Adversarial Networks in Cross Modal Image Generation with Canonical Polyadic Decomposition

Wireless Communications and Mobile Computing ◽

10.1155/2021/8868781 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Ruixin Ma ◽

Junying Lou ◽

Peng Li ◽

Jing Gao

Keyword(s):

Mobile Terminal ◽

Experimental Results ◽

Generative Adversarial Networks ◽

Image Generation ◽

Adversarial Networks ◽

Speed Up ◽

Canonical Polyadic Decomposition

Generating pictures from text is an interesting, classic, and challenging task. Benefited from the development of generative adversarial networks (GAN), the generation quality of this task has been greatly improved. Many excellent cross modal GAN models have been put forward. These models add extensive layers and constraints to get impressive generation pictures. However, complexity and computation of existing cross modal GANs are too high to be deployed in mobile terminal. To solve this problem, this paper designs a compact cross modal GAN based on canonical polyadic decomposition. We replace an original convolution layer with three small convolution layers and use an autoencoder to stabilize and speed up training. The experimental results show that our model achieves 20% times of compression in both parameters and FLOPs without loss of quality on generated images.

Download Full-text