Semantically consistent text to fashion image synthesis with an enhanced attentional generative adversarial network

While Generative Adversarial Networks (GANs) have shown promising performance in image generation, they suffer from numerous issues such as mode collapse and training instability. To stabilize GAN training and improve image synthesis quality with diversity, we propose a simple yet effective approach as Contrastive Distance Learning GAN (CDL-GAN) in this paper. Specifically, we add Consistent Contrastive Distance (CoCD) and Characteristic Contrastive Distance (ChCD) into a principled framework to improve GAN performance. The CoCD explicitly maximizes the ratio of the distance between generated images and the increment between noise vectors to strengthen image feature learning for the generator. The ChCD measures the sampling distance of the encoded images in Euler space to boost feature representations for the discriminator. We model the framework by employing Siamese Network as a module into GANs without any modification on the backbone. Both qualitative and quantitative experiments conducted on three public datasets demonstrate the effectiveness of our method.

Download Full-text

Realistic High-resolution Body Computed Tomography Image Synthesis Using Progressive Growing Generative Adversarial Network: A Visual Turing Test (Preprint)

JMIR Medical Informatics ◽

10.2196/23328 ◽

2020 ◽

Author(s):

Ho Young Park ◽

Hyun-Jin Bae ◽

Gil-Sun Hong ◽

Minjee Kim ◽

JiHye Yun ◽

...

Keyword(s):

Computed Tomography ◽

High Resolution ◽

Image Synthesis ◽

Turing Test ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Tomography Image ◽

Computed Tomography Image

Download Full-text

IntroVNMT: An Introspective Model for Variational Neural Machine Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6411 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8830-8837

Author(s):

Xin Sheng ◽

Linli Xu ◽

Junliang Guo ◽

Jingchang Liu ◽

Ruoyu Zhao ◽

...

Keyword(s):

Machine Translation ◽

Latent Variables ◽

Image Synthesis ◽

Target Language ◽

Generative Adversarial Network ◽

Neural Machine Translation ◽

Adversarial Network ◽

Proposed Model ◽

Model Training ◽

High Level

We propose a novel introspective model for variational neural machine translation (IntroVNMT) in this paper, inspired by the recent successful application of introspective variational autoencoder (IntroVAE) in high quality image synthesis. Different from the vanilla variational NMT model, IntroVNMT is capable of improving itself introspectively by evaluating the quality of the generated target sentences according to the high-level latent variables of the real and generated target sentences. As a consequence of introspective training, the proposed model is able to discriminate between the generated and real sentences of the target language via the latent variables generated by the encoder of the model. In this way, IntroVNMT is able to generate more realistic target sentences in practice. In the meantime, IntroVNMT inherits the advantages of the variational autoencoders (VAEs), and the model training process is more stable than the generative adversarial network (GAN) based models. Experimental results on different translation tasks demonstrate that the proposed model can achieve significant improvements over the vanilla variational NMT model.

Download Full-text

Text to Image Synthesis With Bidirectional Generative Adversarial Network

2020 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme46284.2020.9102904 ◽

2020 ◽

Author(s):

Zixu Wang ◽

Zhe Quan ◽

Zhi-Jie Wang ◽

Xinjian Hu ◽

Yangyang Chen

Keyword(s):

Image Synthesis ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks

Discrete Dynamics in Nature and Society ◽

10.1155/2020/6452536 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Linyan Li ◽

Yu Sun ◽

Fuyuan Hu ◽

Tao Zhou ◽

Xuefeng Xi ◽

...

Keyword(s):

High Resolution ◽

Image Synthesis ◽

Semantic Space ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Fine Grained ◽

Adversarial Network ◽

Adversarial Networks ◽

Cascade Structure ◽

High Resolution Images

In this paper, we propose an Attentional Concatenation Generative Adversarial Network (ACGAN) aiming at generating 1024 × 1024 high-resolution images. First, we propose a multilevel cascade structure, for text-to-image synthesis. During training progress, we gradually add new layers and, at the same time, use the results and word vectors from the previous layer as inputs to the next layer to generate high-resolution images with photo-realistic details. Second, the deep attentional multimodal similarity model is introduced into the network, and we match word vectors with images in a common semantic space to compute a fine-grained matching loss for training the generator. In this way, we can pay attention to the fine-grained information of the word level in the semantics. Finally, the measure of diversity is added to the discriminator, which enables the generator to obtain more diverse gradient directions and improve the diversity of generated samples. The experimental results show that the inception scores of the proposed model on the CUB and Oxford-102 datasets have reached 4.48 and 4.16, improved by 2.75% and 6.42% compared to Attentional Generative Adversarial Networks (AttenGAN). The ACGAN model has a better effect on text-generated images, and the resulting image is closer to the real image.

Download Full-text

ResFPA-GAN: Text-to-Image Synthesis with Generative Adversarial Network Based on Residual Block Feature Pyramid Attention

2019 IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO) ◽

10.1109/arso46408.2019.8948717 ◽

2019 ◽

Author(s):

Jingcong Sun ◽

Yimin Zhou ◽

Bin Zhang

Keyword(s):

Image Synthesis ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Feature Pyramid ◽

Residual Block

Download Full-text

An Indirect Multimodal Image Registration and Completion Method Guided by Image Synthesis

Computational and Mathematical Methods in Medicine ◽

10.1155/2020/2684851 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Huan Yang ◽

Pengjiang Qian ◽

Chao Fan

Keyword(s):

Image Registration ◽

Auxiliary Information ◽

Image Synthesis ◽

Training Data ◽

Synthesis Methods ◽

Generative Adversarial Network ◽

Registration Method ◽

Multimodal Image Registration ◽

Adversarial Network ◽

Ct And Mri

Multimodal registration is a challenging task due to the significant variations exhibited from images of different modalities. CT and MRI are two of the most commonly used medical images in clinical diagnosis, since MRI with multicontrast images, together with CT, can provide complementary auxiliary information. The deformable image registration between MRI and CT is essential to analyze the relationships among different modality images. Here, we proposed an indirect multimodal image registration method, i.e., sCT-guided multimodal image registration and problematic image completion method. In addition, we also designed a deep learning-based generative network, Conditional Auto-Encoder Generative Adversarial Network, called CAE-GAN, combining the idea of VAE and GAN under a conditional process to tackle the problem of synthetic CT (sCT) synthesis. Our main contributions in this work can be summarized into three aspects: (1) We designed a new generative network called CAE-GAN, which incorporates the advantages of two popular image synthesis methods, i.e., VAE and GAN, and produced high-quality synthetic images with limited training data. (2) We utilized the sCT generated from multicontrast MRI as an intermediary to transform multimodal MRI-CT registration into monomodal sCT-CT registration, which greatly reduces the registration difficulty. (3) Using normal CT as guidance and reference, we repaired the abnormal MRI while registering the MRI to the normal CT.

Download Full-text

KT-GAN: Knowledge-Transfer Generative Adversarial Network for Text-to-Image Synthesis

IEEE Transactions on Image Processing ◽

10.1109/tip.2020.3026728 ◽

2021 ◽

Vol 30 ◽

pp. 1275-1290

Author(s):

Hongchen Tan ◽

Xiuping Liu ◽

Meng Liu ◽

Baocai Yin ◽

Xin Li

Keyword(s):

Knowledge Transfer ◽

Image Synthesis ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

High-fidelity fast volumetric brain MRI using a hybrid denoising generative adversarial network (HDnGAN)

10.1101/2021.01.07.425779 ◽

2021 ◽

Author(s):

Ziyu Li ◽

Qiyuan Tian ◽

Chanon Ngamsombat ◽

Samuel Cartmell ◽

John Conklin ◽

...

Keyword(s):

Parallel Imaging ◽

Mean Squared Error ◽

Brain Mri ◽

Similarity Index ◽

Image Synthesis ◽

Image Sharpness ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Significant Difference ◽

Multiple Sclerosis Patients

Purpose: To improve the signal-to-noise ratio (SNR) of highly accelerated volumetric MRI while preserve realistic textures using a generative adversarial network (GAN). Methods: A hybrid GAN for denoising entitled "HDnGAN" with a 3D generator and a 2D discriminator was proposed to denoise 3D T2-weighted fluid-attenuated inversion recovery (FLAIR) images acquired in 2.75 minutes (R=3×2) using wave-controlled aliasing in parallel imaging (Wave-CAIPI). HDnGAN was trained on data from 25 multiple sclerosis patients by minimizing a combined mean squared error and adversarial loss with adjustable weight λ. Results were evaluated on eight separate patients by comparing to standard T2-SPACE FLAIR images acquired in 7.25 minutes (R=2×2) using mean absolute error (MAE), peak SNR (PSNR), structural similarity index (SSIM), and VGG perceptual loss, and by two neuroradiologists using a five-point score regarding gray-white matter contrast, sharpness, SNR, lesion conspicuity, and overall quality. Results: HDnGAN (λ=0) produced the lowest MAE, highest PSNR and SSIM. HDnGAN (λ=10-3) produced the lowest VGG loss. In the reader study, HDnGAN (λ=10-3) significantly improved the gray-white contrast and SNR of Wave-CAIPI images, and outperformed BM4D and HDnGAN (λ=0) regarding image sharpness. The overall quality score from HDnGAN (λ=10-3) was significantly higher than those from Wave-CAIPI, BM4D, and HDnGAN (λ=0), with no significant difference compared to standard images. Conclusion: HDnGAN concurrently benefits from improved image synthesis performance of 3D convolution and increased training samples for training the 2D discriminator on limited data. HDnGAN generates images with high SNR and realistic textures, similar to those acquired in longer times and preferred by neuroradiologists.

Download Full-text