Recurrent Relational Memory Network for Unsupervised Image Captioning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/128 ◽

2020 ◽

Author(s):

Dan Guo ◽

Yang Wang ◽

Peipei Song ◽

Meng Wang

Keyword(s):

Generative Adversarial Networks ◽

Relational Reasoning ◽

Relational Memory ◽

Image Captioning ◽

Adversarial Networks ◽

The Arts ◽

Benchmark Datasets ◽

Unsupervised Training ◽

Visual Concepts ◽

Memory Network

Unsupervised image captioning with no annotations is an emerging challenge in computer vision, where the existing arts usually adopt GAN (Generative Adversarial Networks) models. In this paper, we propose a novel memory-based network rather than GAN, named Recurrent Relational Memory Network (R2M). Unlike complicated and sensitive adversarial learning that non-ideally performs for long sentence generation, R2M implements a concepts-to-sentence memory translator through two-stage memory mechanisms: fusion and recurrent memories, correlating the relational reasoning between common visual concepts and the generated words for long periods. R2M encodes visual context through unsupervised training on images, while enabling the memory to learn from irrelevant textual corpus via supervised fashion. Our solution enjoys less learnable parameters and higher computational efficiency than GAN-based methods, which heavily bear parameter sensitivity. We experimentally validate the superiority of R2M than state-of-the-arts on all benchmark datasets.

Download Full-text

RIS-GAN: Explore Residual and Illumination with Generative Adversarial Networks for Shadow Removal

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6979 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12829-12836 ◽

Cited By ~ 4

Author(s):

Ling Zhang ◽

Chengjiang Long ◽

Xiaolong Zhang ◽

Chunxia Xiao

Keyword(s):

Ground Truth ◽

Superior Performance ◽

Generative Adversarial Networks ◽

Shadow Removal ◽

Illumination Estimation ◽

Adversarial Networks ◽

The Arts ◽

Benchmark Datasets ◽

Coarse To Fine ◽

Ground Truth Information

Residual images and illumination estimation have been proved very helpful in image enhancement. In this paper, we propose a general and novel framework RIS-GAN which explores residual and illumination with Generative Adversarial Networks for shadow removal. Combined with the coarse shadow-removal image, the estimated negative residual images and inverse illumination maps can be used to generate indirect shadow-removal images to refine the coarse shadow-removal result to the fine shadow-free image in a coarse-to-fine fashion. Three discriminators are designed to distinguish whether the predicted negative residual images, shadow-removal images, and the inverse illumination maps are real or fake jointly compared with the corresponding ground-truth information. To our best knowledge, we are the first one to explore residual and illumination for shadow removal. We evaluate our proposed method on two benchmark datasets, i.e., SRD and ISTD, and the extensive experiments demonstrate that our proposed method achieves the superior performance to state-of-the-arts, although we have no particular shadow-aware components designed in our generators.

Download Full-text

Deep image synthesis from intuitive user input: A review and perspectives

Computational Visual Media ◽

10.1007/s41095-021-0234-8 ◽

2021 ◽

Vol 8 (1) ◽

pp. 3-31

Author(s):

Yuan Xue ◽

Yuan-Chen Guo ◽

Han Zhang ◽

Tao Xu ◽

Song-Hai Zhang ◽

...

Keyword(s):

Image Synthesis ◽

Generative Models ◽

Generative Adversarial Networks ◽

Image Generation ◽

Art And Design ◽

User Input ◽

Adversarial Networks ◽

Benchmark Datasets ◽

Deep Image ◽

Realistic Images

AbstractIn many applications of computer graphics, art, and design, it is desirable for a user to provide intuitive non-image input, such as text, sketch, stroke, graph, or layout, and have a computer system automatically generate photo-realistic images according to that input. While classically, works that allow such automatic image content generation have followed a framework of image retrieval and composition, recent advances in deep generative models such as generative adversarial networks (GANs), variational autoencoders (VAEs), and flow-based methods have enabled more powerful and versatile image generation approaches. This paper reviews recent works for image synthesis given intuitive user input, covering advances in input versatility, image generation methodology, benchmark datasets, and evaluation metrics. This motivates new perspectives on input representation and interactivity, cross fertilization between major image generation paradigms, and evaluation and comparison of generation methods.

Download Full-text

Deep Convolutional Sum-Product Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013248 ◽

2019 ◽

Vol 33 ◽

pp. 3248-3255

Author(s):

Cory J. Butz ◽

Jhonatan S. Oliveira ◽

André E. Dos Santos ◽

André L. Teixeira

Keyword(s):

Probabilistic Reasoning ◽

Mean Squared Error ◽

Salient Feature ◽

Generative Adversarial Networks ◽

Image Sampling ◽

Adversarial Networks ◽

Squared Error ◽

Benchmark Datasets ◽

Most Probable Explanation ◽

Marginal Inference

We give conditions under which convolutional neural networks (CNNs) define valid sum-product networks (SPNs). One subclass, called convolutional SPNs (CSPNs), can be implemented using tensors, but also can suffer from being too shallow. Fortunately, tensors can be augmented while maintaining valid SPNs. This yields a larger subclass of CNNs, which we call deep convolutional SPNs (DCSPNs), where the convolutional and sum-pooling layers form rich directed acyclic graph structures. One salient feature of DCSPNs is that they are a rigorous probabilistic model. As such, they can exploit multiple kinds of probabilistic reasoning, including marginal inference and most probable explanation (MPE) inference. This allows an alternative method for learning DCSPNs using vectorized differentiable MPE, which plays a similar role to the generator in generative adversarial networks (GANs). Image sampling is yet another application demonstrating the robustness of DCSPNs. Our preliminary results on image sampling are encouraging, since the DCSPN sampled images exhibit variability. Experiments on image completion show that DCSPNs significantly outperform competing methods by achieving several state-of-the-art mean squared error (MSE) scores in both left-completion and bottom-completion in benchmark datasets.

Download Full-text

TextCycleGAN: cyclical-generative adversarial networks for image captioning

Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III ◽

10.1117/12.2585549 ◽

2021 ◽

Author(s):

Mohammad Alam ◽

Nicole Isoda ◽

Mitch Manzanares ◽

Anthony Delgado ◽

Antonius F. Panggabean

Keyword(s):

Generative Adversarial Networks ◽

Image Captioning ◽

Adversarial Networks

Download Full-text

Keyphrase Generation for Scientific Articles Using GANs (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7238 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13931-13932

Author(s):

Avinash Swaminathan ◽

Raj Kuwar Gupta ◽

Haimin Zhang ◽

Debanjan Mahata ◽

Rakesh Gosangi ◽

...

Keyword(s):

State Of The Art ◽

Generative Adversarial Networks ◽

Scientific Article ◽

Adversarial Networks ◽

Benchmark Datasets ◽

Art Performance

In this paper, we present a keyphrase generation approach using conditional Generative Adversarial Networks (GAN). In our GAN model, the generator outputs a sequence of keyphrases based on the title and abstract of a scientific article. The discriminator learns to distinguish between machine-generated and human-curated keyphrases. We evaluate this approach on standard benchmark datasets. Our model achieves state-of-the-art performance in generation of abstractive keyphrases and is also comparable to the best performing extractive techniques. We also demonstrate that our method generates more diverse keyphrases and make our implementation publicly available1.

Download Full-text

Unsupervised Training Data Generation of Handwritten Formulas using Generative Adversarial Networks with Self-Attention

10.1145/3463945.3469059 ◽

2021 ◽

Author(s):

Matthias Springstein ◽

Eric Müller-Budack ◽

Ralph Ewerth

Keyword(s):

Training Data ◽

Generative Adversarial Networks ◽

Data Generation ◽

Adversarial Networks ◽

Unsupervised Training

Download Full-text

Distilling Portable Generative Adversarial Networks for Image Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5765 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3585-3592 ◽

Cited By ~ 1

Author(s):

Hanting Chen ◽

Yunhe Wang ◽

Han Shu ◽

Changyuan Wen ◽

Chunjing Xu ◽

...

Keyword(s):

Generative Models ◽

Generative Adversarial Networks ◽

Adversarial Networks ◽

Image Translation ◽

Level Information ◽

Benchmark Datasets ◽

Knowledge Distillation ◽

Network Compression ◽

High Level ◽

And Storage

Despite Generative Adversarial Networks (GANs) have been widely used in various image-to-image translation tasks, they can be hardly applied on mobile devices due to their heavy computation and storage cost. Traditional network compression methods focus on visually recognition tasks, but never deal with generation tasks. Inspired by knowledge distillation, a student generator of fewer parameters is trained by inheriting the low-level and high-level information from the original heavy teacher generator. To promote the capability of student generator, we include a student discriminator to measure the distances between real images, and images generated by student and teacher generators. An adversarial learning process is therefore established to optimize student generator and student discriminator. Qualitative and quantitative analysis by conducting experiments on benchmark datasets demonstrate that the proposed method can learn portable generative models with strong performance.

Download Full-text

One Network for Multi-Domains: Domain Adaptive Hashing with Intersectant Generative Adversarial Networks

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/344 ◽

2019 ◽

Cited By ~ 1

Author(s):

Tao He ◽

Yuan-Fang Li ◽

Lianli Gao ◽

Dongxiang Zhang ◽

Jingkuan Song

Keyword(s):

Adaptive Learning ◽

Digital Data ◽

Generative Adversarial Networks ◽

Target Domain ◽

Source Domain ◽

Learning Framework ◽

Common Space ◽

Adversarial Networks ◽

Shared Space ◽

Benchmark Datasets

With the recent explosive increase of digital data, image recognition and retrieval become a critical practical application. Hashing is an effective solution to this problem, due to its low storage requirement and high query speed. However, most of past works focus on hashing in a single (source) domain. Thus, the learned hash function may not adapt well in a new (target) domain that has a large distributional difference with the source domain. In this paper, we explore an end-to-end domain adaptive learning framework that simultaneously and precisely generates discriminative hash codes and classifies target domain images. Our method encodes two domains images into a semantic common space, followed by two independent generative adversarial networks arming at crosswise reconstructing two domains’ images, reducing domain disparity and improving alignment in the shared space. We evaluate our framework on four public benchmark datasets, all of which show that our method is superior to the other state-of-the-art methods on the tasks of object recognition and image retrieval.

Download Full-text

A Novel Image Captioning Method Based on Generative Adversarial Networks

Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series - Lecture Notes in Computer Science ◽

10.1007/978-3-030-30490-4_23 ◽

2019 ◽

pp. 281-292

Author(s):

Yang Fan ◽

Jungang Xu ◽

Yingfei Sun ◽

Yiyu Wang

Keyword(s):

Generative Adversarial Networks ◽

Image Captioning ◽

Adversarial Networks

Download Full-text

Interactive Dual Generative Adversarial Networks for Image Captioning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6826 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11588-11595 ◽

Cited By ~ 2

Author(s):

Junhao Liu ◽

Kai Wang ◽

Chunpu Xu ◽

Zhou Zhao ◽

Ruifeng Xu ◽

...

Keyword(s):

Generative Adversarial Networks ◽

Image Captioning ◽

Generative Adversarial Network ◽

Informative Feedback ◽

Adversarial Network ◽

Adversarial Networks

Image captioning is usually built on either generation-based or retrieval-based approaches. Both ways have certain strengths but suffer from their own limitations. In this paper, we propose an Interactive Dual Generative Adversarial Network (IDGAN) for image captioning, which mutually combines the retrieval-based and generation-based methods to learn a better image captioning ensemble. IDGAN consists of two generators and two discriminators, where the generation- and retrieval-based generators mutually benefit from each other's complementary targets that are learned from two dual adversarial discriminators. Specifically, the generation- and retrieval-based generators provide improved synthetic and retrieved candidate captions with informative feedback signals from the two respective discriminators that are trained to distinguish the generated captions from the true captions and assign top rankings to true captions respectively, thus featuring the merits of both retrieval-based and generation-based approaches. Extensive experiments on MSCOCO dataset demonstrate that the proposed IDGAN model significantly outperforms the compared methods for image captioning.

Download Full-text