Better Understanding: Stylized Image Captioning with Style Attention and Adversarial Training

Tibetan is a low-resource language. In order to alleviate the shortage of parallel corpus between Tibetan and Chinese, this paper uses two monolingual corpora and a small number of seed dictionaries to learn the semi-supervised method with seed dictionaries and self-supervised adversarial training method through the similarity calculation of word clusters in different embedded spaces and puts forward an improved self-supervised adversarial learning method of Tibetan and Chinese monolingual data alignment only. The experimental results are as follows. First, the experimental results of Tibetan syllables Chinese characters are not good, which reflects the weak semantic correlation between Tibetan syllables and Chinese characters; second, the seed dictionary of semi-supervised method made before 10 predicted word accuracy of 66.5 (Tibetan - Chinese) and 74.8 (Chinese - Tibetan) results, to improve the self-supervision methods in both language directions have reached 53.5 accuracy.

Download Full-text

Multi-Channel Convolutional Neural Networks with Adversarial Training for Few-Shot Relation Classification (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7256 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13967-13968

Author(s):

Yuxiang Xie ◽

Hua Xu ◽

Congcong Yang ◽

Kai Gao

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Word Embedding ◽

Vector Representation ◽

Adversarial Learning ◽

Model Experiments ◽

Robust Model ◽

Adversarial Training ◽

Relation Classification

The distant supervised (DS) method has improved the performance of relation classification (RC) by means of extending the dataset. However, DS also brings the problem of wrong labeling. Contrary to DS, the few-shot method relies on few supervised data to predict the unseen classes. In this paper, we use word embedding and position embedding to construct multi-channel vector representation and use the multi-channel convolutional method to extract features of sentences. Moreover, in order to alleviate few-shot learning to be sensitive to overfitting, we introduce adversarial learning for training a robust model. Experiments on the FewRel dataset show that our model achieves significant and consistent improvements on few-shot RC as compared with baselines.

Download Full-text

Improving Diversity of Image Captioning Through Variational Autoencoders and Adversarial Learning

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) ◽

10.1109/wacv.2019.00034 ◽

2019 ◽

Cited By ~ 1

Author(s):

Li Ren ◽

Guo-Jun Qi ◽

Kien Hua

Keyword(s):

Image Captioning ◽

Adversarial Learning

Download Full-text

Discriminative Adversarial Domain Adaptation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6054 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5940-5947 ◽

Cited By ~ 2

Author(s):

Hui Tang ◽

Kui Jia

Keyword(s):

Domain Adaptation ◽

Set Domain ◽

Target Domain ◽

Adversarial Learning ◽

Closed Set ◽

Joint Distributions ◽

Open Set ◽

Benchmark Datasets ◽

Adversarial Training ◽

Minimax Game

Given labeled instances on a source domain and unlabeled ones on a target domain, unsupervised domain adaptation aims to learn a task classifier that can well classify target instances. Recent advances rely on domain-adversarial training of deep networks to learn domain-invariant features. However, due to an issue of mode collapse induced by the separate design of task and domain classifiers, these methods are limited in aligning the joint distributions of feature and category across domains. To overcome it, we propose a novel adversarial learning method termed Discriminative Adversarial Domain Adaptation (DADA). Based on an integrated category and domain classifier, DADA has a novel adversarial objective that encourages a mutually inhibitory relation between category and domain predictions for any input instance. We show that under practical conditions, it defines a minimax game that can promote the joint distribution alignment. Except for the traditional closed set domain adaptation, we also extend DADA for extremely challenging problem settings of partial and open set domain adaptation. Experiments show the efficacy of our proposed methods and we achieve the new state of the art for all the three settings on benchmark datasets.

Download Full-text

Comparison of domain adaptation techniques for white matter hyperintensity segmentation in brain MR images

10.1101/2021.03.12.435171 ◽

2021 ◽

Author(s):

Vaanathi Sundaresan ◽

Giovanna Zamboni ◽

Nicola K. Dinsdale ◽

Peter M. Rothwell ◽

Ludovica Griffanti ◽

...

Keyword(s):

White Matter ◽

Transfer Learning ◽

Domain Adaptation ◽

White Matter Hyperintensity ◽

Mr Images ◽

Target Domain ◽

Baseline Model ◽

Adversarial Learning ◽

Adversarial Training ◽

Brain Mr Images

AbstractRobust automated segmentation of white matter hyperintensities (WMHs) in different datasets (domains) is highly challenging due to differences in acquisition (scanner, sequence), population (WMH amount and location) and limited availability of manual segmentations to train supervised algorithms. In this work we explore various domain adaptation techniques such as transfer learning and domain adversarial learning methods, including domain adversarial neural networks and domain unlearning, to improve the generalisability of our recently proposed triplanar ensemble network, which is our baseline model. We evaluated the domain adaptation techniques on source and target domains consisting of 5 different datasets with variations in intensity profile, lesion characteristics and acquired using different scanners. For transfer learning, we also studied various training options such as minimal number of unfrozen layers and subjects required for finetuning in the target domain. On comparing the performance of different techniques on the target dataset, unsupervised domain adversarial training of neural network gave the best performance, making the technique promising for robust WMH segmentation.

Download Full-text

Visual Question Answering Through Adversarial Learning of Multi-modal Representation

10.36227/techrxiv.12731948 ◽

2020 ◽

Author(s):

Iqbal Chowdhury ◽

Kien Nguyen Thanh ◽

Clinton fookes ◽

Sridha Sridharan

Keyword(s):

Natural Language ◽

Question Answering ◽

Feature Representation ◽

Fusion Method ◽

Adversarial Learning ◽

Visual Question Answering ◽

Proposed Model ◽

Fusion Methods ◽

Adversarial Training ◽

Multimodal Representation

Solving the Visual Question Answering (VQA) task is a step towards achieving human-like reasoning capability of the machines. This paper proposes an approach to learn multimodal feature representation with adversarial training. The purpose of the adversarial training allows the model to learn from standard fusion methods in an unsupervised manner. The discriminator model is equipped with a siamese combinatin of two standard fusion method namely multimodal compact bilinear pooling and multimodal tucker fusion. Output multimodal feature representation from generator is a resultant of graph convolutional operation. The resultant multimodal representation of the adversarial training allows the proposed model to infer the correct answers from open-ended natural language questions from the VQA 2.0 dataset. An overall accuracy of 69.86\% demonstrates the accuracy of the proposed model.

Download Full-text

Visual Question Answering Through Adversarial Learning of Multi-modal Representation

10.36227/techrxiv.12731948.v1 ◽

2020 ◽

Author(s):

Iqbal Chowdhury ◽

Kien Nguyen Thanh ◽

Clinton fookes ◽

Sridha Sridharan

Keyword(s):

Natural Language ◽

Question Answering ◽

Feature Representation ◽

Fusion Method ◽

Adversarial Learning ◽

Visual Question Answering ◽

Proposed Model ◽

Fusion Methods ◽

Adversarial Training ◽

Multimodal Representation

Solving the Visual Question Answering (VQA) task is a step towards achieving human-like reasoning capability of the machines. This paper proposes an approach to learn multimodal feature representation with adversarial training. The purpose of the adversarial training allows the model to learn from standard fusion methods in an unsupervised manner. The discriminator model is equipped with a siamese combinatin of two standard fusion method namely multimodal compact bilinear pooling and multimodal tucker fusion. Output multimodal feature representation from generator is a resultant of graph convolutional operation. The resultant multimodal representation of the adversarial training allows the proposed model to infer the correct answers from open-ended natural language questions from the VQA 2.0 dataset. An overall accuracy of 69.86\% demonstrates the accuracy of the proposed model.

Download Full-text

Adversarial Training for Image Captioning Incorporating Relation Attention

10.1007/978-3-030-89188-6_15 ◽

2021 ◽

pp. 197-211

Author(s):

Tianyu Chen ◽

Zhixin Li ◽

Canlong Zhang ◽

Huifang Ma

Keyword(s):

Image Captioning ◽

Adversarial Training

Download Full-text

Self-adversarial Training and Attention for Multi-task Wheat Phenotyping

Applied Engineering in Agriculture ◽

10.13031/aea.13406 ◽

2019 ◽

Vol 35 (6) ◽

pp. 1009-1014 ◽

Cited By ~ 1

Author(s):

Gensheng Hu ◽

Lidong Qian ◽

Dong Liang ◽

Mingzhu Wan

Keyword(s):

Deep Learning ◽

Precision Agriculture ◽

Attention Mechanism ◽

Learning Ability ◽

Learning Networks ◽

Data Set ◽

Adversarial Networks ◽

Previous State ◽

Adversarial Training ◽

Agriculture Management

Abstract. Phenotypic monitoring provides important data support for precision agriculture management. This study proposes a deep learning-based method to gain an accurate count of wheat ears and spikelets. The deep learning networks incorporate self-adversarial training and attention mechanism with stacked hourglass networks. Four stacked hourglass networks follow a holistic attention map to construct a generator of self-adversarial networks. The holistic attention maps enable the networks to focus on the overall consistency of the whole wheat. The discriminator of self-adversarial networks displays the same structure as the generator, which causes adversarial loss to the generator. This process improves the generator’s learning ability and prediction accuracy for occluded wheat ears. This method yields higher wheat ear count in the Annotated Crop Image Database (ACID) data set than the previous state-of-the-art algorithm. Keywords: Attention mechanism, Plant phenotype, Self-adversarial networks, Stacked hourglass.

Download Full-text

Improving Hypernymy Prediction via Taxonomy Enhanced Adversarial Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017128 ◽

2019 ◽

Vol 33 ◽

pp. 7128-7135

Author(s):

Chengyu Wang ◽

Xiaofeng He ◽

Aoying Zhou

Keyword(s):

Computational Linguistics ◽

Prediction Models ◽

Semantic Relation ◽

Semantic Relations ◽

Training Algorithm ◽

Adversarial Learning ◽

Generic Concept ◽

Adversarial Training ◽

Basic Semantic ◽

Projection Network

Hypernymy is a basic semantic relation in computational linguistics that expresses the “is-a” relation between a generic concept and its specific instances, serving as the backbone in taxonomies and ontologies. Although several NLP tasks related to hypernymy prediction have been extensively addressed, few methods have fully exploited the large number of hypernymy relations in Web-scale taxonomies.In this paper, we introduce the Taxonomy Enhanced Adversarial Learning (TEAL) for hypernymy prediction. We first propose an unsupervised measure U-TEAL to distinguish hypernymy with other semantic relations. It is implemented based on a word embedding projection network distantly trained over a taxonomy. To address supervised hypernymy detection tasks, the supervised model S-TEAL and its improved version, the adversarial supervised model AS-TEAL, are further presented. Specifically, AS-TEAL employs a coupled adversarial training algorithm to transfer hierarchical knowledge in taxonomies to hypernymy prediction models. We conduct extensive experiments to confirm the effectiveness of TEAL over three standard NLP tasks: unsupervised hypernymy classification, supervised hypernymy detection and graded lexical entailment. We also show that TEAL can be applied to non-English languages and can detect missing hypernymy relations in taxonomies.

Download Full-text