scholarly journals Deep Multi-Task Learning with Adversarial-and-Cooperative Nets

Author(s):  
Pei Yang ◽  
Qi Tan ◽  
Jieping Ye ◽  
Hanghang Tong ◽  
Jingrui He

In this paper, we propose a deep multi-Task learning model based on Adversarial-and-COoperative nets (TACO). The goal is to use an adversarial-and-cooperative strategy to decouple the task-common and task-specific knowledge, facilitating the fine-grained knowledge sharing among tasks. TACO accommodates multiple game players, i.e., feature extractors, domain discriminator, and tri-classifiers. They play the MinMax games adversarially and cooperatively to distill the task-common and task-specific features, while respecting their discriminative structures. Moreover, it adopts a divide-and-combine strategy to leverage the decoupled multi-view information to further improve the generalization performance of the model. The experimental results show that our proposed method significantly outperforms the state-of-the-art algorithms on the benchmark datasets in both multi-task learning and semi-supervised domain adaptation scenarios.

Author(s):  
Jie Yang ◽  
Zhiquan Qi ◽  
Yong Shi

This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures --- edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.


Author(s):  
Ziliang Cai ◽  
Lingyue Wang ◽  
Miaomiao Guo ◽  
Guizhi Xu ◽  
Lei Guo ◽  
...  

Emotion plays a significant role in human daily activities, and it can be effectively recognized from EEG signals. However, individual variability limits the generalization of emotion classifiers across subjects. Domain adaptation (DA) is a reliable method to solve the issue. Due to the nonstationarity of EEG, the inferior-quality source domain data bring negative transfer in DA procedures. To solve this problem, an auto-augmentation joint distribution adaptation (AA-JDA) method and a burden-lightened and source-preferred JDA (BLSP-JDA) approach are proposed in this paper. The methods are based on a novel transfer idea, learning the specific knowledge of the target domain from the samples that are appropriate for transfer, which reduces the difficulty of transfer between two domains. On multiple emotion databases, our model shows state-of-the-art performance.


2020 ◽  
Vol 34 (07) ◽  
pp. 12605-12612 ◽  
Author(s):  
Jie Yang ◽  
Zhiquan Qi ◽  
Yong Shi

This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures — edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.


2020 ◽  
Vol 34 (08) ◽  
pp. 13267-13272
Author(s):  
Alex Foo ◽  
Wynne Hsu ◽  
Mong Li Lee ◽  
Gilbert Lim ◽  
Tien Yin Wong

Although deep learning for Diabetic Retinopathy (DR) screening has shown great success in achieving clinically acceptable accuracy for referable versus non-referable DR, there remains a need to provide more fine-grained grading of the DR severity level as well as automated segmentation of lesions (if any) in the retina images. We observe that the DR severity level of an image is dependent on the presence of different types of lesions and their prevalence. In this work, we adopt a multi-task learning approach to perform the DR grading and lesion segmentation tasks. In light of the lack of lesion segmentation mask ground-truths, we further propose a semi-supervised learning process to obtain the segmentation masks for the various datasets. Experiments results on publicly available datasets and a real world dataset obtained from population screening demonstrate the effectiveness of the multi-task solution over state-of-the-art networks.


Author(s):  
Yonghao Xu ◽  
Bo Du ◽  
Lefei Zhang ◽  
Qian Zhang ◽  
Guoli Wang ◽  
...  

Recent years have witnessed the great success of deep learning models in semantic segmentation. Nevertheless, these models may not generalize well to unseen image domains due to the phenomenon of domain shift. Since pixel-level annotations are laborious to collect, developing algorithms which can adapt labeled data from source domain to target domain is of great significance. To this end, we propose self-ensembling attention networks to reduce the domain gap between different datasets. To the best of our knowledge, the proposed method is the first attempt to introduce selfensembling model to domain adaptation for semantic segmentation, which provides a different view on how to learn domain-invariant features. Besides, since different regions in the image usually correspond to different levels of domain gap, we introduce the attention mechanism into the proposed framework to generate attention-aware features, which are further utilized to guide the calculation of consistency loss in the target domain. Experiments on two benchmark datasets demonstrate that the proposed framework can yield competitive performance compared with the state of the art methods.


Author(s):  
Wenya Wang ◽  
Sinno Jialin Pan

In fine-grained opinion mining, aspect and opinion terms extraction has become a fundamental task that provides key information for user-generated texts. Despite its importance, a lack of annotated resources in many domains impede the ability to train a precise model. Very few attempts have applied unsupervised domain adaptation methods to transfer fine-grained knowledge (in the word level) from some labeled source domain(s) to any unlabeled target domain. Existing methods depend on the construction of “pivot” knowledge, e.g., common opinion terms or syntactic relations between aspect and opinion words. In this work, we propose an interactive memory network that consists of local and global memory units. The model could exploit both local and global memory interactions to capture intra-correlations among aspect words or opinion words themselves, as well as the interconnections between aspect and opinion words. The source space and the target space are aligned through these domaininvariant interactions by incorporating an auxiliary task and domain adversarial networks. The proposed model does not require any external resources and demonstrates promising results on 3 benchmark datasets.


Author(s):  
Pin Jiang ◽  
Aming Wu ◽  
Yahong Han ◽  
Yunfeng Shao ◽  
Meiyu Qi ◽  
...  

Semi-supervised domain adaptation (SSDA) is a novel branch of machine learning that scarce labeled target examples are available, compared with unsupervised domain adaptation. To make effective use of these additional data so as to bridge the domain gap, one possible way is to generate adversarial examples, which are images with additional perturbations, between the two domains and fill the domain gap. Adversarial training has been proven to be a powerful method for this purpose. However, the traditional adversarial training adds noises in arbitrary directions, which is inefficient to migrate between domains, or generate directional noises from the source to target domain and reverse. In this work, we devise a general bidirectional adversarial training method and employ gradient to guide adversarial examples across the domain gap, i.e., the Adaptive Adversarial Training (AAT) for source to target domain and Entropy-penalized Virtual Adversarial Training (E-VAT) for target to source domain. Particularly, we devise a Bidirectional Adversarial Training (BiAT) network to perform diverse adversarial trainings jointly. We evaluate the effectiveness of BiAT on three benchmark datasets and experimental results demonstrate the proposed method achieves the state-of-the-art.


Author(s):  
Jun Wen ◽  
Risheng Liu ◽  
Nenggan Zheng ◽  
Qian Zheng ◽  
Zhefeng Gong ◽  
...  

Unsupervised domain adaptation methods aim to alleviate performance degradation caused by domain-shift by learning domain-invariant representations. Existing deep domain adaptation methods focus on holistic feature alignment by matching source and target holistic feature distributions, without considering local features and their multi-mode statistics. We show that the learned local feature patterns are more generic and transferable and a further local feature distribution matching enables fine-grained feature alignment. In this paper, we present a method for learning domain-invariant local feature patterns and jointly aligning holistic and local feature statistics. Comparisons to the state-of-the-art unsupervised domain adaptation methods on two popular benchmark datasets demonstrate the superiority of our approach and its effectiveness on alleviating negative transfer.


2020 ◽  
Vol 34 (07) ◽  
pp. 12152-12159
Author(s):  
Hao Wang ◽  
Cheng Deng ◽  
Fan Ma ◽  
Yi Yang

Actor and action video segmentation with language queries aims to segment out the expression referred objects in the video. This process requires comprehensive language reasoning and fine-grained video understanding. Previous methods mainly leverage dynamic convolutional networks to match visual and semantic representations. However, the dynamic convolution neglects spatial context when processing each region in the frame and is thus challenging to segment similar objects in the complex scenarios. To address such limitation, we construct a context modulated dynamic convolutional network. Specifically, we propose a context modulated dynamic convolutional operation in the proposed framework. The kernels for the specific region are generated from both language sentences and surrounding context features. Moreover, we devise a temporal encoder to incorporate motions into the visual features to further match the query descriptions. Extensive experiments on two benchmark datasets, Actor-Action Dataset Sentences (A2D Sentences) and J-HMDB Sentences, demonstrate that our proposed approach notably outperforms state-of-the-art methods.


Computation ◽  
2019 ◽  
Vol 7 (2) ◽  
pp. 20 ◽  
Author(s):  
Yunfei Teng ◽  
Anna Choromanska

The unsupervised image-to-image translation aims at finding a mapping between the source ( A ) and target ( B ) image domains, where in many applications aligned image pairs are not available at training. This is an ill-posed learning problem since it requires inferring the joint probability distribution from marginals. Joint learning of coupled mappings F A B : A → B and F B A : B → A is commonly used by the state-of-the-art methods, like CycleGAN to learn this translation by introducing cycle consistency requirement to the learning problem, i.e., F A B ( F B A ( B ) ) ≈ B and F B A ( F A B ( A ) ) ≈ A . Cycle consistency enforces the preservation of the mutual information between input and translated images. However, it does not explicitly enforce F B A to be an inverse operation to F A B . We propose a new deep architecture that we call invertible autoencoder (InvAuto) to explicitly enforce this relation. This is done by forcing an encoder to be an inverted version of the decoder, where corresponding layers perform opposite mappings and share parameters. The mappings are constrained to be orthonormal. The resulting architecture leads to the reduction of the number of trainable parameters (up to 2 times). We present image translation results on benchmark datasets and demonstrate state-of-the art performance of our approach. Finally, we test the proposed domain adaptation method on the task of road video conversion. We demonstrate that the videos converted with InvAuto have high quality and show that the NVIDIA neural-network-based end-to-end learning system for autonomous driving, known as PilotNet, trained on real road videos performs well when tested on the converted ones.


Sign in / Sign up

Export Citation Format

Share Document