One Network for Multi-Domains: Domain Adaptive Hashing with Intersectant Generative Adversarial Networks

With the recent explosive increase of digital data, image recognition and retrieval become a critical practical application. Hashing is an effective solution to this problem, due to its low storage requirement and high query speed. However, most of past works focus on hashing in a single (source) domain. Thus, the learned hash function may not adapt well in a new (target) domain that has a large distributional difference with the source domain. In this paper, we explore an end-to-end domain adaptive learning framework that simultaneously and precisely generates discriminative hash codes and classifies target domain images. Our method encodes two domains images into a semantic common space, followed by two independent generative adversarial networks arming at crosswise reconstructing two domains’ images, reducing domain disparity and improving alignment in the shared space. We evaluate our framework on four public benchmark datasets, all of which show that our method is superior to the other state-of-the-art methods on the tasks of object recognition and image retrieval.

Download Full-text

Data-Efficient Domain Adaptation for Semantic Segmentation of Aerial Imagery Using Generative Adversarial Networks

Applied Sciences ◽

10.3390/app10031092 ◽

2020 ◽

Vol 10 (3) ◽

pp. 1092 ◽

Cited By ~ 2

Author(s):

Bilel Benjdira ◽

Adel Ammar ◽

Anis Koubaa ◽

Kais Ouni

Keyword(s):

Domain Adaptation ◽

Semantic Segmentation ◽

Aerial Imagery ◽

Generative Adversarial Networks ◽

Target Domain ◽

Source Domain ◽

Adversarial Networks ◽

Semantic Label ◽

Global Accuracy ◽

The Cost

Despite the significant advances noted in semantic segmentation of aerial imagery, a considerable limitation is blocking its adoption in real cases. If we test a segmentation model on a new area that is not included in its initial training set, accuracy will decrease remarkably. This is caused by the domain shift between the new targeted domain and the source domain used to train the model. In this paper, we addressed this challenge and proposed a new algorithm that uses Generative Adversarial Networks (GAN) architecture to minimize the domain shift and increase the ability of the model to work on new targeted domains. The proposed GAN architecture contains two GAN networks. The first GAN network converts the chosen image from the target domain into a semantic label. The second GAN network converts this generated semantic label into an image that belongs to the source domain but conserves the semantic map of the target image. This resulting image will be used by the semantic segmentation model to generate a better semantic label of the first chosen image. Our algorithm is tested on the ISPRS semantic segmentation dataset and improved the global accuracy by a margin up to 24% when passing from Potsdam domain to Vaihingen domain. This margin can be increased by addition of other labeled data from the target domain. To minimize the cost of supervision in the translation process, we proposed a methodology to use these labeled data efficiently.

Download Full-text

TriGAN: image-to-image translation for multi-source domain adaptation

Machine Vision and Applications ◽

10.1007/s00138-020-01164-4 ◽

2021 ◽

Vol 32 (1) ◽

Author(s):

Subhankar Roy ◽

Aliaksandr Siarohin ◽

Enver Sangineto ◽

Nicu Sebe ◽

Elisa Ricci

Keyword(s):

Domain Adaptation ◽

Image Features ◽

Generative Adversarial Networks ◽

Source Image ◽

Multiple Sources ◽

Target Domain ◽

Invariant Representation ◽

Source Domain ◽

Practical Applications ◽

Adversarial Networks

AbstractMost domain adaptation methods consider the problem of transferring knowledge to the target domain from a single-source dataset. However, in practical applications, we typically have access to multiple sources. In this paper we propose the first approach for multi-source domain adaptation (MSDA) based on generative adversarial networks. Our method is inspired by the observation that the appearance of a given image depends on three factors: the domain, the style (characterized in terms of low-level features variations) and the content. For this reason, we propose to project the source image features onto a space where only the dependence from the content is kept, and then re-project this invariant representation onto the pixel space using the target domain and style. In this way, new labeled images can be generated which are used to train a final target classifier. We test our approach using common MSDA benchmarks, showing that it outperforms state-of-the-art methods.

Download Full-text

Digital Transformation

The Data Imperative ◽

10.1093/oso/9780198840817.003.0001 ◽

2020 ◽

pp. 1-18

Author(s):

Henri Schildt

Keyword(s):

Language Processing ◽

Digital Transformation ◽

Digital Data ◽

Generative Adversarial Networks ◽

Adversarial Networks ◽

Geographic Boundaries ◽

Technological Advances ◽

Data Flows ◽

Speed Up ◽

Systematic Shift

The introductory chapter to the book The Data Imperative examines how technological advances together with a new managerial mindset are driving digital transformation. While early business information systems were often self-contained and designed to solve specific problems, contemporary systems are highly interconnected and integrated. Corporations can use data flows to coordinate diverse processes and activities across organizational and geographic boundaries. The chapter explains how digital transformation involves a systematic shift from predominant reliance on human knowledge and skills to digital data flows and smart algorithms. Artificial intelligence techniques, such as generative adversarial networks and advanced natural language processing, and 5G wireless technologies create new opportunities to replace human routines with algorithmic processing. Data will continue to break down organizational silos, enable deeper collaboration across company boundaries, and speed up the development of new services.

Download Full-text

Multi-Attribute Transfer via Disentangled Representation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019195 ◽

2019 ◽

Vol 33 ◽

pp. 9195-9202 ◽

Cited By ~ 4

Author(s):

Jianfu Zhang ◽

Yuanyuan Huang ◽

Yaoyi Li ◽

Weijie Zhao ◽

Liqing Zhang

Keyword(s):

Neural Network ◽

Facial Expression ◽

Generative Adversarial Networks ◽

Significant Progress ◽

Target Domain ◽

Adversarial Networks ◽

Proposed Model ◽

Image Translation ◽

Realistic Images ◽

Novel Model

Recent studies show significant progress in image-to-image translation task, especially facilitated by Generative Adversarial Networks. They can synthesize highly realistic images and alter the attribute labels for the images. However, these works employ attribute vectors to specify the target domain which diminishes image-level attribute diversity. In this paper, we propose a novel model formulating disentangled representations by projecting images to latent units, grouped feature channels of Convolutional Neural Network, to disassemble the information between different attributes. Thanks to disentangled representation, we can transfer attributes according to the attribute labels and moreover retain the diversity beyond the labels, namely, the styles inside each image. This is achieved by specifying some attributes and swapping the corresponding latent units to “swap” the attributes appearance, or applying channel-wise interpolation to blend different attributes. To verify the motivation of our proposed model, we train and evaluate our model on face dataset CelebA. Furthermore, the evaluation of another facial expression dataset RaFD demonstrates the generalizability of our proposed model.

Download Full-text

Deep image synthesis from intuitive user input: A review and perspectives

Computational Visual Media ◽

10.1007/s41095-021-0234-8 ◽

2021 ◽

Vol 8 (1) ◽

pp. 3-31

Author(s):

Yuan Xue ◽

Yuan-Chen Guo ◽

Han Zhang ◽

Tao Xu ◽

Song-Hai Zhang ◽

...

Keyword(s):

Image Synthesis ◽

Generative Models ◽

Generative Adversarial Networks ◽

Image Generation ◽

Art And Design ◽

User Input ◽

Adversarial Networks ◽

Benchmark Datasets ◽

Deep Image ◽

Realistic Images

AbstractIn many applications of computer graphics, art, and design, it is desirable for a user to provide intuitive non-image input, such as text, sketch, stroke, graph, or layout, and have a computer system automatically generate photo-realistic images according to that input. While classically, works that allow such automatic image content generation have followed a framework of image retrieval and composition, recent advances in deep generative models such as generative adversarial networks (GANs), variational autoencoders (VAEs), and flow-based methods have enabled more powerful and versatile image generation approaches. This paper reviews recent works for image synthesis given intuitive user input, covering advances in input versatility, image generation methodology, benchmark datasets, and evaluation metrics. This motivates new perspectives on input representation and interactivity, cross fertilization between major image generation paradigms, and evaluation and comparison of generation methods.

Download Full-text

Deep Convolutional Sum-Product Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013248 ◽

2019 ◽

Vol 33 ◽

pp. 3248-3255

Author(s):

Cory J. Butz ◽

Jhonatan S. Oliveira ◽

André E. Dos Santos ◽

André L. Teixeira

Keyword(s):

Probabilistic Reasoning ◽

Mean Squared Error ◽

Salient Feature ◽

Generative Adversarial Networks ◽

Image Sampling ◽

Adversarial Networks ◽

Squared Error ◽

Benchmark Datasets ◽

Most Probable Explanation ◽

Marginal Inference

We give conditions under which convolutional neural networks (CNNs) define valid sum-product networks (SPNs). One subclass, called convolutional SPNs (CSPNs), can be implemented using tensors, but also can suffer from being too shallow. Fortunately, tensors can be augmented while maintaining valid SPNs. This yields a larger subclass of CNNs, which we call deep convolutional SPNs (DCSPNs), where the convolutional and sum-pooling layers form rich directed acyclic graph structures. One salient feature of DCSPNs is that they are a rigorous probabilistic model. As such, they can exploit multiple kinds of probabilistic reasoning, including marginal inference and most probable explanation (MPE) inference. This allows an alternative method for learning DCSPNs using vectorized differentiable MPE, which plays a similar role to the generator in generative adversarial networks (GANs). Image sampling is yet another application demonstrating the robustness of DCSPNs. Our preliminary results on image sampling are encouraging, since the DCSPN sampled images exhibit variability. Experiments on image completion show that DCSPNs significantly outperform competing methods by achieving several state-of-the-art mean squared error (MSE) scores in both left-completion and bottom-completion in benchmark datasets.

Download Full-text

Self-Ensembling Attention Networks: Addressing Domain Shift for Semantic Segmentation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015581 ◽

2019 ◽

Vol 33 ◽

pp. 5581-5588 ◽

Cited By ~ 3

Author(s):

Yonghao Xu ◽

Bo Du ◽

Lefei Zhang ◽

Qian Zhang ◽

Guoli Wang ◽

...

Keyword(s):

Domain Adaptation ◽

State Of The Art ◽

Semantic Segmentation ◽

Great Success ◽

Learning Models ◽

Target Domain ◽

Attention Networks ◽

Source Domain ◽

Benchmark Datasets ◽

Different Levels

Recent years have witnessed the great success of deep learning models in semantic segmentation. Nevertheless, these models may not generalize well to unseen image domains due to the phenomenon of domain shift. Since pixel-level annotations are laborious to collect, developing algorithms which can adapt labeled data from source domain to target domain is of great significance. To this end, we propose self-ensembling attention networks to reduce the domain gap between different datasets. To the best of our knowledge, the proposed method is the first attempt to introduce selfensembling model to domain adaptation for semantic segmentation, which provides a different view on how to learn domain-invariant features. Besides, since different regions in the image usually correspond to different levels of domain gap, we introduce the attention mechanism into the proposed framework to generate attention-aware features, which are further utilized to guide the calculation of consistency loss in the target domain. Experiments on two benchmark datasets demonstrate that the proposed framework can yield competitive performance compared with the state of the art methods.

Download Full-text

Unified Generative Adversarial Networks for Multidomain Fingerprint Presentation Attack Detection

Entropy ◽

10.3390/e23081089 ◽

2021 ◽

Vol 23 (8) ◽

pp. 1089

Author(s):

Soha B. Sandouka ◽

Yakoub Bazi ◽

Haikel Alhichri ◽

Naif Alajlan

Keyword(s):

Attack Detection ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Source Domain ◽

The Public ◽

Adversarial Network ◽

Adversarial Networks ◽

Biometric Systems ◽

Security And Reliability ◽

Presentation Attack Detection

With the rapid growth of fingerprint-based biometric systems, it is essential to ensure the security and reliability of the deployed algorithms. Indeed, the security vulnerability of these systems has been widely recognized. Thus, it is critical to enhance the generalization ability of fingerprint presentation attack detection (PAD) cross-sensor and cross-material settings. In this work, we propose a novel solution for addressing the case of a single source domain (sensor) with large labeled real/fake fingerprint images and multiple target domains (sensors) with only few real images obtained from different sensors. Our aim is to build a model that leverages the limited sample issues in all target domains by transferring knowledge from the source domain. To this end, we train a unified generative adversarial network (UGAN) for multidomain conversion to learn several mappings between all domains. This allows us to generate additional synthetic images for the target domains from the source domain to reduce the distribution shift between fingerprint representations. Then, we train a scale compound network (EfficientNetV2) coupled with multiple head classifiers (one classifier for each domain) using the source domain and the translated images. The outputs of these classifiers are then aggregated using an additional fusion layer with learnable weights. In the experiments, we validate the proposed methodology on the public LivDet2015 dataset. The experimental results show that the proposed method improves the average classification accuracy over twelve classification scenarios from 67.80 to 80.44% after adaptation.

Download Full-text

Multi-Source Domain Adaptation for Visual Sentiment Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5651 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2661-2668

Author(s):

Chuang Lin ◽

Sicheng Zhao ◽

Lei Meng ◽

Tat-Seng Chua

Keyword(s):

Domain Adaptation ◽

Sentiment Classification ◽

Similar Distribution ◽

Single Source ◽

Target Domain ◽

Generative Adversarial Network ◽

Source Domain ◽

Adversarial Network ◽

Latent Space ◽

Benchmark Datasets

Existing domain adaptation methods on visual sentiment classification typically are investigated under the single-source scenario, where the knowledge learned from a source domain of sufficient labeled data is transferred to the target domain of loosely labeled or unlabeled data. However, in practice, data from a single source domain usually have a limited volume and can hardly cover the characteristics of the target domain. In this paper, we propose a novel multi-source domain adaptation (MDA) method, termed Multi-source Sentiment Generative Adversarial Network (MSGAN), for visual sentiment classification. To handle data from multiple source domains, it learns to find a unified sentiment latent space where data from both the source and target domains share a similar distribution. This is achieved via cycle consistent adversarial learning in an end-to-end manner. Extensive experiments conducted on four benchmark datasets demonstrate that MSGAN significantly outperforms the state-of-the-art MDA approaches for visual sentiment classification.

Download Full-text