Multiple Graphs and Low-Rank Embedding for Multi-Source Heterogeneous Domain Adaptation

2022 ◽  
Vol 16 (4) ◽  
pp. 1-25
Hanrui Wu ◽  
Michael K. Ng

Multi-source domain adaptation is a challenging topic in transfer learning, especially when the data of each domain are represented by different kinds of features, i.e., Multi-source Heterogeneous Domain Adaptation (MHDA). It is important to take advantage of the knowledge extracted from multiple sources as well as bridge the heterogeneous spaces for handling the MHDA paradigm. This article proposes a novel method named Multiple Graphs and Low-rank Embedding (MGLE), which models the local structure information of multiple domains using multiple graphs and learns the low-rank embedding of the target domain. Then, MGLE augments the learned embedding with the original target data. Specifically, we introduce the modules of both domain discrepancy and domain relevance into the multiple graphs and low-rank embedding learning procedure. Subsequently, we develop an iterative optimization algorithm to solve the resulting problem. We evaluate the effectiveness of the proposed method on several real-world datasets. Promising results show that the performance of MGLE is better than that of the baseline methods in terms of several metrics, such as AUC, MAE, accuracy, precision, F1 score, and MCC, demonstrating the effectiveness of the proposed method.

PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0253415
Hyunsik Jeon ◽  
Seongmin Lee ◽  
U Kang

Given trained models from multiple source domains, how can we predict the labels of unlabeled data in a target domain? Unsupervised multi-source domain adaptation (UMDA) aims for predicting the labels of unlabeled target data by transferring the knowledge of multiple source domains. UMDA is a crucial problem in many real-world scenarios where no labeled target data are available. Previous approaches in UMDA assume that data are observable over all domains. However, source data are not easily accessible due to privacy or confidentiality issues in a lot of practical scenarios, although classifiers learned in source domains are readily available. In this work, we target data-free UMDA where source data are not observable at all, a novel problem that has not been studied before despite being very realistic and crucial. To solve data-free UMDA, we propose DEMS (Data-free Exploitation of Multiple Sources), a novel architecture that adapts target data to source domains without exploiting any source data, and estimates the target labels by exploiting pre-trained source classifiers. Extensive experiments for data-free UMDA on real-world datasets show that DEMS provides the state-of-the-art accuracy which is up to 27.5% point higher than that of the best baseline.

2020 ◽  
Vol 34 (07) ◽  
pp. 12975-12983
Sicheng Zhao ◽  
Guangzhi Wang ◽  
Shanghang Zhang ◽  
Yang Gu ◽  
Yaxian Li ◽  

Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA). Conventional DA methods usually assume that the labeled data is sampled from a single source distribution. However, in practice, labeled data may be collected from multiple sources, while naive application of the single-source DA algorithms may lead to suboptimal solutions. In this paper, we propose a novel multi-source distilling domain adaptation (MDDA) network, which not only considers the different distances among multiple sources and the target, but also investigates the different similarities of the source samples to the target ones. Specifically, the proposed MDDA includes four stages: (1) pre-train the source classifiers separately using the training data from each source; (2) adversarially map the target into the feature space of each source respectively by minimizing the empirical Wasserstein distance between source and target; (3) select the source training samples that are closer to the target to fine-tune the source classifiers; and (4) classify each encoded target feature by corresponding source classifier, and aggregate different predictions using respective domain weight, which corresponds to the discrepancy between each source and target. Extensive experiments are conducted on public DA benchmarks, and the results demonstrate that the proposed MDDA significantly outperforms the state-of-the-art approaches. Our source code is released at:

Cheng Chen ◽  
Qi Dou ◽  
Hao Chen ◽  
Jing Qin ◽  
Pheng-Ann Heng

This paper presents a novel unsupervised domain adaptation framework, called Synergistic Image and Feature Adaptation (SIFA), to effectively tackle the problem of domain shift. Domain adaptation has become an important and hot topic in recent studies on deep learning, aiming to recover performance degradation when applying the neural networks to new testing domains. Our proposed SIFA is an elegant learning diagram which presents synergistic fusion of adaptations from both image and feature perspectives. In particular, we simultaneously transform the appearance of images across domains and enhance domain-invariance of the extracted features towards the segmentation task. The feature encoder layers are shared by both perspectives to grasp their mutual benefits during the end-to-end learning procedure. Without using any annotation from the target domain, the learning of our unified model is guided by adversarial losses, with multiple discriminators employed from various aspects. We have extensively validated our method with a challenging application of crossmodality medical image segmentation of cardiac structures. Experimental results demonstrate that our SIFA model recovers the degraded performance from 17.2% to 73.0%, and outperforms the state-of-the-art methods by a significant margin.

2021 ◽  
Carlos Garrido-Munoz ◽  
Adrián Sánchez-Hernández ◽  
Francisco J. Castellanos ◽  
Jorge Calvo-Zaragoza

Binarization represents a key role in many document image analysis workflows. The current state of the art considers the use of supervised learning, and specifically deep neural networks. However, it is very difficult for the same model to work successfully in a number of document styles, since the set of potential domains is very heterogeneous. We study a multi-source domain adaptation strategy for binarization. Within this scenario, we look into a novel hypothesis where a specialized binarization model must be selected to be used over a target domain, instead of a single model that tries to generalize across multiple domains. The problem then boils down to, given several specialized models and a new target set, deciding which model to use. We propose here a simple way to address this question by using a domain classifier, that estimates which of the source models must be considered to binarize the new target domain. Our experiments on several datasets, including different text styles and music scores, show that our initial hypothesis is quite promising, yet the way to deal with the decision of which model to use still shows great room for improvement.

2020 ◽  
Vol 34 (04) ◽  
pp. 4028-4035 ◽  
Aditya Grover ◽  
Christopher Chute ◽  
Rui Shu ◽  
Zhangjie Cao ◽  
Stefano Ermon

Given datasets from multiple domains, a key challenge is to efficiently exploit these data sources for modeling a target domain. Variants of this problem have been studied in many contexts, such as cross-domain translation and domain adaptation. We propose AlignFlow, a generative modeling framework that models each domain via a normalizing flow. The use of normalizing flows allows for a) flexibility in specifying learning objectives via adversarial training, maximum likelihood estimation, or a hybrid of the two methods; and b) learning and exact inference of a shared representation in the latent space of the generative model. We derive a uniform set of conditions under which AlignFlow is marginally-consistent for the different learning objectives. Furthermore, we show that AlignFlow guarantees exact cycle consistency in mapping datapoints from a source domain to target and back to the source domain. Empirically, AlignFlow outperforms relevant baselines on image-to-image translation and unsupervised domain adaptation and can be used to simultaneously interpolate across the various domains using the learned representation.

2021 ◽  
Vol 7 (10) ◽  
pp. 198
Mattia Litrico ◽  
Sebastiano Battiato ◽  
Sotirios A. Tsaftaris ◽  
Mario Valerio Giuffrida

This paper proposes a novel approach for semi-supervised domain adaptation for holistic regression tasks, where a DNN predicts a continuous value y∈R given an input image x. The current literature generally lacks specific domain adaptation approaches for this task, as most of them mostly focus on classification. In the context of holistic regression, most of the real-world datasets not only exhibit a covariate (or domain) shift, but also a label gap—the target dataset may contain labels not included in the source dataset (and vice versa). We propose an approach tackling both covariate and label gap in a unified training framework. Specifically, a Generative Adversarial Network (GAN) is used to reduce covariate shift, and label gap is mitigated via label normalisation. To avoid overfitting, we propose a stopping criterion that simultaneously takes advantage of the Maximum Mean Discrepancy and the GAN Global Optimality condition. To restore the original label range—that was previously normalised—a handful of annotated images from the target domain are used. Our experimental results, run on 3 different datasets, demonstrate that our approach drastically outperforms the state-of-the-art across the board. Specifically, for the cell counting problem, the mean squared error (MSE) is reduced from 759 to 5.62; in the case of the pedestrian dataset, our approach lowered the MSE from 131 to 1.47. For the last experimental setup, we borrowed a task from plant biology, i.e., counting the number of leaves in a plant, and we ran two series of experiments, showing the MSE is reduced from 2.36 to 0.88 (intra-species), and from 1.48 to 0.6 (inter-species).

Victor Bouvier ◽  
Philippe Very ◽  
Clément Chastagnol ◽  
Myriam Tami ◽  
Céline Hudelot

Domain Invariant Representations (IR) has improved drastically the transferability of representations from a labelled source domain to a new and unlabelled target domain. Unsupervised Domain Adaptation (UDA) in presence of label shift remains an open problem. To this purpose, we present a bound of the target risk which incorporates both weights and invariant representations. Our theoretical analysis highlights the role of inductive bias in aligning distributions across domains. We illustrate it on standard benchmarks by proposing a new learning procedure for UDA. We observed empirically that weak inductive bias makes adaptation robust to label shift. The elaboration of stronger inductive bias is a promising direction for new UDA algorithms.

Yongchun Zhu ◽  
Fuzhen Zhuang ◽  
Deqing Wang

While Unsupervised Domain Adaptation (UDA) algorithms, i.e., there are only labeled data from source domains, have been actively studied in recent years, most algorithms and theoretical results focus on Single-source Unsupervised Domain Adaptation (SUDA). However, in the practical scenario, labeled data can be typically collected from multiple diverse sources, and they might be different not only from the target domain but also from each other. Thus, domain adapters from multiple sources should not be modeled in the same way. Recent deep learning based Multi-source Unsupervised Domain Adaptation (MUDA) algorithms focus on extracting common domain-invariant representations for all domains by aligning distribution of all pairs of source and target domains in a common feature space. However, it is often very hard to extract the same domain-invariant representations for all domains in MUDA. In addition, these methods match distributions without considering domain-specific decision boundaries between classes. To solve these problems, we propose a new framework with two alignment stages for MUDA which not only respectively aligns the distributions of each pair of source and target domains in multiple specific feature spaces, but also aligns the outputs of classifiers by utilizing the domainspecific decision boundaries. Extensive experiments demonstrate that our method can achieve remarkable results on popular benchmark datasets for image classification.

Atsutoshi Kumagai ◽  
Tomoharu Iwata

We propose a simple yet effective method for unsupervised domain adaptation. When training and test distributions are different, standard supervised learning methods perform poorly. Semi-supervised domain adaptation methods have been developed for the case where labeled data in the target domain are available. However, the target data are often unlabeled in practice. Therefore, unsupervised domain adaptation, which does not require labels for target data, is receiving a lot of attention. The proposed method minimizes the discrepancy between the source and target distributions of input features by transforming the feature space of the source domain. Since such unilateral transformations transfer knowledge in the source domain to the target one without reducing dimensionality, the proposed method can effectively perform domain adaptation without losing information to be transfered. With the proposed method, it is assumed that the transformed features and the original features differ by a small residual to preserve the relationship between features and labels. This transformation is learned by aligning the higher-order moments of the source and target feature distributions based on the maximum mean discrepancy, which enables to compare two distributions without density estimation. Once the transformation is found, we learn supervised models by using the transformed source data and their labels. We use two real-world datasets to demonstrate experimentally that the proposed method achieves better classification performance than existing methods for unsupervised domain adaptation.

Yuguang Yan ◽  
Wen Li ◽  
Hanrui Wu ◽  
Huaqing Min ◽  
Mingkui Tan ◽  

Heterogeneous domain adaptation (HDA) aims to exploit knowledge from a heterogeneous source domain to improve the learning performance in a target domain. Since the feature spaces of the source and target domains are different, the transferring of knowledge is extremely difficult. In this paper, we propose a novel semi-supervised algorithm for HDA by exploiting the theory of optimal transport (OT), a powerful tool originally designed for aligning two different distributions. To match the samples between heterogeneous domains, we propose to preserve the semantic consistency between heterogeneous domains by incorporating label information into the entropic Gromov-Wasserstein discrepancy, which is a metric in OT for different metric spaces, resulting in a new semi-supervised scheme. Via the new scheme, the target and transported source samples with the same label are enforced to follow similar distributions. Lastly, based on the Kullback-Leibler metric, we develop an efficient algorithm to optimize the resultant problem. Comprehensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of our proposed method.

Sign in / Sign up

Export Citation Format

Share Document