Multi-Source Distilling Domain Adaptation

Sicheng Zhao; Guangzhi Wang; Shanghang Zhang; Yang Gu; Yaxian Li; Zhichao Song; Pengfei Xu; Runbo Hu; Hua Chai; Kurt Keutzer

doi:10.1609/aaai.v34i07.6997

Multi-Source Distilling Domain Adaptation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6997 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12975-12983

Author(s):

Sicheng Zhao ◽

Guangzhi Wang ◽

Shanghang Zhang ◽

Yang Gu ◽

Yaxian Li ◽

...

Keyword(s):

Domain Adaptation ◽

Feature Space ◽

Wasserstein Distance ◽

Training Data ◽

Source Distribution ◽

Single Source ◽

Multiple Sources ◽

Target Domain ◽

Target Feature ◽

Training Samples

Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA). Conventional DA methods usually assume that the labeled data is sampled from a single source distribution. However, in practice, labeled data may be collected from multiple sources, while naive application of the single-source DA algorithms may lead to suboptimal solutions. In this paper, we propose a novel multi-source distilling domain adaptation (MDDA) network, which not only considers the different distances among multiple sources and the target, but also investigates the different similarities of the source samples to the target ones. Specifically, the proposed MDDA includes four stages: (1) pre-train the source classifiers separately using the training data from each source; (2) adversarially map the target into the feature space of each source respectively by minimizing the empirical Wasserstein distance between source and target; (3) select the source training samples that are closer to the target to fine-tune the source classifiers; and (4) classify each encoded target feature by corresponding source classifier, and aggregate different predictions using respective domain weight, which corresponds to the discrepancy between each source and target. Extensive experiments are conducted on public DA benchmarks, and the results demonstrate that the proposed MDDA significantly outperforms the state-of-the-art approaches. Our source code is released at: https://github.com/daoyuan98/MDDA.

Download Full-text

Differentially Private Optimal Transport: Application to Domain Adaptation

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/395 ◽

2019 ◽

Author(s):

Nam LeTien ◽

Amaury Habrard ◽

Marc Sebban

Keyword(s):

Optimal Transport ◽

Transport Model ◽

Domain Adaptation ◽

Wasserstein Distance ◽

Source Distribution ◽

Target Domain ◽

The Past ◽

Series Of Experiments ◽

Adaptation Scenarios ◽

Transportation Plan

Optimal transport has received much attention during the past few years to deal with domain adaptation tasks. The goal is to transfer knowledge from a source domain to a target domain by finding a transportation of minimal cost moving the source distribution to the target one. In this paper, we address the challenging task of privacy preserving domain adaptation by optimal transport. Using the Johnson-Lindenstrauss transform together with some noise, we present the first differentially private optimal transport model and show how it can be directly applied on both unsupervised and semi-supervised domain adaptation scenarios. Our theoretically grounded method allows the optimization of the transportation plan and the Wasserstein distance between the two distributions while protecting the data of both domains.We perform an extensive series of experiments on various benchmarks (VisDA, Office-Home and Office-Caltech datasets) that demonstrates the efficiency of our method compared to non-private strategies.

Download Full-text

Aligning Domain-Specific Distribution and Classifier for Cross-Domain Classification from Multiple Sources

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015989 ◽

2019 ◽

Vol 33 ◽

pp. 5989-5996 ◽

Cited By ~ 11

Author(s):

Yongchun Zhu ◽

Fuzhen Zhuang ◽

Deqing Wang

Keyword(s):

Domain Adaptation ◽

Feature Space ◽

Multiple Sources ◽

Target Domain ◽

Domain Specific ◽

Unsupervised Domain Adaptation ◽

Specific Distribution ◽

Benchmark Datasets ◽

Invariant Representations ◽

Decision Boundaries

While Unsupervised Domain Adaptation (UDA) algorithms, i.e., there are only labeled data from source domains, have been actively studied in recent years, most algorithms and theoretical results focus on Single-source Unsupervised Domain Adaptation (SUDA). However, in the practical scenario, labeled data can be typically collected from multiple diverse sources, and they might be different not only from the target domain but also from each other. Thus, domain adapters from multiple sources should not be modeled in the same way. Recent deep learning based Multi-source Unsupervised Domain Adaptation (MUDA) algorithms focus on extracting common domain-invariant representations for all domains by aligning distribution of all pairs of source and target domains in a common feature space. However, it is often very hard to extract the same domain-invariant representations for all domains in MUDA. In addition, these methods match distributions without considering domain-specific decision boundaries between classes. To solve these problems, we propose a new framework with two alignment stages for MUDA which not only respectively aligns the distributions of each pair of source and target domains in multiple specific feature spaces, but also aligns the outputs of classifiers by utilizing the domainspecific decision boundaries. Extensive experiments demonstrate that our method can achieve remarkable results on popular benchmark datasets for image classification.

Download Full-text

Unsupervised Domain Adaptation by Matching Distributions Based on the Maximum Mean Discrepancy via Unilateral Transformations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014106 ◽

2019 ◽

Vol 33 ◽

pp. 4106-4113 ◽

Cited By ~ 1

Author(s):

Atsutoshi Kumagai ◽

Tomoharu Iwata

Keyword(s):

Domain Adaptation ◽

Feature Space ◽

Classification Performance ◽

Target Domain ◽

Target Feature ◽

Maximum Mean Discrepancy ◽

Source Domain ◽

Unsupervised Domain Adaptation ◽

Real World Datasets ◽

Target Data

We propose a simple yet effective method for unsupervised domain adaptation. When training and test distributions are different, standard supervised learning methods perform poorly. Semi-supervised domain adaptation methods have been developed for the case where labeled data in the target domain are available. However, the target data are often unlabeled in practice. Therefore, unsupervised domain adaptation, which does not require labels for target data, is receiving a lot of attention. The proposed method minimizes the discrepancy between the source and target distributions of input features by transforming the feature space of the source domain. Since such unilateral transformations transfer knowledge in the source domain to the target one without reducing dimensionality, the proposed method can effectively perform domain adaptation without losing information to be transfered. With the proposed method, it is assumed that the transformed features and the original features differ by a small residual to preserve the relationship between features and labels. This transformation is learned by aligning the higher-order moments of the source and target feature distributions based on the maximum mean discrepancy, which enables to compare two distributions without density estimation. Once the transformation is found, we learn supervised models by using the transformed source data and their labels. We use two real-world datasets to demonstrate experimentally that the proposed method achieves better classification performance than existing methods for unsupervised domain adaptation.

Download Full-text

Siamese Reconstruction Network: Accurate Image Reconstruction from Human Brain Activity by Learning to Compare

Applied Sciences ◽

10.3390/app9224749 ◽

2019 ◽

Vol 9 (22) ◽

pp. 4749

Author(s):

Lingyun Jiang ◽

Kai Qiao ◽

Linyuan Wang ◽

Chi Zhang ◽

Jian Chen ◽

...

Keyword(s):

Deep Learning ◽

Human Brain ◽

Brain Activity ◽

Feature Space ◽

Training Data ◽

Reconstruction Method ◽

Learning Method ◽

Training Samples ◽

Visual Reconstruction ◽

Relationship Of

Decoding human brain activities, especially reconstructing human visual stimuli via functional magnetic resonance imaging (fMRI), has gained increasing attention in recent years. However, the high dimensionality and small quantity of fMRI data impose restrictions on satisfactory reconstruction, especially for the reconstruction method with deep learning requiring huge amounts of labelled samples. When compared with the deep learning method, humans can recognize a new image because our human visual system is naturally capable of extracting features from any object and comparing them. Inspired by this visual mechanism, we introduced the mechanism of comparison into deep learning method to realize better visual reconstruction by making full use of each sample and the relationship of the sample pair by learning to compare. In this way, we proposed a Siamese reconstruction network (SRN) method. By using the SRN, we improved upon the satisfying results on two fMRI recording datasets, providing 72.5% accuracy on the digit dataset and 44.6% accuracy on the character dataset. Essentially, this manner can increase the training data about from n samples to 2n sample pairs, which takes full advantage of the limited quantity of training samples. The SRN learns to converge sample pairs of the same class or disperse sample pairs of different class in feature space.

Download Full-text

An intelligent fault diagnosis method based on domain adaptation for rolling bearings under variable load conditions

Proceedings of the Institution of Mechanical Engineers Part C Journal of Mechanical Engineering Science ◽

10.1177/09544062211032995 ◽

2021 ◽

pp. 095440622110329

Author(s):

Jianqun Zhang ◽

Qing Zhang ◽

Xianrong Qin ◽

Yuantao Sun

Keyword(s):

Feature Extraction ◽

Fault Diagnosis ◽

Domain Adaptation ◽

Rolling Bearing ◽

Training Data ◽

Variable Load ◽

K Nearest Neighbor ◽

Target Domain ◽

Bearing Faults ◽

Load Conditions

To identify rolling bearing faults under variable load conditions, a method named DISA-KNN is proposed in this paper, which is based on the strategy of feature extraction-domain adaptation-classification. To be specific, the time-domain and frequency-domain indicators are used for feature extraction. Discriminative and domain invariant subspace alignment (DISA) is used to minimize the data distributions’ discrepancies between the training data (source domain) and testing data (target domain). K-nearest neighbor (KNN) is applied to identify rolling bearing faults. DISA-KNN’s validation is proved by the experimental signal collected under different load conditions. The identification accuracies obtained by the DISA-KNN method are more than 90% on four datasets, including one dataset with 99.5% accuracy. The strength of the proposed method is further highlighted by comparisons with the other 8 methods. These results reveal that the proposed method is promising for the rolling bearing fault diagnosis in real rotating machinery.

Download Full-text

Joint Partial Optimal Transport for Open Set Domain Adaptation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/352 ◽

2020 ◽

Author(s):

Renjun Xu ◽

Pelen Liu ◽

Yin Zhang ◽

Fang Cai ◽

Jindong Wang ◽

...

Keyword(s):

Optimal Transport ◽

Transport Model ◽

Domain Adaptation ◽

General Setting ◽

Feature Space ◽

Set Domain ◽

Target Domain ◽

Source Domain ◽

Open Set ◽

Proposed Model

Domain adaptation (DA) has achieved a resounding success to learn a good classifier by leveraging labeled data from a source domain to adapt to an unlabeled target domain. However, in a general setting when the target domain contains classes that are never observed in the source domain, namely in Open Set Domain Adaptation (OSDA), existing DA methods failed to work because of the interference of the extra unknown classes. This is a much more challenging problem, since it can easily result in negative transfer due to the mismatch between the unknown and known classes. Existing researches are susceptible to misclassification when target domain unknown samples in the feature space distributed near the decision boundary learned from the labeled source domain. To overcome this, we propose Joint Partial Optimal Transport (JPOT), fully utilizing information of not only the labeled source domain but also the discriminative representation of unknown class in the target domain. The proposed joint discriminative prototypical compactness loss can not only achieve intra-class compactness and inter-class separability, but also estimate the mean and variance of the unknown class through backpropagation, which remains intractable for previous methods due to the blindness about the structure of the unknown classes. To our best knowledge, this is the first optimal transport model for OSDA. Extensive experiments demonstrate that our proposed model can significantly boost the performance of open set domain adaptation on standard DA datasets.

Download Full-text

USING SEMANTICALLY PAIRED IMAGES TO IMPROVE DOMAIN ADAPTATION FOR THE SEMANTIC SEGMENTATION OF AERIAL IMAGES

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2020-483-2020 ◽

2020 ◽

Vol V-2-2020 ◽

pp. 483-492

Author(s):

D. Gritzner ◽

J. Ostermann

Keyword(s):

Time Window ◽

Domain Adaptation ◽

Geographical Area ◽

Model Performance ◽

Ground Truth ◽

Semantic Segmentation ◽

Training Data ◽

Aerial Images ◽

Target Domain ◽

Training Examples

Abstract. Modern machine learning, especially deep learning, which is used in a variety of applications, requires a lot of labelled data for model training. Having an insufficient amount of training examples leads to models which do not generalize well to new input instances. This is a particular significant problem for tasks involving aerial images: often training data is only available for a limited geographical area and a narrow time window, thus leading to models which perform poorly in different regions, at different times of day, or during different seasons. Domain adaptation can mitigate this issue by using labelled source domain training examples and unlabeled target domain images to train a model which performs well on both domains. Modern adversarial domain adaptation approaches use unpaired data. We propose using pairs of semantically similar images, i.e., whose segmentations are accurate predictions of each other, for improved model performance. In this paper we show that, as an upper limit based on ground truth, using semantically paired aerial images during training almost always increases model performance with an average improvement of 4.2% accuracy and .036 mean intersection-over-union (mIoU). Using a practical estimate of semantic similarity, we still achieve improvements in more than half of all cases, with average improvements of 2.5% accuracy and .017 mIoU in those cases.

Download Full-text

Optimal Transport with Dimensionality Reduction for Domain Adaptation

Symmetry ◽

10.3390/sym12121994 ◽

2020 ◽

Vol 12 (12) ◽

pp. 1994

Author(s):

Ping Li ◽

Zhiwei Ni ◽

Xuhui Zhu ◽

Juan Song ◽

Wenying Wu

Keyword(s):

Dimensionality Reduction ◽

Optimal Transport ◽

Domain Adaptation ◽

Wasserstein Distance ◽

Local Information ◽

Target Domain ◽

Source Domain ◽

Second Stage ◽

Cross Domain ◽

Feature Based

Domain adaptation manages to learn a robust classifier for target domain, using the source domain, but they often follow different distributions. To bridge distribution shift between the two domains, most of previous works aim to align their feature distributions through feature transformation, of which optimal transport for domain adaptation has attract researchers’ interest, as it can exploit the local information of the two domains in the process of mapping the source instances to the target ones by minimizing Wasserstein distance between their feature distributions. However, it may weaken the feature discriminability of source domain, thus degrade domain adaptation performance. To address this problem, this paper proposes a two-stage feature-based adaptation approach, referred to as optimal transport with dimensionality reduction (OTDR). In the first stage, we apply the dimensionality reduction with intradomain variant maximization but source intraclass compactness minimization, to separate data samples as much as possible and enhance the feature discriminability of the source domain. In the second stage, we leverage optimal transport-based technique to preserve the local information of the two domains. Notably, the desirable properties in the first stage can mitigate the degradation of feature discriminability of the source domain in the second stage. Extensive experiments on several cross-domain image datasets validate that OTDR is superior to its competitors in classification accuracy.

Download Full-text

Domain Adversarial Neural Networks for Large-Scale Land Cover Classification

Remote Sensing ◽

10.3390/rs11101153 ◽

2019 ◽

Vol 11 (10) ◽

pp. 1153 ◽

Cited By ~ 1

Author(s):

Mesay Belete Bejiga ◽

Farid Melgani ◽

Pietro Beraldini

Keyword(s):

Neural Networks ◽

Land Cover ◽

Transfer Learning ◽

Large Scale ◽

Domain Adaptation ◽

Land Cover Classification ◽

Target Domain ◽

Training Samples ◽

Latent Space ◽

Large Scale Land

Learning classification models require sufficiently labeled training samples, however, collecting labeled samples for every new problem is time-consuming and costly. An alternative approach is to transfer knowledge from one problem to another, which is called transfer learning. Domain adaptation (DA) is a type of transfer learning that aims to find a new latent space where the domain discrepancy between the source and the target domain is negligible. In this work, we propose an unsupervised DA technique called domain adversarial neural networks (DANNs), composed of a feature extractor, a class predictor, and domain classifier blocks, for large-scale land cover classification. Contrary to the traditional methods that perform representation and classifier learning in separate stages, DANNs combine them into a single stage, thereby learning a new representation of the input data that is both domain-invariant and discriminative. Once trained, the classifier of a DANN can be used to predict both source and target domain labels. Additionally, we also modify the domain classifier of a DANN to evaluate its suitability for multi-target domain adaptation problems. Experimental results obtained for both single and multiple target DA problems show that the proposed method provides a performance gain of up to 40%.

Download Full-text

Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification (Extended Abstract)

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/802 ◽

2018 ◽

Author(s):

Alejandro Moreo Fernández ◽

Andrea Esuli ◽

Fabrizio Sebastiani

Keyword(s):

Domain Adaptation ◽

State Of The Art ◽

Sentiment Classification ◽

Training Data ◽

Target Domain ◽

Source Domain ◽

Machine Learning Methods ◽

Cross Domain ◽

Current State ◽

Cross Lingual

Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a “target” domain when the only available training data belongs to a different “source” domain. In this extended abstract, we briefly describe our new DA method called Distributional Correspondence Indexing (DCI) for sentiment classification. DCI derives term representations in a vector space common to both domains where each dimension reflects its distributional correspondence to a pivot, i.e., to a highly predictive term that behaves similarly across domains. The experiments we have conducted show that DCI obtains better performance than current state-of-the-art techniques for cross-lingual and cross-domain sentiment classification.

Download Full-text