ITERATIVE SELF-LABELING DOMAIN ADAPTATION FOR LINEAR STRUCTURED IMAGE CLASSIFICATION

2013 ◽  
Vol 22 (05) ◽  
pp. 1360005 ◽  
Author(s):  
AMAURY HABRARD ◽  
JEAN-PHILIPPE PEYRACHE ◽  
MARC SEBBAN

A strong assumption to derive generalization guarantees in the standard PAC framework is that training (or source) data and test (or target) data are drawn according to the same distribution. Because of the presence of possibly outdated data in the training set, or the use of biased collections, this assumption is often violated in real-world applications leading to different source and target distributions. To go around this problem, a new research area known as Domain Adaptation (DA) has recently been introduced giving rise to many adaptation algorithms and theoretical results in the form of generalization bounds. This paper deals with self-labeling DA whose goal is to iteratively incorporate semi-labeled target data in the learning set to progressively adapt the classifier from the source to the target domain. The contribution of this work is three-fold: First, we provide the minimum and necessary theoretical conditions for a self-labeling DA algorithm to perform an actual domain adaptation. Second, following these theoretical recommendations, we design a new iterative DA algorithm, called GESIDA, able to deal with structured data. This algorithm makes use of the new theory of learning with (ε,γ,τ)-good similarity functions introduced by Balcan et al., which does not require the use of a valid kernel to learn well and allows us to induce sparse models. Finally, we apply our algorithm on a structured image classification task and show that self-labeling domain adaptation is a new original way to deal with scaling and rotation problems.

PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0253415
Author(s):  
Hyunsik Jeon ◽  
Seongmin Lee ◽  
U Kang

Given trained models from multiple source domains, how can we predict the labels of unlabeled data in a target domain? Unsupervised multi-source domain adaptation (UMDA) aims for predicting the labels of unlabeled target data by transferring the knowledge of multiple source domains. UMDA is a crucial problem in many real-world scenarios where no labeled target data are available. Previous approaches in UMDA assume that data are observable over all domains. However, source data are not easily accessible due to privacy or confidentiality issues in a lot of practical scenarios, although classifiers learned in source domains are readily available. In this work, we target data-free UMDA where source data are not observable at all, a novel problem that has not been studied before despite being very realistic and crucial. To solve data-free UMDA, we propose DEMS (Data-free Exploitation of Multiple Sources), a novel architecture that adapts target data to source domains without exploiting any source data, and estimates the target labels by exploiting pre-trained source classifiers. Extensive experiments for data-free UMDA on real-world datasets show that DEMS provides the state-of-the-art accuracy which is up to 27.5% point higher than that of the best baseline.


Author(s):  
Si Wu ◽  
Jian Zhong ◽  
Wenming Cao ◽  
Rui Li ◽  
Zhiwen Yu ◽  
...  

For unsupervised domain adaptation, the process of learning domain-invariant representations could be dominated by the labeled source data, such that the specific characteristics of the target domain may be ignored. In order to improve the performance in inferring target labels, we propose a targetspecific network which is capable of learning collaboratively with a domain adaptation network, instead of directly minimizing domain discrepancy. A clustering regularization is also utilized to improve the generalization capability of the target-specific network by forcing target data points to be close to accumulated class centers. As this network learns and specializes to the target domain, its performance in inferring target labels improves, which in turn facilitates the learning process of the adaptation network. Therefore, there is a mutually beneficial relationship between these two networks. We perform extensive experiments on multiple digit and object datasets, and the effectiveness and superiority of the proposed approach is presented and verified on multiple visual adaptation benchmarks, e.g., we improve the state-ofthe-art on the task of MNIST→SVHN from 76.5% to 84.9% without specific augmentation.


Author(s):  
Yuguang Yan ◽  
Wen Li ◽  
Michael Ng ◽  
Mingkui Tan ◽  
Hanrui Wu ◽  
...  

Domain adaptation aims to reduce the effort on collecting and annotating target data by leveraging knowledge from a different source domain. The domain adaptation problem will become extremely challenging when the feature spaces of the source and target domains are different, which is also known as the heterogeneous domain adaptation (HDA) problem. In this paper, we propose a novel HDA method to find the optimal discriminative correlation subspace for the source and target data. The discriminative correlation subspace is inherited from the canonical correlation subspace between the source and target data, and is further optimized to maximize the discriminative ability for the target domain classifier. We formulate a joint objective in order to simultaneously learn the discriminative correlation subspace and the target domain classifier. We then apply an alternating direction method of multiplier (ADMM) algorithm to address the resulting non-convex optimization problem. Comprehensive experiments on two real-world data sets demonstrate the effectiveness of the proposed method compared to the state-of-the-art methods.


Author(s):  
Jyoti Sandesh Deshmukh ◽  
Amiya Kumar Tripathy ◽  
Dilendra Hiran

An increase in use of web produces large content of information about products. Online reviews are used to make decision by peoples. Opinion mining is vast research area in which different types of reviews are analyzed. Several issues are existing in this area. Domain adaptation is emerging issue in opinion mining. Labling of data for every domain is time consuming and costly task. Hence the need arises for model that train one domain and applied it on other domain reducing cost aswell as time. This is called domain adaptation which is addressed in this paper. Using maximum entropy and clustering technique source domains data is trained. Trained data from source domain is applied on target data to labeling purpose A result shows moderate accuracy for 5 fold cross validation and combination of source domains for Blitzer et al (2007) multi domain product dataset.


2020 ◽  
Vol 39 (6) ◽  
pp. 8149-8159
Author(s):  
Ping Li ◽  
Zhiwei Ni ◽  
Xuhui Zhu ◽  
Juan Song

Domain adaptation (DA) aims to train a robust predictor by transferring rich knowledge from a well-labeled source domain to annotate a newly coming target domain; however, the two domains are usually drawn from very different distributions. Most current methods either learn the common features by matching inter-domain feature distributions and training the classifier separately or align inter-domain label distributions to directly obtain an adaptive classifier based on the original features despite feature distortion. Moreover, intra-domain information may be greatly degraded during the DA process; i.e., the source data samples from different classes might grow closer. To this end, this paper proposes a novel DA approach, referred to as inter-class distribution alienation and inter-domain distribution alignment based on manifold embedding (IDAME). Specifically, IDAME commits to adapting the classifier on the Grassmann manifold by using structural risk minimization, where inter-domain feature distributions are aligned to mitigate feature distortion, and the target pseudo labels are exploited using the distances on the Grassmann manifold. During the classifier adaptation process, we simultaneously consider the inter-class distribution alienation, the inter-domain distribution alignment, and the manifold consistency. Extensive experiments validate that IDAME can outperform several comparative state-of-the-art methods on real-world cross-domain image datasets.


2021 ◽  
Author(s):  
◽  
Muhammad Ghifary

<p>Machine learning has achieved great successes in the area of computer vision, especially in object recognition or classification. One of the core factors of the successes is the availability of massive labeled image or video data for training, collected manually by human. Labeling source training data, however, can be expensive and time consuming. Furthermore, a large amount of labeled source data may not always guarantee traditional machine learning techniques to generalize well; there is a potential bias or mismatch in the data, i.e., the training data do not represent the target environment.  To mitigate the above dataset bias/mismatch, one can consider domain adaptation: utilizing labeled training data and unlabeled target data to develop a well-performing classifier on the target environment. In some cases, however, the unlabeled target data are nonexistent, but multiple labeled sources of data exist. Such situations can be addressed by domain generalization: using multiple source training sets to produce a classifier that generalizes on the unseen target domain. Although several domain adaptation and generalization approaches have been proposed, the domain mismatch in object recognition remains a challenging, open problem – the model performance has yet reached to a satisfactory level in real world applications.  The overall goal of this thesis is to progress towards solving dataset bias in visual object recognition through representation learning in the context of domain adaptation and domain generalization. Representation learning is concerned with finding proper data representations or features via learning rather than via engineering by human experts. This thesis proposes several representation learning solutions based on deep learning and kernel methods.  This thesis introduces a robust-to-noise deep neural network for handwritten digit classification trained on “clean” images only, which we name Deep Hybrid Network (DHN). DHNs are based on a particular combination of sparse autoencoders and restricted Boltzmann machines. The results show that DHN performs better than the standard deep neural network in recognizing digits with Gaussian and impulse noise, block and border occlusions.  This thesis proposes the Domain Adaptive Neural Network (DaNN), a neural network based domain adaptation algorithm that minimizes the classification error and the domain discrepancy between the source and target data representations. The experiments show the competitiveness of DaNN against several state-of-the-art methods on a benchmark object dataset.  This thesis develops the Multi-task Autoencoder (MTAE), a domain generalization algorithm based on autoencoders trained via multi-task learning. MTAE learns to transform the original image into its analogs in multiple related domains simultaneously. The results show that the MTAE’s representations provide better classification performance than some alternative autoencoder-based models as well as the current state-of-the-art domain generalization algorithms.  This thesis proposes a fast kernel-based representation learning algorithm for both domain adaptation and domain generalization, Scatter Component Analysis (SCA). SCA finds a data representation that trades between maximizing the separability of classes, minimizing the mismatch between domains, and maximizing the separability of the whole data points. The results show that SCA performs much faster than some competitive algorithms, while providing state-of-the-art accuracy in both domain adaptation and domain generalization.  Finally, this thesis presents the Deep Reconstruction-Classification Network (DRCN), a deep convolutional network for domain adaptation. DRCN learns to classify labeled source data and also to reconstruct unlabeled target data via a shared encoding representation. The results show that DRCN provides competitive or better performance than the prior state-of-the-art model on several cross-domain object datasets.</p>


Algorithms ◽  
2019 ◽  
Vol 12 (5) ◽  
pp. 96 ◽  
Author(s):  
Imad Eddine Ibrahim Bekkouch ◽  
Youssef Youssry ◽  
Rustam Gafarov ◽  
Adil Khan ◽  
Asad Masood Khattak

Domain adaptation is a sub-field of transfer learning that aims at bridging the dissimilarity gap between different domains by transferring and re-using the knowledge obtained in the source domain to the target domain. Many methods have been proposed to resolve this problem, using techniques such as generative adversarial networks (GAN), but the complexity of such methods makes it hard to use them in different problems, as fine-tuning such networks is usually a time-consuming task. In this paper, we propose a method for unsupervised domain adaptation that is both simple and effective. Our model (referred to as TripNet) harnesses the idea of a discriminator and Linear Discriminant Analysis (LDA) to push the encoder to generate domain-invariant features that are category-informative. At the same time, pseudo-labelling is used for the target data to train the classifier and to bring the same classes from both domains together. We evaluate TripNet against several existing, state-of-the-art methods on three image classification tasks: Digit classification (MNIST, SVHN, and USPC datasets), object recognition (Office31 dataset), and traffic sign recognition (GTSRB and Synthetic Signs datasets). Our experimental results demonstrate that (i) TripNet beats almost all existing methods (having a similar simple model like it) on all of these tasks; and (ii) for models that are significantly more complex (or hard to train) than TripNet, it even beats their performance in some cases. Hence, the results confirm the effectiveness of using TripNet for unsupervised domain adaptation in image classification.


Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7539
Author(s):  
Jungchan Cho

Universal domain adaptation (UDA) is a crucial research topic for efficient deep learning model training using data from various imaging sensors. However, its development is affected by unlabeled target data. Moreover, the nonexistence of prior knowledge of the source and target domain makes it more challenging for UDA to train models. I hypothesize that the degradation of trained models in the target domain is caused by the lack of direct training loss to improve the discriminative power of the target domain data. As a result, the target data adapted to the source representations is biased toward the source domain. I found that the degradation was more pronounced when I used synthetic data for the source domain and real data for the target domain. In this paper, I propose a UDA method with target domain contrastive learning. The proposed method enables models to leverage synthetic data for the source domain and train the discriminativeness of target features in an unsupervised manner. In addition, the target domain feature extraction network is shared with the source domain classification task, preventing unnecessary computational growth. Extensive experimental results on VisDa-2017 and MNIST to SVHN demonstrated that the proposed method significantly outperforms the baseline by 2.7% and 5.1%, respectively.


Author(s):  
Zhedong Zheng ◽  
Yi Yang

This work focuses on the unsupervised scene adaptation problem of learning from both labeled source data and unlabeled target data. Existing approaches focus on minoring the inter-domain gap between the source and target domains. However, the intra-domain knowledge and inherent uncertainty learned by the network are under-explored. In this paper, we propose an orthogonal method, called memory regularization in vivo, to exploit the intra-domain knowledge and regularize the model training. Specifically, we refer to the segmentation model itself as the memory module, and minor the discrepancy of the two classifiers, i.e., the primary classifier and the auxiliary classifier, to reduce the prediction inconsistency. Without extra parameters, the proposed method is complementary to most existing domain adaptation methods and could generally improve the performance of existing methods. Albeit simple, we verify the effectiveness of memory regularization on two synthetic-to-real benchmarks: GTA5 → Cityscapes and SYNTHIA → Cityscapes, yielding +11.1% and +11.3% mIoU improvement over the baseline model, respectively. Besides, a similar +12.0% mIoU improvement is observed on the cross-city benchmark: Cityscapes → Oxford RobotCar.


Author(s):  
Zechang Li ◽  
Yuxuan Lai ◽  
Yansong Feng ◽  
Dongyan Zhao

Recently, semantic parsing has attracted much attention in the community. Although many neural modeling efforts have greatly improved the performance, it still suffers from the data scarcity issue. In this paper, we propose a novel semantic parser for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain. Our semantic parser benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages, i.e., focusing on domain invariant and domain specific information, respectively. In the coarse stage, our novel domain discrimination component and domain relevance attention encourage the model to learn transferable domain general structures. In the fine stage, the model is guided to concentrate on domain related details. Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies. Additionally, we show that our model can well exploit limited target data to capture the difference between the source and target domain, even when the target domain has far fewer training instances.


Sign in / Sign up

Export Citation Format

Share Document