scholarly journals HoMM: Higher-Order Moment Matching for Unsupervised Domain Adaptation

2020 ◽  
Vol 34 (04) ◽  
pp. 3422-3429
Author(s):  
Chao Chen ◽  
Zhihang Fu ◽  
Zhihong Chen ◽  
Sheng Jin ◽  
Zhaowei Cheng ◽  
...  

Minimizing the discrepancy of feature distributions between different domains is one of the most promising directions in unsupervised domain adaptation. From the perspective of moment matching, most existing discrepancy-based methods are designed to match the second-order or lower moments, which however, have limited expression of statistical characteristic for non-Gaussian distributions. In this work, we propose a Higher-order Moment Matching (HoMM) method, and further extend the HoMM into reproducing kernel Hilbert spaces (RKHS). In particular, our proposed HoMM can perform arbitrary-order moment matching, we show that the first-order HoMM is equivalent to Maximum Mean Discrepancy (MMD) and the second-order HoMM is equivalent to Correlation Alignment (CORAL). Moreover, HoMM (order≥ 3) is expected to perform fine-grained domain alignment as higher-order statistics can approximate more complex, non-Gaussian distributions. Besides, we also exploit the pseudo-labeled target samples to learn discriminative representations in the target domain, which further improves the transfer performance. Extensive experiments are conducted, showing that our proposed HoMM consistently outperforms the existing moment matching methods by a large margin. Codes are available at https://github.com/chenchao666/HoMM-Master

Author(s):  
Rui Wang ◽  
Weiguo Huang ◽  
Juanjuan Shi ◽  
Jun Wang ◽  
Changqing Shen ◽  
...  

Abstract Due to the data distribution discrepancy caused by the time-varying working conditions, the intelligent diagnosis methods fail to achieve accurate fault classification in engineering scenarios. To this end, this paper presents a novel higher-order moment matching-based adversarial domain adaptation method (HMMADA) for intelligent bearing fault diagnosis. First, the deep one-dimensional convolution neural network is constructed as the feature extractor to learn the discriminative features of each category through different domains. Then, the distribution discrepancy across domains is significantly reduced by using the joint higher-order moment statistics (HMS) and adversarial learning. In particular, the HMS integrates the first-order and second-order statistics into a unified framework and achieves a fine-grained distribution adaptation between different domains. Finally, the feasibility and effectiveness of the HMMADA are validated by several transfer experiments constructed on two different bearing datasets. The results demonstrate that the HMS is more effective compared with the lower-order statistics.


2021 ◽  
pp. 1-7
Author(s):  
Rong Chen ◽  
Chongguang Ren

Domain adaptation aims to solve the problems of lacking labels. Most existing works of domain adaptation mainly focus on aligning the feature distributions between the source and target domain. However, in the field of Natural Language Processing, some of the words in different domains convey different sentiment. Thus not all features of the source domain should be transferred, and it would cause negative transfer when aligning the untransferable features. To address this issue, we propose a Correlation Alignment with Attention mechanism for unsupervised Domain Adaptation (CAADA) model. In the model, an attention mechanism is introduced into the transfer process for domain adaptation, which can capture the positively transferable features in source and target domain. Moreover, the CORrelation ALignment (CORAL) loss is utilized to minimize the domain discrepancy by aligning the second-order statistics of the positively transferable features extracted by the attention mechanism. Extensive experiments on the Amazon review dataset demonstrate the effectiveness of CAADA method.


2021 ◽  
Author(s):  
Jiahao Fan ◽  
Hangyu Zhu ◽  
Xinyu Jiang ◽  
Long Meng ◽  
Cong Fu ◽  
...  

Deep sleep staging networks have reached top performance on large-scale datasets. However, these models perform poorer when training and testing on small sleep cohorts due to data inefficiency. Transferring well-trained models from large-scale datasets (source domain) to small sleep cohorts (target domain) is a promising solution but still remains challenging due to the domain-shift issue. In this work, an unsupervised domain adaptation approach, domain statistics alignment (DSA), is developed to bridge the gap between the data distribution of source and target domains. DSA adapts the source models on the target domain by modulating the domain-specific statistics of deep features stored in the Batch Normalization (BN) layers. Furthermore, we have extended DSA by introducing cross-domain statistics in each BN layer to perform DSA adaptively (AdaDSA). The proposed methods merely need the well-trained source model without access to the source data, which may be proprietary and inaccessible. DSA and AdaDSA are universally applicable to various deep sleep staging networks that have BN layers. We have validated the proposed methods by extensive experiments on two state-of-the-art deep sleep staging networks, DeepSleepNet+ and U-time. The performance was evaluated by conducting various transfer tasks on six sleep databases, including two large-scale databases, MASS and SHHS, as the source domain, four small sleep databases as the target domain. Thereinto, clinical sleep records acquired in Huashan Hospital, Shanghai, were used. The results show that both DSA and AdaDSA could significantly improve the performance of source models on target domains, providing novel insights into the domain generalization problem in sleep staging tasks.<br>


Author(s):  
Juan J. González De la Rosa ◽  
Carlos G. Puntonet ◽  
A. Moreno-Muñoz

Power quality (PQ) event detection and classification is gaining importance due to worldwide use of delicate electronic devices. Things like lightning, large switching loads, non-linear load stresses, inadequate or incorrect wiring and grounding or accidents involving electric lines, can create problems to sensitive equipment, if it is designed to operate within narrow voltage limits, or if it does not incorporate the capability of filtering fluctuations in the electrical supply (Gerek et. al., 2006; Moreno et. al., 2006). The solution for a PQ problem implies the acquisition and monitoring of long data records from the energy distribution system, along with an automated detection and classification strategy which allows identify the cause of these voltage anomalies. Signal processing tools have been widely used for this purpose, and are mainly based in spectral analysis and wavelet transforms. These second-order methods, the most familiar to the scientific community, are based on the independence of the spectral components and evolution of the spectrum in the time domain. Other tools are threshold-based algorithms, linear classifiers and Bayesian networks. The goal of the signal processing analysis is to get a feature vector from the data record under study, which constitute the input to the computational intelligence modulus, which has the task of classification. Some recent works bring a different strategy, based in higher-order statistics (HOS), in dealing with the analysis of transients within PQ analysis (Gerek et. al., 2006; Moreno et. al., 2006) and other fields of Science (De la Rosa et. al., 2004, 2005, 2007). Without perturbation, the 50-Hz of the voltage waveform exhibits a Gaussian behaviour. Deviations from Gaussianity can be detected and characterized via HOS. Non-Gaussian processes need third and fourth order statistical characterization in order to be recognized. In order words, second-order moments and cumulants could be not capable of differentiate non-Gaussian events. The situation described matches the problem of differentiating between a transient of long duration named fault (within a signal period), and a short duration transient (25 per cent of a cycle). This one could also bring the 50-Hz voltage to zero instantly and, generally affects the sinusoid dramatically. By the contrary, the long-duration transient could be considered as a modulating signal (the 50-Hz signal is the carrier). These transients are intrinsically non-stationary, so it is necessary a battery of observations (sample registers) to obtain a reliable characterization. The main contribution of this work consists of the application of higher-order central cumulants to characterize PQ events, along with the use of a competitive layer as the classification tool. Results reveal that two different clusters, associated to both types of transients, can be recognized in the 2D graph. The successful results convey the idea that the physical underlying processes associated to the analyzed transients, generate different types of deviations from the typical effects that the noise cause in the 50-Hz sinusoid voltage waveform. The paper is organized as follows: Section on higher-order cumulants summarizes the main equations of the cumulants used in the paper. Then, we recall the competitive layer’s foundations, along with the Kohonen learning rule. The experience is described then, and the conclusions are drawn.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Baoying Chen ◽  
Shunquan Tan

Recently, various Deepfake detection methods have been proposed, and most of them are based on convolutional neural networks (CNNs). These detection methods suffer from overfitting on the source dataset and do not perform well on cross-domain datasets which have different distributions from the source dataset. To address these limitations, a new method named FeatureTransfer is proposed in this paper, which is a two-stage Deepfake detection method combining with transfer learning. Firstly, The CNN model pretrained on a third-party large-scale Deepfake dataset can be used to extract the more transferable feature vectors of Deepfake videos in the source and target domains. Secondly, these feature vectors are fed into the domain-adversarial neural network based on backpropagation (BP-DANN) for unsupervised domain adaptive training, where the videos in the source domain have real or fake labels, while the videos in the target domain are unlabelled. The experimental results indicate that the proposed method FeatureTransfer can effectively solve the overfitting problem in Deepfake detection and greatly improve the performance of cross-dataset evaluation.


2020 ◽  
Vol 34 (05) ◽  
pp. 7618-7625
Author(s):  
Yong Dai ◽  
Jian Liu ◽  
Xiancong Ren ◽  
Zenglin Xu

Multi-source unsupervised domain adaptation (MS-UDA) for sentiment analysis (SA) aims to leverage useful information in multiple source domains to help do SA in an unlabeled target domain that has no supervised information. Existing algorithms of MS-UDA either only exploit the shared features, i.e., the domain-invariant information, or based on some weak assumption in NLP, e.g., smoothness assumption. To avoid these problems, we propose two transfer learning frameworks based on the multi-source domain adaptation methodology for SA by combining the source hypotheses to derive a good target hypothesis. The key feature of the first framework is a novel Weighting Scheme based Unsupervised Domain Adaptation framework ((WS-UDA), which combine the source classifiers to acquire pseudo labels for target instances directly. While the second framework is a Two-Stage Training based Unsupervised Domain Adaptation framework (2ST-UDA), which further exploits these pseudo labels to train a target private extractor. Importantly, the weights assigned to each source classifier are based on the relations between target instances and source domains, which measured by a discriminator through the adversarial training. Furthermore, through the same discriminator, we also fulfill the separation of shared features and private features.Experimental results on two SA datasets demonstrate the promising performance of our frameworks, which outperforms unsupervised state-of-the-art competitors.


Algorithms ◽  
2019 ◽  
Vol 12 (5) ◽  
pp. 96 ◽  
Author(s):  
Imad Eddine Ibrahim Bekkouch ◽  
Youssef Youssry ◽  
Rustam Gafarov ◽  
Adil Khan ◽  
Asad Masood Khattak

Domain adaptation is a sub-field of transfer learning that aims at bridging the dissimilarity gap between different domains by transferring and re-using the knowledge obtained in the source domain to the target domain. Many methods have been proposed to resolve this problem, using techniques such as generative adversarial networks (GAN), but the complexity of such methods makes it hard to use them in different problems, as fine-tuning such networks is usually a time-consuming task. In this paper, we propose a method for unsupervised domain adaptation that is both simple and effective. Our model (referred to as TripNet) harnesses the idea of a discriminator and Linear Discriminant Analysis (LDA) to push the encoder to generate domain-invariant features that are category-informative. At the same time, pseudo-labelling is used for the target data to train the classifier and to bring the same classes from both domains together. We evaluate TripNet against several existing, state-of-the-art methods on three image classification tasks: Digit classification (MNIST, SVHN, and USPC datasets), object recognition (Office31 dataset), and traffic sign recognition (GTSRB and Synthetic Signs datasets). Our experimental results demonstrate that (i) TripNet beats almost all existing methods (having a similar simple model like it) on all of these tasks; and (ii) for models that are significantly more complex (or hard to train) than TripNet, it even beats their performance in some cases. Hence, the results confirm the effectiveness of using TripNet for unsupervised domain adaptation in image classification.


2020 ◽  
Vol 34 (05) ◽  
pp. 7830-7838 ◽  
Author(s):  
Han Guo ◽  
Ramakanth Pasunuru ◽  
Mohit Bansal

Domain adaptation performance of a learning algorithm on a target domain is a function of its source domain error and a divergence measure between the data distribution of these two domains. We present a study of various distance-based measures in the context of NLP tasks, that characterize the dissimilarity between domains based on sample estimates. We first conduct analysis experiments to show which of these distance measures can best differentiate samples from same versus different domains, and are correlated with empirical results. Next, we develop a DistanceNet model which uses these distance measures, or a mixture of these distance measures, as an additional loss function to be minimized jointly with the task's loss function, so as to achieve better unsupervised domain adaptation. Finally, we extend this model to a novel DistanceNet-Bandit model, which employs a multi-armed bandit controller to dynamically switch between multiple source domains and allow the model to learn an optimal trajectory and mixture of domains for transfer to the low-resource target domain. We conduct experiments on popular sentiment analysis datasets with several diverse domains and show that our DistanceNet model, as well as its dynamic bandit variant, can outperform competitive baselines in the context of unsupervised domain adaptation.


2020 ◽  
Vol 34 (04) ◽  
pp. 6243-6250 ◽  
Author(s):  
Qian Wang ◽  
Toby Breckon

Unsupervised domain adaptation aims to address the problem of classifying unlabeled samples from the target domain whilst labeled samples are only available from the source domain and the data distributions are different in these two domains. As a result, classifiers trained from labeled samples in the source domain suffer from significant performance drop when directly applied to the samples from the target domain. To address this issue, different approaches have been proposed to learn domain-invariant features or domain-specific classifiers. In either case, the lack of labeled samples in the target domain can be an issue which is usually overcome by pseudo-labeling. Inaccurate pseudo-labeling, however, could result in catastrophic error accumulation during learning. In this paper, we propose a novel selective pseudo-labeling strategy based on structured prediction. The idea of structured prediction is inspired by the fact that samples in the target domain are well clustered within the deep feature space so that unsupervised clustering analysis can be used to facilitate accurate pseudo-labeling. Experimental results on four datasets (i.e. Office-Caltech, Office31, ImageCLEF-DA and Office-Home) validate our approach outperforms contemporary state-of-the-art methods.


Author(s):  
Xiaobin Chang ◽  
Yongxin Yang ◽  
Tao Xiang ◽  
Timothy M. Hospedales

In this paper, a unified approach is presented to transfer learning that addresses several source and target domain labelspace and annotation assumptions with a single model. It is particularly effective in handling a challenging case, where source and target label-spaces are disjoint, and outperforms alternatives in both unsupervised and semi-supervised settings. The key ingredient is a common representation termed Common Factorised Space. It is shared between source and target domains, and trained with an unsupervised factorisation loss and a graph-based loss. With a wide range of experiments, we demonstrate the flexibility, relevance and efficacy of our method, both in the challenging cases with disjoint label spaces, and in the more conventional cases such as unsupervised domain adaptation, where the source and target domains share the same label-sets.


Sign in / Sign up

Export Citation Format

Share Document