In recent years, the research on dependency parsing focuses on improving the accuracy of the domain-specific (in-domain) test datasets and has made remarkable progress. However, there are innumerable scenarios in the real world that are not covered by the dataset, namely, the out-of-domain dataset. As a result, parsers that perform well on the in-domain data usually suffer from significant performance degradation on the out-of-domain data. Therefore, to adapt the existing in-domain parsers with high performance to a new domain scenario, cross-domain transfer learning methods are essential to solve the domain problem in parsing. This paper examines two scenarios for cross-domain transfer learning: semi-supervised and unsupervised cross-domain transfer learning. Specifically, we adopt a pre-trained language model BERT for training on the source domain (in-domain) data at the subword level and introduce self-training methods varied from tri-training for these two scenarios. The evaluation results on the NLPCC-2019 shared task and universal dependency parsing task indicate the effectiveness of the adopted approaches on cross-domain transfer learning and show the potential of self-learning to cross-lingual transfer learning.
Multi-source domain adaptation is a challenging topic in transfer learning, especially when the data of each domain are represented by different kinds of features, i.e., Multi-source Heterogeneous Domain Adaptation (MHDA). It is important to take advantage of the knowledge extracted from multiple sources as well as bridge the heterogeneous spaces for handling the MHDA paradigm. This article proposes a novel method named Multiple Graphs and Low-rank Embedding (MGLE), which models the local structure information of multiple domains using multiple graphs and learns the low-rank embedding of the target domain. Then, MGLE augments the learned embedding with the original target data. Specifically, we introduce the modules of both domain discrepancy and domain relevance into the multiple graphs and low-rank embedding learning procedure. Subsequently, we develop an iterative optimization algorithm to solve the resulting problem. We evaluate the effectiveness of the proposed method on several real-world datasets. Promising results show that the performance of MGLE is better than that of the baseline methods in terms of several metrics, such as AUC, MAE, accuracy, precision, F1 score, and MCC, demonstrating the effectiveness of the proposed method.
Domain adaptation aims at improving the performance of learning tasks in a target domain by leveraging the knowledge extracted from a source domain. To this end, one can perform knowledge transfer between these two domains. However, this problem becomes extremely challenging when the data of these two domains are characterized by different types of features, i.e., the feature spaces of the source and target domains are different, which is referred to as heterogeneous domain adaptation (HDA). To solve this problem, we propose a novel model called Knowledge Preserving and Distribution Alignment (KPDA), which learns an augmented target space by jointly minimizing information loss and maximizing domain distribution alignment. Specifically, we seek to discover a latent space, where the knowledge is preserved by exploiting the Laplacian graph terms and reconstruction regularizations. Moreover, we adopt the Maximum Mean Discrepancy to align the distributions of the source and target domains in the latent space. Mathematically, KPDA is formulated as a minimization problem with orthogonal constraints, which involves two projection variables. Then, we develop an algorithm based on the Gauss–Seidel iteration scheme and split the problem into two subproblems, which are solved by searching algorithms based on the Barzilai–Borwein (BB) stepsize. Promising results demonstrate the effectiveness of the proposed method.
Deep learning has achieved considerable success in medical image segmentation. However, applying deep learning in clinical environments often involves two problems: (1) scarcity of annotated data as data annotation is time-consuming and (2) varying attributes of different datasets due to domain shift. To address these problems, we propose an improved generative adversarial network (GAN) segmentation model, called U-shaped GAN, for limited-annotated chest radiograph datasets. The semi-supervised learning approach and unsupervised domain adaptation (UDA) approach are modeled into a unified framework for effective segmentation. We improve GAN by replacing the traditional discriminator with a U-shaped net, which predicts each pixel a label. The proposed U-shaped net is designed with high resolution radiographs (1,024 × 1,024) for effective segmentation while taking computational burden into account. The pointwise convolution is applied to U-shaped GAN for dimensionality reduction, which decreases the number of feature maps while retaining their salient features. Moreover, we design the U-shaped net with a pretrained ResNet-50 as an encoder to reduce the computational burden of training the encoder from scratch. A semi-supervised learning approach is proposed learning from limited annotated data while exploiting additional unannotated data with a pixel-level loss. U-shaped GAN is extended to UDA by taking the source and target domain data as the annotated data and the unannotated data in the semi-supervised learning approach, respectively. Compared to the previous models dealing with the aforementioned problems separately, U-shaped GAN is compatible with varying data distributions of multiple medical centers, with efficient training and optimizing performance. U-shaped GAN can be generalized to chest radiograph segmentation for clinical deployment. We evaluate U-shaped GAN with two chest radiograph datasets. U-shaped GAN is shown to significantly outperform the state-of-the-art models.
Due to the non-invasiveness and high precision of electroencephalography (EEG), the combination of EEG and artificial intelligence (AI) is often used for emotion recognition. However, the internal differences in EEG data have become an obstacle to classification accuracy. To solve this problem, considering labeled data from similar nature but different domains, domain adaptation usually provides an attractive option. Most of the existing researches aggregate the EEG data from different subjects and sessions as a source domain, which ignores the assumption that the source has a certain marginal distribution. Moreover, existing methods often only align the representation distributions extracted from a single structure, and may only contain partial information. Therefore, we propose the multi-source and multi-representation adaptation (MSMRA) for cross-domain EEG emotion recognition, which divides the EEG data from different subjects and sessions into multiple domains and aligns the distribution of multiple representations extracted from a hybrid structure. Two datasets, i.e., SEED and SEED IV, are used to validate the proposed method in cross-session and cross-subject transfer scenarios, experimental results demonstrate the superior performance of our model to state-of-the-art models in most settings.