Adversarial Training for Cross-Domain Universal Dependency Parsing

In recent years, the research on dependency parsing focuses on improving the accuracy of the domain-specific (in-domain) test datasets and has made remarkable progress. However, there are innumerable scenarios in the real world that are not covered by the dataset, namely, the out-of-domain dataset. As a result, parsers that perform well on the in-domain data usually suffer from significant performance degradation on the out-of-domain data. Therefore, to adapt the existing in-domain parsers with high performance to a new domain scenario, cross-domain transfer learning methods are essential to solve the domain problem in parsing. This paper examines two scenarios for cross-domain transfer learning: semi-supervised and unsupervised cross-domain transfer learning. Specifically, we adopt a pre-trained language model BERT for training on the source domain (in-domain) data at the subword level and introduce self-training methods varied from tri-training for these two scenarios. The evaluation results on the NLPCC-2019 shared task and universal dependency parsing task indicate the effectiveness of the adopted approaches on cross-domain transfer learning and show the potential of self-learning to cross-lingual transfer learning.

Download Full-text

Adversarial Cross-domain Community Question Retrieval

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3487291 ◽

2022 ◽

Vol 21 (3) ◽

pp. 1-22

Author(s):

Aibo Guo ◽

Xinyi Li ◽

Ning Pang ◽

Xiang Zhao

Keyword(s):

Social Media ◽

Information Sharing ◽

Information Access ◽

Professional Knowledge ◽

Feature Space ◽

Online Information ◽

Cross Domain ◽

Adversarial Training ◽

Sentence Matching ◽

Target Data

Community Q&A forum is a special type of social media that provides a platform to raise questions and to answer them (both by forum participants), to facilitate online information sharing. Currently, community Q&A forums in professional domains have attracted a large number of users by offering professional knowledge. To support information access and save users’ efforts of raising new questions, they usually come with a question retrieval function, which retrieves similar existing questions (and their answers) to a user’s query. However, it can be difficult for community Q&A forums to cover all domains, especially those emerging lately with little labeled data but great discrepancy from existing domains. We refer to this scenario as cross-domain question retrieval. To handle the unique challenges of cross-domain question retrieval, we design a model based on adversarial training, namely, X-QR , which consists of two modules—a domain discriminator and a sentence matcher. The domain discriminator aims at aligning the source and target data distributions and unifying the feature space by domain-adversarial training. With the assistance of the domain discriminator, the sentence matcher is able to learn domain-consistent knowledge for the final matching prediction. To the best of our knowledge, this work is among the first to investigate the domain adaption problem of sentence matching for community Q&A forums question retrieval. The experiment results suggest that the proposed X-QR model offers better performance than conventional sentence matching methods in accomplishing cross-domain community Q&A tasks.

Download Full-text

Deep Contextualized Self-training for Low Resource Dependency Parsing

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00294 ◽

2019 ◽

Vol 7 ◽

pp. 695-713 ◽

Cited By ~ 1

Author(s):

Guy Rotman ◽

Roi Reichart

Keyword(s):

State Of The Art ◽

Resource Dependency ◽

Training Methods ◽

Dependency Parsing ◽

Training Algorithm ◽

Low Resource ◽

Supervised Training ◽

Cross Domain ◽

Gating Mechanism ◽

Multiple Languages

Neural dependency parsing has proven very effective, achieving state-of-the-art results on numerous domains and languages. Unfortunately, it requires large amounts of labeled data, which is costly and laborious to create. In this paper we propose a self-training algorithm that alleviates this annotation bottleneck by training a parser on its own output. Our Deep Contextualized Self-training (DCST) algorithm utilizes representation models trained on sequence labeling tasks that are derived from the parser’s output when applied to unlabeled data, and integrates these models with the base parser through a gating mechanism. We conduct experiments across multiple languages, both in low resource in-domain and in cross-domain setups, and demonstrate that DCST substantially outperforms traditional self-training as well as recent semi-supervised training methods. 1

Download Full-text

APGN: Adversarial and Parameter Generation Networks for Multi-Source Cross-Domain Dependency Parsing

10.18653/v1/2021.findings-emnlp.149 ◽

2021 ◽

Author(s):

Ying Li ◽

Meishan Zhang ◽

Zhenghua Li ◽

Min Zhang ◽

Zhefeng Wang ◽

...

Keyword(s):

Dependency Parsing ◽

Cross Domain

Download Full-text

Semantic-aware short path adversarial training for cross-domain semantic segmentation

Neurocomputing ◽

10.1016/j.neucom.2019.11.008 ◽

2020 ◽

Vol 380 ◽

pp. 125-132

Author(s):

Yuhu Shan ◽

Chee Meng Chew ◽

Wen Feng Lu

Keyword(s):

Short Path ◽

Semantic Segmentation ◽

Cross Domain ◽

Adversarial Training

Download Full-text

Overview of the NLPCC 2019 Shared Task: Cross-Domain Dependency Parsing

Natural Language Processing and Chinese Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-32236-6_69 ◽

2019 ◽

pp. 760-771 ◽

Cited By ~ 2

Author(s):

Xue Peng ◽

Zhenghua Li ◽

Min Zhang ◽

Rui Wang ◽

Yue Zhang ◽

...

Keyword(s):

Dependency Parsing ◽

Shared Task ◽

Cross Domain

Download Full-text

AlignFlow: Cycle Consistent Learning from Multiple Domains via Normalizing Flows

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5820 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4028-4035 ◽

Cited By ~ 2

Author(s):

Aditya Grover ◽

Christopher Chute ◽

Rui Shu ◽

Zhangjie Cao ◽

Stefano Ermon

Keyword(s):

Domain Adaptation ◽

Likelihood Estimation ◽

Learning Objectives ◽

Target Domain ◽

Modeling Framework ◽

Source Domain ◽

Cross Domain ◽

Generative Modeling ◽

Adversarial Training ◽

Multiple Domains

Given datasets from multiple domains, a key challenge is to efficiently exploit these data sources for modeling a target domain. Variants of this problem have been studied in many contexts, such as cross-domain translation and domain adaptation. We propose AlignFlow, a generative modeling framework that models each domain via a normalizing flow. The use of normalizing flows allows for a) flexibility in specifying learning objectives via adversarial training, maximum likelihood estimation, or a hybrid of the two methods; and b) learning and exact inference of a shared representation in the latent space of the generative model. We derive a uniform set of conditions under which AlignFlow is marginally-consistent for the different learning objectives. Furthermore, we show that AlignFlow guarantees exact cycle consistency in mapping datapoints from a source domain to target and back to the source domain. Empirically, AlignFlow outperforms relevant baselines on image-to-image translation and unsupervised domain adaptation and can be used to simultaneously interpolate across the various domains using the learned representation.

Download Full-text