scholarly journals Project-then-Transfer: Effective Two-stage Cross-lingual Transfer for Semantic Dependency Parsing

Author(s):  
Hiroaki Ozaki ◽  
Gaku Morio ◽  
Terufumi Morishita ◽  
Toshinori Miyoshi
2014 ◽  
Author(s):  
Željko Agić ◽  
Jörg Tiedemann ◽  
Danijela Merkler ◽  
Simon Krek ◽  
Kaja Dobrovoljc ◽  
...  

Author(s):  
Shu Jiang ◽  
Zuchao Li ◽  
Hai Zhao ◽  
Bao-Liang Lu ◽  
Rui Wang

In recent years, the research on dependency parsing focuses on improving the accuracy of the domain-specific (in-domain) test datasets and has made remarkable progress. However, there are innumerable scenarios in the real world that are not covered by the dataset, namely, the out-of-domain dataset. As a result, parsers that perform well on the in-domain data usually suffer from significant performance degradation on the out-of-domain data. Therefore, to adapt the existing in-domain parsers with high performance to a new domain scenario, cross-domain transfer learning methods are essential to solve the domain problem in parsing. This paper examines two scenarios for cross-domain transfer learning: semi-supervised and unsupervised cross-domain transfer learning. Specifically, we adopt a pre-trained language model BERT for training on the source domain (in-domain) data at the subword level and introduce self-training methods varied from tri-training for these two scenarios. The evaluation results on the NLPCC-2019 shared task and universal dependency parsing task indicate the effectiveness of the adopted approaches on cross-domain transfer learning and show the potential of self-learning to cross-lingual transfer learning.


2019 ◽  
Vol 7 ◽  
pp. 643-659
Author(s):  
Amichay Doitch ◽  
Ram Yazdi ◽  
Tamir Hazan ◽  
Roi Reichart

The best solution of structured prediction models in NLP is often inaccurate because of limited expressive power of the model or to non-exact parameter estimation. One way to mitigate this problem is sampling candidate solutions from the model’s solution space, reasoning that effective exploration of this space should yield high-quality solutions. Unfortunately, sampling is often computationally hard and many works hence back-off to sub-optimal strategies, such as extraction of the best scoring solutions of the model, which are not as diverse as sampled solutions. In this paper we propose a perturbation-based approach where sampling from a probabilistic model is computationally efficient. We present a learning algorithm for the variance of the perturbations, and empirically demonstrate its importance. Moreover, while finding the argmax in our model is intractable, we propose an efficient and effective approximation. We apply our framework to cross-lingual dependency parsing across 72 corpora from 42 languages and to lightly supervised dependency parsing across 13 corpora from 12 languages, and demonstrate strong results in terms of both the quality of the entire solution list and of the final solution. 1


2019 ◽  
Author(s):  
Wasi Uddin Ahmad ◽  
Zhisong Zhang ◽  
Xuezhe Ma ◽  
Kai-Wei Chang ◽  
Nanyun Peng

2013 ◽  
Vol 46 ◽  
pp. 203-233 ◽  
Author(s):  
H. Zhao ◽  
X. Zhang ◽  
C. Kit

Semantic parsing, i.e., the automatic derivation of meaning representation such as an instantiated predicate-argument structure for a sentence, plays a critical role in deep processing of natural language. Unlike all other top systems of semantic dependency parsing that have to rely on a pipeline framework to chain up a series of submodels each specialized for a specific subtask, the one presented in this article integrates everything into one model, in hopes of achieving desirable integrity and practicality for real applications while maintaining a competitive performance. This integrative approach tackles semantic parsing as a word pair classification problem using a maximum entropy classifier. We leverage adaptive pruning of argument candidates and large-scale feature selection engineering to allow the largest feature space ever in use so far in this field, it achieves a state-of-the-art performance on the evaluation data set for CoNLL-2008 shared task, on top of all but one top pipeline system, confirming its feasibility and effectiveness.


2020 ◽  
Author(s):  
Zixia Jia ◽  
Youmi Ma ◽  
Jiong Cai ◽  
Kewei Tu

Sign in / Sign up

Export Citation Format

Share Document