Language engineering for syntactic knowledge transfer

Mihaela Colhon

doi:10.2298/csis120130032c

Language engineering for syntactic knowledge transfer

Computer Science and Information Systems ◽

10.2298/csis120130032c ◽

2012 ◽

Vol 9 (3) ◽

pp. 1231-1247 ◽

Cited By ~ 3

Author(s):

Mihaela Colhon

Keyword(s):

Knowledge Transfer ◽

Syntactic Parsing ◽

Language Engineering ◽

Syntactic Knowledge ◽

Cross Lingual ◽

Parallel Texts

In this paper we present a method for an English-Romanian treebank construction, together with the obtained evaluation results. The treebank is built upon a parallel English-Romanian corpus word-aligned and annotated at the morphological and syntactic level. The syntactic trees of the Romanian texts are generated by considering the syntactic phrases of the English parallel texts automatically resulted from syntactic parsing. The method reuses and adjusts existing tools and algorithms for cross-lingual transfer of syntactic constituents and syntactic trees alignment.

Download Full-text

Multilingual Projection for Parsing Truly Low-Resource Languages

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00100 ◽

2016 ◽

Vol 4 ◽

pp. 301-312 ◽

Cited By ~ 12

Author(s):

Željko Agić ◽

Anders Johannsen ◽

Barbara Plank ◽

Héctor Martínez Alonso ◽

Natalie Schluter ◽

...

Keyword(s):

Empirical Evaluation ◽

Upper Bounds ◽

Low Resource ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Novel Approach ◽

Cross Lingual ◽

Test Languages ◽

Speech Tagging ◽

Parallel Texts

We propose a novel approach to cross-lingual part-of-speech tagging and dependency parsing for truly low-resource languages. Our annotation projection-based approach yields tagging and parsing models for over 100 languages. All that is needed are freely available parallel texts, and taggers and parsers for resource-rich languages. The empirical evaluation across 30 test languages shows that our method consistently provides top-level accuracies, close to established upper bounds, and outperforms several competitive baselines.

Download Full-text

Neural knowledge transfer for low-source sentiment analysis : cross-domain, cross-task & cross-lingual

10.14711/thesis-991012879862503412 ◽

2020 ◽

Author(s):

Zheng Li

Keyword(s):

Knowledge Transfer ◽

Sentiment Analysis ◽

Cross Domain ◽

Cross Lingual

Download Full-text

Low Resource Named Entity Recognition Using Contextual Word Representation and Neural Cross-Lingual Knowledge Transfer

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-36708-4_25 ◽

2019 ◽

pp. 299-311

Author(s):

Soyeon Caren Han ◽

Yingru Lin ◽

Siqu Long ◽

Josiah Poon

Keyword(s):

Knowledge Transfer ◽

Named Entity Recognition ◽

Entity Recognition ◽

Low Resource ◽

Named Entity ◽

Word Representation ◽

Cross Lingual

Download Full-text

Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR

2012 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt.2012.6424230 ◽

2012 ◽

Cited By ~ 55

Author(s):

Pawel Swietojanski ◽

Arnab Ghoshal ◽

Steve Renals

Keyword(s):

Knowledge Transfer ◽

Cross Lingual

Download Full-text

A Multi-media Approach to Cross-lingual Entity Knowledge Transfer

10.18653/v1/p16-1006 ◽

2016 ◽

Cited By ~ 4

Author(s):

Di Lu ◽

Xiaoman Pan ◽

Nima Pourdamghani ◽

Shih-Fu Chang ◽

Heng Ji ◽

...

Keyword(s):

Knowledge Transfer ◽

Multi Media ◽

Cross Lingual

Download Full-text

Exploiting Cross-Lingual Subword Similarities in Low-Resource Document Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6500 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9547-9554

Author(s):

Mozhi Zhang ◽

Yoshinari Fujinuma ◽

Jordan Boyd-Graber

Keyword(s):

Knowledge Transfer ◽

Text Classification ◽

Document Classification ◽

Training Data ◽

Target Language ◽

Source Language ◽

Low Resource ◽

Classification Framework ◽

Related Language ◽

Cross Lingual

Text classification must sometimes be applied in a low-resource language with no labeled training data. However, training data may be available in a related language. We investigate whether character-level knowledge transfer from a related language helps text classification. We present a cross-lingual document classification framework (caco) that exploits cross-lingual subword similarity by jointly training a character-based embedder and a word-based classifier. The embedder derives vector representations for input words from their written forms, and the classifier makes predictions based on the word vectors. We use a joint character representation for both the source language and the target language, which allows the embedder to generalize knowledge about source language words to target language words with similar forms. We propose a multi-task objective that can further improve the model if additional cross-lingual or monolingual resources are available. Experiments confirm that character-level knowledge transfer is more data-efficient than word-level transfer between related languages.

Download Full-text