A Universal Feature Schema for Rich Morphological Annotation and Fine-Grained Cross-Lingual Part-of-Speech Tagging

We propose a novel approach to cross-lingual part-of-speech tagging and dependency parsing for truly low-resource languages. Our annotation projection-based approach yields tagging and parsing models for over 100 languages. All that is needed are freely available parallel texts, and taggers and parsers for resource-rich languages. The empirical evaluation across 30 test languages shows that our method consistently provides top-level accuracies, close to established upper bounds, and outperforms several competitive baselines.

Download Full-text

Fine-grained part-of-speech tagging in Nepali text

Procedia Computer Science ◽

10.1016/j.procs.2021.05.099 ◽

2021 ◽

Vol 189 ◽

pp. 300-311

Author(s):

Ingroj Shrestha ◽

Shreeya Singh Dhakal

Keyword(s):

Fine Grained ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Speech Tagging

Download Full-text

Cross-Lingual Part-of-Speech Tagging through Ambiguous Learning

10.3115/v1/d14-1187 ◽

2014 ◽

Cited By ~ 7

Author(s):

Guillaume Wisniewski ◽

Nicolas Pécheux ◽

Souhir Gahbiche-Braham ◽

François Yvon

Keyword(s):

Part Of Speech Tagging ◽

Part Of Speech ◽

Cross Lingual ◽

Speech Tagging

Download Full-text

Unsupervised Cross-Lingual Part-of-Speech Tagging for Truly Low-Resource Scenarios

10.18653/v1/2020.emnlp-main.391 ◽

2020 ◽

Author(s):

Ramy Eskander ◽

Smaranda Muresan ◽

Michael Collins

Keyword(s):

Low Resource ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Cross Lingual ◽

Speech Tagging

Download Full-text

Joint Prediction of Morphosyntactic Categories for Fine-Grained Arabic Part-of-Speech Tagging Exploiting Tag Dictionary Information

10.18653/v1/k17-1042 ◽

2017 ◽

Cited By ~ 1

Author(s):

Go Inoue ◽

Hiroyuki Shindo ◽

Yuji Matsumoto

Keyword(s):

Fine Grained ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Joint Prediction ◽

Speech Tagging

Download Full-text

Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00205 ◽

2013 ◽

Vol 1 ◽

pp. 1-12 ◽

Cited By ~ 23

Author(s):

Oscar Täckström ◽

Dipanjan Das ◽

Slav Petrov ◽

Ryan McDonald ◽

Joakim Nivre

Keyword(s):

Conditional Random Field ◽

Target Language ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

European Languages ◽

Partially Observed ◽

Resource Poor ◽

Conditional Random Field Model ◽

Cross Lingual ◽

Speech Tagging

We consider the construction of part-of-speech taggers for resource-poor languages. Recently, manually constructed tag dictionaries from Wiktionary and dictionaries projected via bitext have been used as type constraints to overcome the scarcity of annotated data in this setting. In this paper, we show that additional token constraints can be projected from a resource-rich source language to a resource-poor target language via word-aligned bitext. We present several models to this end; in particular a partially observed conditional random field model, where coupled token and type constraints provide a partial signal for training. Averaged across eight previously studied Indo-European languages, our model achieves a 25% relative error reduction over the prior state of the art. We further present successful results on seven additional languages from different families, empirically demonstrating the applicability of coupled token and type constraints across a diverse set of languages.

Download Full-text

Cross-lingual Annotation Projection Is Effective for Neural Part-of-Speech Tagging

10.18653/v1/w19-1425 ◽

2019 ◽

Cited By ~ 1

Author(s):

Matthias Huck ◽

Diana Dutka ◽

Alexander Fraser

Keyword(s):

Part Of Speech Tagging ◽

Part Of Speech ◽

Cross Lingual ◽

Speech Tagging

Download Full-text

A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0770-7 ◽

2019 ◽

Vol 19 (S2) ◽

Cited By ~ 3

Author(s):

Ying Xiong ◽

Zhongmin Wang ◽

Dehuan Jiang ◽

Xiaolong Wang ◽

Qingcai Chen ◽

...

Keyword(s):

Word Segmentation ◽

Chinese Word ◽

Chinese Word Segmentation ◽

Clinical Text ◽

Fine Grained ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Speech Tagging

Download Full-text

How much can part-of-speech tagging help parsing?

Natural Language Engineering ◽

10.1017/s1351324905004079 ◽

2006 ◽

Vol 12 (4) ◽

pp. 373-389 ◽

Cited By ~ 2

Author(s):

MARY DALRYMPLE

Keyword(s):

Large Scale ◽

Equivalence Classes ◽

Fine Grained ◽

Development Platform ◽

English Grammar ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Folk Wisdom ◽

Speech Tagging

Folk wisdom holds that incorporating a part-of-speech tagger into a system that performs deep linguistic analysis will improve the speed and accuracy of the system. Previous studies of tagging have tested this belief by incorporating an existing tagger into a parsing system and observing the effect on the speed of the parser and accuracy of the results. However, not much work has been done to determine in a fine-grained manner exactly how much tagging can help to disambiguate or reduce ambiguity in parser output. We take a new approach to this issue by examining the full parse-forest output of a large-scale LFG-based English grammar (Riezler et al. (2002)) running on the XLE grammar development platform (Maxwell and Kaplan (1993); Maxwell and Kaplan (1996)); and partitioning the parse outputs into equivalence classes based on the tag sequences for each parse. If we find a large number of tag-sequence equivalence classes for each sentence, we can conclude that different parses tend to be distinguished by their tags; a small number means that tagging would not help much in reducing ambiguity. In this way, we can determine how much tagging would help us in the best case, if we had the “perfect tagger” to give us the correct tag sequence for each sentence. We show that if a perfect tagger were available, a reduction in ambiguity of about 50% would be available. Somewhat surprisingly, about 30% of the sentences in the corpus that was examined would not be disambiguated, even by the perfect tagger, since all of the parses for these sentences shared the same tag sequence. Our study also helps to inform research on tagging by providing a targeted determination of exactly which tags can help the most in disambiguation.

Download Full-text