ADAPTING WORD ORDER TRANSFORMATION FOR VIETNAMESE DEPENDENCY PARSING

2021 ◽  
Author(s):  
Do Tran Anh Duc ◽  
Pham Hoai Phu Thinh ◽  
Vu Tran Duy ◽  
Luong An Vinh
Author(s):  
Utpal Garain ◽  
Sankar De

A grammar-driven dependency parsing has been attempted for Bangla (Bengali). The free-word order nature of the language makes the development of an accurate parser very difficult. The Paninian grammatical model has been used to tackle the free-word order problem. The approach is to simplify complex and compound sentences and then to parse simple sentences by satisfying the Karaka demands of the Demand Groups (Verb Groups). Finally, parsed structures are rejoined with appropriate links and Karaka labels. The parser has been trained with a Treebank of 1000 annotated sentences and then evaluated with un-annotated test data of 150 sentences. The evaluation shows that the proposed approach achieves 90.32% and 79.81% accuracies for unlabeled and labeled attachments, respectively.


2011 ◽  
Vol 37 (1) ◽  
pp. 197-230 ◽  
Author(s):  
Ryan McDonald ◽  
Joakim Nivre

There has been a rapid increase in the volume of research on data-driven dependency parsers in the past five years. This increase has been driven by the availability of treebanks in a wide variety of languages—due in large part to the CoNLL shared tasks—as well as the straightforward mechanisms by which dependency theories of syntax can encode complex phenomena in free word order languages. In this article, our aim is to take a step back and analyze the progress that has been made through an analysis of the two predominant paradigms for data-driven dependency parsing, which are often called graph-based and transition-based dependency parsing. Our analysis covers both theoretical and empirical aspects and sheds light on the kinds of errors each type of parser makes and how they relate to theoretical expectations. Using these observations, we present an integrated system based on a stacking learning framework and show that such a system can learn to overcome the shortcomings of each non-integrated system.


2014 ◽  
pp. 1536-1549
Author(s):  
Utpal Garain ◽  
Sankar De

A grammar-driven dependency parsing has been attempted for Bangla (Bengali). The free-word order nature of the language makes the development of an accurate parser very difficult. The Paninian grammatical model has been used to tackle the free-word order problem. The approach is to simplify complex and compound sentences and then to parse simple sentences by satisfying the Karaka demands of the Demand Groups (Verb Groups). Finally, parsed structures are rejoined with appropriate links and Karaka labels. The parser has been trained with a Treebank of 1000 annotated sentences and then evaluated with un-annotated test data of 150 sentences. The evaluation shows that the proposed approach achieves 90.32% and 79.81% accuracies for unlabeled and labeled attachments, respectively.


2016 ◽  
Vol 4 ◽  
pp. 343-356 ◽  
Author(s):  
Kristina Gulordava ◽  
Paola Merlo

The growing work in multi-lingual parsing faces the challenge of fair comparative evaluation and performance analysis across languages and their treebanks. The difficulty lies in teasing apart the properties of treebanks, such as their size or average sentence length, from those of the annotation scheme, and from the linguistic properties of languages. We propose a method to evaluate the effects of word order of a language on dependency parsing performance, while controlling for confounding treebank properties. The method uses artificially-generated treebanks that are minimal permutations of actual treebanks with respect to two word order properties: word order variation and dependency lengths. Based on these artificial data on twelve languages, we show that longer dependencies and higher word order variability degrade parsing performance. Our method also extends to minimal pairs of individual sentences, leading to a finer-grained understanding of parsing errors.


2016 ◽  
Vol 39 ◽  
Author(s):  
William O'Grady

AbstractI focus on two challenges that processing-based theories of language must confront: the need to explain why language has the particular properties that it does, and the need to explain why processing pressures are manifested in the particular way that they are. I discuss these matters with reference to two illustrative phenomena: proximity effects in word order and a constraint on contraction.


1967 ◽  
Vol 10 (3) ◽  
pp. 600-605 ◽  
Author(s):  
Penelope B. Odom ◽  
Richard L. Blanton

Two groups each containing 24 deaf subjects were compared with 24 fifth graders and 24 twelfth graders with normal hearing on the learning of segments of written English. Eight subjects from each group learned phrasally defined segments such as “paid the tall lady,” eight more learned the same words in nonphrases having acceptable English word order such as “lady paid the tall,” and the remaining eight in each group learned the same words scrambled, “lady tall the paid.” The task consisted of 12 study-test trials. Analyses of the mean number of words recalled correctly and the probability of recalling the whole phrase correctly, given that one word of it was recalled, indicated that both ages of hearing subjects showed facilitation on the phrasally defined segments, interference on the scrambled segments. The deaf groups showed no differential recall as a function of phrasal structure. It was concluded that the deaf do not possess the same perceptual or memory processes with regard to English as do the hearing subjects.


Author(s):  
Jae Jung Song
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document