shallow parser
Recently Published Documents


TOTAL DOCUMENTS

16
(FIVE YEARS 2)

H-INDEX

5
(FIVE YEARS 0)

2020 ◽  
Vol 17 (9) ◽  
pp. 4255-4257
Author(s):  
Abhay Desalli ◽  
R. Anirudh ◽  
N. Prajwal Pai ◽  
S. B. Rajeshwari ◽  
Jagadish S. Kallimani

Polysemy words refer to the same word, but have different context, when used in a single sentence, resulting in variant Part of Speech (POS). Occurrence of these Polysemy words, more than once, in a Kannada sentences, leads to ambiguity and is often cumbersome to translate it to English, due to incorrect interpretation of the sentence. The algorithm proposed is, concentrated on a few examples, which can identify the wrong POS-tagged word in a sentence. The POS tagged sentence is obtained as an input through Shallow parser, and then, by owing to the structure of Kannada language, the algorithm identifies the incorrectly tagged word.


2020 ◽  
Vol 26 (4) ◽  
pp. 455-479
Author(s):  
Branislava Šandrih ◽  
Cvetana Krstev ◽  
Ranka Stanković

AbstractIn this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being varied. In the experiments presented in this paper, the source language was English, and the target language Serbian, and a selected domain was Library and Information Science, for which an aligned corpus exists, as well as a bilingual terminological dictionary. For term extraction, we used the FlexiTerm tool for the source language and a shallow parser for the target language, while for word alignment we used GIZA++. The evaluation results show that for the first approach the F1 score varies from 29.43% to 51.15%, while for the second it varies from 61.03% to 71.03%. On the basis of the evaluation results, we developed a binary classifier that decides whether a candidate pair, composed of aligned source and target terms, is valid. We trained and evaluated different classifiers on a list of manually labeled candidate pairs obtained after the implementation of our extraction system. The best results in a fivefold cross-validation setting were achieved with the Radial Basis Function Support Vector Machine classifier, giving a F1 score of 82.09% and accuracy of 78.49%.


2015 ◽  
Vol 110 (14) ◽  
pp. 1-3
Author(s):  
Prabhas Tiwari ◽  
Md. Tabrez Nafis

2010 ◽  
Vol 16 (4) ◽  
pp. 391-415 ◽  
Author(s):  
JIANGUO LI ◽  
CHRIS BREW

AbstractLapata and Brew (Computational Linguistics, vol. 30, 2004, pp. 295–313) (hereafter LB04) obtain from untagged texts a statistical prior model that is able to generate class preferences for ambiguous Lewin (English Verb Classes and Alternations: A Preliminary Investigation, 1993, University of Chicago Press) verbs (hereafter Levin). They also show that their informative priors, incorporated into a Naive Bayes classifier deduced from hand-tagged data (HTD), can aid in verb class disambiguation. We re-analyse LB04's prior model and show that a single factor (the joint probability of class and frame) determines the predominant class for a particular verb in a particular frame. This means that the prior model cannot be sensitive to fine-grained lexical distinctions between different individual verbs falling in the same class.We replicate LB04's supervised disambiguation experiments on large-scale data, using deep parsers rather than the shallow parser of LB04. In addition, we introduce a method for training our classifier without using HTD. This relies on knowledge of Levin class memberships to move information from unambiguous to ambiguous instances of each class. We regard this system as unsupervised because it does not rely on human annotation of individual verb instances. Although our unsupervised verb class disambiguator does not match the performance of the ones that make use of HTD, it consistently outperforms the random baseline model. Our experiments also demonstrate that the informative priors derived from untagged texts help improve the performance of the classifier trained on untagged data.


Sign in / Sign up

Export Citation Format

Share Document