A shallow parser for Tamil

AbstractIn this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being varied. In the experiments presented in this paper, the source language was English, and the target language Serbian, and a selected domain was Library and Information Science, for which an aligned corpus exists, as well as a bilingual terminological dictionary. For term extraction, we used the FlexiTerm tool for the source language and a shallow parser for the target language, while for word alignment we used GIZA++. The evaluation results show that for the first approach the F1 score varies from 29.43% to 51.15%, while for the second it varies from 61.03% to 71.03%. On the basis of the evaluation results, we developed a binary classifier that decides whether a candidate pair, composed of aligned source and target terms, is valid. We trained and evaluated different classifiers on a list of manually labeled candidate pairs obtained after the implementation of our extraction system. The best results in a fivefold cross-validation setting were achieved with the Radial Basis Function Support Vector Machine classifier, giving a F1 score of 82.09% and accuracy of 78.49%.

Download Full-text

A language--independent shallow--parser compiler

10.3115/1073012.1073055 ◽

2001 ◽

Cited By ~ 12

Author(s):

Alexandra Kinyon

Keyword(s):

Shallow Parser

Download Full-text

Robustness beyond shallowness: incremental deep parsing

Natural Language Engineering ◽

10.1017/s1351324902002887 ◽

2002 ◽

Vol 8 (2-3) ◽

pp. 121-144 ◽

Cited By ~ 81

Author(s):

S. AÏT-MOKHTAR ◽

J.-P. CHANOD ◽

C. ROUX

Keyword(s):

Language Processing ◽

Syntactic Structures ◽

Partial Parsing ◽

Language Analysis ◽

Robust Parsing ◽

Shallow Parsing ◽

Linguistic Relations ◽

Shallow Parser ◽

Dependency Parser ◽

Generic System

Robustness is a key issue for natural language processing in general and parsing in particular, and many approaches have been explored in the last decade for the design of robust parsing systems. Among those approaches is shallow or partial parsing, which produces minimal and incomplete syntactic structures, often in an incremental way. We argue that with a systematic incremental methodology one can go beyond shallow parsing to deeper language analysis, while preserving robustness. We describe a generic system based on such a methodology and designed for building robust analyzers that tackle deeper linguistic phenomena than those traditionally handled by the now widespread shallow parsers. The rule formalism allows the recognition of n-ary linguistic relations between words or constituents on the basis of global or local structural, topological and/or lexical conditions. It offers the advantage of accepting various types of inputs, ranging from raw to chunked or constituent-marked texts, so for instance it can be used to process existing annotated corpora, or to perform a deeper analysis on the output of an existing shallow parser. It has been successfully used to build a deep functional dependency parser, as well as for the task of co-reference resolution, in a modular way.

Download Full-text

A shallow parser based on closed-class words to capture relations in biomedical text

Journal of Biomedical Informatics ◽

10.1016/s1532-0464(03)00039-x ◽

2003 ◽

Vol 36 (3) ◽

pp. 145-158 ◽

Cited By ~ 57

Author(s):

Gondy Leroy ◽

Hsinchun Chen ◽

Jesse D Martinez

Keyword(s):

Biomedical Text ◽

Closed Class ◽

Shallow Parser

Download Full-text

UCSG Shallow Parser

Computational Linguistics and Intelligent Text Processing - Lecture Notes in Computer Science ◽

10.1007/11671299_18 ◽

2006 ◽

pp. 156-167 ◽

Cited By ~ 5

Author(s):

Guntur Bharadwaja Kumar ◽

Kavi Narayana Murthy

Keyword(s):

Shallow Parser

Download Full-text

Error Patterns and Analysis of Hindi Shallow Parser

International Journal of Computer Applications ◽

10.5120/19381-0632 ◽

2015 ◽

Vol 110 (14) ◽

pp. 1-3

Author(s):

Prabhas Tiwari ◽

Md. Tabrez Nafis

Keyword(s):

Error Patterns ◽

Shallow Parser

Download Full-text

Rectifying Incorrectly Part of Speech-Tagged Polysemy Words in Kannada Language for Machine Translation

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9057 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4255-4257

Author(s):

Abhay Desalli ◽

R. Anirudh ◽

N. Prajwal Pai ◽

S. B. Rajeshwari ◽

Jagadish S. Kallimani

Keyword(s):

Machine Translation ◽

Single Sentence ◽

Part Of Speech ◽

Kannada Language ◽

Shallow Parser

Polysemy words refer to the same word, but have different context, when used in a single sentence, resulting in variant Part of Speech (POS). Occurrence of these Polysemy words, more than once, in a Kannada sentences, leads to ambiguity and is often cumbersome to translate it to English, due to incorrect interpretation of the sentence. The algorithm proposed is, concentrated on a few examples, which can identify the wrong POS-tagged word in a sentence. The POS tagged sentence is obtained as an input through Shallow parser, and then, by owing to the structure of Kannada language, the algorithm identifies the incorrectly tagged word.

Download Full-text