Word Sense Disambiguation Using Target Language Corpus in a Machine Translation System

2005 ◽  
Vol 20 (2) ◽  
pp. 237-249 ◽  
Author(s):  
Tayebeh Mosavi Miangah ◽  
Ali Delavar Khalafi
2014 ◽  
Vol 981 ◽  
pp. 153-156
Author(s):  
Chun Xiang Zhang ◽  
Long Deng ◽  
Xue Yao Gao ◽  
Li Li Guo

Word sense disambiguation is key to many application problems in natural language processing. In this paper, a specific classifier of word sense disambiguation is introduced into machine translation system in order to improve the quality of the output translation. Firstly, translation of ambiguous word is deleted from machine translation of Chinese sentence. Secondly, ambiguous word is disambiguated and the classification labels are translations of ambiguous word. Thirdly, these two translations are combined. 50 Chinese sentences including ambiguous words are collected for test experiments. Experimental results show that the translation quality is improved after the proposed method is applied.


Author(s):  
David Vickrey ◽  
Luke Biewald ◽  
Marc Teyssier ◽  
Daphne Koller

IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 38512-38523 ◽  
Author(s):  
Quang-Phuoc Nguyen ◽  
Anh-Dung Vo ◽  
Joon-Choul Shin ◽  
Cheol-Young Ock

2002 ◽  
Vol 8 (4) ◽  
pp. 293-310 ◽  
Author(s):  
DAVID YAROWSKY ◽  
RADU FLORIAN

This paper presents a comprehensive empirical exploration and evaluation of a diverse range of data characteristics which influence word sense disambiguation performance. It focuses on a set of six core supervised algorithms, including three variants of Bayesian classifiers, a cosine model, non-hierarchical decision lists, and an extension of the transformation-based learning model. Performance is investigated in detail with respect to the following parameters: (a) target language (English, Spanish, Swedish and Basque); (b) part of speech; (c) sense granularity; (d) inclusion and exclusion of major feature classes; (e) variable context width (further broken down by part-of-speech of keyword); (f) number of training examples; (g) baseline probability of the most likely sense; (h) sense distributional entropy; (i) number of senses per keyword; (j) divergence between training and test data; (k) degree of (artificially introduced) noise in the training data; (l) the effectiveness of an algorithm's confidence rankings; and (m) a full keyword breakdown of the performance of each algorithm. The paper concludes with a brief analysis of similarities, differences, strengths and weaknesses of the algorithms and a hierarchical clustering of these algorithms based on agreement of sense classification behavior. Collectively, the paper constitutes the most comprehensive survey of evaluation measures and tests yet applied to sense disambiguation algorithms. And it does so over a diverse range of supervised algorithms, languages and parameter spaces in single unified experimental framework.


2016 ◽  
Vol 13 ◽  
Author(s):  
Sharid Loáiciga ◽  
Cristina Grisot

This paper proposes a method for improving the results of a statistical Machine Translation system using boundedness, a pragmatic component of the verbal phrase’s lexical aspect. First, the paper presents manual and automatic annotation experiments for lexical aspect in English-French parallel corpora. It will be shown that this aspectual property is identified and classified with ease both by humans and by automatic systems. Second, Statistical Machine Translation experiments using the boundedness annotations are presented. These experiments show that the information regarding lexical aspect is useful to improve the output of a Machine Translation system in terms of better choices of verbal tenses in the target language, as well as better lexical choices. Ultimately, this work aims at providing a method for the automatic annotation of data with boundedness information and at contributing to Machine Translation by taking into account linguistic data.


Sign in / Sign up

Export Citation Format

Share Document