Global Word Sense Disambiguation of Polysemous Words in Telugu Language

Word Sense Disambiguation (WSD) is a significant issue in Natural Language Processing (NLP). WSD refers to the capacity of recognizing the correct sense of a word in a given context. It can improve numerous NLP applications such as machine translation, text summarization, information retrieval, or sentiment analysis. This paper proposes an approach named ShotgunWSD. Shotgun WSD is an unsupervised and knowledgebased algorithm for global word sense disambiguation. The algorithm is motivated by the Shotgun sequencing technique. Shotgun WSD is proposed to disambiguate the word senses of Telugu document with three functional phases. The Shotgun WSD achieves the better performance than other approaches of WSD in the disambiguating sense of ambiguous words in Telugu documents. The dataset is used in the Indo-WordNet.

Download Full-text

Techniques for Lexical Semantics in Hindi Language

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3636.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 4075-4080

Keyword(s):

Language Processing ◽

Word Sense Disambiguation ◽

Word Sense ◽

Official Language ◽

Indian Languages ◽

Semantic Task ◽

Sense Disambiguation ◽

Polysemous Word ◽

The Given ◽

Polysemous Words

A word having multiple senses in a text introduces the lexical semantic task to find out which particular sense is appropriate for the given context. One such task is word sense disambiguation which refers to the identification of the most appropriate meaning of the polysemous word in a given context using computational algorithms. The language processing research in Hindi, the official language of India, and other Indian languages is constrained by non-availability of the standard corpora. For Hindi word sense disambiguation also, the large corpus is not available. In this work, we prepared the text containing new senses of certain words leading to the enrichment of the available sense-tagged Hindi corpus of sixty polysemous words. Furthermore, we analyzed two novel lexical associations for Hindi word sense disambiguation based on the contextual features of the polysemous word. The evaluation of these methods is carried out over learning algorithms and favourable results are achieved

Download Full-text

Word Sense Disambiguation for Improving the Quality of Machine Translation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.981.153 ◽

2014 ◽

Vol 981 ◽

pp. 153-156

Author(s):

Chun Xiang Zhang ◽

Long Deng ◽

Xue Yao Gao ◽

Li Li Guo

Keyword(s):

Machine Translation ◽

Language Processing ◽

Word Sense Disambiguation ◽

Ambiguous Word ◽

Translation System ◽

Word Sense ◽

Ambiguous Words ◽

Sense Disambiguation ◽

Machine Translation System

Word sense disambiguation is key to many application problems in natural language processing. In this paper, a specific classifier of word sense disambiguation is introduced into machine translation system in order to improve the quality of the output translation. Firstly, translation of ambiguous word is deleted from machine translation of Chinese sentence. Secondly, ambiguous word is disambiguated and the classification labels are translations of ambiguous word. Thirdly, these two translations are combined. 50 Chinese sentences including ambiguous words are collected for test experiments. Experimental results show that the translation quality is improved after the proposed method is applied.

Download Full-text

An Associative Concept Dictionary for Natural Language Processing: Text Summarization and Word Sense Disambiguation

Journal of Cognitive Science ◽

10.17791/jcs.2011.12.3.261 ◽

2011 ◽

Vol 12 (3) ◽

pp. 261-278

Author(s):

Jun Okamoto ◽

Shun Ishizaki

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Word Sense Disambiguation ◽

Text Summarization ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

A Knowledge Based Word Sense Disambiguation in Telugu Language

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1911.1010120 ◽

2020 ◽

Vol 10 (1) ◽

pp. 440-445

Keyword(s):

Computational Linguistics ◽

Word Sense Disambiguation ◽

The Other ◽

Word Sense ◽

Knowledge Based ◽

Ambiguous Words ◽

Sense Disambiguation ◽

The Senses ◽

Definition Of ◽

Polysemous Words

Telugu (తెలుగు) is one of the Dravidian languages which are morphologically rich. As within the other languages, it too consists of ambiguous words/phrases which have one-of-a-kind meanings in special contexts. Such words are referred as polysemous words i.e. words having a couple of experiences. A Knowledge based approach is proposed for disambiguating Telugu polysemous phrases using the computational linguistics tool, IndoWordNet. The task of WSD (Word sense disambiguation) requires finding out the similarity among the target phrase and the nearby phrase. In this approach, the similarity is calculated either by means of locating out the range of similar phrases (intersection) between the glosses (definition) of the target and nearby words or by way of finding out the exact occurrence of the nearby phrase's sense in the hierarchy (hypernyms/hyponyms) of the target phrase's senses. The above parameters are changed by using the intersection use of not simplest the glosses but also by using which include the related words. Additionally, it is a third parameter 'distance' which measures the distance among the target and nearby phrases. The proposed method makes use of greater parameters for calculating similarity. It scores the senses based on the general impact of parameters i.e. intersection, hierarchy and distance, after which chooses the sense with the best score. The correct meaning of Telugu polysemous phrase could be identified with this technique.

Download Full-text

Word sense disambiguation using implicit information

Natural Language Engineering ◽

10.1017/s1351324919000421 ◽

2019 ◽

Vol 26 (4) ◽

pp. 413-432 ◽

Cited By ~ 1

Author(s):

Goonjan Jain ◽

D.K. Lobiyal

Keyword(s):

Word Sense Disambiguation ◽

Ambiguous Word ◽

Word Sense ◽

Implicit Information ◽

Ambiguous Words ◽

Sense Disambiguation ◽

Novel Method ◽

Unsupervised Approach ◽

Polysemous Words ◽

Better Than

AbstractHumans proficiently interpret the true sense of an ambiguous word by establishing association among words in a sentence. The complete sense of text is also based on implicit information, which is not explicitly mentioned. The absence of this implicit information is a significant problem for a computer program that attempts to determine the correct sense of ambiguous words. In this paper, we propose a novel method to uncover the implicit information that links the words of a sentence. We reveal this implicit information using a graph, which is then used to disambiguate the ambiguous word. The experiments show that the proposed algorithm interprets the correct sense for both homonyms and polysemous words. Our proposed algorithm has performed better than the approaches presented in the SemEval-2013 task for word sense disambiguation and has shown an accuracy of 79.6 percent, which is 2.5 percent better than the best unsupervised approach in SemEval-2007.

Download Full-text

Networks and Natural Language Processing

AI Magazine ◽

10.1609/aimag.v29i3.2160 ◽

2008 ◽

Vol 29 (3) ◽

pp. 16 ◽

Cited By ~ 9

Author(s):

Dragomir R. Radev ◽

Rada Mihalcea

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Word Sense Disambiguation ◽

Text Clustering ◽

Text Summarization ◽

Syntactic Parsing ◽

Word Sense ◽

Ontology Construction ◽

Sense Disambiguation

Over the last few years, a number of areas of natural language processing have begun applying graph-based techniques. These include, among others, text summarization, syntactic parsing, word-sense disambiguation, ontology construction, sentiment and subjectivity analysis, and text clustering. In this paper, we present some of the most successful graph-based representations and algorithms used in language processing and try to explain how and why they work.

Download Full-text

Word vs. Class-Based Word Sense Disambiguation

Journal of Artificial Intelligence Research ◽

10.1613/jair.4727 ◽

2015 ◽

Vol 54 ◽

pp. 83-122 ◽

Cited By ~ 4

Author(s):

Ruben Izquierdo ◽

Armando Suarez ◽

German Rigau

Keyword(s):

Word Sense Disambiguation ◽

Coarse Grained ◽

Semantic Features ◽

Word Sense ◽

Simple Method ◽

Word Meanings ◽

Semantic Class ◽

Semantic Classes ◽

Sense Disambiguation ◽

Word Senses

As empirically demonstrated by the Word Sense Disambiguation (WSD) tasks of the last SensEval/SemEval exercises, assigning the appropriate meaning to words in context has resisted all attempts to be successfully addressed. Many authors argue that one possible reason could be the use of inappropriate sets of word meanings. In particular, WordNet has been used as a de-facto standard repository of word meanings in most of these tasks. Thus, instead of using the word senses defined in WordNet, some approaches have derived semantic classes representing groups of word senses. However, the meanings represented by WordNet have been only used for WSD at a very fine-grained sense level or at a very coarse-grained semantic class level (also called SuperSenses). We suspect that an appropriate level of abstraction could be on between both levels. The contributions of this paper are manifold. First, we propose a simple method to automatically derive semantic classes at intermediate levels of abstraction covering all nominal and verbal WordNet meanings. Second, we empirically demonstrate that our automatically derived semantic classes outperform classical approaches based on word senses and more coarse-grained sense groupings. Third, we also demonstrate that our supervised WSD system benefits from using these new semantic classes as additional semantic features while reducing the amount of training examples. Finally, we also demonstrate the robustness of our supervised semantic class-based WSD system when tested on out of domain corpus.

Download Full-text

A HIGHLY ACCURATE BOOTSTRAPPING ALGORITHM FOR WORD SENSE DISAMBIGUATION

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213001000398 ◽

2001 ◽

Vol 10 (01n02) ◽

pp. 5-21 ◽

Cited By ~ 17

Author(s):

RADA F. MIHALCEA ◽

DAN I. MOLDOVAN

Keyword(s):

High Precision ◽

Word Sense Disambiguation ◽

Original Text ◽

Word Sense ◽

New Words ◽

Input Text ◽

Ambiguous Words ◽

Sense Disambiguation ◽

Very High

In this paper, we present a bootstrapping algorithm for Word Sense Disambiguation which succeeds in disambiguating a subset of the words in the input text with very high precision. It uses WordNet and a semantic tagged corpus, for the purpose of identifying the correct sense of the words in a given text. The bootstrapping process initializes a set of ambiguous words with all the nouns and verbs in the text. It then applies various disambiguation procedures and builds a set of disambiguated words: new words are sense tagged based on their relation to the already disambiguated words, and then added to the set. This process allows us to identify, in the original text, a set of words which can be disambiguated with high precision; 55% of the verbs and nouns are disambiguated with an accuracy of 92%.

Download Full-text

A critical analysis and explication of word sense disambiguation as approached by natural language processing

Lingua ◽

10.1016/j.lingua.2020.102896 ◽

2020 ◽

Vol 243 ◽

pp. 102896

Author(s):

Julie Mennes ◽

Stephan van der Waart van Gulik

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Critical Analysis ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

deepBioWSD: effective deep neural word sense disambiguation of biomedical text data

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy189 ◽

2019 ◽

Vol 26 (5) ◽

pp. 438-446 ◽

Cited By ~ 3

Author(s):

Ahmad Pesaranghader ◽

Stan Matwin ◽

Marina Sokolova ◽

Ali Pesaranghader

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Word Sense Disambiguation ◽

Training Data ◽

Biomedical Text ◽

Word Sense ◽

Vocabulary Size ◽

Unified Medical Language System ◽

Knowledge Based ◽

Sense Disambiguation

Abstract Objective In biomedicine, there is a wealth of information hidden in unstructured narratives such as research articles and clinical reports. To exploit these data properly, a word sense disambiguation (WSD) algorithm prevents downstream difficulties in the natural language processing applications pipeline. Supervised WSD algorithms largely outperform un- or semisupervised and knowledge-based methods; however, they train 1 separate classifier for each ambiguous term, necessitating a large number of expert-labeled training data, an unattainable goal in medical informatics. To alleviate this need, a single model that shares statistical strength across all instances and scales well with the vocabulary size is desirable. Materials and Methods Built on recent advances in deep learning, our deepBioWSD model leverages 1 single bidirectional long short-term memory network that makes sense prediction for any ambiguous term. In the model, first, the Unified Medical Language System sense embeddings will be computed using their text definitions; and then, after initializing the network with these embeddings, it will be trained on all (available) training data collectively. This method also considers a novel technique for automatic collection of training data from PubMed to (pre)train the network in an unsupervised manner. Results We use the MSH WSD dataset to compare WSD algorithms, with macro and micro accuracies employed as evaluation metrics. deepBioWSD outperforms existing models in biomedical text WSD by achieving the state-of-the-art performance of 96.82% for macro accuracy. Conclusions Apart from the disambiguation improvement and unsupervised training, deepBioWSD depends on considerably less number of expert-labeled data as it learns the target and the context terms jointly. These merit deepBioWSD to be conveniently deployable in real-time biomedical applications.

Download Full-text