DISTRIBUTIONAL ANALYSIS OF RELATED SYNSETS IN WordNet FOR A WORD SENSE DISAMBIGUATION TASK

This work presents a new method for an unsupervised word sense disambiguation task using WordNet semantic relations. In this method we expand the context of a word being disambiguated with related synsets from the available WordNet relations and study within this set the distribution of the related synset that correspond to each sense of the target word. A single sample Pearson-Chi-Square goodness-of-fit hypothesis test is used to determine whether the null hypothesis of a composite normality PDF is a reasonable assumption for a set of related synsets corresponding to a sense. The calculated p-value from this test is a critical value for deciding the correct sense. The target word is assigned the sense, the related synsets of which are distributed more "abnormally" relative to the other sets of the other senses. Our algorithm is evaluated on English lexical sample data from the Senseval-2 word sense disambiguation competition. Three WordNet relations, antonymy, hyponymy and hypernymy give a distributional set of related synsets for the context that was proved quite a good word sense discriminator, achieving comparable results with the system obtained the better results among the other competing participants.

Download Full-text

A Word Sense Disambiguation Approach for English-Thai Translation

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.411-414.287 ◽

2013 ◽

Vol 411-414 ◽

pp. 287-290

Author(s):

Nantapong Keandoungchun ◽

Nithinant Thammakoranonta

Keyword(s):

Target Word ◽

Word Sense Disambiguation ◽

Local Context ◽

Word Sense ◽

Test Statistic ◽

Maximum Probability ◽

Novel Approach ◽

Sense Disambiguation ◽

Stored Information ◽

Paired T Test

This paper proposes a novel approach for word sense disambiguation (WSD) in English to Thai. The approach generated a knowledge base which stored information of local context and then applied this information to analyze probabilities of several meanings of a target word. The meanings with the maximum probability are translated as Thai meaning of that English target word. The approach has been evaluated by analyzing the percentage of accuracy of the target word translation in each paper. It also compared the accuracy with Google translation. The experimental results indicate that the proposed approach is more accuracy than Google Translation by using paired T-test statistic equals to 6.628 with sig. = 0.00 (< 0.05)

Download Full-text

MODELING WORDNET GLOSSES TO PERFORM WORD SENSE DISAMBIGUATION

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213013500036 ◽

2013 ◽

Vol 22 (02) ◽

pp. 1350003 ◽

Cited By ~ 2

Author(s):

KOSTAS FRAGOS

Keyword(s):

Goodness Of Fit ◽

Word Sense Disambiguation ◽

Semantic Relatedness ◽

Statistical Test ◽

Word Sense ◽

Theoretical Assumption ◽

Actual Distribution ◽

Sample Data ◽

Sense Disambiguation ◽

Formal Mechanism

In this work, we propose a new measure of semantic relatedness between concepts applied in word sense disambiguation. Using the overlaps between WordNet definitions of concepts (glosses) and the so-called goodness of fit statistical test we establish a formal mechanism for quantifying and estimating the semantic relatedness between concepts. More concretely, we model WordNet glosses overlaps by making a theoretical assumption about their distribution and then we quantify the discrepancy between the theoretical and actual distribution. This discrepancy is suitably used to measure the relatedness between the input concepts. The experimental results showed very good performance on SensEval-2 lexical sample data for word sense disambiguation.

Download Full-text

Target word sense disambiguation system for Kannada language

3rd International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom 2011) ◽

10.1049/ic.2011.0097 ◽

2011 ◽

Cited By ~ 3

Author(s):

S. Parameswarappa ◽

V.N. Narayana

Keyword(s):

Target Word ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation ◽

Kannada Language

Download Full-text

Arabic Gloss WSD Using BERT

Applied Sciences ◽

10.3390/app11062567 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2567

Author(s):

Mohammed El-Razzaz ◽

Mohamed Waleed Fakhr ◽

Fahima A. Maghraby

Keyword(s):

Target Word ◽

Semantic Similarity ◽

Test Data ◽

Word Sense Disambiguation ◽

Word Sense ◽

Written Text ◽

Knowledge Based ◽

Training Samples ◽

Sense Disambiguation ◽

Definition Of

Word Sense Disambiguation (WSD) aims to predict the correct sense of a word given its context. This problem is of extreme importance in Arabic, as written words can be highly ambiguous; 43% of diacritized words have multiple interpretations and the percentage increases to 72% for non-diacritized words. Nevertheless, most Arabic written text does not have diacritical marks. Gloss-based WSD methods measure the semantic similarity or the overlap between the context of a target word that needs to be disambiguated and the dictionary definition of that word (gloss of the word). Arabic gloss WSD suffers from a lack of context-gloss datasets. In this paper, we present an Arabic gloss-based WSD technique. We utilize the celebrated Bidirectional Encoder Representation from Transformers (BERT) to build two models that can efficiently perform Arabic WSD. These models can be trained with few training samples since they utilize BERT models that were pretrained on a large Arabic corpus. Our experimental results show that our models outperform two of the most recent gloss-based WSDs when we test them against the same test data used to evaluate our model. Additionally, our model achieves an F1-score of 89% compared to the best-reported F1-score of 85% for knowledge-based Arabic WSD. Another contribution of this paper is introducing a context-gloss benchmark that may help to overcome the lack of a standardized benchmark for Arabic gloss-based WSD.

Download Full-text

Cognitive Metaphor and Discourse: Research Methods and Paradigms

Vestnik Volgogradskogo gosudarstvennogo universiteta Serija 2 Jazykoznanije ◽

10.15688/jvolsu2.2021.5.9 ◽

2022 ◽

pp. 108-121

Author(s):

Oleg Kalinin

Keyword(s):

Word Sense Disambiguation ◽

The Other ◽

Computer Assisted ◽

Word Sense ◽

Research Papers ◽

Cognitive Metaphor ◽

Sense Disambiguation ◽

Points Of View ◽

Basic Approaches ◽

The One

The article dwells on a modern cognitive and discourse study of metaphors. Taking the advantage of the analysis and fusion of information in foreign and domestic papers, the researcher delves into their classification from the ontological, axiological and epistemological points of view. The ontological level breaks down into two basic approaches, namely metaphorical nature of discourse and discursive nature of metaphors. The former analyses metaphors to fathom characteristics of discourse, while the other provides for the study of metaphorical features in the context of discursive communication. The axiological aspect covers critical and descriptive studies and the epistemological angle comprises quantitive and qualitative methods in metaphorical studies. Other issues covered in the paper incorporate a thorough review of methods for identification of metaphors to include computer-assisted solutions (Word Sense Disambiguation, Categorisation, Metaphor Clusters) and numerical analysis of the metaphorical nature of discourse – descriptor analysis, metaphor power index, cluster analysis, and complex metaphor power analysis. On the one hand, the conceptualization of research papers boils down to major features of the discursive approach to metaphors and on the other, multiple studies of metaphors in the context of discourse pave the way for a discursive trend in cognitive metaphorology.

Download Full-text

Translation selection through source word sense disambiguation and target word selection

10.3115/1072228.1072274 ◽

2002 ◽

Cited By ~ 3

Author(s):

Hyun Ah Lee ◽

Gil Chang Kim

Keyword(s):

Target Word ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation ◽

Word Selection ◽

Source Word

Download Full-text

Utilizing Clues in Syntactic Relationship for Automatic Target Word Sense Disambiguation

Journal of Research in Science Computing and Engineering ◽

10.3860/jrsce.v3i3.99 ◽

2008 ◽

Vol 3 (3) ◽

Author(s):

Ebony Domingo ◽

Rachel Edita Roxas

Keyword(s):

Target Word ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

A Knowledge Based Word Sense Disambiguation in Telugu Language

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1911.1010120 ◽

2020 ◽

Vol 10 (1) ◽

pp. 440-445

Keyword(s):

Computational Linguistics ◽

Word Sense Disambiguation ◽

The Other ◽

Word Sense ◽

Knowledge Based ◽

Ambiguous Words ◽

Sense Disambiguation ◽

The Senses ◽

Definition Of ◽

Polysemous Words

Telugu (తెలుగు) is one of the Dravidian languages which are morphologically rich. As within the other languages, it too consists of ambiguous words/phrases which have one-of-a-kind meanings in special contexts. Such words are referred as polysemous words i.e. words having a couple of experiences. A Knowledge based approach is proposed for disambiguating Telugu polysemous phrases using the computational linguistics tool, IndoWordNet. The task of WSD (Word sense disambiguation) requires finding out the similarity among the target phrase and the nearby phrase. In this approach, the similarity is calculated either by means of locating out the range of similar phrases (intersection) between the glosses (definition) of the target and nearby words or by way of finding out the exact occurrence of the nearby phrase's sense in the hierarchy (hypernyms/hyponyms) of the target phrase's senses. The above parameters are changed by using the intersection use of not simplest the glosses but also by using which include the related words. Additionally, it is a third parameter 'distance' which measures the distance among the target and nearby phrases. The proposed method makes use of greater parameters for calculating similarity. It scores the senses based on the general impact of parameters i.e. intersection, hierarchy and distance, after which chooses the sense with the best score. The correct meaning of Telugu polysemous phrase could be identified with this technique.

Download Full-text

What’s the Matter? Knowledge Acquisition by Unsupervised Multi-Topic Labeling for Spoken Utterances

International Journal of Humanized Computing and Communication ◽

10.35708/hcc1868-126364 ◽

2020 ◽

pp. 43-66

Author(s):

Sebastian Weigelt

Keyword(s):

Knowledge Acquisition ◽

User Study ◽

Word Sense Disambiguation ◽

Spoken Language ◽

The Other ◽

Word Sense ◽

Bayes Classifier ◽

Sense Disambiguation ◽

Topic Labeling ◽

Word Senses

Systems such as Alexa, Cortana, and Siri appear rather smart. However, they only react to predefined wordings and do not actually grasp the user’s intent. To overcome this limitation, a system must understand the topics the user is talking about. Therefore, we apply unsupervised multi-topic labeling to spoken utterances. Although topic labeling is a well-studied task on textual documents, its potential for spoken input is almost unexplored. Our approach for topic labeling is tailored to spoken utterances; it copes with short and ungrammatical input. The approach is two-tiered. First, we disambiguate word senses. We utilize Wikipedia as pre-labeled corpus to train a naïve-bayes classifier. Second, we build topic graphs based on DBpedia relations. We use two strategies to determine central terms in the graphs, i.e. the shared topics. One focuses on the dominant senses in the utterance and the other covers as many distinct senses as possible. Our approach creates multiple distinct topics per utterance and ranks results. The evaluation shows that the approach is feasible; the word sense disambiguation achieves a recall of 0.799. Concerning topic labeling, in a user study subjects assessed that in 90.9% of the cases at least one proposed topic label among the first four is a good fit. With regard to precision, the subjects judged that 77.2% of the top ranked labels are a good fit or good but somewhat too broad (Fleiss’ kappa κ = 0.27). We illustrate areas of application of topic labeling in the field of programming in spoken language. With topic labeling applied to the spoken input as well as ontologies that model the situational context we are able to select the most appropriate ontologies with an F1-score of 0.907.

Download Full-text