scholarly journals Normalized Statistical Algorithm for Afaan Oromo Word Sense Disambiguation

2021 ◽  
Vol 13 (6) ◽  
pp. 40-50
Author(s):  
Abdo Ababor Abafogi ◽  

Language is the main means of communication used by human. In various situations, the same word can mean differently based on the usage of the word in a particular sentence which is challenging for a computer to understand as level of human. Word Sense Disambiguation (WSD), which aims to identify correct sense of a given ambiguity word, is a long-standing problem in natural language processing (NLP). As the major aim of WSD is to accurately understand the sense of a word in particular context, can be used for the correct labeling of words in natural language applications. In this paper, I propose a normalized statistical algorithm that performs the task of WSD for Afaan Oromo language despite morphological analysis The propose algorithm has the power to discriminate ambiguous word’s sense without windows size consideration, without predefined rule and without utilize annotated dataset for training which minimize a challenge of under resource languages. The proposed system tested on 249 sentences with precision, recall, and F-measure. The overall effectiveness of the system is 80.76% in F-measure, which implies that the proposed system is promising on Afaan Oromo that is one of under resource languages spoken in East Africa. The algorithm can be extended for semantic text similarity without modification or with a bit modification. Furthermore, the forwarded direction can improve the performance of the proposed algorithm.

2020 ◽  
Vol 1 ◽  
pp. 1-18
Author(s):  
Amine Medad ◽  
Mauro Gaio ◽  
Ludovic Moncla ◽  
Sébastien Mustière ◽  
Yannick Le Nir

Abstract. Discourse may contain both named and nominal entities. Most common nouns or nominal mentions in natural language do not have a single, simple meaning but rather a number of related meanings. This form of ambiguity led to the development of a task in natural language processing known as Word Sense Disambiguation. Recognition and categorisation of named and nominal entities is an essential step for Word Sense Disambiguation methods. Up to now, named entity recognition and categorisation systems mainly focused on the annotation, categorisation and identification of named entities. This paper focuses on the annotation and the identification of spatial nominal entities. We explore the combination of Transfer Learning principle and supervised learning algorithms, in order to build a system to detect spatial nominal entities. For this purpose, different supervised learning algorithms are evaluated with three different context sizes on two manually annotated datasets built from Wikipedia articles and hiking description texts. The studied algorithms have been selected for one or more of their specific properties potentially useful in solving our problem. The results of the first phase of experiments reveal that the selected algorithms have similar performances in terms of ability to detect spatial nominal entities. The study also confirms the importance of the size of the window to describe the context, when word-embedding principle is used to represent the semantics of each word.


2019 ◽  
Vol 9 (2) ◽  
pp. 3985-3989 ◽  
Author(s):  
P. Sharma ◽  
N. Joshi

The purpose of word sense disambiguation (WSD) is to find the meaning of the word in any context with the help of a computer, to find the proper meaning of a lexeme in the available context in the problem area and the relationship between lexicons. This is done using natural language processing (NLP) techniques which involve queries from machine translation (MT), NLP specific documents or output text. MT automatically translates text from one natural language into another. Several application areas for WSD involve information retrieval (IR), lexicography, MT, text processing, speech processing etc. Using this knowledge-based technique, we are investigating Hindi WSD in this article. It involves incorporating word knowledge from external knowledge resources to remove the equivocalness of words. In this experiment, we tried to develop a WSD tool by considering a knowledge-based approach with WordNet of Hindi. The tool uses the knowledge-based LESK algorithm for WSD for Hindi. Our proposed system gives an accuracy of about 71.4%.


Information ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 452
Author(s):  
Ammar Arbaaeen ◽  
Asadullah Shah

Within the space of question answering (QA) systems, the most critical module to improve overall performance is question analysis processing. Extracting the lexical semantic of a Natural Language (NL) question presents challenges at syntactic and semantic levels for most QA systems. This is due to the difference between the words posed by a user and the terms presently stored in the knowledge bases. Many studies have achieved encouraging results in lexical semantic resolution on the topic of word sense disambiguation (WSD), and several other works consider these challenges in the context of QA applications. Additionally, few scholars have examined the role of WSD in returning potential answers corresponding to particular questions. However, natural language processing (NLP) is still facing several challenges to determine the precise meaning of various ambiguities. Therefore, the motivation of this work is to propose a novel knowledge-based sense disambiguation (KSD) method for resolving the problem of lexical ambiguity associated with questions posed in QA systems. The major contribution is the proposed innovative method, which incorporates multiple knowledge sources. This includes the question’s metadata (date/GPS), context knowledge, and domain ontology into a shallow NLP. The proposed KSD method is developed into a unique tool for a mobile QA application that aims to determine the intended meaning of questions expressed by pilgrims. The experimental results reveal that our method obtained comparable and better accuracy performance than the baselines in the context of the pilgrimage domain.


2016 ◽  
Vol 13 (10) ◽  
pp. 6929-6934
Author(s):  
Junting Chen ◽  
Liyun Zhong ◽  
Caiyun Cai

Word sense disambiguation (WSD) in natural language text is a fundamental semantic understanding task at the lexical level in natural language processing (NLP) applications. Kernel methods such as support vector machine (SVM) have been successfully applied to WSD. This is mainly due to their relatively high classification accuracy as well as their ability to handle high dimensional and sparse data. A significant challenge in WSD is to reduce the need for labeled training data while maintaining an acceptable performance. In this paper, we present a semi-supervised technique using the exponential kernel for WSD. Specifically, the semantic similarities between terms are first determined with both labeled and unlabeled training data by means of a diffusion process on a graph defined by lexicon and co-occurrence information, and the exponential kernel is then constructed based on the learned semantic similarity. Finally, the SVM classifier trains a model for each class during the training phase and this model is then applied to all test examples in the test phase. The main feature of this approach is that it takes advantage of the exponential kernel to reveal the semantic similarities between terms in an unsupervised manner, which provides a kernel framework for semi-supervised learning. Experiments on several SENSEVAL benchmark data sets demonstrate the proposed approach is sound and effective.


Author(s):  
Carlos Ramisch ◽  
Aline Villavicencio

In natural-language processing, multiword expressions (MWEs) have been the focus of much attention in their many forms, including idioms, nominal compounds, verbal expressions, and collocations. In addition to their relevance for lexicographic and terminographic work, their ubiquity in language affects the performance of tasks like parsing, word sense disambiguation, and natural-language generation. They lend a mark of naturalness and fluency to applications that can deal with them, ranging from machine translation to information retrieval. This chapter presents an overview of their linguistic characteristics and discusses a variety of proposals for incorporating them into language technology, covering type-based discovery, token-based identification, and MWE-aware language technology applications.


2015 ◽  
Vol 2015 ◽  
pp. 1-6 ◽  
Author(s):  
Yuntong Liu ◽  
Hua Sun

In order to use semantics more effectively in natural language processing, a word sense disambiguation method for Chinese based on semantics calculation was proposed. The word sense disambiguation for a Chinese clause could be achieved by solving the semantic model of the natural language; each step of the word sense disambiguation process was discussed in detail; and the computational complexity of the word sense disambiguation process was analyzed. Finally, some experiments were finished to verify the effectiveness of the method.


2012 ◽  
Vol 182-183 ◽  
pp. 2109-2112
Author(s):  
Lin Lin Yu ◽  
Deng Feng Xu ◽  
Li Fang Song ◽  
Guo Jie Li ◽  
Xu Dong Song

Word sense disambiguation (WSD) is a critical and difficult issue in natural language processing(NLP), as well as WSD is great significance in large area of research areas of NLP. This paper presents a method of multi-word sense disambiguation strategy. The method combines the method based on match word corpus and the method based on the similarity and relevance very well. While the calculation of similarity and relevance are make full use of the sememe-tree information from HowNet. The experiments show that the proposed WSD method can obtain better results.


Author(s):  
Mr. Prashant Y. Itankar ◽  
Dr. Nikhat Raza

Execution of Word Sense Disambiguation (WSD) is one of the difficult undertakings in the space of Natural language processing (NLP). Age of sense clarified corpus for multilingual WSD is far off for most languages regardless of whether assets are accessible. In this paper we propose a solo technique utilizing word and sense embeddings for working on the presentation of WSD frameworks utilizing untagged corpora and make two bags to be specific context bag and wiki sense bag to create the faculties with most noteworthy closeness. Wiki sense bag gives outer information to the framework needed to help the disambiguation exactness. We investigate Word2Vec model to produce the sense back and notice huge execution acquire for our dataset.


Sign in / Sign up

Export Citation Format

Share Document