word sense disambiguation
Recently Published Documents





Ali Saeed ◽  
Rao Muhammad Adeel Nawab ◽  
Mark Stevenson

Word Sense Disambiguation (WSD), the process of automatically identifying the correct meaning of a word used in a given context, is a significant challenge in Natural Language Processing. A range of approaches to the problem has been explored by the research community. The majority of these efforts has focused on a relatively small set of languages, particularly English. Research on WSD for South Asian languages, particularly Urdu, is still in its infancy. In recent years, deep learning methods have proved to be extremely successful for a range of Natural Language Processing tasks. The main aim of this study is to apply, evaluate, and compare a range of deep learning methods approaches to Urdu WSD (both Lexical Sample and All-Words) including Simple Recurrent Neural Networks, Long-Short Term Memory, Gated Recurrent Units, Bidirectional Long-Short Term Memory, and Ensemble Learning. The evaluation was carried out on two benchmark corpora: (1) the ULS-WSD-18 corpus and (2) the UAW-WSD-18 corpus. Results (Accuracy = 63.25% and F1-Measure = 0.49) show that a deep learning approach outperforms previously reported results for the Urdu All-Words WSD task, whereas performance using deep learning approaches (Accuracy = 72.63% and F1-Measure = 0.60) are low in comparison to previously reported for the Urdu Lexical Sample task.

2022 ◽  
Vol 2022 ◽  
pp. 1-14
Chun-Xiang Zhang ◽  
Shu-Yang Pang ◽  
Xue-Yao Gao ◽  
Jia-Qi Lu ◽  
Bo Yu

In order to improve the disambiguation accuracy of biomedical words, this paper proposes a disambiguation method based on the attention neural network. The biomedical word is viewed as the center. Morphology, part of speech, and semantic information from 4 adjacent lexical units are extracted as disambiguation features. The attention layer is used to generate a feature matrix. Average asymmetric convolutional neural networks (Av-ACNN) and bidirectional long short-term memory (Bi-LSTM) networks are utilized to extract features. The softmax function is applied to determine the semantic category of the biomedical word. At the same time, CNN, LSTM, and Bi-LSTM are applied to biomedical WSD. MSH corpus is adopted to optimize CNN, LSTM, Bi-LSTM, and the proposed method and testify their disambiguation performance. Experimental results show that the average disambiguation accuracy of the proposed method is improved compared with CNN, LSTM, and Bi-LSTM. The average disambiguation accuracy of the proposed method achieves 91.38%.

Oleg Kalinin

The article dwells on a modern cognitive and discourse study of metaphors. Taking the advantage of the analysis and fusion of information in foreign and domestic papers, the researcher delves into their classification from the ontological, axiological and epistemological points of view. The ontological level breaks down into two basic approaches, namely metaphorical nature of discourse and discursive nature of metaphors. The former analyses metaphors to fathom characteristics of discourse, while the other provides for the study of metaphorical features in the context of discursive communication. The axiological aspect covers critical and descriptive studies and the epistemological angle comprises quantitive and qualitative methods in metaphorical studies. Other issues covered in the paper incorporate a thorough review of methods for identification of metaphors to include computer-assisted solutions (Word Sense Disambiguation, Categorisation, Metaphor Clusters) and numerical analysis of the metaphorical nature of discourse – descriptor analysis, metaphor power index, cluster analysis, and complex metaphor power analysis. On the one hand, the conceptualization of research papers boils down to major features of the discursive approach to metaphors and on the other, multiple studies of metaphors in the context of discourse pave the way for a discursive trend in cognitive metaphorology.

2022 ◽  
Vol 2161 (1) ◽  
pp. 012035
Nemika Tyagi ◽  
Sudeshna Chakraborty ◽  
Jyotsna ◽  
Aditya Kumar ◽  
Nzanzu Katasohire Romeo

Abstract Word Sense Disambiguation (WSD) arises due to the presence of ambiguity in the text during the semantic analysis of natural languages. It is a major unsolved problem in the area of Natural Language Processing (NLP) and its applications. This paper explores and reviews WSD algorithms that have contributed to, or created state-of-art solutions in recent years. Moreover, this paper also aims to analyze the recent technological trends in the domain of WSD which can give us leverage to identify the possible future trajectory of the search for better WSD solutions.

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Muhammad Yasir ◽  
Li Chen ◽  
Amna Khatoon ◽  
Muhammad Amir Malik ◽  
Fazeel Abid

Mixed script identification is a hindrance for automated natural language processing systems. Mixing cursive scripts of different languages is a challenge because NLP methods like POS tagging and word sense disambiguation suffer from noisy text. This study tackles the challenge of mixed script identification for mixed-code dataset consisting of Roman Urdu, Hindi, Saraiki, Bengali, and English. The language identification model is trained using word vectorization and RNN variants. Moreover, through experimental investigation, different architectures are optimized for the task associated with Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit (GRU), and Bidirectional Gated Recurrent Unit (Bi-GRU). Experimentation achieved the highest accuracy of 90.17 for Bi-GRU, applying learned word class features along with embedding with GloVe. Moreover, this study addresses the issues related to multilingual environments, such as Roman words merged with English characters, generative spellings, and phonetic typing.

2021 ◽  
Vol 13 (6) ◽  
pp. 40-50
Abdo Ababor Abafogi ◽  

Language is the main means of communication used by human. In various situations, the same word can mean differently based on the usage of the word in a particular sentence which is challenging for a computer to understand as level of human. Word Sense Disambiguation (WSD), which aims to identify correct sense of a given ambiguity word, is a long-standing problem in natural language processing (NLP). As the major aim of WSD is to accurately understand the sense of a word in particular context, can be used for the correct labeling of words in natural language applications. In this paper, I propose a normalized statistical algorithm that performs the task of WSD for Afaan Oromo language despite morphological analysis The propose algorithm has the power to discriminate ambiguous word’s sense without windows size consideration, without predefined rule and without utilize annotated dataset for training which minimize a challenge of under resource languages. The proposed system tested on 249 sentences with precision, recall, and F-measure. The overall effectiveness of the system is 80.76% in F-measure, which implies that the proposed system is promising on Afaan Oromo that is one of under resource languages spoken in East Africa. The algorithm can be extended for semantic text similarity without modification or with a bit modification. Furthermore, the forwarded direction can improve the performance of the proposed algorithm.

Mr. Prashant Y. Itankar ◽  
Dr. Nikhat Raza

Execution of Word Sense Disambiguation (WSD) is one of the difficult undertakings in the space of Natural language processing (NLP). Age of sense clarified corpus for multilingual WSD is far off for most languages regardless of whether assets are accessible. In this paper we propose a solo technique utilizing word and sense embeddings for working on the presentation of WSD frameworks utilizing untagged corpora and make two bags to be specific context bag and wiki sense bag to create the faculties with most noteworthy closeness. Wiki sense bag gives outer information to the framework needed to help the disambiguation exactness. We investigate Word2Vec model to produce the sense back and notice huge execution acquire for our dataset.

2021 ◽  
Nazreena Rahman ◽  
Bhogeswar Borah

Abstract This paper presents a query-based extractive text summarization method by using sense-oriented semantic relatedness measure. We have proposed a Word Sense Disambiguation (WSD) technique to find the exact sense of a word present in the sentence. It helps in extracting query relevance sentences while calculating the sense-oriented sentence semantic relatedness score between the query and input text sentence. The proposed method uses five unique features to make clusters of query-relevant sentences. A redundancy removal technique is also put forward to eliminate redundant sentences. We have evaluated our proposed WSD technique with other existing methods by using Senseval and SemEval datasets. Experimental evaluation and discussion signifies the better performance of proposed WSD method over current systems in terms of F-score. We compare our proposed query-based extractive text summarization method with other methods participated in Document Understanding Conference (DUC) and as well as with current methods. Evaluation and comparison state that the proposed query-based extractive text summarization method outperforms many existing methods. As an unsupervised learning algorithm, we obtained highest ROUGE (Recall-Oriented Understudy for Gisting Evaluation) score for all three DUC 2005, 2006 and 2007 datasets. Our proposed method is also quite comparable with other supervised learning based algorithms. We also observe that our query-based extractive text summarization method can recognize query relevance sentences which meet the query need.

Sign in / Sign up

Export Citation Format

Share Document