Word Sense Disambiguation Models Emerging Trends: A Comparative Analysis

Abstract Word Sense Disambiguation (WSD) arises due to the presence of ambiguity in the text during the semantic analysis of natural languages. It is a major unsolved problem in the area of Natural Language Processing (NLP) and its applications. This paper explores and reviews WSD algorithms that have contributed to, or created state-of-art solutions in recent years. Moreover, this paper also aims to analyze the recent technological trends in the domain of WSD which can give us leverage to identify the possible future trajectory of the search for better WSD solutions.

Download Full-text

A Comparative Analysis of Supervised Word Sense Disambiguation in Information Retrieval

Communication and Intelligent Systems - Lecture Notes in Networks and Systems ◽

10.1007/978-981-16-1089-9_10 ◽

2021 ◽

pp. 111-120

Author(s):

Chandrakala Arya ◽

Manoj Diwakar ◽

Shobha Arya

Keyword(s):

Information Retrieval ◽

Comparative Analysis ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

A critical analysis and explication of word sense disambiguation as approached by natural language processing

Lingua ◽

10.1016/j.lingua.2020.102896 ◽

2020 ◽

Vol 243 ◽

pp. 102896

Author(s):

Julie Mennes ◽

Stephan van der Waart van Gulik

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Critical Analysis ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

deepBioWSD: effective deep neural word sense disambiguation of biomedical text data

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy189 ◽

2019 ◽

Vol 26 (5) ◽

pp. 438-446 ◽

Cited By ~ 3

Author(s):

Ahmad Pesaranghader ◽

Stan Matwin ◽

Marina Sokolova ◽

Ali Pesaranghader

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Word Sense Disambiguation ◽

Training Data ◽

Biomedical Text ◽

Word Sense ◽

Vocabulary Size ◽

Unified Medical Language System ◽

Knowledge Based ◽

Sense Disambiguation

Abstract Objective In biomedicine, there is a wealth of information hidden in unstructured narratives such as research articles and clinical reports. To exploit these data properly, a word sense disambiguation (WSD) algorithm prevents downstream difficulties in the natural language processing applications pipeline. Supervised WSD algorithms largely outperform un- or semisupervised and knowledge-based methods; however, they train 1 separate classifier for each ambiguous term, necessitating a large number of expert-labeled training data, an unattainable goal in medical informatics. To alleviate this need, a single model that shares statistical strength across all instances and scales well with the vocabulary size is desirable. Materials and Methods Built on recent advances in deep learning, our deepBioWSD model leverages 1 single bidirectional long short-term memory network that makes sense prediction for any ambiguous term. In the model, first, the Unified Medical Language System sense embeddings will be computed using their text definitions; and then, after initializing the network with these embeddings, it will be trained on all (available) training data collectively. This method also considers a novel technique for automatic collection of training data from PubMed to (pre)train the network in an unsupervised manner. Results We use the MSH WSD dataset to compare WSD algorithms, with macro and micro accuracies employed as evaluation metrics. deepBioWSD outperforms existing models in biomedical text WSD by achieving the state-of-the-art performance of 96.82% for macro accuracy. Conclusions Apart from the disambiguation improvement and unsupervised training, deepBioWSD depends on considerably less number of expert-labeled data as it learns the target and the context terms jointly. These merit deepBioWSD to be conveniently deployable in real-time biomedical applications.

Download Full-text

Ontology Matching using BabelNet Dictionary and Word Sense Disambiguation Algorithms

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v5.i1.pp196-205 ◽

2017 ◽

Vol 5 (1) ◽

pp. 196 ◽

Cited By ~ 5

Author(s):

Mohamed Biniz ◽

Rachid El Ayachi ◽

Mohamed Fakir

Keyword(s):

Natural Language Processing ◽

Language Processing ◽

Word Sense Disambiguation ◽

Similarity Measures ◽

Ontology Matching ◽

Word Sense ◽

Sense Disambiguation ◽

Lesk Algorithm ◽

Reference Ontology ◽

Selection Of

<p>Ontology matching is a discipline that means two things: first, the process of discovering correspondences between two different ontologies, and second is the result of this process, that is to say the expression of correspondences. This discipline is a crucial task to solve problems merging and evolving of heterogeneous ontologies in applications of the Semantic Web. This domain imposes several challenges, among them, the selection of appropriate similarity measures to discover the correspondences. In this article, we are interested to study algorithms that calculate the semantic similarity by using Adapted Lesk algorithm, Wu & Palmer Algorithm, Resnik Algorithm, Leacock and Chodorow Algorithm, and similarity flooding between two ontologies and BabelNet as reference ontology, we implement them, and compared experimentally. Overall, the most effective methods are Wu & Palmer and Adapted Lesk, which is widely used for Word Sense Disambiguation (WSD) in the field of Automatic Natural Language Processing (NLP).</p>

Download Full-text

A Novel Approach to Word Sense Disambiguation Based on Topical and Semantic Association

The Scientific World JOURNAL ◽

10.1155/2013/586327 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 2

Author(s):

Xin Wang ◽

Wanli Zuo ◽

Ying Wang

Keyword(s):

Language Processing ◽

Fundamental Problem ◽

Word Sense Disambiguation ◽

Ambiguous Word ◽

Semantic Features ◽

Word Sense ◽

Semantic Association ◽

Data Set ◽

Novel Approach ◽

Sense Disambiguation

Word sense disambiguation (WSD) is a fundamental problem in nature language processing, the objective of which is to identify the most proper sense for an ambiguous word in a given context. Although WSD has been researched over the years, the performance of existing algorithms in terms of accuracy and recall is still unsatisfactory. In this paper, we propose a novel approach to word sense disambiguation based on topical and semantic association. For a given document, supposing that its topic category is accurately discriminated, the correct sense of the ambiguous term is identified through the corresponding topic and semantic contexts. We firstly extract topic discriminative terms from document and construct topical graph based on topic span intervals to implement topic identification. We then exploit syntactic features, topic span features, and semantic features to disambiguate nouns and verbs in the context of ambiguous word. Finally, we conduct experiments on the standard data set SemCor to evaluate the performance of the proposed method, and the results indicate that our approach achieves relatively better performance than existing approaches.

Download Full-text

A comparative analysis of Hindi word sense disambiguation and its approaches

International Conference on Computing, Communication & Automation ◽

10.1109/ccaa.2015.7148396 ◽

2015 ◽

Cited By ~ 3

Author(s):

Sarika ◽

Dilip Kumar Sharma

Keyword(s):

Comparative Analysis ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

Comparing supervised learning algorithms for Spatial Nominal Entity recognition

AGILE: GIScience Series ◽

10.5194/agile-giss-1-15-2020 ◽

2020 ◽

Vol 1 ◽

pp. 1-18

Author(s):

Amine Medad ◽

Mauro Gaio ◽

Ludovic Moncla ◽

Sébastien Mustière ◽

Yannick Le Nir

Keyword(s):

Natural Language ◽

Supervised Learning ◽

Language Processing ◽

Word Sense Disambiguation ◽

Learning Algorithms ◽

Named Entity Recognition ◽

Entity Recognition ◽

Word Sense ◽

Sense Disambiguation ◽

Supervised Learning Algorithms

Abstract. Discourse may contain both named and nominal entities. Most common nouns or nominal mentions in natural language do not have a single, simple meaning but rather a number of related meanings. This form of ambiguity led to the development of a task in natural language processing known as Word Sense Disambiguation. Recognition and categorisation of named and nominal entities is an essential step for Word Sense Disambiguation methods. Up to now, named entity recognition and categorisation systems mainly focused on the annotation, categorisation and identification of named entities. This paper focuses on the annotation and the identification of spatial nominal entities. We explore the combination of Transfer Learning principle and supervised learning algorithms, in order to build a system to detect spatial nominal entities. For this purpose, different supervised learning algorithms are evaluated with three different context sizes on two manually annotated datasets built from Wikipedia articles and hiking description texts. The studied algorithms have been selected for one or more of their specific properties potentially useful in solving our problem. The results of the first phase of experiments reveal that the selected algorithms have similar performances in terms of ability to detect spatial nominal entities. The study also confirms the importance of the size of the window to describe the context, when word-embedding principle is used to represent the semantics of each word.

Download Full-text

Knowledge-Based Method for Word Sense Disambiguation by Using Hindi WordNet

Engineering, Technology & Applied Science Research ◽

10.48084/etasr.2596 ◽

2019 ◽

Vol 9 (2) ◽

pp. 3985-3989 ◽

Cited By ~ 1

Author(s):

P. Sharma ◽

N. Joshi

Keyword(s):

Natural Language ◽

Language Processing ◽

Speech Processing ◽

Text Processing ◽

Word Sense Disambiguation ◽

Problem Area ◽

Word Sense ◽

Knowledge Resources ◽

Knowledge Based ◽

Sense Disambiguation

The purpose of word sense disambiguation (WSD) is to find the meaning of the word in any context with the help of a computer, to find the proper meaning of a lexeme in the available context in the problem area and the relationship between lexicons. This is done using natural language processing (NLP) techniques which involve queries from machine translation (MT), NLP specific documents or output text. MT automatically translates text from one natural language into another. Several application areas for WSD involve information retrieval (IR), lexicography, MT, text processing, speech processing etc. Using this knowledge-based technique, we are investigating Hindi WSD in this article. It involves incorporating word knowledge from external knowledge resources to remove the equivocalness of words. In this experiment, we tried to develop a WSD tool by considering a knowledge-based approach with WordNet of Hindi. The tool uses the knowledge-based LESK algorithm for WSD for Hindi. Our proposed system gives an accuracy of about 71.4%.

Download Full-text

Word-Sense Disambiguation

10.1093/oxfordhb/9780199276349.013.0013 ◽

2012 ◽

Cited By ~ 1

Author(s):

Mark Stevenson ◽

Yorick Wilks

Keyword(s):

Language Processing ◽

Word Sense Disambiguation ◽

Syntactic Analysis ◽

Word Sense ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Evaluation Strategies ◽

Sense Disambiguation ◽

Translation Systems ◽

Speech Tagging

Word-sense disambiguation (WSD) is the process of identifying the meanings of words in context. This article begins with discussing the origins of the problem in the earliest machine translation systems. Early attempts to solve the WSD problem suffered from a lack of coverage. The main approaches to tackle the problem were dictionary-based, connectionist, and statistical strategies. This article concludes with a review of evaluation strategies for WSD and possible applications of the technology. WSD is an ‘intermediate’ task in language processing: like part-of-speech tagging or syntactic analysis, it is unlikely that anyone other than linguists would be interested in its results for their own sake. ‘Final’ tasks produce results of use to those without a specific interest in language and often make use of ‘intermediate’ tasks. WSD is a long-standing and important problem in the field of language processing.

Download Full-text

Supervised Word Sense Disambiguation on Polysemy with Neural Network Models: A Case Study of BUN in Taiwan Hakka

International Journal of Asian Language Processing ◽

10.1142/s2717554520500113 ◽

2021 ◽

pp. 2050011

Author(s):

Huei-Ling Lai ◽

Hsiao-Ling Hsu ◽

Jyi-Shane Liu ◽

Chia-Hung Lin ◽

Yanhong Chen

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Language Processing ◽

Word Sense Disambiguation ◽

Network Models ◽

Word Sense ◽

Neural Network Models ◽

Low Resource ◽

Sense Disambiguation

While word sense disambiguation (WSD) has been extensively studied in natural language processing, such a task in low-resource languages still receives little attention. Findings based on a few dominant languages may lead to narrow applications. A language-specific WSD system is in need to implement in low-resource languages, for instance, in Taiwan Hakka. This study examines the performance of DNN and Bi-LSTM in WSD tasks on polysemous BUNin Taiwan Hakka. Both models are trained and tested on a small amount of hand-crafted labeled data. Two experiments are designed with four kinds of input features and two window spans to explore what information is needed for the models to achieve their best performance. The results show that to achieve the best performance, DNN and Bi-LSTM models prefer different kinds of input features and window spans.

Download Full-text