Neural Network Models for Word Sense Disambiguation: An Overview

Abstract The following article presents an overview of the use of artificial neural networks for the task of Word Sense Disambiguation (WSD). More specifically, it surveys the advances in neural language models in recent years that have resulted in methods for the effective distributed representation of linguistic units. Such representations – word embeddings, context embeddings, sense embeddings – can be effectively applied for WSD purposes, as they encode rich semantic information, especially in conjunction with recurrent neural networks, which are able to capture long-distance relations encoded in word order, syntax, information structuring.

Download Full-text

Supervised Word Sense Disambiguation on Polysemy with Neural Network Models: A Case Study of BUN in Taiwan Hakka

International Journal of Asian Language Processing ◽

10.1142/s2717554520500113 ◽

2021 ◽

pp. 2050011

Author(s):

Huei-Ling Lai ◽

Hsiao-Ling Hsu ◽

Jyi-Shane Liu ◽

Chia-Hung Lin ◽

Yanhong Chen

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Language Processing ◽

Word Sense Disambiguation ◽

Network Models ◽

Word Sense ◽

Neural Network Models ◽

Low Resource ◽

Sense Disambiguation

While word sense disambiguation (WSD) has been extensively studied in natural language processing, such a task in low-resource languages still receives little attention. Findings based on a few dominant languages may lead to narrow applications. A language-specific WSD system is in need to implement in low-resource languages, for instance, in Taiwan Hakka. This study examines the performance of DNN and Bi-LSTM in WSD tasks on polysemous BUNin Taiwan Hakka. Both models are trained and tested on a small amount of hand-crafted labeled data. Two experiments are designed with four kinds of input features and two window spans to explore what information is needed for the models to achieve their best performance. The results show that to achieve the best performance, DNN and Bi-LSTM models prefer different kinds of input features and window spans.

Download Full-text

Spreading semantic information by Word Sense Disambiguation

Knowledge-Based Systems ◽

10.1016/j.knosys.2017.06.013 ◽

2017 ◽

Vol 132 ◽

pp. 47-61 ◽

Cited By ~ 5

Author(s):

Yoan Gutiérrez ◽

Sonia Vázquez ◽

Andrés Montoyo

Keyword(s):

Semantic Information ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

Toward Universal Word Sense Disambiguation Using Deep Neural Networks

IEEE Access ◽

10.1109/access.2019.2914921 ◽

2019 ◽

Vol 7 ◽

pp. 60264-60275 ◽

Cited By ~ 1

Author(s):

Hiram Calvo ◽

Arturo P. Rocha-Ramirez ◽

Marco A. Moreno-Armendariz ◽

Carlos A. Duchanoy

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

A Lexicographic Encoding for Word Sense Disambiguation with Evolutionary Neural Networks

AI*IA 2009: Emergent Perspectives in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-642-10291-2_20 ◽

2009 ◽

pp. 192-201 ◽

Cited By ~ 2

Author(s):

A. Azzini ◽

C. da Costa Pereira ◽

M. Dragoni ◽

A. G. B. Tettamanzi

Keyword(s):

Neural Networks ◽

Word Sense Disambiguation ◽

Word Sense ◽

Evolutionary Neural Networks ◽

Sense Disambiguation

Download Full-text

AutoExtend: Combining Word Embeddings with Semantic Resources

Computational Linguistics ◽

10.1162/coli_a_00294 ◽

2017 ◽

Vol 43 (3) ◽

pp. 593-617 ◽

Cited By ~ 4

Author(s):

Sascha Rothe ◽

Hinrich Schütze

Keyword(s):

Semantic Information ◽

State Of The Art ◽

Word Sense Disambiguation ◽

Input Word ◽

Word Sense ◽

Word Embeddings ◽

Training Corpus ◽

Context Similarity ◽

Sense Disambiguation ◽

Semantic Resources

We present AutoExtend, a system that combines word embeddings with semantic resources by learning embeddings for non-word objects like synsets and entities and learning word embeddings that incorporate the semantic information from the resource. The method is based on encoding and decoding the word embeddings and is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The obtained embeddings live in the same vector space as the input word embeddings. A sparse tensor formalization guarantees efficiency and parallelizability. We use WordNet, GermaNet, and Freebase as semantic resources. AutoExtend achieves state-of-the-art performance on Word-in-Context Similarity and Word Sense Disambiguation tasks.

Download Full-text

Multi-Task Learning for Metaphor Detection with Graph Convolutional Neural Networks and Word Sense Disambiguation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6326 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8139-8146

Author(s):

Duong Le ◽

My Thai ◽

Thien Nguyen

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Word Sense Disambiguation ◽

Word Sense ◽

Knowledge Resources ◽

Useful Knowledge ◽

Learning Framework ◽

Task Learning ◽

Sense Disambiguation

The current deep learning works on metaphor detection have only considered this task independently, ignoring the useful knowledge from the related tasks and knowledge resources. In this work, we introduce two novel mechanisms to improve the performance of the deep learning models for metaphor detection. The first mechanism employs graph convolutional neural networks (GCN) with dependency parse trees to directly connect the words of interest with their important context words for metaphor detection. The GCN networks in this work also present a novel control mechanism to filter the learned representation vectors to retain the most important information for metaphor detection. The second mechanism, on the other hand, features a multi-task learning framework that exploits the similarity between word sense disambiguation and metaphor detection to transfer the knowledge between the two tasks. The extensive experiments demonstrate the effectiveness of the proposed techniques, yielding the state-of-the-art performance over several datasets.

Download Full-text

Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2017.08.001 ◽

2017 ◽

Vol 73 ◽

pp. 137-147 ◽

Cited By ~ 19

Author(s):

Antonio Jimeno Yepes

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

Word Sense Disambiguation ◽

Word Sense ◽

Word Embeddings ◽

Short Term ◽

Term Memory ◽

Sense Disambiguation ◽

Long Short Term Memory

Download Full-text

Evolving Neural Networks for Word Sense Disambiguation

10.1109/his.2008.88 ◽

2008 ◽

Cited By ~ 4

Author(s):

A. Azzini ◽

C. da Costa Pereira ◽

M. Dragoni ◽

A.G.B. Tettamanzi

Keyword(s):

Neural Networks ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

Analysis and Evaluation of Language Models for Word Sense Disambiguation

Computational Linguistics ◽

10.1162/coli_a_00405 ◽

2021 ◽

pp. 1-55

Author(s):

Daniel Loureiro ◽

Kiamehr Rezaee ◽

Mohammad Taher Pilehvar ◽

Jose Camacho-Collados

Keyword(s):

Feature Extraction ◽

Word Sense Disambiguation ◽

Language Model ◽

Training Data ◽

Fine Tuning ◽

Language Models ◽

Coarse Grained ◽

Word Sense ◽

Sense Disambiguation ◽

High Level

Abstract Transformer-based language models have taken many fields in NLP by storm. BERT and its derivatives dominate most of the existing evaluation benchmarks, including those for Word Sense Disambiguation (WSD), thanks to their ability in capturing context-sensitive semantic nuances. However, there is still little knowledge about their capabilities and potential limitations in encoding and recovering word senses. In this article, we provide an in-depth quantitative and qualitative analysis of the celebrated BERT model with respect to lexical ambiguity. One of the main conclusions of our analysis is that BERT can accurately capture high-level sense distinctions, even when a limited number of examples is available for each word sense. Our analysis also reveals that in some cases language models come close to solving coarse-grained noun disambiguation under ideal conditions in terms of availability of training data and computing resources. However, this scenario rarely occurs in real-world settings and, hence, many practical challenges remain even in the coarse-grained setting. We also perform an in-depth comparison of the two main language model based WSD strategies, i.e., fine-tuning and feature extraction, finding that the latter approach is more robust with respect to sense bias and it can better exploit limited available training data. In fact, the simple feature extraction strategy of averaging contextualized embeddings proves robust even using only three training sentences per word sense, with minimal improvements obtained by increasing the size of this training data.

Download Full-text

Investigating Semantic Augmentation in Virtual Environments for Image Segmentation Using Convolutional Neural Networks

Journal of Imaging ◽

10.3390/jimaging7080146 ◽

2021 ◽

Vol 7 (8) ◽

pp. 146

Author(s):

Joshua Ganter ◽

Simon Löffler ◽

Ron Metzger ◽

Katharina Ußling ◽

Christoph Müller

Keyword(s):

Neural Network ◽

Neural Networks ◽

Virtual Environments ◽

Semantic Information ◽

Synthetic Data ◽

Image Data ◽

Network Models ◽

Neural Network Models ◽

Real World Data ◽

Virtual Domain

Collecting real-world data for the training of neural networks is enormously time- consuming and expensive. As such, the concept of virtualizing the domain and creating synthetic data has been analyzed in many instances. This virtualization offers many possibilities of changing the domain, and with that, enabling the relatively fast creation of data. It also offers the chance to enhance necessary augmentations with additional semantic information when compared with conventional augmentation methods. This raises the question of whether such semantic changes, which can be seen as augmentations of the virtual domain, contribute to better results for neural networks, when trained with data augmented this way. In this paper, a virtual dataset is presented, including semantic augmentations and automatically generated annotations, as well as a comparison between semantic and conventional augmentation for image data. It is determined that the results differ only marginally for neural network models trained with the two augmentation approaches.

Download Full-text