Layered Multistep Bidirectional Long Short-Term Memory Networks for Biomedical Word Sense Disambiguation

Word Sense Disambiguation (WSD), the process of automatically identifying the correct meaning of a word used in a given context, is a significant challenge in Natural Language Processing. A range of approaches to the problem has been explored by the research community. The majority of these efforts has focused on a relatively small set of languages, particularly English. Research on WSD for South Asian languages, particularly Urdu, is still in its infancy. In recent years, deep learning methods have proved to be extremely successful for a range of Natural Language Processing tasks. The main aim of this study is to apply, evaluate, and compare a range of deep learning methods approaches to Urdu WSD (both Lexical Sample and All-Words) including Simple Recurrent Neural Networks, Long-Short Term Memory, Gated Recurrent Units, Bidirectional Long-Short Term Memory, and Ensemble Learning. The evaluation was carried out on two benchmark corpora: (1) the ULS-WSD-18 corpus and (2) the UAW-WSD-18 corpus. Results (Accuracy = 63.25% and F1-Measure = 0.49) show that a deep learning approach outperforms previously reported results for the Urdu All-Words WSD task, whereas performance using deep learning approaches (Accuracy = 72.63% and F1-Measure = 0.60) are low in comparison to previously reported for the Urdu Lexical Sample task.

Download Full-text

Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks

BMC Bioinformatics ◽

10.1186/s12859-019-3079-8 ◽

2019 ◽

Vol 20 (S16) ◽

Cited By ~ 3

Author(s):

Canlin Zhang ◽

Daniel Biś ◽

Xiuwen Liu ◽

Zhe He

Keyword(s):

Neural Network ◽

Short Term Memory ◽

State Of The Art ◽

Word Sense Disambiguation ◽

Word Sense ◽

Short Term ◽

Term Memory ◽

Attention Model ◽

Sense Disambiguation ◽

Long Short Term Memory

Abstract Background In recent years, deep learning methods have been applied to many natural language processing tasks to achieve state-of-the-art performance. However, in the biomedical domain, they have not out-performed supervised word sense disambiguation (WSD) methods based on support vector machines or random forests, possibly due to inherent similarities of medical word senses. Results In this paper, we propose two deep-learning-based models for supervised WSD: a model based on bi-directional long short-term memory (BiLSTM) network, and an attention model based on self-attention architecture. Our result shows that the BiLSTM neural network model with a suitable upper layer structure performs even better than the existing state-of-the-art models on the MSH WSD dataset, while our attention model was 3 or 4 times faster than our BiLSTM model with good accuracy. In addition, we trained “universal” models in order to disambiguate all ambiguous words together. That is, we concatenate the embedding of the target ambiguous word to the max-pooled vector in the universal models, acting as a “hint”. The result shows that our universal BiLSTM neural network model yielded about 90 percent accuracy. Conclusion Deep contextual models based on sequential information processing methods are able to capture the relative contextual information from pre-trained input word embeddings, in order to provide state-of-the-art results for supervised biomedical WSD tasks.

Download Full-text

Attention Neural Network for Biomedical Word Sense Disambiguation

Discrete Dynamics in Nature and Society ◽

10.1155/2022/6182058 ◽

2022 ◽

Vol 2022 ◽

pp. 1-14

Author(s):

Chun-Xiang Zhang ◽

Shu-Yang Pang ◽

Xue-Yao Gao ◽

Jia-Qi Lu ◽

Bo Yu

Keyword(s):

Neural Network ◽

Semantic Information ◽

Short Term Memory ◽

Word Sense Disambiguation ◽

Semantic Category ◽

Word Sense ◽

Short Term ◽

Part Of Speech ◽

Sense Disambiguation ◽

Long Short Term Memory

In order to improve the disambiguation accuracy of biomedical words, this paper proposes a disambiguation method based on the attention neural network. The biomedical word is viewed as the center. Morphology, part of speech, and semantic information from 4 adjacent lexical units are extracted as disambiguation features. The attention layer is used to generate a feature matrix. Average asymmetric convolutional neural networks (Av-ACNN) and bidirectional long short-term memory (Bi-LSTM) networks are utilized to extract features. The softmax function is applied to determine the semantic category of the biomedical word. At the same time, CNN, LSTM, and Bi-LSTM are applied to biomedical WSD. MSH corpus is adopted to optimize CNN, LSTM, Bi-LSTM, and the proposed method and testify their disambiguation performance. Experimental results show that the average disambiguation accuracy of the proposed method is improved compared with CNN, LSTM, and Bi-LSTM. The average disambiguation accuracy of the proposed method achieves 91.38%.

Download Full-text