scholarly journals Attention Neural Network for Biomedical Word Sense Disambiguation

2022 ◽  
Vol 2022 ◽  
pp. 1-14
Author(s):  
Chun-Xiang Zhang ◽  
Shu-Yang Pang ◽  
Xue-Yao Gao ◽  
Jia-Qi Lu ◽  
Bo Yu

In order to improve the disambiguation accuracy of biomedical words, this paper proposes a disambiguation method based on the attention neural network. The biomedical word is viewed as the center. Morphology, part of speech, and semantic information from 4 adjacent lexical units are extracted as disambiguation features. The attention layer is used to generate a feature matrix. Average asymmetric convolutional neural networks (Av-ACNN) and bidirectional long short-term memory (Bi-LSTM) networks are utilized to extract features. The softmax function is applied to determine the semantic category of the biomedical word. At the same time, CNN, LSTM, and Bi-LSTM are applied to biomedical WSD. MSH corpus is adopted to optimize CNN, LSTM, Bi-LSTM, and the proposed method and testify their disambiguation performance. Experimental results show that the average disambiguation accuracy of the proposed method is improved compared with CNN, LSTM, and Bi-LSTM. The average disambiguation accuracy of the proposed method achieves 91.38%.

2019 ◽  
Vol 20 (S16) ◽  
Author(s):  
Canlin Zhang ◽  
Daniel Biś ◽  
Xiuwen Liu ◽  
Zhe He

Abstract Background In recent years, deep learning methods have been applied to many natural language processing tasks to achieve state-of-the-art performance. However, in the biomedical domain, they have not out-performed supervised word sense disambiguation (WSD) methods based on support vector machines or random forests, possibly due to inherent similarities of medical word senses. Results In this paper, we propose two deep-learning-based models for supervised WSD: a model based on bi-directional long short-term memory (BiLSTM) network, and an attention model based on self-attention architecture. Our result shows that the BiLSTM neural network model with a suitable upper layer structure performs even better than the existing state-of-the-art models on the MSH WSD dataset, while our attention model was 3 or 4 times faster than our BiLSTM model with good accuracy. In addition, we trained “universal” models in order to disambiguate all ambiguous words together. That is, we concatenate the embedding of the target ambiguous word to the max-pooled vector in the universal models, acting as a “hint”. The result shows that our universal BiLSTM neural network model yielded about 90 percent accuracy. Conclusion Deep contextual models based on sequential information processing methods are able to capture the relative contextual information from pre-trained input word embeddings, in order to provide state-of-the-art results for supervised biomedical WSD tasks.


Author(s):  
Ali Saeed ◽  
Rao Muhammad Adeel Nawab ◽  
Mark Stevenson

Word Sense Disambiguation (WSD), the process of automatically identifying the correct meaning of a word used in a given context, is a significant challenge in Natural Language Processing. A range of approaches to the problem has been explored by the research community. The majority of these efforts has focused on a relatively small set of languages, particularly English. Research on WSD for South Asian languages, particularly Urdu, is still in its infancy. In recent years, deep learning methods have proved to be extremely successful for a range of Natural Language Processing tasks. The main aim of this study is to apply, evaluate, and compare a range of deep learning methods approaches to Urdu WSD (both Lexical Sample and All-Words) including Simple Recurrent Neural Networks, Long-Short Term Memory, Gated Recurrent Units, Bidirectional Long-Short Term Memory, and Ensemble Learning. The evaluation was carried out on two benchmark corpora: (1) the ULS-WSD-18 corpus and (2) the UAW-WSD-18 corpus. Results (Accuracy = 63.25% and F1-Measure = 0.49) show that a deep learning approach outperforms previously reported results for the Urdu All-Words WSD task, whereas performance using deep learning approaches (Accuracy = 72.63% and F1-Measure = 0.60) are low in comparison to previously reported for the Urdu Lexical Sample task.


Author(s):  
Tian Shengwei ◽  
Li Dongbai ◽  
Yu Long ◽  
Feng Guanjun ◽  
Zhao Jianguo ◽  
...  

As a core subtask in anaphora resolution, anaphoricity determination has aroused the interest of researchers. However, in recent work, the influence caused by the deep semantic information and the context of the coreference elements have not been taken into account. In this paper, by combining the semantic feature of Uygur, we established a Convolutional Neural Network & Long Short-Term Memory (CNN_LSTM) model in determining the anaphoricity of Uygur pronoun. Firstly, the deep negative semantic feature representation is extracted via word2vec. Secondly, the shallow explicit feature representation of coreference elements is extracted by our system. Afterwards, two kinds of features are combined to recognize whether coreference element is referential or not. The results showed that the method we used can distinguish coreference element accurately, the ACC[Formula: see text] score is 90.18% and the ACC[Formula: see text] score is 89.93%, which are higher than ANN (Artificial Neural Network) and SVM (Support Vector Machine) respectively.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Chun-Xiang Zhang ◽  
Rui Liu ◽  
Xue-Yao Gao ◽  
Bo Yu

Word sense disambiguation (WSD) is an important research topic in natural language processing, which is widely applied to text classification, machine translation, and information retrieval. In order to improve disambiguation accuracy, this paper proposes a WSD method based on the graph convolutional network (GCN). Word, part of speech, and semantic category are extracted from contexts of the ambiguous word as discriminative features. Discriminative features and sentence containing the ambiguous word are used as nodes to construct the WSD graph. Word2Vec tool, Doc2Vec tool, pointwise mutual information (PMI), and TF-IDF are applied to compute embeddings of nodes and edge weights. GCN is used to fuse features of a node and its neighbors, and the softmax function is applied to determine the semantic category of the ambiguous word. Training corpus of SemEval-2007: Task #5 is adopted to optimize the proposed WSD classifier. Test corpus of SemEval-2007: Task #5 is used to test the performance of WSD classifier. Experimental results show that average accuracy of the proposed method is improved.


2020 ◽  
Vol 13 (1) ◽  
pp. 104
Author(s):  
Dana-Mihaela Petroșanu ◽  
Alexandru Pîrjan

The accurate forecasting of the hourly month-ahead electricity consumption represents a very important aspect for non-household electricity consumers and system operators, and at the same time represents a key factor in what regards energy efficiency and achieving sustainable economic, business, and management operations. In this context, we have devised, developed, and validated within the paper an hourly month ahead electricity consumption forecasting method. This method is based on a bidirectional long-short-term memory (BiLSTM) artificial neural network (ANN) enhanced with a multiple simultaneously decreasing delays approach coupled with function fitting neural networks (FITNETs). The developed method targets the hourly month-ahead total electricity consumption at the level of a commercial center-type consumer and for the hourly month ahead consumption of its refrigerator storage room. The developed approach offers excellent forecasting results, highlighted by the validation stage’s results along with the registered performance metrics, namely 0.0495 for the root mean square error (RMSE) performance metric for the total hourly month-ahead electricity consumption and 0.0284 for the refrigerator storage room. We aimed for and managed to attain an hourly month-ahead consumed electricity prediction without experiencing a significant drop in the forecasting accuracy that usually tends to occur after the first two weeks, therefore achieving a reliable method that satisfies the contractor’s needs, being able to enhance his/her activity from the economic, business, and management perspectives. Even if the devised, developed, and validated forecasting solution for the hourly consumption targets a commercial center-type consumer, based on its accuracy, this solution can also represent a useful tool for other non-household electricity consumers due to its generalization capability.


Sign in / Sign up

Export Citation Format

Share Document