Building neural network models for morphological and morpheme analysis of texts

Alexander Sergeevich Sapin

doi:10.15514/ispras-2021-33(4)-9

Building neural network models for morphological and morpheme analysis of texts

Proceedings of the Institute for System Programming of RAS ◽

10.15514/ispras-2021-33(4)-9 ◽

2021 ◽

Vol 33 (4) ◽

pp. 117-130

Author(s):

Alexander Sergeevich Sapin

Keyword(s):

Neural Network ◽

Language Processing ◽

High Performance ◽

Morphological Analysis ◽

Morphological Characteristics ◽

Network Models ◽

Word Form ◽

Neural Network Models ◽

Russian Word ◽

Word Forms

Morphological analysis of text is one of the most important stages of natural language processing (NLP). Traditional and well-studied problems of morphological analysis include normalization (lemmatization) of a given word form, recognition of its morphological characteristics and their morphological disambiguation. The morphological analysis also involves the problem of morpheme segmentation of words (i.e., segmentation of words into constituent morphs and their classification), which is actual in some NLP applications. In recent years, several machine learning models have been developed, which increase the accuracy of traditional morphological analysis and morpheme segmentation, but performance of such models is insufficient for many applied problems. For morpheme segmentation, high-precision models have been built only for lemmas (normalized word forms). This paper describes two new high-accuracy neural network models that implement morphemic segmentation of Russian word forms with sufficiently high performance. The first model is based on convolutional neural networks and shows the state-of-the-art quality of morphemic segmentation for Russian word forms. The second model, besides morpheme segmentation of a word form, preliminarily refines its morphological characteristics, thereby performing their disambiguation. The performance of this joined morphological model is the best among the considered morpheme segmentation models, with comparable accuracy of segmentation.

Download Full-text

Automatic Detection of Hypoglycemic Events From the Electronic Health Record Notes of Diabetes Patients: Empirical Study (Preprint)

10.2196/preprints.14340 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yonghao Jin ◽

Fei Li ◽

Varsha G Vimalananda ◽

Hong Yu

Keyword(s):

Neural Network ◽

Language Processing ◽

High Performance ◽

Population Studies ◽

Network Models ◽

Automatic Detection ◽

Support Vector ◽

Neural Network Models ◽

Electronic Health ◽

Diabetes Patients

BACKGROUND Hypoglycemic events are common and potentially dangerous conditions among patients being treated for diabetes. Automatic detection of such events could improve patient care and is valuable in population studies. Electronic health records (EHRs) are valuable resources for the detection of such events. OBJECTIVE In this study, we aim to develop a deep-learning–based natural language processing (NLP) system to automatically detect hypoglycemic events from EHR notes. Our model is called the High-Performing System for Automatically Detecting Hypoglycemic Events (HYPE). METHODS Domain experts reviewed 500 EHR notes of diabetes patients to determine whether each sentence contained a hypoglycemic event or not. We used this annotated corpus to train and evaluate HYPE, the high-performance NLP system for hypoglycemia detection. We built and evaluated both a classical machine learning model (ie, support vector machines [SVMs]) and state-of-the-art neural network models. RESULTS We found that neural network models outperformed the SVM model. The convolutional neural network (CNN) model yielded the highest performance in a 10-fold cross-validation setting: mean precision=0.96 (SD 0.03), mean recall=0.86 (SD 0.03), and mean F1=0.91 (SD 0.03). CONCLUSIONS Despite the challenges posed by small and highly imbalanced data, our CNN-based HYPE system still achieved a high performance for hypoglycemia detection. HYPE can be used for EHR-based hypoglycemia surveillance and population studies in diabetes patients.

Download Full-text

Automatic Detection of Hypoglycemic Events From the Electronic Health Record Notes of Diabetes Patients: Empirical Study

JMIR Medical Informatics ◽

10.2196/14340 ◽

2019 ◽

Vol 7 (4) ◽

pp. e14340 ◽

Cited By ~ 2

Author(s):

Yonghao Jin ◽

Fei Li ◽

Varsha G Vimalananda ◽

Hong Yu

Keyword(s):

Neural Network ◽

Language Processing ◽

High Performance ◽

Population Studies ◽

Network Models ◽

Automatic Detection ◽

Support Vector ◽

Neural Network Models ◽

Electronic Health ◽

Diabetes Patients

Background Hypoglycemic events are common and potentially dangerous conditions among patients being treated for diabetes. Automatic detection of such events could improve patient care and is valuable in population studies. Electronic health records (EHRs) are valuable resources for the detection of such events. Objective In this study, we aim to develop a deep-learning–based natural language processing (NLP) system to automatically detect hypoglycemic events from EHR notes. Our model is called the High-Performing System for Automatically Detecting Hypoglycemic Events (HYPE). Methods Domain experts reviewed 500 EHR notes of diabetes patients to determine whether each sentence contained a hypoglycemic event or not. We used this annotated corpus to train and evaluate HYPE, the high-performance NLP system for hypoglycemia detection. We built and evaluated both a classical machine learning model (ie, support vector machines [SVMs]) and state-of-the-art neural network models. Results We found that neural network models outperformed the SVM model. The convolutional neural network (CNN) model yielded the highest performance in a 10-fold cross-validation setting: mean precision=0.96 (SD 0.03), mean recall=0.86 (SD 0.03), and mean F1=0.91 (SD 0.03). Conclusions Despite the challenges posed by small and highly imbalanced data, our CNN-based HYPE system still achieved a high performance for hypoglycemia detection. HYPE can be used for EHR-based hypoglycemia surveillance and population studies in diabetes patients.

Download Full-text

The relational processing limits of classic and contemporary neural network models of language processing

10.32470/ccn.2019.1022-0 ◽

2019 ◽

Author(s):

Guillermo Puebla ◽

Andrea Martin ◽

Leonidas Doumas

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Relational Processing ◽

Neural Network Models

Download Full-text

The relational processing limits of classic and contemporary neural network models of language processing

Language Cognition and Neuroscience ◽

10.1080/23273798.2020.1821906 ◽

2020 ◽

pp. 1-15

Author(s):

Guillermo Puebla ◽

Andrea E. Martin ◽

Leonidas A. A. Doumas

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Relational Processing ◽

Neural Network Models

Download Full-text

Comparison of rule-based and neural network models for negation detection in radiology reports

Natural Language Engineering ◽

10.1017/s1351324920000509 ◽

2020 ◽

pp. 1-22 ◽

Cited By ~ 2

Author(s):

D. Sykes ◽

A. Grivas ◽

C. Grover ◽

R. Tobin ◽

C. Sudlow ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Language Processing ◽

Network Models ◽

Neural Network Models ◽

Test Set ◽

Rule Based ◽

Radiology Reports ◽

The Neural Network ◽

Negation Detection

Abstract Using natural language processing, it is possible to extract structured information from raw text in the electronic health record (EHR) at reasonably high accuracy. However, the accurate distinction between negated and non-negated mentions of clinical terms remains a challenge. EHR text includes cases where diseases are stated not to be present or only hypothesised, meaning a disease can be mentioned in a report when it is not being reported as present. This makes tasks such as document classification and summarisation more difficult. We have developed the rule-based EdIE-R-Neg, part of an existing text mining pipeline called EdIE-R (Edinburgh Information Extraction for Radiology reports), developed to process brain imaging reports, (https://www.ltg.ed.ac.uk/software/edie-r/) and two machine learning approaches; one using a bidirectional long short-term memory network and another using a feedforward neural network. These were developed on data from the Edinburgh Stroke Study (ESS) and tested on data from routine reports from NHS Tayside (Tayside). Both datasets consist of written reports from medical scans. These models are compared with two existing rule-based models: pyConText (Harkema et al. 2009. Journal of Biomedical Informatics42(5), 839–851), a python implementation of a generalisation of NegEx, and NegBio (Peng et al. 2017. NegBio: A high-performance tool for negation and uncertainty detection in radiology reports. arXiv e-prints, p. arXiv:1712.05898), which identifies negation scopes through patterns applied to a syntactic representation of the sentence. On both the test set of the dataset from which our models were developed, as well as the largely similar Tayside test set, the neural network models and our custom-built rule-based system outperformed the existing methods. EdIE-R-Neg scored highest on F1 score, particularly on the test set of the Tayside dataset, from which no development data were used in these experiments, showing the power of custom-built rule-based systems for negation detection on datasets of this size. The performance gap of the machine learning models to EdIE-R-Neg on the Tayside test set was reduced through adding development Tayside data into the ESS training set, demonstrating the adaptability of the neural network models.

Download Full-text

Finding Fuzziness in Neural Network Models of Language Processing

Explainable AI and Other Applications of Fuzzy Techniques - Lecture Notes in Networks and Systems ◽

10.1007/978-3-030-82099-2_25 ◽

2021 ◽

pp. 278-290

Author(s):

Kanishka Misra ◽

Julia Taylor Rayz

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Neural Network Models

Download Full-text

A Primer on Neural Network Models for Natural Language Processing

Journal of Artificial Intelligence Research ◽

10.1613/jair.4992 ◽

2016 ◽

Vol 57 ◽

pp. 345-420 ◽

Cited By ~ 233

Author(s):

Yoav Goldberg

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Speech Processing ◽

Network Models ◽

Neural Network Models ◽

Convolutional Networks ◽

The Past ◽

Gradient Computation

Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing. More recently, neural network models started to be applied also to textual natural language signals, again with very promising results. This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques. The tutorial covers input encoding for natural language tasks, feed-forward networks, convolutional networks, recurrent networks and recursive networks, as well as the computation graph abstraction for automatic gradient computation.

Download Full-text

Double Multi-Head Attention-Based Capsule Network for Relation Classification

10.5121/csit.2021.110711 ◽

2021 ◽

Author(s):

Hongjun Heng ◽

Renjie Li

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Language Processing ◽

Layer Structure ◽

Single Layer ◽

Network Models ◽

Classification Model ◽

Neural Network Models ◽

Comparable Performance ◽

Relation Classification

Semantic relation classification is an important task in the field of nature language processing. The existing neural network relation classification models introduce attention mechanism to increase the importance of significant features, but part of these attention models only have one head which is not enough to capture more distinctive fine-grained features. Models based on RNN (Recurrent Neural Network) usually use single-layer structure and have limited feature extraction capability. Current RNN-based capsule networks have problem of improper handling of noise which increase complexity of network. Therefore, we propose a capsule network relation classification model based on double multi-head attention. In this model, we introduce an auxiliary BiGRU (Bidirectional Gated Recurrent Unit) to make up for the lack of feature extraction performance of single BiGRU, improve the bilinear attention through double multihead mechanism to enable the model to obtain more information of sentence from different representation subspace and instantiate capsules with sentence-level features to alleviate noise impact. Experiments on the SemEval-2010 Task 8 benchmark dataset show that our model outperforms most of previous state-of-the-art neural network models and achieves the comparable performance with F1 score of 85.3% in capsule network.

Download Full-text

Supervised Word Sense Disambiguation on Polysemy with Neural Network Models: A Case Study of BUN in Taiwan Hakka

International Journal of Asian Language Processing ◽

10.1142/s2717554520500113 ◽

2021 ◽

pp. 2050011

Author(s):

Huei-Ling Lai ◽

Hsiao-Ling Hsu ◽

Jyi-Shane Liu ◽

Chia-Hung Lin ◽

Yanhong Chen

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Language Processing ◽

Word Sense Disambiguation ◽

Network Models ◽

Word Sense ◽

Neural Network Models ◽

Low Resource ◽

Sense Disambiguation

While word sense disambiguation (WSD) has been extensively studied in natural language processing, such a task in low-resource languages still receives little attention. Findings based on a few dominant languages may lead to narrow applications. A language-specific WSD system is in need to implement in low-resource languages, for instance, in Taiwan Hakka. This study examines the performance of DNN and Bi-LSTM in WSD tasks on polysemous BUNin Taiwan Hakka. Both models are trained and tested on a small amount of hand-crafted labeled data. Two experiments are designed with four kinds of input features and two window spans to explore what information is needed for the models to achieve their best performance. The results show that to achieve the best performance, DNN and Bi-LSTM models prefer different kinds of input features and window spans.

Download Full-text

Probing Classifiers: Promises, Shortcomings, and Advances

Computational Linguistics ◽

10.1162/coli_a_00422 ◽

2021 ◽

pp. 1-12

Author(s):

Yonatan Belinkov

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Network ◽

Network Models ◽

Neural Network Models ◽

Linguistic Property

Abstract Probing classifiers have emerged as one of the prominent methodologies for interpreting and analyzing deep neural network models of natural language processing. The basic idea is simple —a classifier is trained to predict some linguistic property from a model's representations—and has been used to examine a wide variety of models and properties. However, recent studies have demonstrated various methodological limitations of this approach. This article critically reviews the probing classifiers framework, highlighting their promises, shortcomings, and advances.

Download Full-text