scholarly journals Deep learning approaches for extracting adverse events and indications of dietary supplements from clinical text

Author(s):  
Yadan Fan ◽  
Sicheng Zhou ◽  
Yifan Li ◽  
Rui Zhang

Abstract Objective We sought to demonstrate the feasibility of utilizing deep learning models to extract safety signals related to the use of dietary supplements (DSs) in clinical text. Materials and Methods Two tasks were performed in this study. For the named entity recognition (NER) task, Bi-LSTM-CRF (bidirectional long short-term memory conditional random field) and BERT (bidirectional encoder representations from transformers) models were trained and compared with CRF model as a baseline to recognize the named entities of DSs and events from clinical notes. In the relation extraction (RE) task, 2 deep learning models, including attention-based Bi-LSTM and convolutional neural network as well as a random forest model were trained to extract the relations between DSs and events, which were categorized into 3 classes: positive (ie, indication), negative (ie, adverse events), and not related. The best performed NER and RE models were further applied on clinical notes mentioning 88 DSs for discovering DSs adverse events and indications, which were compared with a DS knowledge base. Results For the NER task, deep learning models achieved a better performance than CRF, with F1 scores above 0.860. The attention-based Bi-LSTM model performed the best in the RE task, with an F1 score of 0.893. When comparing DS event pairs generated by the deep learning models with the knowledge base for DSs and event, we found both known and unknown pairs. Conclusions Deep learning models can detect adverse events and indication of DSs in clinical notes, which hold great potential for monitoring the safety of DS use.

2019 ◽  
Vol 26 (12) ◽  
pp. 1584-1591 ◽  
Author(s):  
Xue Shi ◽  
Yingping Yi ◽  
Ying Xiong ◽  
Buzhou Tang ◽  
Qingcai Chen ◽  
...  

Abstract Objective Extracting clinical entities and their attributes is a fundamental task of natural language processing (NLP) in the medical domain. This task is typically recognized as 2 sequential subtasks in a pipeline, clinical entity or attribute recognition followed by entity-attribute relation extraction. One problem of pipeline methods is that errors from entity recognition are unavoidably passed to relation extraction. We propose a novel joint deep learning method to recognize clinical entities or attributes and extract entity-attribute relations simultaneously. Materials and Methods The proposed method integrates 2 state-of-the-art methods for named entity recognition and relation extraction, namely bidirectional long short-term memory with conditional random field and bidirectional long short-term memory, into a unified framework. In this method, relation constraints between clinical entities and attributes and weights of the 2 subtasks are also considered simultaneously. We compare the method with other related methods (ie, pipeline methods and other joint deep learning methods) on an existing English corpus from SemEval-2015 and a newly developed Chinese corpus. Results Our proposed method achieves the best F1 of 74.46% on entity recognition and the best F1 of 50.21% on relation extraction on the English corpus, and 89.32% and 88.13% on the Chinese corpora, respectively, which outperform the other methods on both tasks. Conclusions The joint deep learning–based method could improve both entity recognition and relation extraction from clinical text in both English and Chinese, indicating that the approach is promising.


Author(s):  
Jun Xu ◽  
Zhiheng Li ◽  
Qiang Wei ◽  
Yonghui Wu ◽  
Yang Xiang ◽  
...  

Abstract Background To detect attributes of medical concepts in clinical text, a traditional method often consists of two steps: named entity recognition of attributes and then relation classification between medical concepts and attributes. Here we present a novel solution, in which attribute detection of given concepts is converted into a sequence labeling problem, thus attribute entity recognition and relation classification are done simultaneously within one step. Methods A neural architecture combining bidirectional Long Short-Term Memory networks and Conditional Random fields (Bi-LSTMs-CRF) was adopted to detect various medical concept-attribute pairs in an efficient way. We then compared our deep learning-based sequence labeling approach with traditional two-step systems for three different attribute detection tasks: disease-modifier, medication-signature, and lab test-value. Results Our results show that the proposed method achieved higher accuracy than the traditional methods for all three medical concept-attribute detection tasks. Conclusions This study demonstrates the efficacy of our sequence labeling approach using Bi-LSTM-CRFs on the attribute detection task, indicating its potential to speed up practical clinical NLP applications.


Processes ◽  
2021 ◽  
Vol 9 (5) ◽  
pp. 832
Author(s):  
Lanfei Peng ◽  
Dong Gao ◽  
Yujie Bai

Hazard and operability analysis (HAZOP) is one of the most commonly used hazard analysis methods in the petrochemical industry. The large amount of unstructured data in HAZOP reports has generated an information explosion which has led to a pressing need for technologies that can simplify the use of this information. In order to solve the problem that massive data are difficult to reuse and share, in this study, we propose a new deep learning framework for Chinese HAZOP documents to perform a named entity recognition (NER) task, aiming at the characteristics of HAZOP documents, such as polysemy, multi-entity nesting, and long-distance text. Specifically, the preprocessed data are input into an embeddings from language models (ELMo) and a double convolutional neural network (DCNN) model to extract rich character features. Meanwhile, a bidirectional long short-term memory (BiLSTM) network is used to extract long-distance semantic information. Finally, the results are decoded by a conditional random field (CRF), and then output. Experiments were carried out using the HAZOP report of a coal seam indirect liquefaction project. The experimental results for the proposed model showed that the accuracy rate of the optimal results reached 90.83, the recall rate reached 92.46, and the F-value reached the highest 91.76%, which was significantly improved as compared with other models.


Author(s):  
Xi Yang ◽  
Tianchen Lyu ◽  
Qian Li ◽  
Chih-Yin Lee ◽  
Jiang Bian ◽  
...  

Abstract Background De-identification is a critical technology to facilitate the use of unstructured clinical text while protecting patient privacy and confidentiality. The clinical natural language processing (NLP) community has invested great efforts in developing methods and corpora for de-identification of clinical notes. These annotated corpora are valuable resources for developing automated systems to de-identify clinical text at local hospitals. However, existing studies often utilized training and test data collected from the same institution. There are few studies to explore automated de-identification under cross-institute settings. The goal of this study is to examine deep learning-based de-identification methods at a cross-institute setting, identify the bottlenecks, and provide potential solutions. Methods We created a de-identification corpus using a total 500 clinical notes from the University of Florida (UF) Health, developed deep learning-based de-identification models using 2014 i2b2/UTHealth corpus, and evaluated the performance using UF corpus. We compared five different word embeddings trained from the general English text, clinical text, and biomedical literature, explored lexical and linguistic features, and compared two strategies to customize the deep learning models using UF notes and resources. Results Pre-trained word embeddings using a general English corpus achieved better performance than embeddings from de-identified clinical text and biomedical literature. The performance of deep learning models trained using only i2b2 corpus significantly dropped (strict and relax F1 scores dropped from 0.9547 and 0.9646 to 0.8568 and 0.8958) when applied to another corpus annotated at UF Health. Linguistic features could further improve the performance of de-identification in cross-institute settings. After customizing the models using UF notes and resource, the best model achieved the strict and relaxed F1 scores of 0.9288 and 0.9584, respectively. Conclusions It is necessary to customize de-identification models using local clinical text and other resources when applied in cross-institute settings. Fine-tuning is a potential solution to re-use pre-trained parameters and reduce the training time to customize deep learning-based de-identification models trained using clinical corpus from a different institution.


2019 ◽  
Author(s):  
John Giorgi ◽  
Gary Bader

Motivation: Automatic biomedical named entity recognition (BioNER) is a key task in biomedical information extraction (IE). For some time, state-of-the-art BioNER has been dominated by machine learning methods, particularly conditional random fields (CRFs), with a recent focus on deep learning. However, recent work has suggested that the high performance of CRFs for BioNER may not generalize to corpora other than the one it was trained on. In our analysis, we find that a popular deep learning-based approach to BioNER, known as bidirectional long short-term memory network-conditional random field (BiLSTM-CRF), is correspondingly poor at generalizing - often dramatically overfitting the corpus it was trained on. To address this, we evaluate three modifications of BiLSTM-CRF for BioNER to alleviate overfitting and improve generalization: improved regularization via variational dropout, transfer learning, and multi-task learning. Results: We measure the effect that each strategy has when training/testing on the same corpus ("in-corpus" performance) and when training on one corpus and evaluating on another ("out-of-corpus" performance), our measure of the models ability to generalize. We found that variational dropout improves out-of-corpus performance by an average of 4.62%, transfer learning by 6.48% and multi-task learning by 8.42%. The maximal increase we identified combines multi-task learning and variational dropout, which boosts out-of-corpus performance by 10.75%. Furthermore, we make available a new open-source tool, called Saber, that implements our best BioNER models. Availability: Source code for our biomedical IE tool is available at https://github.com/BaderLab/saber. Corpora and other resources used in this study are available at https://github.com/BaderLab/Towards- reliable-BioNER.


2019 ◽  
Vol 36 (1) ◽  
pp. 280-286 ◽  
Author(s):  
John M Giorgi ◽  
Gary D Bader

Abstract Motivation Automatic biomedical named entity recognition (BioNER) is a key task in biomedical information extraction. For some time, state-of-the-art BioNER has been dominated by machine learning methods, particularly conditional random fields (CRFs), with a recent focus on deep learning. However, recent work has suggested that the high performance of CRFs for BioNER may not generalize to corpora other than the one it was trained on. In our analysis, we find that a popular deep learning-based approach to BioNER, known as bidirectional long short-term memory network-conditional random field (BiLSTM-CRF), is correspondingly poor at generalizing. To address this, we evaluate three modifications of BiLSTM-CRF for BioNER to improve generalization: improved regularization via variational dropout, transfer learning and multi-task learning. Results We measure the effect that each strategy has when training/testing on the same corpus (‘in-corpus’ performance) and when training on one corpus and evaluating on another (‘out-of-corpus’ performance), our measure of the model’s ability to generalize. We found that variational dropout improves out-of-corpus performance by an average of 4.62%, transfer learning by 6.48% and multi-task learning by 8.42%. The maximal increase we identified combines multi-task learning and variational dropout, which boosts out-of-corpus performance by 10.75%. Furthermore, we make available a new open-source tool, called Saber that implements our best BioNER models. Availability and implementation Source code for our biomedical IE tool is available at https://github.com/BaderLab/saber. Corpora and other resources used in this study are available at https://github.com/BaderLab/Towards-reliable-BioNER. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 11 (1) ◽  
pp. 17 ◽  
Author(s):  
Dong Xu ◽  
Ruping Ge ◽  
Zhihua Niu

A state-of-the-art entity recognition system relies on deep learning under data-driven conditions. In this paper, we combine deep learning with linguistic features and propose the long short-term memory-conditional random field model (LSTM-CRF model) with the integrity algorithm. This approach is primarily based on the use of part-of-speech (POS) syntactic rules to correct the boundaries of LSTM-CRF model annotations and improve its performance by raising the integrity of the elements. The method incorporates the advantages of the data-driven method and dependency syntax, and improves the precision rate of the elements without losing recall rate. Experiments show that the integrity algorithm is not only easy to combine with the other neural network model, but the overall effect is better than several advanced methods. In addition, we conducted cross-domain experiments based on a multi-industry corpus in the financial field. The results indicate that the method can be applied to other industries.


Author(s):  
Richa Sharma ◽  
Sudha Morwal ◽  
Basant Agarwal

This article presents a neural network-based approach to develop named entity recognition for Hindi text. In this paper, the authors propose a deep learning architecture based on convolutional neural network (CNN) and bi-directional long short-term memory (Bi-LSTM) neural network. Skip-gram approach of word2vec model is used in the proposed model to generate word vectors. In this research work, several deep learning models have been developed and evaluated as baseline systems such as recurrent neural network (RNN), long short-term memory (LSTM), Bi-LSTM. Furthermore, these baseline systems are promoted to a proposed model with the integration of CNN and conditional random field (CRF) layers. After a comparative analysis of results, it is verified that the performance of the proposed model (i.e., Bi-LSTM-CNN-CRF) is impressive. The proposed system achieves 61% precision, 56% recall, and 58% F-measure.


Author(s):  
B. Premjith ◽  
K. P. Soman

Morphological synthesis is one of the main components of Machine Translation (MT) frameworks, especially when any one or both of the source and target languages are morphologically rich. Morphological synthesis is the process of combining two words or two morphemes according to the Sandhi rules of the morphologically rich language. Malayalam and Tamil are two languages in India which are morphologically abundant as well as agglutinative. Morphological synthesis of a word in these two languages is challenging basically because of the following reasons: (1) Abundance in morphology; (2) Complex Sandhi rules; (3) The possibilty in Malayalam to form words by combining words that belong to different syntactic categories (for example, noun and verb); and (4) The construction of a sentence by combining multiple words. We formulated the task of the morphological generation of nouns and verbs of Malayalam and Tamil as a character-to-character sequence tagging problem. In this article, we used deep learning architectures like Recurrent Neural Network (RNN) , Long Short-Term Memory Networks (LSTM) , Gated Recurrent Unit (GRU) , and their stacked and bidirectional versions for the implementation of morphological synthesis at the character level. In addition to that, we investigated the performance of the combination of the aforementioned deep learning architectures and the Conditional Random Field (CRF) in the morphological synthesis of nouns and verbs in Malayalam and Tamil. We observed that the addition of CRF to the Bidirectional LSTM/GRU architecture achieved more than 99% accuracy in the morphological synthesis of Malayalam and Tamil nouns and verbs.


Sign in / Sign up

Export Citation Format

Share Document