Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting

Abstract Objective To develop a natural language processing system that identifies relations of medications with adverse drug events from clinical narratives. This project is part of the 2018 n2c2 challenge. Materials and Methods We developed a novel clinical named entity recognition method based on an recurrent convolutional neural network and compared it to a recurrent neural network implemented using the long-short term memory architecture, explored methods to integrate medical knowledge as embedding layers in neural networks, and investigated 3 machine learning models, including support vector machines, random forests and gradient boosting for relation classification. The performance of our system was evaluated using annotated data and scripts provided by the 2018 n2c2 organizers. Results Our system was among the top ranked. Our best model submitted during this challenge (based on recurrent neural networks and support vector machines) achieved lenient F1 scores of 0.9287 for concept extraction (ranked third), 0.9459 for relation classification (ranked fourth), and 0.8778 for the end-to-end relation extraction (ranked second). We developed a novel named entity recognition model based on a recurrent convolutional neural network and further investigated gradient boosting for relation classification. The new methods improved the lenient F1 scores of the 3 subtasks to 0.9292, 0.9633, and 0.8880, respectively, which are comparable to the best performance reported in this challenge. Conclusion This study demonstrated the feasibility of using machine learning methods to extract the relations of medications with adverse drug events from clinical narratives.

Download Full-text

ChemTok: A New Rule Based Tokenizer for Chemical Named Entity Recognition

BioMed Research International ◽

10.1155/2016/4248026 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 5

Author(s):

Abbas Akkasi ◽

Ekrem Varoğlu ◽

Nazife Dimililer

Keyword(s):

Conditional Random Fields ◽

Named Entity Recognition ◽

Classification Performance ◽

Entity Recognition ◽

Support Vector ◽

Learning Approaches ◽

Data Set ◽

Rule Based ◽

Named Entity ◽

Vector Machines

Named Entity Recognition (NER) from text constitutes the first step in many text mining applications. The most important preliminary step for NER systems using machine learning approaches is tokenization where raw text is segmented into tokens. This study proposes an enhanced rule based tokenizer, ChemTok, which utilizes rules extracted mainly from the train data set. The main novelty of ChemTok is the use of the extracted rules in order to merge the tokens split in the previous steps, thus producing longer and more discriminative tokens. ChemTok is compared to the tokenization methods utilized by ChemSpot and tmChem. Support Vector Machines and Conditional Random Fields are employed as the learning algorithms. The experimental results show that the classifiers trained on the output of ChemTok outperforms all classifiers trained on the output of the other two tokenizers in terms of classification performance, and the number of incorrectly segmented entities.

Download Full-text

Tuning support vector machines for biomedical named entity recognition

10.3115/1118149.1118150 ◽

2002 ◽

Cited By ~ 82

Author(s):

Jun'ichi Kazama ◽

Takaki Makino ◽

Yoshihiro Ohta ◽

Jun'ichi Tsujii

Keyword(s):

Support Vector Machines ◽

Named Entity Recognition ◽

Entity Recognition ◽

Support Vector ◽

Named Entity ◽

Vector Machines ◽

Biomedical Named Entity Recognition

Download Full-text

Ontology-Based Healthcare Named Entity Recognition from Twitter Messages Using a Recurrent Neural Network Approach

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph16193628 ◽

2019 ◽

Vol 16 (19) ◽

pp. 3628 ◽

Cited By ~ 5

Author(s):

Erdenebileg Batbaatar ◽

Keun Ho Ryu

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Adverse Drug Events ◽

Viterbi Algorithm ◽

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Health Related

Named Entity Recognition (NER) in the healthcare domain involves identifying and categorizing disease, drugs, and symptoms for biosurveillance, extracting their related properties and activities, and identifying adverse drug events appearing in texts. These tasks are important challenges in healthcare. Analyzing user messages in social media networks such as Twitter can provide opportunities to detect and manage public health events. Twitter provides a broad range of short messages that contain interesting information for information extraction. In this paper, we present a Health-Related Named Entity Recognition (HNER) task using healthcare-domain ontology that can recognize health-related entities from large numbers of user messages from Twitter. For this task, we employ a deep learning architecture which is based on a recurrent neural network (RNN) with little feature engineering. To achieve our goal, we collected a large number of Twitter messages containing health-related information, and detected biomedical entities from the Unified Medical Language System (UMLS). A bidirectional long short-term memory (BiLSTM) model learned rich context information, and a convolutional neural network (CNN) was used to produce character-level features. The conditional random field (CRF) model predicted a sequence of labels that corresponded to a sequence of inputs, and the Viterbi algorithm was used to detect health-related entities from Twitter messages. We provide comprehensive results giving valuable insights for identifying medical entities in Twitter for various applications. The BiLSTM-CRF model achieved a precision of 93.99%, recall of 73.31%, and F1-score of 81.77% for disease or syndrome HNER; a precision of 90.83%, recall of 81.98%, and F1-score of 87.52% for sign or symptom HNER; and a precision of 94.85%, recall of 73.47%, and F1-score of 84.51% for pharmacologic substance named entities. The ontology-based manual annotation results show that it is possible to perform high-quality annotation despite the complexity of medical terminology and the lack of context in tweets.

Download Full-text

NAMED ENTITY RECOGNITION IN BIOMEDICAL LITERATURE USING TWO-LAYER SUPPORT VECTOR MACHINES

Proceedings of the Ninth International Conference on Enterprise Information Systems ◽

10.5220/0002357300390045 ◽

2007 ◽

Keyword(s):

Support Vector Machines ◽

Named Entity Recognition ◽

Biomedical Literature ◽

Entity Recognition ◽

Support Vector ◽

Named Entity ◽

Vector Machines

Download Full-text

Identifying interactions between chemical entities in biomedical text

Journal of Integrative Bioinformatics ◽

10.1515/jib-2014-247 ◽

2014 ◽

Vol 11 (3) ◽

pp. 1-16 ◽

Cited By ~ 6

Author(s):

Andre Lamurias ◽

João D. Ferreira ◽

Francisco M. Couto

Keyword(s):

Named Entity Recognition ◽

Relation Extraction ◽

Ensemble Classifier ◽

Entity Recognition ◽

Support Vector ◽

Biomedical Text ◽

Web Tool ◽

Named Entity ◽

Vector Machines ◽

Chemical Named Entity Recognition

Summary Interactions between chemical compounds described in biomedical text can be of great importance to drug discovery and design, as well as pharmacovigilance. We developed a novel system, “Identifying Interactions between Chemical Entities” (IICE), to identify chemical interactions described in text. Kernel-based Support Vector Machines first identify the interactions and then an ensemble classifier validates and classifies the type of each interaction. This relation extraction module was evaluated with the corpus released for the DDI Extraction task of SemEval 2013, obtaining results comparable to stateof- the-art methods for this type of task. We integrated this module with our chemical named entity recognition module and made the whole system available as a web tool at www.lasige.di.fc.ul.pt/webtools/iice.

Download Full-text

Use of support vector machines in extended named entity recognition

10.3115/1118853.1118882 ◽

2002 ◽

Cited By ~ 47

Author(s):

Koichi Takeuchi ◽

Nigel Collier

Keyword(s):

Support Vector Machines ◽

Named Entity Recognition ◽

Entity Recognition ◽

Support Vector ◽

Named Entity ◽

Vector Machines

Download Full-text

Conditional random fields and support vector machines for disorder named entity recognition in clinical texts

Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing - BioNLP '08 ◽

10.3115/1572306.1572326 ◽

2008 ◽

Cited By ~ 29

Author(s):

Dingcheng Li ◽

Karin Kipper-Schuler ◽

Guergana Savova

Keyword(s):

Support Vector Machines ◽

Random Fields ◽

Conditional Random Fields ◽

Named Entity Recognition ◽

Entity Recognition ◽

Support Vector ◽

Named Entity ◽

Vector Machines

Download Full-text

Chinese Named Entity Recognition using Support Vector Machines

2006 International Conference on Machine Learning and Cybernetics ◽

10.1109/icmlc.2006.258946 ◽

2006 ◽

Cited By ~ 2

Author(s):

Xu-dong Lin ◽

Hong Peng ◽

Bo Liu

Keyword(s):

Support Vector Machines ◽

Named Entity Recognition ◽

Entity Recognition ◽

Support Vector ◽

Named Entity ◽

Vector Machines

Download Full-text

COST-SENSITIVE STRUCTURED PERCEPTRON INCORPORATING CATEGORY HIERARCHY FOR NAMED ENTITY RECOGNITION

Journal of Information and Communication Technology ◽

10.32890/jict.14.2015.8153 ◽

2015 ◽

Author(s):

Shohei Higashiyama ◽

Blondel Mathieu ◽

Kazuhiro Seki ◽

Kuniaki Uehara

Keyword(s):

Cost Function ◽

Language Processing ◽

Named Entity Recognition ◽

Entity Recognition ◽

Superior Performance ◽

Support Vector ◽

Named Entity ◽

Vector Machines ◽

Representative Work ◽

Correct Category

Named Entity Recognition (NER) is a fundamental natural language processing task for the identifi cation and classifi cation of expressions into predefi ned categories, such as person and organization. Existing NER systems usually target about 10 categories and do not incorporate analysis of category relations. However, categories often belong naturally to some predefi ned hierarchy. In such cases, the distance between categories in the hierarchy becomes a rich source of information that can be exploited. This is intuitively useful particularly when the categories are numerous. On that account, this paper proposes an NER approach that can leverage category hierarchy information by introducing, in the structured perceptron framework, a cost function more strongly penalizing category predictions that are more distant from the correct category in the hierarchy. Experimental results on the GENIA biomedical text corpus indicate the effectiveness of the proposed approach as compared with the case where no cost function is utilized. In addition, the proposed approach demonstrates the superior performance over a representative work using multi-class support vector machines on the same corpus. A possible direction to further improve the proposed approach is to investigate more elaborate cost functions than a simple additive cost adopted in this work.

Download Full-text

NAMED ENTITY RECOGNITION IN GREEK TEXTS WITH AN ENSEMBLE OF SVMS AND ACTIVE LEARNING

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213007003680 ◽

2007 ◽

Vol 16 (06) ◽

pp. 1015-1045 ◽

Cited By ~ 4

Author(s):

GIORGIO LUCARELLI ◽

XENOFON VASILAKOS ◽

ION ANDROUTSOPOULOS

Keyword(s):

Support Vector Machines ◽

Active Learning ◽

Named Entity Recognition ◽

Entity Recognition ◽

Support Vector ◽

Named Entity ◽

Input Text ◽

Vector Machines ◽

Temporal Expressions ◽

General Collection

We present a freely available named-entity recognizer for Greek texts that identifies temporal expressions, person, and organization names. For temporal expressions, it relies on semi-automatically produced patterns. For person and organization names, it employs an ensemble of Support Vector Machines that scan the input text in two passes. The ensemble is trained using active learning, whereby the system itself proposes candidate training instances to be annotated by a human during training. The recognizer was evaluated on both a general collection of newspaper articles and a more focussed, in terms of topics, collection of financial articles.

Download Full-text