scholarly journals Lancet: a high precision medication event extraction system for clinical text

2010 ◽  
Vol 17 (5) ◽  
pp. 563-567 ◽  
Author(s):  
Zuofeng Li ◽  
Feifan Liu ◽  
Lamont Antieau ◽  
Yonggang Cao ◽  
Hong Yu
2020 ◽  
Author(s):  
Zining Yang ◽  
Siyu Zhan ◽  
Mengshu Hou ◽  
Xiaoyang Zeng ◽  
Hao Zhu

The recent pre-trained language model has made great success in many NLP tasks. In this paper, we propose an event extraction system based on the novel pre-trained language model BERT to extract both event trigger and argument. As a deep-learningbased method, the size of the training dataset has a crucial impact on performance. To address the lacking training data problem for event extraction, we further train the pretrained language model with a carefully constructed in-domain corpus to inject event knowledge to our event extraction system with minimal efforts. Empirical evaluation on the ACE2005 dataset shows that injecting event knowledge can significantly improve the performance of event extraction.


BMJ Open ◽  
2019 ◽  
Vol 9 (4) ◽  
pp. e023232 ◽  
Author(s):  
Beata Fonferko-Shadrach ◽  
Arron S Lacey ◽  
Angus Roberts ◽  
Ashley Akbari ◽  
Simon Thompson ◽  
...  

ObjectiveRoutinely collected healthcare data are a powerful research resource but often lack detailed disease-specific information that is collected in clinical free text, for example, clinic letters. We aim to use natural language processing techniques to extract detailed clinical information from epilepsy clinic letters to enrich routinely collected data.DesignWe used the general architecture for text engineering (GATE) framework to build an information extraction system, ExECT (extraction of epilepsy clinical text), combining rule-based and statistical techniques. We extracted nine categories of epilepsy information in addition to clinic date and date of birth across 200 clinic letters. We compared the results of our algorithm with a manual review of the letters by an epilepsy clinician.SettingDe-identified and pseudonymised epilepsy clinic letters from a Health Board serving half a million residents in Wales, UK.ResultsWe identified 1925 items of information with overall precision, recall and F1 score of 91.4%, 81.4% and 86.1%, respectively. Precision and recall for epilepsy-specific categories were: epilepsy diagnosis (88.1%, 89.0%), epilepsy type (89.8%, 79.8%), focal seizures (96.2%, 69.7%), generalised seizures (88.8%, 52.3%), seizure frequency (86.3%–53.6%), medication (96.1%, 94.0%), CT (55.6%, 58.8%), MRI (82.4%, 68.8%) and electroencephalogram (81.5%, 75.3%).ConclusionsWe have built an automated clinical text extraction system that can accurately extract epilepsy information from free text in clinic letters. This can enhance routinely collected data for research in the UK. The information extracted with ExECT such as epilepsy type, seizure frequency and neurological investigations are often missing from routinely collected data. We propose that our algorithm can bridge this data gap enabling further epilepsy research opportunities. While many of the rules in our pipeline were tailored to extract epilepsy specific information, our methods can be applied to other diseases and also can be used in clinical practice to record patient information in a structured manner.


2019 ◽  
Vol 23 (2) ◽  
pp. 953-965 ◽  
Author(s):  
Kailai Zhang ◽  
Ji Wu ◽  
Xiaofeng Tong ◽  
Yumeng Wang

2010 ◽  
Vol 17 (5) ◽  
pp. 507-513 ◽  
Author(s):  
Guergana K Savova ◽  
James J Masanz ◽  
Philip V Ogren ◽  
Jiaping Zheng ◽  
Sunghwan Sohn ◽  
...  

2019 ◽  
Vol 26 (11) ◽  
pp. 1364-1369 ◽  
Author(s):  
Majid Afshar ◽  
Dmitriy Dligach ◽  
Brihat Sharma ◽  
Xiaoyuan Cai ◽  
Jason Boyda ◽  
...  

AbstractObjectiveNatural language processing (NLP) engines such as the clinical Text Analysis and Knowledge Extraction System are a solution for processing notes for research, but optimizing their performance for a clinical data warehouse remains a challenge. We aim to develop a high throughput NLP architecture using the clinical Text Analysis and Knowledge Extraction System and present a predictive model use case.Materials and MethodsThe CDW was comprised of 1 103 038 patients across 10 years. The architecture was constructed using the Hadoop data repository for source data and 3 large-scale symmetric processing servers for NLP. Each named entity mention in a clinical document was mapped to the Unified Medical Language System concept unique identifier (CUI).ResultsThe NLP architecture processed 83 867 802 clinical documents in 13.33 days and produced 37 721 886 606 CUIs across 8 standardized medical vocabularies. Performance of the architecture exceeded 500 000 documents per hour across 30 parallel instances of the clinical Text Analysis and Knowledge Extraction System including 10 instances dedicated to documents greater than 20 000 bytes. In a use–case example for predicting 30-day hospital readmission, a CUI-based model had similar discrimination to n-grams with an area under the curve receiver operating characteristic of 0.75 (95% CI, 0.74–0.76).Discussion and ConclusionOur health system’s high throughput NLP architecture may serve as a benchmark for large-scale clinical research using a CUI-based approach.


2016 ◽  
Vol 2016 ◽  
pp. 1-11 ◽  
Author(s):  
Valery Solovyev ◽  
Vladimir Ivanov

Automatic event extraction form text is an important step in knowledge acquisition and knowledge base population. Manual work in development of extraction system is indispensable either in corpus annotation or in vocabularies and pattern creation for a knowledge-based system. Recent works have been focused on adaptation of existing system (for extraction from English texts) to new domains. Event extraction in other languages was not studied due to the lack of resources and algorithms necessary for natural language processing. In this paper we define a set of linguistic resources that are necessary in development of a knowledge-based event extraction system in Russian: a vocabulary of subordination models, a vocabulary of event triggers, and a vocabulary of Frame Elements that are basic building blocks for semantic patterns. We propose a set of methods for creation of such vocabularies in Russian and other languages using Google Books NGram Corpus. The methods are evaluated in development of event extraction system for Russian.


Sign in / Sign up

Export Citation Format

Share Document