scholarly journals Detection of Bleeding Events in Electronic Health Record Notes Using Convolutional Neural Network Models Enhanced With Recurrent Neural Network Autoencoders: Deep Learning Approach (Preprint)

2018 ◽  
Author(s):  
Rumeng Li ◽  
Baotian Hu ◽  
Feifan Liu ◽  
Weisong Liu ◽  
Francesca Cunningham ◽  
...  

BACKGROUND Bleeding events are common and critical and may cause significant morbidity and mortality. High incidences of bleeding events are associated with cardiovascular disease in patients on anticoagulant therapy. Prompt and accurate detection of bleeding events is essential to prevent serious consequences. As bleeding events are often described in clinical notes, automatic detection of bleeding events from electronic health record (EHR) notes may improve drug-safety surveillance and pharmacovigilance. OBJECTIVE We aimed to develop a natural language processing (NLP) system to automatically classify whether an EHR note sentence contains a bleeding event. METHODS We expert annotated 878 EHR notes (76,577 sentences and 562,630 word-tokens) to identify bleeding events at the sentence level. This annotated corpus was used to train and validate our NLP systems. We developed an innovative hybrid convolutional neural network (CNN) and long short-term memory (LSTM) autoencoder (HCLA) model that integrates a CNN architecture with a bidirectional LSTM (BiLSTM) autoencoder model to leverage large unlabeled EHR data. RESULTS HCLA achieved the best area under the receiver operating characteristic curve (0.957) and F1 score (0.938) to identify whether a sentence contains a bleeding event, thereby surpassing the strong baseline support vector machines and other CNN and autoencoder models. CONCLUSIONS By incorporating a supervised CNN model and a pretrained unsupervised BiLSTM autoencoder, the HCLA achieved high performance in detecting bleeding events.

2016 ◽  
Vol 24 (1) ◽  
pp. 162-171 ◽  
Author(s):  
Pedro L Teixeira ◽  
Wei-Qi Wei ◽  
Robert M Cronin ◽  
Huan Mo ◽  
Jacob P VanHouten ◽  
...  

Objective: Phenotyping algorithms applied to electronic health record (EHR) data enable investigators to identify large cohorts for clinical and genomic research. Algorithm development is often iterative, depends on fallible investigator intuition, and is time- and labor-intensive. We developed and evaluated 4 types of phenotyping algorithms and categories of EHR information to identify hypertensive individuals and controls and provide a portable module for implementation at other sites. Materials and Methods: We reviewed the EHRs of 631 individuals followed at Vanderbilt for hypertension status. We developed features and phenotyping algorithms of increasing complexity. Input categories included International Classification of Diseases, Ninth Revision (ICD9) codes, medications, vital signs, narrative-text search results, and Unified Medical Language System (UMLS) concepts extracted using natural language processing (NLP). We developed a module and tested portability by replicating 10 of the best-performing algorithms at the Marshfield Clinic. Results: Random forests using billing codes, medications, vitals, and concepts had the best performance with a median area under the receiver operator characteristic curve (AUC) of 0.976. Normalized sums of all 4 categories also performed well (0.959 AUC). The best non-NLP algorithm combined normalized ICD9 codes, medications, and blood pressure readings with a median AUC of 0.948. Blood pressure cutoffs or ICD9 code counts alone had AUCs of 0.854 and 0.908, respectively. Marshfield Clinic results were similar. Conclusion: This work shows that billing codes or blood pressure readings alone yield good hypertension classification performance. However, even simple combinations of input categories improve performance. The most complex algorithms classified hypertension with excellent recall and precision.


2014 ◽  
Vol 22 (1) ◽  
pp. 155-165 ◽  
Author(s):  
Christian M Rochefort ◽  
Aman D Verma ◽  
Tewodros Eguale ◽  
Todd C Lee ◽  
David L Buckeridge

Abstract Background Venous thromboembolisms (VTEs), which include deep vein thrombosis (DVT) and pulmonary embolism (PE), are associated with significant mortality, morbidity, and cost in hospitalized patients. To evaluate the success of preventive measures, accurate and efficient methods for monitoring VTE rates are needed. Therefore, we sought to determine the accuracy of statistical natural language processing (NLP) for identifying DVT and PE from electronic health record data. Methods We randomly sampled 2000 narrative radiology reports from patients with a suspected DVT/PE in Montreal (Canada) between 2008 and 2012. We manually identified DVT/PE within each report, which served as our reference standard. Using a bag-of-words approach, we trained 10 alternative support vector machine (SVM) models predicting DVT, and 10 predicting PE. SVM training and testing was performed with nested 10-fold cross-validation, and the average accuracy of each model was measured and compared. Results On manual review, 324 (16.2%) reports were DVT-positive and 154 (7.7%) were PE-positive. The best DVT model achieved an average sensitivity of 0.80 (95% CI 0.76 to 0.85), specificity of 0.98 (98% CI 0.97 to 0.99), positive predictive value (PPV) of 0.89 (95% CI 0.85 to 0.93), and an area under the curve (AUC) of 0.98 (95% CI 0.97 to 0.99). The best PE model achieved sensitivity of 0.79 (95% CI 0.73 to 0.85), specificity of 0.99 (95% CI 0.98 to 0.99), PPV of 0.84 (95% CI 0.75 to 0.92), and AUC of 0.99 (95% CI 0.98 to 1.00). Conclusions Statistical NLP can accurately identify VTE from narrative radiology reports.


2021 ◽  
Vol 39 (28_suppl) ◽  
pp. 324-324
Author(s):  
Isaac S. Chua ◽  
Elise Tarbi ◽  
Jocelyn H. Siegel ◽  
Kate Sciacca ◽  
Anne Kwok ◽  
...  

324 Background: Delivering goal-concordant care to patients with advanced cancer requires identifying eligible patients who would benefit from goals of care (GOC) conversations; training clinicians how to have these conversations; conducting conversations in a timely manner; and documenting GOC conversations that can be readily accessed by care teams. We used an existing, locally developed electronic cancer care clinical pathways system to guide oncologists toward these conversations. Methods: To identify eligible patients, pathways directors from 12 oncology disease centers identified therapeutic decision nodes for each pathway that corresponded to a predicted life expectancy of ≤1 year. When oncologists selected one of these pre-identified pathways nodes, the decision was captured in a relational database. From these patients, we sought evidence of GOC documentation within the electronic health record by extracting coded data from the advance care planning (ACP) module—a designated area within the electronic health record for clinicians to document GOC conversations. We also used rule-based natural language processing (NLP) to capture free text GOC documentation within these same patients’ progress notes. A domain expert reviewed all progress notes identified by NLP to confirm the presence of GOC documentation. Results: In a pilot sample obtained between March 20 and September 25, 2020, we identified a total of 21 pathway nodes conveying a poor prognosis, which represented 91 unique patients with advanced cancer. Among these patients, the mean age was 62 (SD 13.8) years old; 55 (60.4%) patients were female, and 69 (75.8%) were non-Hispanic White. The cancers most represented were thoracic (32 [35.2%]), breast (31 [34.1%]), and head and neck (13 [14.3%]). Within the 3 months leading up to the pathways decision date, a total 62 (68.1%) patients had any GOC documentation. Twenty-one (23.1%) patients had documentation in both the ACP module and NLP-identified progress notes; 5 (5.5%) had documentation in the ACP module only; and 36 (39.6%) had documentation in progress notes only. Twenty-two unique clinicians utilized the ACP module, of which 1 (4.5%) was an oncologist and 21 (95.5%) were palliative care clinicians. Conclusions: Approximately two thirds of patients had any GOC documentation. A total of 26 (28.6%) patients had any GOC documentation in the ACP module, and only 1 oncologist documented using the ACP module, where care teams can most easily retrieve GOC information. These findings provide an important baseline for future quality improvement efforts (e.g., implementing serious illness communications training, increasing support around ACP module utilization, and incorporating behavioral nudges) to enhance oncologists’ ability to conduct and to document timely, high quality GOC conversations.


2019 ◽  
Author(s):  
Daniel M. Bean ◽  
James Teo ◽  
Honghan Wu ◽  
Ricardo Oliveira ◽  
Raj Patel ◽  
...  

AbstractAtrial fibrillation (AF) is the most common arrhythmia and significantly increases stroke risk. This risk is effectively managed by oral anticoagulation. Recent studies using national registry data indicate increased use of anticoagulation resulting from changes in guidelines and the availability of newer drugs.The aim of this study is to develop and validate an open source risk scoring pipeline for free-text electronic health record data using natural language processing.AF patients discharged from 1st January 2011 to 1st October 2017 were identified from discharge summaries (N=10,030, 64.6% male, average age 75.3 ± 12.3 years). A natural language processing pipeline was developed to identify risk factors in clinical text and calculate risk for ischaemic stroke (CHA2DS2-VASc) and bleeding (HAS-BLED). Scores were validated vs two independent experts for 40 patients.Automatic risk scores were in strong agreement with the two independent experts for CHA2DS2-VASc (average kappa 0.78 vs experts, compared to 0.85 between experts). Agreement was lower for HAS-BLED (average kappa 0.54 vs experts, compared to 0.74 between experts).In high-risk patients (CHA2DS2-VASc ≥2) OAC use has increased significantly over the last 7 years, driven by the availability of DOACs and the transitioning of patients from AP medication alone to OAC. Factors independently associated with OAC use included components of the CHA2DS2-VASc and HAS-BLED scores as well as discharging specialty and frailty. OAC use was highest in patients discharged under cardiology (69%).Electronic health record text can be used for automatic calculation of clinical risk scores at scale. Open source tools are available today for this task but require further validation. Analysis of routinely-collected EHR data can replicate findings from large-scale curated registries.


2020 ◽  
Vol 27 (6) ◽  
pp. 917-923
Author(s):  
Liqin Wang ◽  
Suzanne V Blackley ◽  
Kimberly G Blumenthal ◽  
Sharmitha Yerneni ◽  
Foster R Goss ◽  
...  

Abstract Objective Incomplete and static reaction picklists in the allergy module led to free-text and missing entries that inhibit the clinical decision support intended to prevent adverse drug reactions. We developed a novel, data-driven, “dynamic” reaction picklist to improve allergy documentation in the electronic health record (EHR). Materials and Methods We split 3 decades of allergy entries in the EHR of a large Massachusetts healthcare system into development and validation datasets. We consolidated duplicate allergens and those with the same ingredients or allergen groups. We created a reaction value set via expert review of a previously developed value set and then applied natural language processing to reconcile reactions from structured and free-text entries. Three association rule-mining measures were used to develop a comprehensive reaction picklist dynamically ranked by allergen. The dynamic picklist was assessed using recall at top k suggested reactions, comparing performance to the static picklist. Results The modified reaction value set contained 490 reaction concepts. Among 4 234 327 allergy entries collected, 7463 unique consolidated allergens and 469 unique reactions were identified. Of the 3 dynamic reaction picklists developed, the 1 with the optimal ranking achieved recalls of 0.632, 0.763, and 0.822 at the top 5, 10, and 15, respectively, significantly outperforming the static reaction picklist ranked by reaction frequency. Conclusion The dynamic reaction picklist developed using EHR data and a statistical measure was superior to the static picklist and suggested proper reactions for allergy documentation. Further studies might evaluate the usability and impact on allergy documentation in the EHR.


2020 ◽  
Author(s):  
Tjardo D Maarseveen ◽  
Timo Meinderink ◽  
Marcel J T Reinders ◽  
Johannes Knitza ◽  
Tom W J Huizinga ◽  
...  

BACKGROUND Financial codes are often used to extract diagnoses from electronic health records. This approach is prone to false positives. Alternatively, queries are constructed, but these are highly center and language specific. A tantalizing alternative is the automatic identification of patients by employing machine learning on format-free text entries. OBJECTIVE The aim of this study was to develop an easily implementable workflow that builds a machine learning algorithm capable of accurately identifying patients with rheumatoid arthritis from format-free text fields in electronic health records. METHODS Two electronic health record data sets were employed: Leiden (n=3000) and Erlangen (n=4771). Using a portion of the Leiden data (n=2000), we compared 6 different machine learning methods and a naïve word-matching algorithm using 10-fold cross-validation. Performances were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPRC), and F1 score was used as the primary criterion for selecting the best method to build a classifying algorithm. We selected the optimal threshold of positive predictive value for case identification based on the output of the best method in the training data. This validation workflow was subsequently applied to a portion of the Erlangen data (n=4293). For testing, the best performing methods were applied to remaining data (Leiden n=1000; Erlangen n=478) for an unbiased evaluation. RESULTS For the Leiden data set, the word-matching algorithm demonstrated mixed performance (AUROC 0.90; AUPRC 0.33; F1 score 0.55), and 4 methods significantly outperformed word-matching, with support vector machines performing best (AUROC 0.98; AUPRC 0.88; F1 score 0.83). Applying this support vector machine classifier to the test data resulted in a similarly high performance (F1 score 0.81; positive predictive value [PPV] 0.94), and with this method, we could identify 2873 patients with rheumatoid arthritis in less than 7 seconds out of the complete collection of 23,300 patients in the Leiden electronic health record system. For the Erlangen data set, gradient boosting performed best (AUROC 0.94; AUPRC 0.85; F1 score 0.82) in the training set, and applied to the test data, resulted once again in good results (F1 score 0.67; PPV 0.97). CONCLUSIONS We demonstrate that machine learning methods can extract the records of patients with rheumatoid arthritis from electronic health record data with high precision, allowing research on very large populations for limited costs. Our approach is language and center independent and could be applied to any type of diagnosis. We have developed our pipeline into a universally applicable and easy-to-implement workflow to equip centers with their own high-performing algorithm. This allows the creation of observational studies of unprecedented size covering different countries for low cost from already available data in electronic health record systems.


Sign in / Sign up

Export Citation Format

Share Document