Mining Interpretable and Predictive Diagnosis Codes from Multi-source Electronic Health Records

Author(s):  
Sanjoy Dey ◽  
Gyorgy Simon ◽  
Bonnie Westra ◽  
Michael Steinbach ◽  
Vipin Kumar
Author(s):  
Roselie A. Bright ◽  
Susan J. Bright-Ponte ◽  
Lee Anne Palmer ◽  
Summer K. Rankin ◽  
Sergey Blok

ABSTRACTBackgroundElectronic health records (EHRs) and big data tools offer the opportunity for surveillance of adverse events (patient harm associated with medical care). We chose the case of transfusion adverse events (TAEs) and potential TAEs (PTAEs) because 1.) real dates were obscured in the study data, and 2.) there was emerging recognition of new types during the study data period.ObjectiveWe aimed to use the structured data in electronic health records (EHRs) to find TAEs and PTAEs among adults.MethodsWe used 49,331 adult admissions involving critical care at a major teaching hospital, 2001-2012, in the MIMIC-III EHRs database. We formed a T (defined as packed red blood cells, platelets, or plasma) group of 21,443 admissions vs. 25,468 comparison (C) admissions. The ICD-9-CM diagnosis codes were compared for T vs. C, described, and tested with statistical tools.ResultsTAEs such as transfusion associated circulatory overload (TACO; 12 T cases; rate ratio (RR) 15.61; 95% CI 2.49 to 98) were found. There were also PTAEs similar to TAEs, such as fluid overload disorder (361 T admissions; RR 2.24; 95% CI 1.88 to 2.65), similar to TACO. Some diagnoses could have been sequelae of TAEs, including nontraumatic compartment syndrome of abdomen (52 T cases; RR 6.76; 95% CI 3.40 to 14.9) possibly being a consequence of TACO.ConclusionsSurveillance for diagnosis codes that could be TAE sequelae or unrecognized TAE might be useful supplements to existing medical product adverse event programs.


2021 ◽  
Author(s):  
Roselie A. Bright ◽  
Katherine Dowdy ◽  
Summer K. Rankin ◽  
Sergey V. Blok ◽  
Lee Anne Palmer ◽  
...  

ABSTRACTBackgroundText in electronic health records (EHRs) and big data tools offer the opportunity for surveillance of adverse events (patient harm associated with medical care) (AEs) in the unstructured notes. Writers may explicitly state an apparent association between treatment and adverse outcome (“attributed”) or state the simple treatment and outcome without an association (“unattributed”). We chose to study EHRs from 2006-2008 because of known heparin contamination during this timeframe. We hypothesized that the prevalence of adulterated heparin may have been widespread enough to manifest in EHRs through symptoms related to heparin adverse events, independent of clinicians’ documentation of attributed AEs.ObjectiveUse the Shakespeare Method, a new unsupervised set of tools, to identify attributed and unattributed potential AEs using the unstructured text of EHRs.MethodsWe studied 21,287 adult critical care admissions divided into three time periods. Comparisons of period 3 (7/2007 to 6/2008) to period 2 (7/2006 to 6/2007) were used to find admissions notes to review for new or increased clinical events by generating Latent Dirichlet Allocation topics among words in period 3 that were distinct from period 2. These results were further explored with frequency analyses of periods 1 (7/2001 to 6/2006) through 3.ResultsTopics represented unattributed heparin AEs, other medical AEs, rare medical diagnoses, and other clinical events; all were verified with EHRs notes review and frequency analysis. The heparin AEs were not attributed in the notes, diagnosis codes, or procedure codes. Somewhat different from our hypothesis, heparin AEs increased in prevalence from 2001 through 2007, and decreased starting in 2008 (when heparin AEs were being published).ConclusionsThe Shakespeare Method could be a useful supplement to AE reporting and surveillance of structured EHRs data. Future improvements should include automation of the manual review process.


Hypertension ◽  
2020 ◽  
Vol 76 (Suppl_1) ◽  
Author(s):  
Priyanka Solanki ◽  
Imran Ajmal ◽  
Xiruo Ding ◽  
Jordana Cohen ◽  
Debbie Cohen ◽  
...  

Introduction: Apparent treatment resistant hypertension (aTRH) affects 10-20% of hypertensive adults and increases risk of cardiovascular events and mortality. Fewer than half of these patients have true resistant hypertension. The majority experience pseudo-resistant hypertension due to inadequate medication adherence, white coat hypertension, and secondary causes of hypertension. We hypothesize that electronic health records can be leveraged to identify aTRH patients who would benefit from targeted counseling, medication reconciliation, and screening for secondary causes of hypertension. Methods: We studied electronic health record (EHR) data from 395 hypertensive adults in our primary care population who received longitudinal care between 2007 and 2017. Patients who met the 2008 AHA definition of resistant hypertension by chart review were considered to have aTRH. We also included 100 patients identified by heuristics targeting secondary hypertension. We extracted from the EHR demographics, vitals, laboratory results, diagnosis codes, and medications. Results outside of physiologic range were excluded and median imputation was used to handle missing data. Random forest model performance was assessed by 5-fold cross validation. Model discrimination was evaluated at an estimated positive predictive value of 75%. Results: The prevalence of aTRH in our randomly selected and full cohorts was 20.3% (n=295) and 25.8% (n=395), respectively. In cross-validation, the random forest model demonstrated a median sensitivity of 65% (IQR: 60% - 65%) and a median AUROC of 0.92 (IQR: 0.90 - 0.92). The most influential variables were related to the prescription of three or more hypertension medications; number of days on diuretics, angiotensin-converting enzyme inhibitors, or angiotensin II receptor blockers; systolic blood pressure measurements; and hypertension or diabetes diagnosis codes. Conclusion: EHR data can be used to accurately identify patients with aTRH. We expect the implementation of a clinical decision support system leveraging such models could lead to the improved care for aTRH patients.


2019 ◽  
Vol 26 (8-9) ◽  
pp. 787-795 ◽  
Author(s):  
Tao Chen ◽  
Mark Dredze ◽  
Jonathan P Weiner ◽  
Hadi Kharrazi

Abstract Objective Geriatric syndromes such as functional disability and lack of social support are often not encoded in electronic health records (EHRs), thus obscuring the identification of vulnerable older adults in need of additional medical and social services. In this study, we automatically identify vulnerable older adult patients with geriatric syndrome based on clinical notes extracted from an EHR system, and demonstrate how contextual information can improve the process. Materials and Methods We propose a novel end-to-end neural architecture to identify sentences that contain geriatric syndromes. Our model learns a representation of the sentence and augments it with contextual information: surrounding sentences, the entire clinical document, and the diagnosis codes associated with the document. We trained our system on annotated notes from 85 patients, tuned the model on another 50 patients, and evaluated its performance on the rest, 50 patients. Results Contextual information improved classification, with the most effective context coming from the surrounding sentences. At sentence level, our best performing model achieved a micro-F1 of 0.605, significantly outperforming context-free baselines. At patient level, our best model achieved a micro-F1 of 0.843. Discussion Our solution can be used to expand the identification of vulnerable older adults with geriatric syndromes. Since functional and social factors are often not captured by diagnosis codes in EHRs, the automatic identification of the geriatric syndrome can reduce disparities by ensuring consistent care across the older adult population. Conclusion EHR free-text can be used to identify vulnerable older adults with a range of geriatric syndromes.


2016 ◽  
Vol 34 (2) ◽  
pp. 163-165 ◽  
Author(s):  
William B. Ventres ◽  
Richard M. Frankel

Sign in / Sign up

Export Citation Format

Share Document