scholarly journals Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance

2015 ◽  
Vol 23 (e1) ◽  
pp. e20-e27 ◽  
Author(s):  
Wei-Qi Wei ◽  
Pedro L Teixeira ◽  
Huan Mo ◽  
Robert M Cronin ◽  
Jeremy L Warner ◽  
...  

Abstract Objective To evaluate the phenotyping performance of three major electronic health record (EHR) components: International Classification of Disease (ICD) diagnosis codes, primary notes, and specific medications. Materials and Methods We conducted the evaluation using de-identified Vanderbilt EHR data. We preselected ten diseases: atrial fibrillation, Alzheimer’s disease, breast cancer, gout, human immunodeficiency virus infection, multiple sclerosis, Parkinson’s disease, rheumatoid arthritis, and types 1 and 2 diabetes mellitus. For each disease, patients were classified into seven categories based on the presence of evidence in diagnosis codes, primary notes, and specific medications. Twenty-five patients per disease category (a total number of 175 patients for each disease, 1750 patients for all ten diseases) were randomly selected for manual chart review. Review results were used to estimate the positive predictive value (PPV), sensitivity, and F -score for each EHR component alone and in combination. Results The PPVs of single components were inconsistent and inadequate for accurately phenotyping (0.06–0.71). Using two or more ICD codes improved the average PPV to 0.84. We observed a more stable and higher accuracy when using at least two components (mean ± standard deviation: 0.91 ± 0.08). Primary notes offered the best sensitivity (0.77). The sensitivity of ICD codes was 0.67. Again, two or more components provided a reasonably high and stable sensitivity (0.59 ± 0.16). Overall, the best performance ( F score: 0.70 ± 0.12) was achieved by using two or more components. Although the overall performance of using ICD codes (0.67 ± 0.14) was only slightly lower than using two or more components, its PPV (0.71 ± 0.13) is substantially worse (0.91 ± 0.08). Conclusion Multiple EHR components provide a more consistent and higher performance than a single one for the selected phenotypes. We suggest considering multiple EHR components for future phenotyping design in order to obtain an ideal result.

2016 ◽  
Vol 37 (10) ◽  
pp. 1247-1250 ◽  
Author(s):  
Kristen A. Feemster ◽  
Folasade M. Odeniyi ◽  
Russell Localio ◽  
Robert W. Grundmeier ◽  
Susan E. Coffin ◽  
...  

Compared to chart review, a definition based on the International Classification of Disease, Ninth Revision, Clinical Modification (ICD-9-CM) diagnosis code for healthcare-associated influenza-like illness (HA-ILI) among young children in a large pediatric network demonstrated high positive and negative predictive values. This finding suggests that electronic health record–based definitions for surveillance can accurately identify medically attended outpatient HA-ILI cases for research and surveillance.Infect Control Hosp Epidemiol 2016;1–4


Rheumatology ◽  
2019 ◽  
Vol 59 (5) ◽  
pp. 1059-1065 ◽  
Author(s):  
Sizheng Steven Zhao ◽  
Chuan Hong ◽  
Tianrun Cai ◽  
Chang Xu ◽  
Jie Huang ◽  
...  

Abstract Objectives To develop classification algorithms that accurately identify axial SpA (axSpA) patients in electronic health records, and compare the performance of algorithms incorporating free-text data against approaches using only International Classification of Diseases (ICD) codes. Methods An enriched cohort of 7853 eligible patients was created from electronic health records of two large hospitals using automated searches (⩾1 ICD codes combined with simple text searches). Key disease concepts from free-text data were extracted using NLP and combined with ICD codes to develop algorithms. We created both supervised regression-based algorithms—on a training set of 127 axSpA cases and 423 non-cases—and unsupervised algorithms to identify patients with high probability of having axSpA from the enriched cohort. Their performance was compared against classifications using ICD codes only. Results NLP extracted four disease concepts of high predictive value: ankylosing spondylitis, sacroiliitis, HLA-B27 and spondylitis. The unsupervised algorithm, incorporating both the NLP concept and ICD code for AS, identified the greatest number of patients. By setting the probability threshold to attain 80% positive predictive value, it identified 1509 axSpA patients (mean age 53 years, 71% male). Sensitivity was 0.78, specificity 0.94 and area under the curve 0.93. The two supervised algorithms performed similarly but identified fewer patients. All three outperformed traditional approaches using ICD codes alone (area under the curve 0.80–0.87). Conclusion Algorithms incorporating free-text data can accurately identify axSpA patients in electronic health records. Large cohorts identified using these novel methods offer exciting opportunities for future clinical research.


2020 ◽  
Vol 7 (6) ◽  
Author(s):  
Takaaki Kobayashi ◽  
Brice Beck ◽  
Aaron Miller ◽  
Philip Polgreen ◽  
Amy M J O’Shea ◽  
...  

Abstract Background Prior studies have used International Classification of Disease (ICD) diagnosis codes in administrative data to identify patients with infective endocarditis (IE) associated with intravenous drug use (IVDU). Little is known about the accuracy of ICD codes for IVDU-IE. Methods We used 2 previously described algorithms to identify patients with potential IVDU-IE admitted to 125 Veterans Administration hospitals from January 2010 through December 2018. Algorithm A identified patients with concurrent ICD-9/10 codes for IE and drug use during the same admission. Algorithm B identified patients with drug use coded either during the IE admission or during outpatient or other visits within 6 months of admission. We reviewed 400 randomly selected patient charts to determine the positive predictive value (PPV) of each algorithm for clinical documentation of IE, any drug use, IVDU, and IVDU-IE, respectively. Results Algorithm A identified 788 patients, and B identified 1314 patients, a 68% increase. PPVs were high for clinical documentation of diagnoses of IE (86.5% for A and 82.6% for B) and any drug use (99.0% and 96.3%). PPVs were lower for documented IVDU (74.5% and 64.1%) and combined diagnoses of IVDU-IE (65.0% and 55.2%), partly because of a lack of ICD codes specific to IVDU. Among patients identified by algorithm B but not A, 72% had clinical documentation of drug use during the IE admission, indicating a failure of algorithm A to capture cases due to incomplete recording of inpatient ICD codes for drug use. Conclusions There is need for improved algorithms for IVDU-IE surveillance during the ongoing opioid epidemic.


1997 ◽  
Vol 119 (3) ◽  
pp. 335-341 ◽  
Author(s):  
A. J. COLQUHOUN ◽  
K. G. NICHOLSON ◽  
J. L. BOTHA ◽  
N. T. RAYMOND

The effectiveness of influenza vaccination in reducing hospitalization of people with diabetes for influenza, pneumonia, or diabetic events during influenza epidemics was assessed in a case control study in Leicestershire, England. Cases were 80 patients on the Leicestershire Diabetes Register who were admitted and discharged from hospital with International Classification of Disease codes for pneumonia, bronchitis, influenza, diabetic ketoacidosis, coma and diabetes, without mention of complications, during the influenza epidemics of 1989–90 and 1993. One hundred and sixty-controls, who were not admitted to hospital during this period, were randomly selected from the Register. Immunization against influenza was assessed in 37 cases and 77 controls for whom consent was obtained to access their clinical notes and for whom notes were available. Significant association was detected between reduction in hospitalization and influenza vaccination during the period immediately preceding an epidemic. Multiple logistic regression analysis estimated that influenza vaccination reduced hospital admissions by 79% (95% CI 19–95%) during the two epidemics, after adjustment for potential confounders.


2019 ◽  
pp. 160-163
Author(s):  
Anusha G Bhat ◽  
Kevin White ◽  
Kyle Gobeil ◽  
Tara Lagu ◽  
Peter K Lindenauer ◽  
...  

Prior studies of stress cardiomyopathy (SCM) have used International Classification of Diseases (ICD) codes to identify patients in administrative databases without evaluating the validity of these codes. Between 2010 and 2016, we identified 592 patients discharged with a first known principal or secondary ICD code for SCM in our medical system. On chart review, 580 charts had a diagnosis of SCM (positive predictive value 98%; 95% CI: 96.4-98.8), although 38 (6.4%) did not have active clinical manifestations of SCM during the hospitalization. Moreover, only 66.8% underwent cardiac catheterization and 91.5% underwent echocardiography. These findings suggest that, although all but a few hospitalized patients with an ICD code for SCM had a diagnosis of SCM, some of these were chronic cases, and numerous patients with a new diagnosis of SCM did not undergo a complete diagnostic workup. Researchers should be mindful of these limitations in future studies involving administrative databases.


2020 ◽  
Vol 4 (Supplement_1) ◽  
Author(s):  
Lina Sulieman ◽  
Jing He ◽  
Robert Carroll ◽  
Lisa Bastarache ◽  
Andrea Ramirez

Abstract Electronic Health Records (EHR) contain rich data to identify and study diabetes. Many phenotype algorithms have been developed to identify research subjects with type 2 diabetes (T2D), but very few accurately identify type 1 diabetes (T1D) cases or more rare forms of monogenic and atypical metabolic presentations. Polygenetic risk scores (PRS) quantify risk of a disease using common genomic variants well for both T1D and T2D. In this study, we apply validated phenotyping algorithms to EHRs linked to a genomic biobank to understand the independent contribution of PRS to classification of diabetes etiology and generate additional novel markers to distinguish subtypes of diabetes in EHR data. Using a de-identified mirror of medical center’s electronic health record, we applied published algorithms for T1D and T2D to identify cases, and used natural language processing and chart review strategies to identify cases of maturity onset diabetes of the young (MODY) and other more rare presentations. This novel approach included additional data types such as medication sequencing, ratio and temporality of insulin and non-insulin agents, clinical genetic testing, and ratios of diagnostic codes. Chart review was performed to validate etiology. To calculate PRS, we used genome wide genotyping from our BioBank, the de-identified biobank linking EHR to genomic data using coefficients of 65 published T1D SNPS and 76,996 T2D SNPS using PLINK in Caucasian subjects. In the dataset, we identified 82,238 cases of T2D but only 130 cases of T1D using the most cited published algorithms. Adding novel structured elements and natural language processing identified an additional 138 cases of T1D and distinguished 354 cases as MODY. Among over 90,000 subjects with genotyping data available, we included 72,624 Caucasian subjects since PRS coefficients were generated in Caucasian cohorts. Among those subjects, 248, 6,488, and 21 subjects were identified as T1D, T2D, and MODY subjects respectively in our final PRS cohort. The T1D PRS did significantly discriminate well between cases and controls (Mann-Whitney p-value is 3.4 e-17). The PRS for T2D did not significantly discriminate between cases and controls using published algorithms. The atypical case count was too low to calculate PRS discrimination. Calculation of the PRS score was limited by quality inclusion of variants available, and discrimination may improve in larger data sets. Additionally, blinded physician case review is ongoing to validate the novel classification scheme and provide a gold standard for machine learning approaches that can be applied in validation sets.


2018 ◽  
Author(s):  
Lili Chan ◽  
Kelly Beers ◽  
Kinsuk Chauhan ◽  
Neha Debnath ◽  
Aparna Saha ◽  
...  

AbstractBackgroundIdentification of symptoms is challenging with surveys, which are time-intensive and low-throughput. Natural language processing (NLP) could be utilized to identify symptoms from narrative documentation in the electronic health record (EHR).MethodsWe utilized NLP to parse notes for maintenance hemodialysis (HD) patients from two EHR databases (BioMe and MIMIC-III) to identify fatigue, nausea/vomiting, anxiety, depression, cramping, itching, and pain. We compared NLP performance with International Classification of Diseases (ICD) codes and validated the performance of both NLP and codes against manual chart review in a representative subset.ResultsWe identified 1034 and 929 HD patients from BioMe and MIMIC-III respectively. The most frequently identified symptoms by NLP from both cohorts were fatigue, pain, and nausea and/or vomiting. NLP was significantly more sensitive than ICD codes for nearly all symptoms. In the BioMe dataset, sensitivity for NLP ranged from 0.85-0.99 vs. 0.09-0.59 for ICD codes. In the MIMIC-III dataset, NLP sensitivity was 0.8-0.98 vs. 0.02-0.53 for ICD. ICD codes were significantly more specific for nausea and/or vomiting (NLP 0.57 vs. ICD 0.97, P=0.03) in BioMe and for depression (NLP 0.67 vs. ICD 0.99, P=0.002) in MIMIC-III. A majority of patients in both cohorts had ?4 symptoms. The more encounters available for a patient the more likely NLP was to identify a symptom.ConclusionsNLP out performed ICD codes for identification of symptoms on several tests parameters including sensitivity for a majority of symptoms. NLP may be useful for the high-throughput identification of patient centered outcomes from EHR.Significance StatementPatients on maintenance hemodialysis experience a high frequency of symptoms. However, symptoms have been measured utilizing time-intensive surveys. This paper compares natural language processing (NLP) to administrative codes for the identification of seven key symptoms from two cohorts with electronic health records and validation through manual chart review. NLP identified high rates of symptoms; the most common were fatigue, pain, and nausea and/or vomiting. A majority of patients had ≥4 symptoms. NLP was significantly more sensitive at identifying symptoms compared to administrative codes for nearly all symptoms but specificity was not significantly different compared to codes. This paper demonstrates utility of a high throughput method of identifying symptoms from EHR which may advance the field of patient centered research in nephrology.


1997 ◽  
Vol 12 (1) ◽  
pp. 8-10 ◽  
Author(s):  
AHT Pang ◽  
GS Ungvari ◽  
CK Wong ◽  
T Leung

SummaryIn an attempt to assess the universal applicability of the International Classification of Disease (ICD-10), two psychiatrists from different socio-cultural backgrounds and training independently performed a chart review of 238 Chinese patients. Inter-rater reliability figures were comparable to those found in the WHO-coordinated ICD-10 field trials. The results suggest that ICD-10 has good ‘universality’ in routine clinical practice.


2019 ◽  
Vol 6 (Supplement_2) ◽  
pp. S117-S118
Author(s):  
Michael Haden ◽  
Mohammad Mahdee Sobhanie ◽  
Courtney Hebert ◽  
Clara Castillejo Becerra ◽  
Abigail N Turner

Abstract Background Opioid dependence and overdose are at epidemic levels in the United States. Ohio has the third highest rate of opioid-related overdose deaths. Infectious complications of intravenous drug use (IDU) include increased acquisition of hepatitis C, HIV and infective endocarditis. In this study, we aimed to characterize cases of infective endocarditis admitted to our healthcare system over a five-year period. We additionally sought to determine the validity of using ICD codes to identify infective endocarditis cases and IDU. Methods Patients with ICD-9 or 10 discharge diagnosis codes for infective endocarditis were identified from our institution’s electronic health record. ICD codes pertaining to substance abuse were used to classify patients according to IDU status. Readmissions during the same episode of infective endocarditis were excluded. We compared chart review to ICD code for the identification of infective endocarditis and IDU in a random sample of 296 of 1590 cases. Results Of 296 charts reviewed, 133 (44.9%) were excluded because they did not meet criteria for definite infective endocarditis by modified Duke’s criteria or because the episode was a readmission. A total of 163 (55.1%) cases met inclusion criteria, all of whom were seen in consultation by the inpatient Infectious Disease service. Of these, 52 (31.9%) had ICD 9 or 10 codes linked to substance abuse. Following manual chart review, we established that in fact 86 of these 163 cases (52.8%) had evidence of substance abuse. Conclusion Misclassification due to use of ICD codes is a well-established challenge to epidemiological research. However, the extent of misclassification in this analysis was greater than expected. If prior research on IDU and infective endocarditis has relied on medical record data alone without verification through manual chart review, the observed epidemiological trends may not be accurate. Disclosures All authors: No reported disclosures.


10.2196/17784 ◽  
2020 ◽  
Vol 8 (7) ◽  
pp. e17784 ◽  
Author(s):  
Jihad S Obeid ◽  
Jennifer Dahne ◽  
Sean Christensen ◽  
Samuel Howard ◽  
Tami Crawford ◽  
...  

Background Suicide is an important public health concern in the United States and around the world. There has been significant work examining machine learning approaches to identify and predict intentional self-harm and suicide using existing data sets. With recent advances in computing, deep learning applications in health care are gaining momentum. Objective This study aimed to leverage the information in clinical notes using deep neural networks (DNNs) to (1) improve the identification of patients treated for intentional self-harm and (2) predict future self-harm events. Methods We extracted clinical text notes from electronic health records (EHRs) of 835 patients with International Classification of Diseases (ICD) codes for intentional self-harm and 1670 matched controls who never had any intentional self-harm ICD codes. The data were divided into training and holdout test sets. We tested a number of algorithms on clinical notes associated with the intentional self-harm codes using the training set, including several traditional bag-of-words–based models and 2 DNN models: a convolutional neural network (CNN) and a long short-term memory model. We also evaluated the predictive performance of the DNNs on a subset of patients who had clinical notes 1 to 6 months before the first intentional self-harm event. Finally, we evaluated the impact of a pretrained model using Word2vec (W2V) on performance. Results The area under the receiver operating characteristic curve (AUC) for the CNN on the phenotyping task, that is, the detection of intentional self-harm in clinical notes concurrent with the events was 0.999, with an F1 score of 0.985. In the predictive task, the CNN achieved the highest performance with an AUC of 0.882 and an F1 score of 0.769. Although pretraining with W2V shortened the DNN training time, it did not improve performance. Conclusions The strong performance on the first task, namely, phenotyping based on clinical notes, suggests that such models could be used effectively for surveillance of intentional self-harm in clinical text in an EHR. The modest performance on the predictive task notwithstanding, the results using DNN models on clinical text alone are competitive with other reports in the literature using risk factors from structured EHR data.


Sign in / Sign up

Export Citation Format

Share Document