scholarly journals Using Machine Learning and the Electronic Health Record to Predict Complicated Clostridium difficile Infection

2019 ◽  
Vol 6 (5) ◽  
Author(s):  
Benjamin Y Li ◽  
Jeeheh Oh ◽  
Vincent B Young ◽  
Krishna Rao ◽  
Jenna Wiens

Abstract Background Clostridium (Clostridioides) difficile infection (CDI) is a health care–associated infection that can lead to serious complications. Potential complications include intensive care unit (ICU) admission, development of toxic megacolon, need for colectomy, and death. However, identifying the patients most likely to develop complicated CDI is challenging. To this end, we explored the utility of a machine learning (ML) approach for patient risk stratification for complications using electronic health record (EHR) data. Methods We considered adult patients diagnosed with CDI between October 2010 and January 2013 at the University of Michigan hospitals. Cases were labeled complicated if the infection resulted in ICU admission, colectomy, or 30-day mortality. Leveraging EHR data, we trained a model to predict subsequent complications on each of the 3 days after diagnosis. We compared our EHR-based model to one based on a small set of manually curated features. We evaluated model performance using a held-out data set in terms of the area under the receiver operating characteristic curve (AUROC). Results Of 1118 cases of CDI, 8% became complicated. On the day of diagnosis, the model achieved an AUROC of 0.69 (95% confidence interval [CI], 0.55–0.83). Using data extracted 2 days after CDI diagnosis, performance increased (AUROC, 0.90; 95% CI, 0.83–0.95), outperforming a model based on a curated set of features (AUROC, 0.84; 95% CI, 0.75–0.91). Conclusions Using EHR data, we can accurately stratify CDI cases according to their risk of developing complications. Such an approach could be used to guide future clinical studies investigating interventions that could prevent or mitigate complicated CDI.

Author(s):  
Jeffrey G Klann ◽  
Griffin M Weber ◽  
Hossein Estiri ◽  
Bertrand Moal ◽  
Paul Avillach ◽  
...  

Abstract Introduction The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing COVID-19 with federated analyses of electronic health record (EHR) data. Objective We sought to develop and validate a computable phenotype for COVID-19 severity. Methods Twelve 4CE sites participated. First we developed an EHR-based severity phenotype consisting of six code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also piloted an alternative machine-learning approach and compared selected predictors of severity to the 4CE phenotype at one site. Results The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability - up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean AUC 0.903 (95% CI: 0.886, 0.921), compared to AUC 0.956 (95% CI: 0.952, 0.959) for the machine-learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared to chart review. Discussion We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine-learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly due to heterogeneous pandemic conditions. Conclusion We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.


2020 ◽  
Author(s):  
Tjardo D Maarseveen ◽  
Timo Meinderink ◽  
Marcel J T Reinders ◽  
Johannes Knitza ◽  
Tom W J Huizinga ◽  
...  

BACKGROUND Financial codes are often used to extract diagnoses from electronic health records. This approach is prone to false positives. Alternatively, queries are constructed, but these are highly center and language specific. A tantalizing alternative is the automatic identification of patients by employing machine learning on format-free text entries. OBJECTIVE The aim of this study was to develop an easily implementable workflow that builds a machine learning algorithm capable of accurately identifying patients with rheumatoid arthritis from format-free text fields in electronic health records. METHODS Two electronic health record data sets were employed: Leiden (n=3000) and Erlangen (n=4771). Using a portion of the Leiden data (n=2000), we compared 6 different machine learning methods and a naïve word-matching algorithm using 10-fold cross-validation. Performances were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPRC), and F1 score was used as the primary criterion for selecting the best method to build a classifying algorithm. We selected the optimal threshold of positive predictive value for case identification based on the output of the best method in the training data. This validation workflow was subsequently applied to a portion of the Erlangen data (n=4293). For testing, the best performing methods were applied to remaining data (Leiden n=1000; Erlangen n=478) for an unbiased evaluation. RESULTS For the Leiden data set, the word-matching algorithm demonstrated mixed performance (AUROC 0.90; AUPRC 0.33; F1 score 0.55), and 4 methods significantly outperformed word-matching, with support vector machines performing best (AUROC 0.98; AUPRC 0.88; F1 score 0.83). Applying this support vector machine classifier to the test data resulted in a similarly high performance (F1 score 0.81; positive predictive value [PPV] 0.94), and with this method, we could identify 2873 patients with rheumatoid arthritis in less than 7 seconds out of the complete collection of 23,300 patients in the Leiden electronic health record system. For the Erlangen data set, gradient boosting performed best (AUROC 0.94; AUPRC 0.85; F1 score 0.82) in the training set, and applied to the test data, resulted once again in good results (F1 score 0.67; PPV 0.97). CONCLUSIONS We demonstrate that machine learning methods can extract the records of patients with rheumatoid arthritis from electronic health record data with high precision, allowing research on very large populations for limited costs. Our approach is language and center independent and could be applied to any type of diagnosis. We have developed our pipeline into a universally applicable and easy-to-implement workflow to equip centers with their own high-performing algorithm. This allows the creation of observational studies of unprecedented size covering different countries for low cost from already available data in electronic health record systems.


Author(s):  
Chunyang Li ◽  
Vikas Patil ◽  
Kelli M. Rasmussen ◽  
Christina Yong ◽  
Hsu-Chih Chien ◽  
...  

The most accurate prognostic approach for follicular lymphoma (FL), progression of disease at 24 months (POD24), requires two years’ observation after initiating first-line therapy (L1) to predict outcomes. We applied machine learning to structured electronic health record (EHR) data to predict individual survival at L1 initiation. We grouped 523 observations and 1933 variables from a nationwide cohort of FL patients diagnosed 2006–2014 in the Veterans Health Administration into traditionally used prognostic variables (“curated”), commonly measured labs (“labs”), and International Classification of Diseases diagnostic codes (“ICD”) sets. We compared performance of random survival forests (RSF) vs. traditional Cox model using four datasets: curated, curated + labs, curated + ICD, and curated + ICD + labs, also using Cox on curated + POD24. We evaluated variable importance and partial dependence plots with area under the receiver operating characteristic curve (AUC). RSF with curated + labs performed best, with mean AUC 0.73 (95% CI: 0.71–0.75). It approximated, but did not surpass, Cox with POD24 (mean AUC 0.74 [95% CI: 0.71–0.77]). RSF using EHR data achieved better performance than traditional prognostic variables, setting the foundation for the incorporation of our algorithm into the EHR. It also provides for possible future scenarios in which clinicians could be provided an EHR-based tool which approximates the predictive ability of the most accurate known indicator, using information available 24 months earlier.


2020 ◽  
Author(s):  
Sujeong Hur ◽  
Ryoung-Eun Ko ◽  
Junsang Yoo ◽  
Juhyung Ha ◽  
Won Chul Cha ◽  
...  

BACKGROUND Delirium occurs frequently among patients admitted to intensive care unit (ICU). There is only limited evidence to support interventions to treat or resolve delirium in patients who have already developed delirium. Therefore, the early recognition and prevention of delirium is important in the management of critically ill patients. OBJECTIVE This study aimed to develop and validate the PRIDE (PRediction of ICU DElirium) model with machine learning using electronic health record data for delirium prediction within 24 hours from ICU admission. METHODS This is a retrospective cohort study performed at a tertiary referral hospital with 120 ICU beds. Machine learning-based PRIDE (PRediction of ICU DElirium) models were developed using patient data from the first 2 years of the study period and validated using patient data from the last 6 months. eXtreme Gradient Boosting (XGBoost), random forest (RF), deep neural network (DNN), and logistic regression (LR) were used. The PRIDE model was externally validated using MIMIC-III data. RESULTS We only included patients who were 18 years or older at the time of admission and who stayed in the medical or surgical ICU. A total of 37,543 cases were collected. After patient exclusion, 12,409 remained as our study population, of which 3,816 (30.8%) patients experienced delirium incidents during the study period. The MIMIC-3 dataset, based on the exclusion criteria, out of the 96,016 ICU admission cases, 2,061 cases were included, and 272 (13.2%) delirium incidents occurred. In the internal validation, the area under the receiver operating characteristics (AUROC) for XGBoost, RF, DNN, and LR was 0.919 (95% CI 0.919–0.919), 0.916 (95% CI 0.916–0.916), 0.881 (95% CI 0.878–0.884), and 0.875 (95% CI 0.875–0.875), respectively. Regarding the external validation, the best AUROC was 0.721 (95% CI 0.72–0.721), 0.697 (95% CI 0.695–0.699), 0.655 (95% CI 0.654–0.657), and 0.631 (95% CI 0.631–0.631) for RF, XGBoost, DNN, and LR, respectively. The Brier score of the XGBoost model is 0.094, indicating that it is well calibrated. CONCLUSIONS A machine learning approach based on electronic health record data can be used to predict delirium within 24 hours of ICU admission. CLINICALTRIAL N/A


10.2196/23930 ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. e23930
Author(s):  
Tjardo D Maarseveen ◽  
Timo Meinderink ◽  
Marcel J T Reinders ◽  
Johannes Knitza ◽  
Tom W J Huizinga ◽  
...  

Background Financial codes are often used to extract diagnoses from electronic health records. This approach is prone to false positives. Alternatively, queries are constructed, but these are highly center and language specific. A tantalizing alternative is the automatic identification of patients by employing machine learning on format-free text entries. Objective The aim of this study was to develop an easily implementable workflow that builds a machine learning algorithm capable of accurately identifying patients with rheumatoid arthritis from format-free text fields in electronic health records. Methods Two electronic health record data sets were employed: Leiden (n=3000) and Erlangen (n=4771). Using a portion of the Leiden data (n=2000), we compared 6 different machine learning methods and a naïve word-matching algorithm using 10-fold cross-validation. Performances were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPRC), and F1 score was used as the primary criterion for selecting the best method to build a classifying algorithm. We selected the optimal threshold of positive predictive value for case identification based on the output of the best method in the training data. This validation workflow was subsequently applied to a portion of the Erlangen data (n=4293). For testing, the best performing methods were applied to remaining data (Leiden n=1000; Erlangen n=478) for an unbiased evaluation. Results For the Leiden data set, the word-matching algorithm demonstrated mixed performance (AUROC 0.90; AUPRC 0.33; F1 score 0.55), and 4 methods significantly outperformed word-matching, with support vector machines performing best (AUROC 0.98; AUPRC 0.88; F1 score 0.83). Applying this support vector machine classifier to the test data resulted in a similarly high performance (F1 score 0.81; positive predictive value [PPV] 0.94), and with this method, we could identify 2873 patients with rheumatoid arthritis in less than 7 seconds out of the complete collection of 23,300 patients in the Leiden electronic health record system. For the Erlangen data set, gradient boosting performed best (AUROC 0.94; AUPRC 0.85; F1 score 0.82) in the training set, and applied to the test data, resulted once again in good results (F1 score 0.67; PPV 0.97). Conclusions We demonstrate that machine learning methods can extract the records of patients with rheumatoid arthritis from electronic health record data with high precision, allowing research on very large populations for limited costs. Our approach is language and center independent and could be applied to any type of diagnosis. We have developed our pipeline into a universally applicable and easy-to-implement workflow to equip centers with their own high-performing algorithm. This allows the creation of observational studies of unprecedented size covering different countries for low cost from already available data in electronic health record systems.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 1511-1511
Author(s):  
Dylan J. Peterson ◽  
Nicolai P. Ostberg ◽  
Douglas W. Blayney ◽  
James D. Brooks ◽  
Tina Hernandez-Boussard

1511 Background: Acute care use is one of the largest drivers of cancer care costs. OP-35: Admissions and Emergency Department Visits for Patients Receiving Outpatient Chemotherapy is a CMS quality measure that will affect reimbursement based on unplanned inpatient admissions (IP) and emergency department (ED) visits. Targeted measures can reduce preventable acute care use but identifying which patients might benefit remains challenging. Prior predictive models have made use of a limited subset of the data available in the Electronic Health Record (EHR). We hypothesized dense, structured EHR data could be used to train machine learning algorithms to predict risk of preventable ED and IP visits. Methods: Patients treated at Stanford Health Care and affiliated community care sites between 2013 and 2015 who met inclusion criteria for OP-35 were selected from our EHR. Preventable ED or IP visits were identified using OP-35 criteria. Demographic, diagnosis, procedure, medication, laboratory, vital sign, and healthcare utilization data generated prior to chemotherapy treatment were obtained. A random split of 80% of the cohort was used to train a logistic regression with least absolute shrinkage and selection operator regularization (LASSO) model to predict risk for acute care events within the first 180 days of chemotherapy. The remaining 20% were used to measure model performance by the Area Under the Receiver Operator Curve (AUROC). Results: 8,439 patients were included, of whom 35% had one or more preventable event within 180 days of starting chemotherapy. Our LASSO model classified patients at risk for preventable ED or IP visits with an AUROC of 0.783 (95% CI: 0.761-0.806). Model performance was better for identifying risk for IP visits than ED visits. LASSO selected 125 of 760 possible features to use when classifying patients. These included prior acute care visits, cancer stage, race, laboratory values, and a diagnosis of depression. Key features for the model are shown in the table. Conclusions: Machine learning models trained on a large number of routinely collected clinical variables can identify patients at risk for acute care events with promising accuracy. These models have the potential to improve cancer care outcomes, patient experience, and costs by allowing for targeted preventative interventions. Future work will include prospective and external validation in other healthcare systems.[Table: see text]


Author(s):  
Emily Kogan ◽  
Kathryn Twyman ◽  
Jesse Heap ◽  
Dejan Milentijevic ◽  
Jennifer H. Lin ◽  
...  

Abstract Background Stroke severity is an important predictor of patient outcomes and is commonly measured with the National Institutes of Health Stroke Scale (NIHSS) scores. Because these scores are often recorded as free text in physician reports, structured real-world evidence databases seldom include the severity. The aim of this study was to use machine learning models to impute NIHSS scores for all patients with newly diagnosed stroke from multi-institution electronic health record (EHR) data. Methods NIHSS scores available in the Optum© de-identified Integrated Claims-Clinical dataset were extracted from physician notes by applying natural language processing (NLP) methods. The cohort analyzed in the study consists of the 7149 patients with an inpatient or emergency room diagnosis of ischemic stroke, hemorrhagic stroke, or transient ischemic attack and a corresponding NLP-extracted NIHSS score. A subset of these patients (n = 1033, 14%) were held out for independent validation of model performance and the remaining patients (n = 6116, 86%) were used for training the model. Several machine learning models were evaluated, and parameters optimized using cross-validation on the training set. The model with optimal performance, a random forest model, was ultimately evaluated on the holdout set. Results Leveraging machine learning we identified the main factors in electronic health record data for assessing stroke severity, including death within the same month as stroke occurrence, length of hospital stay following stroke occurrence, aphagia/dysphagia diagnosis, hemiplegia diagnosis, and whether a patient was discharged to home or self-care. Comparing the imputed NIHSS scores to the NLP-extracted NIHSS scores on the holdout data set yielded an R2 (coefficient of determination) of 0.57, an R (Pearson correlation coefficient) of 0.76, and a root-mean-squared error of 4.5. Conclusions Machine learning models built on EHR data can be used to determine proxies for stroke severity. This enables severity to be incorporated in studies of stroke patient outcomes using administrative and EHR databases.


2020 ◽  
Vol 31 (6) ◽  
pp. 1348-1357 ◽  
Author(s):  
Ibrahim Sandokji ◽  
Yu Yamamoto ◽  
Aditya Biswas ◽  
Tanima Arora ◽  
Ugochukwu Ugwuowo ◽  
...  

BackgroundTimely prediction of AKI in children can allow for targeted interventions, but the wealth of data in the electronic health record poses unique modeling challenges.MethodsWe retrospectively reviewed the electronic medical records of all children younger than 18 years old who had at least two creatinine values measured during a hospital admission from January 2014 through January 2018. We divided the study population into derivation, and internal and external validation cohorts, and used five feature selection techniques to select 10 of 720 potentially predictive variables from the electronic health records. Model performance was assessed by the area under the receiver operating characteristic curve in the validation cohorts. The primary outcome was development of AKI (per the Kidney Disease Improving Global Outcomes creatinine definition) within a moving 48-hour window. Secondary outcomes included severe AKI (stage 2 or 3), inpatient mortality, and length of stay.ResultsAmong 8473 encounters studied, AKI occurred in 516 (10.2%), 207 (9%), and 27 (2.5%) encounters in the derivation, and internal and external validation cohorts, respectively. The highest-performing model used a machine learning-based genetic algorithm, with an overall receiver operating characteristic curve in the internal validation cohort of 0.76 [95% confidence interval (CI), 0.72 to 0.79] for AKI, 0.79 (95% CI, 0.74 to 0.83) for severe AKI, and 0.81 (95% CI, 0.77 to 0.86) for neonatal AKI. To translate this prediction model into a clinical risk-stratification tool, we identified high- and low-risk threshold points.ConclusionsUsing various machine learning algorithms, we identified and validated a time-updated prediction model of ten readily available electronic health record variables to accurately predict imminent AKI in hospitalized children.


Sign in / Sign up

Export Citation Format

Share Document