scholarly journals Validation of an Internationally Derived Patient Severity Phenotype to Support COVID-19 Analytics from Electronic Health Record Data

Author(s):  
Jeffrey G Klann ◽  
Griffin M Weber ◽  
Hossein Estiri ◽  
Bertrand Moal ◽  
Paul Avillach ◽  
...  

Abstract Introduction The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing COVID-19 with federated analyses of electronic health record (EHR) data. Objective We sought to develop and validate a computable phenotype for COVID-19 severity. Methods Twelve 4CE sites participated. First we developed an EHR-based severity phenotype consisting of six code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also piloted an alternative machine-learning approach and compared selected predictors of severity to the 4CE phenotype at one site. Results The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability - up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean AUC 0.903 (95% CI: 0.886, 0.921), compared to AUC 0.956 (95% CI: 0.952, 0.959) for the machine-learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared to chart review. Discussion We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine-learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly due to heterogeneous pandemic conditions. Conclusion We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.

Author(s):  
Jeffrey G Klann ◽  
Griffin M Weber ◽  
Hossein Estiri ◽  
Bertrand Moal ◽  
Paul Avillach ◽  
...  

AbstractIntroductionThe Consortium for Clinical Characterization of COVID-19 by EHR (4CE) includes hundreds of hospitals internationally using a federated computational approach to COVID-19 research using the EHR.ObjectiveWe sought to develop and validate a standard definition of COVID-19 severity from readily accessible EHR data across the Consortium.MethodsWe developed an EHR-based severity algorithm and validated it on patient hospitalization data from 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also used a machine learning approach to compare selected predictors of severity to the 4CE algorithm at one site.ResultsThe 4CE severity algorithm performed with pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of single code categories for acuity were unacceptably inaccurate - varying by up to 0.65 across sites. A multivariate machine learning approach identified codes resulting in mean AUC 0.956 (95% CI: 0.952, 0.959) compared to 0.903 (95% CI: 0.886, 0.921) using expert-derived codes. Billing codes were poor proxies of ICU admission, with 49% precision and recall compared against chart review at one partner institution.DiscussionWe developed a proxy measure of severity that proved resilient to coding variability internationally by using a set of 6 code classes. In contrast, machine-learning approaches may tend to overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold standard outcomes, possibly due to pandemic conditions.ConclusionWe developed an EHR-based algorithm for COVID-19 severity and validated it at 12 international sites.


2020 ◽  
Author(s):  
Sujeong Hur ◽  
Ryoung-Eun Ko ◽  
Junsang Yoo ◽  
Juhyung Ha ◽  
Won Chul Cha ◽  
...  

BACKGROUND Delirium occurs frequently among patients admitted to intensive care unit (ICU). There is only limited evidence to support interventions to treat or resolve delirium in patients who have already developed delirium. Therefore, the early recognition and prevention of delirium is important in the management of critically ill patients. OBJECTIVE This study aimed to develop and validate the PRIDE (PRediction of ICU DElirium) model with machine learning using electronic health record data for delirium prediction within 24 hours from ICU admission. METHODS This is a retrospective cohort study performed at a tertiary referral hospital with 120 ICU beds. Machine learning-based PRIDE (PRediction of ICU DElirium) models were developed using patient data from the first 2 years of the study period and validated using patient data from the last 6 months. eXtreme Gradient Boosting (XGBoost), random forest (RF), deep neural network (DNN), and logistic regression (LR) were used. The PRIDE model was externally validated using MIMIC-III data. RESULTS We only included patients who were 18 years or older at the time of admission and who stayed in the medical or surgical ICU. A total of 37,543 cases were collected. After patient exclusion, 12,409 remained as our study population, of which 3,816 (30.8%) patients experienced delirium incidents during the study period. The MIMIC-3 dataset, based on the exclusion criteria, out of the 96,016 ICU admission cases, 2,061 cases were included, and 272 (13.2%) delirium incidents occurred. In the internal validation, the area under the receiver operating characteristics (AUROC) for XGBoost, RF, DNN, and LR was 0.919 (95% CI 0.919–0.919), 0.916 (95% CI 0.916–0.916), 0.881 (95% CI 0.878–0.884), and 0.875 (95% CI 0.875–0.875), respectively. Regarding the external validation, the best AUROC was 0.721 (95% CI 0.72–0.721), 0.697 (95% CI 0.695–0.699), 0.655 (95% CI 0.654–0.657), and 0.631 (95% CI 0.631–0.631) for RF, XGBoost, DNN, and LR, respectively. The Brier score of the XGBoost model is 0.094, indicating that it is well calibrated. CONCLUSIONS A machine learning approach based on electronic health record data can be used to predict delirium within 24 hours of ICU admission. CLINICALTRIAL N/A


Author(s):  
Emily Kogan ◽  
Kathryn Twyman ◽  
Jesse Heap ◽  
Dejan Milentijevic ◽  
Jennifer H. Lin ◽  
...  

Abstract Background Stroke severity is an important predictor of patient outcomes and is commonly measured with the National Institutes of Health Stroke Scale (NIHSS) scores. Because these scores are often recorded as free text in physician reports, structured real-world evidence databases seldom include the severity. The aim of this study was to use machine learning models to impute NIHSS scores for all patients with newly diagnosed stroke from multi-institution electronic health record (EHR) data. Methods NIHSS scores available in the Optum© de-identified Integrated Claims-Clinical dataset were extracted from physician notes by applying natural language processing (NLP) methods. The cohort analyzed in the study consists of the 7149 patients with an inpatient or emergency room diagnosis of ischemic stroke, hemorrhagic stroke, or transient ischemic attack and a corresponding NLP-extracted NIHSS score. A subset of these patients (n = 1033, 14%) were held out for independent validation of model performance and the remaining patients (n = 6116, 86%) were used for training the model. Several machine learning models were evaluated, and parameters optimized using cross-validation on the training set. The model with optimal performance, a random forest model, was ultimately evaluated on the holdout set. Results Leveraging machine learning we identified the main factors in electronic health record data for assessing stroke severity, including death within the same month as stroke occurrence, length of hospital stay following stroke occurrence, aphagia/dysphagia diagnosis, hemiplegia diagnosis, and whether a patient was discharged to home or self-care. Comparing the imputed NIHSS scores to the NLP-extracted NIHSS scores on the holdout data set yielded an R2 (coefficient of determination) of 0.57, an R (Pearson correlation coefficient) of 0.76, and a root-mean-squared error of 4.5. Conclusions Machine learning models built on EHR data can be used to determine proxies for stroke severity. This enables severity to be incorporated in studies of stroke patient outcomes using administrative and EHR databases.


2019 ◽  
Vol 6 (10) ◽  
pp. e688-e695 ◽  
Author(s):  
Julia L Marcus ◽  
Leo B Hurley ◽  
Douglas S Krakower ◽  
Stacey Alexeeff ◽  
Michael J Silverberg ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document