scholarly journals Predicting Acute Kidney Injury: A Machine Learning Approach Using Electronic Health Records

Information ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 386
Author(s):  
Sheikh S. Abdullah ◽  
Neda Rostamzadeh ◽  
Kamran Sedig ◽  
Amit X. Garg ◽  
Eric McArthur

Acute kidney injury (AKI) is a common complication in hospitalized patients and can result in increased hospital stay, health-related costs, mortality and morbidity. A number of recent studies have shown that AKI is predictable and avoidable if early risk factors can be identified by analyzing Electronic Health Records (EHRs). In this study, we employ machine learning techniques to identify older patients who have a risk of readmission with AKI to the hospital or emergency department within 90 days after discharge. One million patients’ records are included in this study who visited the hospital or emergency department in Ontario between 2014 and 2016. The predictor variables include patient demographics, comorbid conditions, medications and diagnosis codes. We developed 31 prediction models based on different combinations of two sampling techniques, three ensemble methods, and eight classifiers. These models were evaluated through 10-fold cross-validation and compared based on the AUROC metric. The performances of these models were consistent, and the AUROC ranged between 0.61 and 0.88 for predicting AKI among 31 prediction models. In general, the performances of ensemble-based methods were higher than the cost-sensitive logistic regression. We also validated features that are most relevant in predicting AKI with a healthcare expert to improve the performance and reliability of the models. This study predicts the risk of AKI for a patient after being discharged, which provides healthcare providers enough time to intervene before the onset of AKI.

2021 ◽  
Author(s):  
Horng-Ruey Chua ◽  
Kaiping Zheng ◽  
Anantharaman Vathsala ◽  
Kee-Yuan Ngiam ◽  
Hui-Kim Yap ◽  
...  

BACKGROUND Acute kidney injury (AKI) develops in 4% of hospitalized patients and is a marker of clinical deterioration and nephrotoxicity. AKI onset is highly variable in hospitals, which makes it difficult to time biomarker assessment in all patients for preemptive care. OBJECTIVE The study sought to apply machine learning techniques to electronic health records and predict hospital-acquired AKI by a 48-hour lead time, with the aim to create an AKI surveillance algorithm that is deployable in real time. METHODS The data were sourced from 20,732 case admissions in 16,288 patients over 1 year in our institution. We enhanced the bidirectional recurrent neural network model with a novel time-invariant and time-variant aggregated module to capture important clinical features temporal to AKI in every patient. Time-series features included laboratory parameters that preceded a 48-hour prediction window before AKI onset; the latter’s corresponding reference was the final in-hospital serum creatinine performed in case admissions without AKI episodes. RESULTS The cohort was of mean age 53 (SD 25) years, of whom 29%, 12%, 12%, and 53% had diabetes, ischemic heart disease, cancers, and baseline eGFR &lt;90 mL/min/1.73 m<sup>2</sup>, respectively. There were 911 AKI episodes in 869 patients. We derived and validated an algorithm in the testing dataset with an AUROC of 0.81 (0.78-0.85) for predicting AKI. At a 15% prediction threshold, our model generated 699 AKI alerts with 2 false positives for every true AKI and predicted 26% of AKIs. A lowered 5% prediction threshold improved the recall to 60% but generated 3746 AKI alerts with 6 false positives for every true AKI. Representative interpretation results produced by our model alluded to the top-ranked features that predicted AKI that could be categorized in association with sepsis, acute coronary syndrome, nephrotoxicity, or multiorgan injury, specific to every case at risk. CONCLUSIONS We generated an accurate algorithm from electronic health records through machine learning that predicted AKI by a lead time of at least 48 hours. The prediction threshold could be adjusted during deployment to optimize recall and minimize alert fatigue, while its precision could potentially be augmented by targeted AKI biomarker assessment in the high-risk cohort identified.


10.2196/30805 ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. e30805
Author(s):  
Horng-Ruey Chua ◽  
Kaiping Zheng ◽  
Anantharaman Vathsala ◽  
Kee-Yuan Ngiam ◽  
Hui-Kim Yap ◽  
...  

Background Acute kidney injury (AKI) develops in 4% of hospitalized patients and is a marker of clinical deterioration and nephrotoxicity. AKI onset is highly variable in hospitals, which makes it difficult to time biomarker assessment in all patients for preemptive care. Objective The study sought to apply machine learning techniques to electronic health records and predict hospital-acquired AKI by a 48-hour lead time, with the aim to create an AKI surveillance algorithm that is deployable in real time. Methods The data were sourced from 20,732 case admissions in 16,288 patients over 1 year in our institution. We enhanced the bidirectional recurrent neural network model with a novel time-invariant and time-variant aggregated module to capture important clinical features temporal to AKI in every patient. Time-series features included laboratory parameters that preceded a 48-hour prediction window before AKI onset; the latter’s corresponding reference was the final in-hospital serum creatinine performed in case admissions without AKI episodes. Results The cohort was of mean age 53 (SD 25) years, of whom 29%, 12%, 12%, and 53% had diabetes, ischemic heart disease, cancers, and baseline eGFR <90 mL/min/1.73 m2, respectively. There were 911 AKI episodes in 869 patients. We derived and validated an algorithm in the testing dataset with an AUROC of 0.81 (0.78-0.85) for predicting AKI. At a 15% prediction threshold, our model generated 699 AKI alerts with 2 false positives for every true AKI and predicted 26% of AKIs. A lowered 5% prediction threshold improved the recall to 60% but generated 3746 AKI alerts with 6 false positives for every true AKI. Representative interpretation results produced by our model alluded to the top-ranked features that predicted AKI that could be categorized in association with sepsis, acute coronary syndrome, nephrotoxicity, or multiorgan injury, specific to every case at risk. Conclusions We generated an accurate algorithm from electronic health records through machine learning that predicted AKI by a lead time of at least 48 hours. The prediction threshold could be adjusted during deployment to optimize recall and minimize alert fatigue, while its precision could potentially be augmented by targeted AKI biomarker assessment in the high-risk cohort identified.


2016 ◽  
Vol 3 ◽  
pp. 99 ◽  
Author(s):  
Scott M. Sutherland ◽  
Lakhmir S. Chawla ◽  
Sandra L. Kane-Gill ◽  
Raymond K. Hsu ◽  
Andrew A. Kramer ◽  
...  

2017 ◽  
Vol 44 (1) ◽  
pp. 68-76 ◽  
Author(s):  
Scott M. Sutherland ◽  
Stuart L. Goldstein ◽  
Sean M. Bagshaw

While acute kidney injury (AKI) has been poorly defined historically, a decade of effort has culminated in a standardized, consensus definition. In parallel, electronic health records (EHRs) have been adopted with greater regularity, clinical informatics approaches have been refined, and the field of EHR-enabled care improvement and research has burgeoned. Although both fields have matured in isolation, uniting the 2 has the capacity to redefine AKI-related care and research. This article describes how the application of a consistent AKI definition to the EHR dataset can accurately and rapidly diagnose and identify AKI events. Furthermore, this electronic, automated diagnostic strategy creates the opportunity to develop predictive approaches, optimize AKI alerts, and trace AKI events across institutions, care platforms, and administrative datasets.


2020 ◽  
Author(s):  
Janmajay Singh ◽  
Masahiro Sato ◽  
Tomoko Ohkuma

BACKGROUND Missing data in electronic health records is inevitable and considered to be nonrandom. Several studies have found that features indicating missing patterns (missingness) encode useful information about a patient’s health and advocate for their inclusion in clinical prediction models. But their effectiveness has not been comprehensively evaluated. OBJECTIVE The goal of the research is to study the effect of including informative missingness features in machine learning models for various clinically relevant outcomes and explore robustness of these features across patient subgroups and task settings. METHODS A total of 48,336 electronic health records from the 2012 and 2019 PhysioNet Challenges were used, and mortality, length of stay, and sepsis outcomes were chosen. The latter dataset was multicenter, allowing external validation. Gated recurrent units were used to learn sequential patterns in the data and classify or predict labels of interest. Models were evaluated on various criteria and across population subgroups evaluating discriminative ability and calibration. RESULTS Generally improved model performance in retrospective tasks was observed on including missingness features. Extent of improvement depended on the outcome of interest (area under the curve of the receiver operating characteristic [AUROC] improved from 1.2% to 7.7%) and even patient subgroup. However, missingness features did not display utility in a simulated prospective setting, being outperformed (0.9% difference in AUROC) by the model relying only on pathological features. This was despite leading to earlier detection of disease (true positives), since including these features led to a concomitant rise in false positive detections. CONCLUSIONS This study comprehensively evaluated effectiveness of missingness features on machine learning models. A detailed understanding of how these features affect model performance may lead to their informed use in clinical settings especially for administrative tasks like length of stay prediction where they present the greatest benefit. While missingness features, representative of health care processes, vary greatly due to intra- and interhospital factors, they may still be used in prediction models for clinically relevant outcomes. However, their use in prospective models producing frequent predictions needs to be explored further.


2021 ◽  
Author(s):  
Oscar Garnica ◽  
Diego Gómez ◽  
Víctor Ramos ◽  
J. Ignacio Hidalgo ◽  
José M. Ruiz-Giardín

Abstract Background The bacteraemia prediction is relevant because sepsis is one of the most important causes of morbidity and mortality. Bacteraemia prognosis primarily depends on a rapid diagnosis. The bacteraemia prediction would shorten up to 6 days the diagnosis, and, in conjunction with individual patient variables, should be considered to start the early administration of personalised antibiotic treatment and medical services, the election of specific diagnostic techniques and the determination of additional treatments, such as surgery, that would prevent subsequent complications. Machine learning techniques could help physicians make these informed decisions by predicting bacteraemia using the data already available in electronic hospital records. Objective This study presents the application of machine learning techniques to these records to predict the blood culture’s outcome, which would reduce the lag in starting a personalised antibiotic treatment and the medical costs associated with erroneous treatments due to conservative assumptions about blood culture outcomes. Methods Six supervised classifiers were created using three machine learning techniques, Support Vector Machine, Random Forest and K-Nearest Neighbours, on the electronic health records of hospital patients. The best approach to handle missing data was chosen and, for each machine learning technique, two classification models were created: the first uses the features known at the time of blood extraction, whereas the second uses four extra features revealed during the blood culture. Results The six classifiers were trained and tested using a dataset of 4357 patients with 117 features per patient. The models obtain predictions that, for the best case, are up to a state-of-the-art accuracy of 85.9%, a sensitivity of 87.4% and an AUC of 0.93. Conclusions Our results provide cutting-edge metrics of interest in predictive medical models with values that exceed the medical practice threshold and previous results in the literature using classical modelling techniques in specific types of bacteraemia. Additionally, the consistency of results is reasserted because the three classifiers’ importance ranking shows similar features that coincide with those that physicians use in their manual heuristics. Therefore, the efficacy of these machine learning techniques confirms their viability to assist in the aims of predictive and personalised medicine once the disease presents bacteraemia-compatible symptoms and to assist in improving the healthcare economy.


Author(s):  
Weiqi Chen ◽  
Yong Hu ◽  
Xiangzhou Zhang ◽  
Lijuan Wu ◽  
Kang Liu ◽  
...  

2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Sanmay Low ◽  
Anantharaman Vathsala ◽  
Tanusya Murali Murali ◽  
Long Pang ◽  
Graeme MacLaren ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document