Abstract 39: Development of a Hypoglycemia Prediction Model for Veterans With Diabetes Using Supervised Machine Learning Applied to Electronic Health Record Data

Circulation ◽  
2020 ◽  
Vol 141 (Suppl_1) ◽  
Author(s):  
Sridharan Raghavan ◽  
Wenhui Liu ◽  
Anna Baron ◽  
David Saxon ◽  
Meg Plomondon ◽  
...  

Accurate assessment of hypoglycemia risk is critical for treatment selection in individuals with diabetes and cardiovascular disease (CVD) - patients for whom hypoglycemia is particularly harmful. We developed and validated a hypoglycemia prediction model in diabetes patients with and without CVD using data routinely available in electronic health records (EHR) and compared performance to a published prediction model. We studied 128,893 US Veterans with diabetes and angiographic assessment of CVD from 2005 to 2018. We used a random 2/3 of the sample for model development and the remaining 1/3 for validation. The primary outcome was severe hypoglycemia based on a previously validated algorithm that uses diagnosis codes and glucose measurements. We evaluated 33 potential predictors, including demographics, diabetes-related variables, comorbidities, and CVD risk factors. We sequentially used two machine learning algorithms for model development. First, we used multivariable adaptive regression splines, which can accommodate interactions and non-linearities for continuous variables, to select predictors. Second, we used adaptive elastic net, which can accommodate time-to-event outcomes, to fit a model with the selected variables. We tested model discrimination using the area under the ROC curve (AUC) and calibration by plotting predicted versus observed event rates in the independent validation cohort. The best-fitting prediction model included 18 predictors; a history of hypoglycemia was the strongest predictor (Table). In external validation, AUC was 0.729 for 2-year events, and the slope of the calibration curve was 1.05, exceeding performance of the published model in this patient population for both discrimination and calibration (Table). Conclusions: Applying supervised machine learning to EHR data may provide an efficient approach to tailoring prediction of preventable clinical outcomes, e.g., hypoglycemia, for high risk patients receiving care in an integrated healthcare system.

2020 ◽  
Vol 31 (6) ◽  
pp. 1348-1357 ◽  
Author(s):  
Ibrahim Sandokji ◽  
Yu Yamamoto ◽  
Aditya Biswas ◽  
Tanima Arora ◽  
Ugochukwu Ugwuowo ◽  
...  

BackgroundTimely prediction of AKI in children can allow for targeted interventions, but the wealth of data in the electronic health record poses unique modeling challenges.MethodsWe retrospectively reviewed the electronic medical records of all children younger than 18 years old who had at least two creatinine values measured during a hospital admission from January 2014 through January 2018. We divided the study population into derivation, and internal and external validation cohorts, and used five feature selection techniques to select 10 of 720 potentially predictive variables from the electronic health records. Model performance was assessed by the area under the receiver operating characteristic curve in the validation cohorts. The primary outcome was development of AKI (per the Kidney Disease Improving Global Outcomes creatinine definition) within a moving 48-hour window. Secondary outcomes included severe AKI (stage 2 or 3), inpatient mortality, and length of stay.ResultsAmong 8473 encounters studied, AKI occurred in 516 (10.2%), 207 (9%), and 27 (2.5%) encounters in the derivation, and internal and external validation cohorts, respectively. The highest-performing model used a machine learning-based genetic algorithm, with an overall receiver operating characteristic curve in the internal validation cohort of 0.76 [95% confidence interval (CI), 0.72 to 0.79] for AKI, 0.79 (95% CI, 0.74 to 0.83) for severe AKI, and 0.81 (95% CI, 0.77 to 0.86) for neonatal AKI. To translate this prediction model into a clinical risk-stratification tool, we identified high- and low-risk threshold points.ConclusionsUsing various machine learning algorithms, we identified and validated a time-updated prediction model of ten readily available electronic health record variables to accurately predict imminent AKI in hospitalized children.


Cancers ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 3817
Author(s):  
Shi-Jer Lou ◽  
Ming-Feng Hou ◽  
Hong-Tai Chang ◽  
Chong-Chi Chiu ◽  
Hao-Hsien Lee ◽  
...  

No studies have discussed machine learning algorithms to predict recurrence within 10 years after breast cancer surgery. This study purposed to compare the accuracy of forecasting models to predict recurrence within 10 years after breast cancer surgery and to identify significant predictors of recurrence. Registry data for breast cancer surgery patients were allocated to a training dataset (n = 798) for model development, a testing dataset (n = 171) for internal validation, and a validating dataset (n = 171) for external validation. Global sensitivity analysis was then performed to evaluate the significance of the selected predictors. Demographic characteristics, clinical characteristics, quality of care, and preoperative quality of life were significantly associated with recurrence within 10 years after breast cancer surgery (p < 0.05). Artificial neural networks had the highest prediction performance indices. Additionally, the surgeon volume was the best predictor of recurrence within 10 years after breast cancer surgery, followed by hospital volume and tumor stage. Accurate recurrence within 10 years prediction by machine learning algorithms may improve precision in managing patients after breast cancer surgery and improve understanding of risk factors for recurrence within 10 years after breast cancer surgery.


2020 ◽  
Author(s):  
Govinda KC ◽  
Giovanni Bocci ◽  
Srijan Verma ◽  
Mahmudulla Hassan ◽  
Jayme Holmes ◽  
...  

<p>Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (<a href="https://drugdiscovery.utep.edu/redial">http://drugcentral.org/Redial</a>). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (<i>viral entry</i>, <i>viral replication,</i> and <i>live virus infectivity</i>) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (<a href="https://github.com/sirimullalab/ncats_covid">https://github.com/sirimullalab/</a>redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.</p>


2020 ◽  
Author(s):  
Govinda KC ◽  
Giovanni Bocci ◽  
Srijan Verma ◽  
Mahmudulla Hassan ◽  
Jayme Holmes ◽  
...  

<p>Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (<a href="https://drugdiscovery.utep.edu/redial">http://drugcentral.org/Redial</a>). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (<i>viral entry</i>, <i>viral replication,</i> and <i>live virus infectivity</i>) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (<a href="https://github.com/sirimullalab/ncats_covid">https://github.com/sirimullalab/</a>redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.</p>


Sepsis is a life-threatening disease that causes tissue damage, organ failure and results in the death of millions of people. Sepsis is one of the highest risky diseases identified globally. A large proportion of these deaths occur in developing countries due to inaccessibility of hospitals or lack of resources. Blood samples are taken to confirm sepsis, but it requires the presence of laboratory and is time-consuming. The aim and objective of this study is to develop a practical, non-invasive sepsis prediction model that can be used to detect sepsis using supervised machine Learning algorithms. For this retrospective analysis, we used the data available from Physio-Net database.


Sign in / Sign up

Export Citation Format

Share Document