scholarly journals A Vital Sign-based Prediction Algorithm for Differentiating COVID-19 Versus Seasonal Influenza in Hospitalized Patients

Author(s):  
Naveena Yanamala ◽  
Nanda H. Krishna ◽  
Quincy A. Hathaway ◽  
Aditya Radhakrishnan ◽  
Srinidhi Sunkara ◽  
...  

AbstractPatients with influenza and SARS-CoV2/Coronavirus disease 2019 (COVID-19) infections have different clinical course and outcomes. We developed and validated a supervised machine learning pipeline to distinguish the two viral infections using the available vital signs and demographic dataset from the first hospital/emergency room encounters of 3,883 patients who had confirmed diagnoses of influenza A/B, COVID-19 or negative laboratory test results. The models were able to achieve an area under the receiver operating characteristic curve (ROC AUC) of at least 97% using our multiclass classifier. The predictive models were externally validated on 15,697 encounters in 3,125 patients available on TrinetX database that contains patient-level data from different healthcare organizations. The influenza vs. COVID-19-positive model had an AUC of 98%, and 92% on the internal and external test sets, respectively. Our study illustrates the potentials of machine-learning models for accurately distinguishing the two viral infections. The code is made available at https://github.com/ynaveena/COVID-19-vs-Influenza and may be have utility as a frontline diagnostic tool to aid healthcare workers in triaging patients once the two viral infections start cocirculating in the communities.

2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Naveena Yanamala ◽  
Nanda H. Krishna ◽  
Quincy A. Hathaway ◽  
Aditya Radhakrishnan ◽  
Srinidhi Sunkara ◽  
...  

AbstractPatients with influenza and SARS-CoV2/Coronavirus disease 2019 (COVID-19) infections have a different clinical course and outcomes. We developed and validated a supervised machine learning pipeline to distinguish the two viral infections using the available vital signs and demographic dataset from the first hospital/emergency room encounters of 3883 patients who had confirmed diagnoses of influenza A/B, COVID-19 or negative laboratory test results. The models were able to achieve an area under the receiver operating characteristic curve (ROC AUC) of at least 97% using our multiclass classifier. The predictive models were externally validated on 15,697 encounters in 3125 patients available on TrinetX database that contains patient-level data from different healthcare organizations. The influenza vs COVID-19-positive model had an AUC of 98.8%, and 92.8% on the internal and external test sets, respectively. Our study illustrates the potentials of machine-learning models for accurately distinguishing the two viral infections. The code is made available at https://github.com/ynaveena/COVID-19-vs-Influenza and may have utility as a frontline diagnostic tool to aid healthcare workers in triaging patients once the two viral infections start cocirculating in the communities.


2020 ◽  
Vol 66 (11) ◽  
pp. 1396-1404 ◽  
Author(s):  
He S Yang ◽  
Yu Hou ◽  
Ljiljana V Vasovic ◽  
Peter A D Steel ◽  
Amy Chadburn ◽  
...  

Abstract Background Accurate diagnostic strategies to identify SARS-CoV-2 positive individuals rapidly for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours. Method We developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual’s SARS-CoV-2 infection status. Laboratory testing results obtained within 2 days before the release of SARS-CoV-2 RT-PCR result were used to train a gradient boosting decision tree (GBDT) model from 3,356 SARS-CoV-2 RT-PCR tested patients (1,402 positive and 1,954 negative) evaluated at a metropolitan hospital. Results The model achieved an area under the receiver operating characteristic curve (AUC) of 0.854 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within 2 days. Conclusion This model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-CoV-2 infected patients before their RT-PCR results are available. It may play an important role in assisting the identification of SARS-CoV-2 infected patients in areas where RT-PCR testing is not accessible due to financial or supply constraints.


2018 ◽  
Vol 129 (4) ◽  
pp. 675-688 ◽  
Author(s):  
Samir Kendale ◽  
Prathamesh Kulkarni ◽  
Andrew D. Rosenberg ◽  
Jing Wang

AbstractEditor’s PerspectiveWhat We Already Know about This TopicWhat This Article Tells Us That Is NewBackgroundHypotension is a risk factor for adverse perioperative outcomes. Machine-learning methods allow large amounts of data for development of robust predictive analytics. The authors hypothesized that machine-learning methods can provide prediction for the risk of postinduction hypotension.MethodsData was extracted from the electronic health record of a single quaternary care center from November 2015 to May 2016 for patients over age 12 that underwent general anesthesia, without procedure exclusions. Multiple supervised machine-learning classification techniques were attempted, with postinduction hypotension (mean arterial pressure less than 55 mmHg within 10 min of induction by any measurement) as primary outcome, and preoperative medications, medical comorbidities, induction medications, and intraoperative vital signs as features. Discrimination was assessed using cross-validated area under the receiver operating characteristic curve. The best performing model was tuned and final performance assessed using split-set validation.ResultsOut of 13,323 cases, 1,185 (8.9%) experienced postinduction hypotension. Area under the receiver operating characteristic curve using logistic regression was 0.71 (95% CI, 0.70 to 0.72), support vector machines was 0.63 (95% CI, 0.58 to 0.60), naive Bayes was 0.69 (95% CI, 0.67 to 0.69), k-nearest neighbor was 0.64 (95% CI, 0.63 to 0.65), linear discriminant analysis was 0.72 (95% CI, 0.71 to 0.73), random forest was 0.74 (95% CI, 0.73 to 0.75), neural nets 0.71 (95% CI, 0.69 to 0.71), and gradient boosting machine 0.76 (95% CI, 0.75 to 0.77). Test set area for the gradient boosting machine was 0.74 (95% CI, 0.72 to 0.77).ConclusionsThe success of this technique in predicting postinduction hypotension demonstrates feasibility of machine-learning models for predictive analytics in the field of anesthesiology, with performance dependent on model selection and appropriate tuning.


Circulation ◽  
2021 ◽  
Vol 144 (Suppl_2) ◽  
Author(s):  
Tsung-Chien Lu ◽  
Eric H Chou ◽  
CHIH-HUNG WANG ◽  
Amir Mostafavi ◽  
Mario Tovar ◽  
...  

Introduction: There are only scarce models developed for stratifying the risk of cardiac arrest from COVID-19 patients presenting to the ED with suspected pneumonia. By using the machine learning (ML) approach, we aimed to develop and validate the ML models to predict in-hospital cardiac arrest (IHCA) in patients admitted from the ED. Hypothesis: We hypothesized that ML approach can serve as a valuable tool in identifying patients at risk of IHCA in a timely fashion. Methods: We included the COVID-19 patients admitted from the EDs of five hospitals in Texas between March and November 2020. All adult (≥ 18 years) patients were included if they had positive RT-PCR for SARS-CoV-2 and also received CXR examination for suspected pneumonia. Patients’ demographic, past medical history, vital signs at ED triage, CXR findings, and laboratory results were retrieved from the EMR system. The primary outcome (IHCA) was identified via a resuscitation code. Patients presented as OHCA or without any blood testing were excluded. Nonrandom splitting strategy based on different location was used to divide the dataset into the training (one urban and two suburban hospitals) and testing cohort (one urban and one suburban hospital) at around 2-to-1 ratio. Three supervised ML models were trained and performances were evaluated and compared with the National Early Warning Score (NEWS) by the area under the receiver operating characteristic curve (AUC). Results: We included 1,485 records for analysis. Of them, 190 (12.8%) developed IHCA. Of the constructed ML models, Random Forest outperformed the others with the best AUC result (0.930, 95% CI: 0.896-0.958), followed by Gradient Boosting (0.929, 95% CI: 0.891-0.959) and Extra Trees classifier (0.909, 95% CI: 0.875-0.943). All constructed ML models performed significantly better than by using the NEWS scoring system (AUC: 0.787, 95% CI: 0.725-0.840). The top six important features selected were age, oxygen saturation at triage, and lab data of APTT, lactic acid, and LDH. Conclusions: The ML approach showed excellent discriminatory performance to identify IHCA for patients with COVID-19 and suspected pneumonia. It has the potential to save more life or provide end-of-life decision making if successfully implemented in the EMR system.


Hypertension ◽  
2020 ◽  
Vol 76 (Suppl_1) ◽  
Author(s):  
Sachin Aryal ◽  
Ahmad Alimadadi ◽  
Ishan Manandhar ◽  
Bina Joe ◽  
Xi Cheng

In recent years, the microbiome has been recognized as an important factor associated with cardiovascular disease (CVD), which is the leading cause of human mortality worldwide. Disparities in gut microbial compositions between individuals with and without CVD were reported, whereby, we hypothesized that utilizing such microbiome-based data for training with supervised machine learning (ML) models could be exploited as a new strategy for evaluation of cardiovascular health. To test our hypothesis, we analyzed the metagenomics data extracted from the American Gut Project. Specifically, 16S rRNA reads from stool samples of 478 CVD and 473 non-CVD control samples were analyzed using five supervised ML algorithms: random forest (RF), support vector machine with radial kernel (svmRadial), decision tree (DT), elastic net (ENet) and neural networks (NN). Thirty-nine differential bacterial taxa (LEfSe: LDA > 2) were identified between CVD and non-CVD groups. ML classifications, using these taxonomic features, achieved an AUC (area under the receiver operating characteristic curve) of ~0.58 (RF). However, by choosing the top 500 high-variance features of operational taxonomic units (OTUs) for training ML models, an improved AUC of ~0.65 (RF) was achieved. Further, by limiting the selection to only the top 25 highly contributing OTU features to reduce the dimensionality of feature space, the AUC was further significantly enhanced to ~0.70 (RF). In summary, this study is the first to demonstrate the successful development of a ML model using microbiome-based datasets for a systematic diagnostic screening of CVD.


Author(s):  
Wonju Seo ◽  
You-Bin Lee ◽  
Seunghyun Lee ◽  
Sang-Man Jin ◽  
Sung-Min Park

Abstract Background For an effective artificial pancreas (AP) system and an improved therapeutic intervention with continuous glucose monitoring (CGM), predicting the occurrence of hypoglycemia accurately is very important. While there have been many studies reporting successful algorithms for predicting nocturnal hypoglycemia, predicting postprandial hypoglycemia still remains a challenge due to extreme glucose fluctuations that occur around mealtimes. The goal of this study is to evaluate the feasibility of easy-to-use, computationally efficient machine-learning algorithm to predict postprandial hypoglycemia with a unique feature set. Methods We use retrospective CGM datasets of 104 people who had experienced at least one hypoglycemia alert value during a three-day CGM session. The algorithms were developed based on four machine learning models with a unique data-driven feature set: a random forest (RF), a support vector machine using a linear function or a radial basis function, a K-nearest neighbor, and a logistic regression. With 5-fold cross-subject validation, the average performance of each model was calculated to compare and contrast their individual performance. The area under a receiver operating characteristic curve (AUC) and the F1 score were used as the main criterion for evaluating the performance. Results In predicting a hypoglycemia alert value with a 30-min prediction horizon, the RF model showed the best performance with the average AUC of 0.966, the average sensitivity of 89.6%, the average specificity of 91.3%, and the average F1 score of 0.543. In addition, the RF showed the better predictive performance for postprandial hypoglycemic events than other models. Conclusion In conclusion, we showed that machine-learning algorithms have potential in predicting postprandial hypoglycemia, and the RF model could be a better candidate for the further development of postprandial hypoglycemia prediction algorithm to advance the CGM technology and the AP technology further.


Author(s):  
He Sarina Yang ◽  
Ljiljana V. Vasovic ◽  
Peter Steel ◽  
Amy Chadburn ◽  
Yu Hou ◽  
...  

AbstractBackgroundAccurate diagnostic strategies to rapidly identify SARS-CoV-2 positive individuals for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results of this test are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours.MethodWe developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual’s SARS-CoV-2 infection status. Laboratory test results obtained within two days before the release of SARS-CoV-2-RT-PCR result were used to train a gradient boosted decision tree (GBDT) model from 3,346 SARS-CoV-2 RT-PCR tested patients (1,394 positive and 1,952 negative) evaluated at a large metropolitan hospital.ResultsThe model achieved an area under the receiver operating characteristic curve (AUC) of 0.853 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within two days.ConclusionThis model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-COV-2 infected patients before their RT-PCR results are available. This may facilitate patient care and quarantine, indicate who requires retesting, and direct personal protective equipment use while awaiting definitive RT-PCR results.


Diagnostics ◽  
2020 ◽  
Vol 10 (10) ◽  
pp. 743
Author(s):  
Tae Sik Hwang ◽  
Hyun Woo Park ◽  
Ha Young Park ◽  
Young Sook Park

The vital signs or laboratory test results of sepsis patients may change before clinical deterioration. This study examined the differences in prognostic performance when systemic inflammatory response syndrome (SIRS), Sequential Organ Failure Assessment (SOFA), quick SOFA (qSOFA) scores, National Early Warning Score (NEWS), and lactate levels were repeatedly measured. Scores were obtained at arrival to triage, 1 h after fluid resuscitation, 1 h after vasopressor prescription, and before leaving the emergency room (ER) in 165 patients with septic shock. The relationships between score changes and in-hospital mortality, mechanical ventilation, admission to the intensive care unit, and mortality within seven days were compared using areas under receiver operating characteristic curve (AUROCs). Scores measured before leaving the ER had the highest AUROCs across all variables (SIRS score 0.827 [0.737–0.917], qSOFA score 0.754 [0.627–0.838], NEWS 0.888 [0.826–0.950], SOFA score 0.835 [0.766–0.904], and lactate 0.872 [0.805–0.939]). When combined, SIRS + lactate (0.882 [0.804–0.960]), qSOFA + lactate (0.872 [0.808–0.935]), NEWS + lactate (0.909 [0.855–0.963]), and SOFA + lactate (0.885 [0.832–0.939]) showed improved AUROCs. In patients with septic shock, scoring systems show better predictive performances at the timepoints reflecting changes in vital signs and laboratory test results than at the time of arrival, and combining them with lactate values increases their predictive powers.


2021 ◽  
Author(s):  
Duo Xu ◽  
Andre Neil Forbes ◽  
Sandra Cohen ◽  
Ann Palladino ◽  
Tatiana Karadimitriou ◽  
...  

Regulatory networks containing enhancer to gene edges define cellular state and their rewiring is a hallmark of cancer. While efforts, such as ENCODE, have revealed these networks for reference tissues and cell-lines by integrating multi-omics data, the same methods cannot be applied for large patient cohorts due to the constraints on generating ChIP-seq and three-dimensional data from limited material in patient biopsies. We trained a supervised machine learning model using genomic 3D signatures of physical enhancer-gene connections that can predict accurate connections using data from ATAC-seq and RNA-seq assays only, which can be easily generated from patient biopsies. Our method overcomes the major limitations of correlation-based approaches that cannot distinguish between distinct target genes of given enhancers in different samples, which is a hallmark of network rewiring in cancer. Our model achieved an AUROC (area under receiver operating characteristic curve) of 0.91 and, importantly, can distinguish between active regulatory elements with connections to target genes and poised elements with no connections to target genes. Our predicted regulatory elements are validated by multi-omics data, including histone modification marks from ENCODE, with an average specificity of 0.92. Application of our model on chromatin accessibility and transcriptomic data from 400 cancer patients across 22 cancer types revealed novel cancer-type and subtype-specific enhancer-gene connections for known cancer genes. In one example, we identified two enhancers that regulate the expression of ESR1 in only ER+ breast cancer (BRCA) samples but not in ER- samples. These enhancers are predicted to contribute to the high expression of ESR1 in 93% of ER+ BRCA samples. Functional validation using CRISPRi confirms that inhibition of these enhancers decreases the expression of ESR1 in ER+ samples.


2012 ◽  
Vol 9 (73) ◽  
pp. 1934-1942 ◽  
Author(s):  
Philip J. Hepworth ◽  
Alexey V. Nefedov ◽  
Ilya B. Muchnik ◽  
Kenton L. Morgan

Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.


Sign in / Sign up

Export Citation Format

Share Document