scholarly journals Updated Analysis: Novel Machine-Learning-Based Sepsis Prediction Model for Patients Undergoing Hematopoietic Stem Cell Transplantation (Early Sepsis Prediction/Identification for Transplant Recipients: ESPRIT)

Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 4477-4477
Author(s):  
Zahra Eftekhari ◽  
Sally Mokhtari ◽  
Tushondra Thomas ◽  
Dongyun Yang ◽  
Liana Nikolaenko ◽  
...  

Sepsis contributes significantly to early treatment-related mortality after hematopoietic cell transplantation (HCT). Since the clinical presentation and characteristics of sepsis immediately after HCT can be different from that seen in general population or those who are receiving non-HCT chemotherapy, detecting early signs of sepsis in HCT recipients becomes critical. Herein, we extended our earlier analyses (Dadwal et al. ASH 2018) and evaluated a consecutive case series of 1806 patients who underwent HCT at City of Hope (2014-2017) to develop a machine-learning sepsis prediction model for HCT recipients, namely Early Sepsis Prediction/Identification for Transplant Recipients (ESPRIT) using variables within the Electronic Health Record (EHR) data. The primary clinical event was sepsis diagnosis within 100 days post-HCT, identified based on the use of the institutional "sepsis management order set" and mention of "sepsis" in the progress notes. The time of sepsis order set was considered as time of sepsis for the analyses. Data from 2014 to 2016 (108 visits with and 1315 visits without sepsis, 8% sepsis prevalence) were used as the training set and data from 2017 (24 visits with and 359 visits without sepsis, 6.6% sepsis prevalence) were kept as the holdout dataset for testing the model. From each patient visit, 61 variables were collected with a total of 862,009 lab values, 3,284,561 vital sign values and 249,982 medication orders for 1806 visits over the duration of HCT hospitalization (median: 24.1 days, range: 7-304). An ensemble of 100 random forest classification models were used to develop the prediction model. Last Observation Carried Forward (LOCF) imputation was done to attribute the missing values with the last observed value of that variable. For model development and optimization, we applied a 5-fold stratified cross validation on the training dataset. Variable importance for the 100 models was assessed using Gini mean decrease accuracy method value, which was averaged to produce the final variable importance. HCT was autologous in 798 and allogeneic in 1008 patients. Ablative conditioning regimen was delivered to 97.3% and 38.3% of patients in autologous and allogeneic groups, respectively. When the impact of "sepsis" was analyzed as a time-dependent variable, sepsis development was associated with increased mortality (HR=2.79, 95%CI: 2.14-3.64, p<0.001) by multivariable Cox regression model. Retrospective evaluation at 0, 4, 8 and 12 hours pre-sepsis showed area under the ROC curves (AUCs) of 0.98, 0.91, 0.90 and 0.85, respectively (Fig 1a), outperforming the widely used Modified Early Warning Score (MEWS) (Fig 1b). We then simulated our ESPRIT's performance in the unselected real-world data by running the model every hour from admit to sepsis or discharge, whichever occurred first. This process created an hourly risk score from admit to sepsis or discharge. ESPRIT achieved an AUC of 0.83 on the training and AUC of 0.82 on the holdout test dataset (Fig 2). An example of risk over time for a septic patient that was identified by the model with 27 hours lead time at threshold of 0.6 is shown in Fig 3. With at risk threshold of 0.6 (sensitivity: 0.4, specificity: 0.93), ESPRIT had a median lead time of 35 and 47 hours on training and holdout test data, respectively. This model allows users to select any threshold (with specific false positive/negative rate expected for a given population) to be used for specific purposes. For example, a red flag can be assigned to a patient when the risk passes the threshold of 0.6. At this threshold the false positive rate is only 7% and true positive rate is 40%. Then a yellow flag can be assigned at the threshold of 0.4, with which the model has higher (38%) false positive rate but also a high (90%) true positive rate. Using this two-step assessment/intervention system (red flag as an alarm and yellow flag as a warning sign to examine the patient to rule out sepsis), the model would achieve 90% sensitivity and 93% specificity in practice and overcome the low positive predictive value due to the rare incidence of sepsis. In summary, we developed and validated a novel machine learning monitoring system for sepsis prediction in HCT recipients. Our data strongly support further clinical validation of the ESPRIT model as a method to provide real-time sepsis predictions, and timely initiation of preemptive antibiotics therapy according to the predicted risks in the era of EHR. Disclosures Dadwal: Ansun biopharma: Research Funding; SHIRE: Research Funding; Janssen: Membership on an entity's Board of Directors or advisory committees; Merck: Membership on an entity's Board of Directors or advisory committees; Clinigen: Membership on an entity's Board of Directors or advisory committees. Nakamura:Kirin Kyowa: Other: support for an academic seminar in a university in Japan; Merck: Membership on an entity's Board of Directors or advisory committees; Celgene: Other: support for an academic seminar in a university in Japan; Alexion: Other: support to a lecture at a Japan Society of Transfusion/Cellular Therapy meeting .

Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 711-711
Author(s):  
Sanjeet Dadwal ◽  
Zahra Eftekhari ◽  
Tushondra Thomas ◽  
Deron Johnson ◽  
Dongyun Yang ◽  
...  

Abstract Sepsis and severe sepsis contribute significantly to early treatment-related mortality after hematopoietic cell transplantation (HCT), with reported mortality rates of 30 and 55% due to severe sepsis, during engraftment admission, for autologous and allogeneic HCT, respectively. Since the clinical presentation and characteristics of sepsis immediately after HCT can be different from that seen in general population or those who are receiving non-HCT chemotherapy, detecting early signs of sepsis in HCT recipients becomes critical. Herein, we developed and validated a machine-learning based sepsis prediction model for patients who underwent HCT at City of Hope, using variables within the Electronic Health Record (EHR) data. We evaluated a consecutive case series of 1046 HCTs (autologous: n=491, allogeneic: n=555) at our center between 2014 and 2017. The median age at the time of HCT was 56 years (range: 18-78). For this analysis, the primary clinical event was sepsis diagnosis within 100 days post-HCT, identified based on - use of the institutional sepsis management order set and mention of "sepsis" in the progress notes. The time of sepsis order set was considered as time of sepsis for analyses. To train the model, 829 visits (104 septic and 725 non-septic) and their data were used, while 217 visits (31 septic and 186 non-septic) were used as a validation cohort. At each hour after HCT, when a new data point was available, 47 variables were calculated from each patient's data and a risk score was assigned to each time point. These variables consisted of patient demographics, transplant type, regimen intensity, disease status, Hematopoietic cell transplantation - specific comorbidity index, lab values, vital signs, medication orders, and comorbidities. For the 829 visits in the training dataset, the 47 variables were calculated at 220,889 different time points, resulting in a total of 10,381,783 data points. Lab values and vital signs were considered as changes from individual patient's baselines at each time point. The baseline for each lab value and vital sign were the last measured values before HCT. An ensemble of 20 random forest binary classification models were trained to identify and learn patterns of data for HCT patients at high risk for sepsis and differentiate them from patients at lower sepsis risk. To help the model learning patterns of data prior to sepsis, available data from septic patients' within 24 hours preceding diagnosis of sepsis was used. For 829 septic visits in the training data set, there were 5048 time points, each having 47 variables. Variable importance for the 20 models was assessed using Gini mean decrease accuracy method. The sum of importance values from each model was calculated for each variable as the final importance value. Figure 1a shows the importance of variables using this method. Testing the model on the validation cohort results in an AUC of 0.85 on the test dataset (Figure 1b). At a threshold of 0.6, our model was 0.32 sensitive and 0.96 specific. At this threshold, this model identified 10 out of 31 patients with a median lead time of 119.5 hours, of which 2 patients were flagged as high risk at the time of transplant and developed sepsis at 17 and 60 days post-HCT. The lead time is what truly sets this predictive model apart from detective models with organ failure or dysfunction or other deterioration metrics as their detection criteria. At a threshold of 0.4, our model has 0.9 sensitivity and 0.65 specificity. In summary, a machine-learning sepsis prediction model can be tailored towards HCT recipients to improve the quality of care, prevent sepsis associated-organ damage and decrease mortality post-HCT. Our model significantly outperforms widely used Modified Early Warning Score (MEWS), with AUC of 0.73 in general population. Possible application of our model include showing a "red flag" at a threshold of 0.6 (0.32 true positive rate and 0.04 false positive rate) for antibiotic initiation/modification, and a "yellow flag" at a threshold of 0.4 (0.9 true positive rate and 0.35 false positive rate) suggesting closer monitoring or less aggressive treatments for the patient. Figure 1. Figure 1. Disclosures Dadwal: MERK: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding, Speakers Bureau; Gilead: Research Funding; AiCuris: Research Funding; Shire: Research Funding.


Web use and digitized information are getting expanded each day. The measure of information created is likewise getting expanded. On the opposite side, the security assaults cause numerous security dangers in the system, sites and Internet. Interruption discovery in a fast system is extremely a hard undertaking. The Hadoop Implementation is utilized to address the previously mentioned test that is distinguishing interruption in a major information condition at constant. To characterize the strange bundle stream, AI methodologies are used. Innocent Bayes does grouping by a vector of highlight esteems produced using some limited set. Choice Tree is another Machine Learning classifier which is likewise an administered learning model. Choice tree is the stream diagram like tree structure. J48 and Naïve Bayes Algorithm are actualized in Hadoop MapReduce Framework for parallel preparing by utilizing the KDDCup Data Corrected Benchmark dataset records. The outcome acquired is 89.9% True Positive rate and 0.04% False Positive rate for Naive Bayes Algorithm and 98.06% True Positive rate and 0.001% False Positive rate for Decision Tree Algorithm.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1894
Author(s):  
Chun Guo ◽  
Zihua Song ◽  
Yuan Ping ◽  
Guowei Shen ◽  
Yuhei Cui ◽  
...  

Remote Access Trojan (RAT) is one of the most terrible security threats that organizations face today. At present, two major RAT detection methods are host-based and network-based detection methods. To complement one another’s strengths, this article proposes a phased RATs detection method by combining double-side features (PRATD). In PRATD, both host-side and network-side features are combined to build detection models, which is conducive to distinguishing the RATs from benign programs because that the RATs not only generate traffic on the network but also leave traces on the host at run time. Besides, PRATD trains two different detection models for the two runtime states of RATs for improving the True Positive Rate (TPR). The experiments on the network and host records collected from five kinds of benign programs and 20 famous RATs show that PRATD can effectively detect RATs, it can achieve a TPR as high as 93.609% with a False Positive Rate (FPR) as low as 0.407% for the known RATs, a TPR 81.928% and FPR 0.185% for the unknown RATs, which suggests it is a competitive candidate for RAT detection.


2019 ◽  
Author(s):  
Rayees Rahman ◽  
Arad Kodesh ◽  
Stephen Z Levine ◽  
Sven Sandin ◽  
Abraham Reichenberg ◽  
...  

AbstractImportanceCurrent approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, where most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome.ObjectiveDevelop a machine learning (ML) method predicting the diagnosis of ASD in offspring in a general population sample, using parental electronic medical records (EMR) available before childbirthDesignPrognostic study of EMR data within a single Israeli health maintenance organization, for the parents of 1,397 ASD children (ICD-9/10), and 94,741 non-ASD children born between January 1st, 1997 through December 31st, 2008. The complete EMR record of the parents was used to develop various ML models to predict the risk of having a child with ASD.Main outcomes and measuresRoutinely available parental sociodemographic information, medical histories and prescribed medications data until offspring’s birth were used to generate features to train various machine learning algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross validation, by computing C statistics, sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value, PPV).ResultsAll ML models tested had similar performance, achieving an average C statistics of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85% for predicting ASD in this dataset.Conclusion and relevanceML algorithms combined with EMR capture early life ASD risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children.Key pointsQuestionCan autism risk in children be predicted using the pre-birth electronic medical record (EMR) of the parents?FindingsIn this population-based study that included 1,397 children with autism spectrum disorder (ASD) and 94,741 non-ASD children, we developed a machine learning classifier for predicting the likelihood of childhood diagnosis of ASD with an average C statistic of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85%.MeaningThe results presented serve as a proof-of-principle of the potential utility of EMR for the identification of a large proportion of future children at a high-risk of ASD.


2012 ◽  
pp. 830-850
Author(s):  
Abhilash Alexander Miranda ◽  
Olivier Caelen ◽  
Gianluca Bontempi

This chapter presents a comprehensive scheme for automated detection of colorectal polyps in computed tomography colonography (CTC) with particular emphasis on robust learning algorithms that differentiate polyps from non-polyp shapes. The authors’ automated CTC scheme introduces two orientation independent features which encode the shape characteristics that aid in classification of polyps and non-polyps with high accuracy, low false positive rate, and low computations making the scheme suitable for colorectal cancer screening initiatives. Experiments using state-of-the-art machine learning algorithms viz., lazy learning, support vector machines, and naïve Bayes classifiers reveal the robustness of the two features in detecting polyps at 100% sensitivity for polyps with diameter greater than 10 mm while attaining total low false positive rates, respectively, of 3.05, 3.47 and 0.71 per CTC dataset at specificities above 99% when tested on 58 CTC datasets. The results were validated using colonoscopy reports provided by expert radiologists.


2019 ◽  
Vol 09 (03) ◽  
pp. e262-e267
Author(s):  
Henry Alexander Easley ◽  
Todd Michael Beste

Objectives To evaluate the diagnostic accuracy of a multivariable prediction model, the Shoulder Screen (Perigen, Inc.), and compare it with the American College of Obstetricians and Gynecologists (ACOG) guidelines to prevent harm from shoulder dystocia. Study Design The model was applied to two groups of 199 patients each who delivered during a 4-year period. One group experienced shoulder dystocia and the other group delivered without shoulder dystocia. The model's accuracy was analyzed. The performance of the model was compared with the ACOG guideline. Results The sensitivity, specificity, positive, and negative predictive values of the model were 23.1, 99.5, 97.9, and 56.4%, respectively. The sensitivity of the ACOG guideline was 10.1%. The false-positive rate of the model was 0.5%. The accuracy of the model was 61.3%. Conclusion A multivariable prediction model can predict shoulder dystocia and is more accurate than ACOG guidelines.


1979 ◽  
Vol 25 (12) ◽  
pp. 2034-2037 ◽  
Author(s):  
L B Sheiner ◽  
L A Wheeler ◽  
J K Moore

Abstract The percentage of mislabeled specimens detected (true-positive rate) and the percentage of correctly labeled specimens misidentified (false-positive rate) were computed for three previously proposed delta check methods and two linear discriminant functions. The true-positive rate was computed from a set of pairs of specimens, each having one member replaced by a member from another pair chosen at random. The relationship between true-positive and false-positive rates was similar among the delta check methods tested, indicating equal performance for all of them over the range of false-positive rate of interest. At a practical false-positive operating rate of about 5%, delta check methods detect only about 50% of mislabeled specimens; even if the actual mislabeling rate is moderate (e.g., 1%), only abot 10% of specimens flagged a by a delta check will actually have been mislabeled.


2020 ◽  
Vol 63 (1) ◽  
Author(s):  
Rayees Rahman ◽  
Arad Kodesh ◽  
Stephen Z. Levine ◽  
Sven Sandin ◽  
Abraham Reichenberg ◽  
...  

Abstract Background. Current approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, and most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome. The aim of the current study was to test the ability of machine learning (ML) models applied to electronic medical records (EMRs) to predict ASD early in life, in a general population sample. Methods. We used EMR data from a single Israeli Health Maintenance Organization, including EMR information for parents of 1,397 ASD children (ICD-9/10) and 94,741 non-ASD children born between January 1st, 1997 and December 31st, 2008. Routinely available parental sociodemographic information, parental medical histories, and prescribed medications data were used to generate features to train various ML algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross-validation by computing the area under the receiver operating characteristic curve (AUC; C-statistic), sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value [PPV]). Results. All ML models tested had similar performance. The average performance across all models had C-statistic of 0.709, sensitivity of 29.93%, specificity of 98.18%, accuracy of 95.62%, false positive rate of 1.81%, and PPV of 43.35% for predicting ASD in this dataset. Conclusions. We conclude that ML algorithms combined with EMR capture early life ASD risk as well as reveal previously unknown features to be associated with ASD-risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children.


2009 ◽  
Vol 53 (7) ◽  
pp. 2949-2954 ◽  
Author(s):  
Isabel Cuesta ◽  
Concha Bielza ◽  
Pedro Larrañaga ◽  
Manuel Cuenca-Estrella ◽  
Fernando Laguna ◽  
...  

ABSTRACT European Committee on Antimicrobial Susceptibility Testing (EUCAST) breakpoints classify Candida strains with a fluconazole MIC ≤ 2 mg/liter as susceptible, those with a fluconazole MIC of 4 mg/liter as representing intermediate susceptibility, and those with a fluconazole MIC > 4 mg/liter as resistant. Machine learning models are supported by complex statistical analyses assessing whether the results have statistical relevance. The aim of this work was to use supervised classification algorithms to analyze the clinical data used to produce EUCAST fluconazole breakpoints. Five supervised classifiers (J48, Correlation and Regression Trees [CART], OneR, Naïve Bayes, and Simple Logistic) were used to analyze two cohorts of patients with oropharyngeal candidosis and candidemia. The target variable was the outcome of the infections, and the predictor variables consisted of values for the MIC or the proportion between the dose administered and the MIC of the isolate (dose/MIC). Statistical power was assessed by determining values for sensitivity and specificity, the false-positive rate, the area under the receiver operating characteristic (ROC) curve, and the Matthews correlation coefficient (MCC). CART obtained the best statistical power for a MIC > 4 mg/liter for detecting failures (sensitivity, 87%; false-positive rate, 8%; area under the ROC curve, 0.89; MCC index, 0.80). For dose/MIC determinations, the target was >75, with a sensitivity of 91%, a false-positive rate of 10%, an area under the ROC curve of 0.90, and an MCC index of 0.80. Other classifiers gave similar breakpoints with lower statistical power. EUCAST fluconazole breakpoints have been validated by means of machine learning methods. These computer tools must be incorporated in the process for developing breakpoints to avoid researcher bias, thus enhancing the statistical power of the model.


Sensors ◽  
2020 ◽  
Vol 20 (2) ◽  
pp. 348 ◽  
Author(s):  
Chang-Hee Han ◽  
Euijin Kim ◽  
Chang-Hwan Im

Asynchronous brain–computer interfaces (BCIs) based on electroencephalography (EEG) generally suffer from poor performance in terms of classification accuracy and false-positive rate (FPR). Thus, BCI toggle switches based on electrooculogram (EOG) signals were developed to toggle on/off synchronous BCI systems. The conventional BCI toggle switches exhibit fast responses with high accuracy; however, they have a high FPR or cannot be applied to patients with oculomotor impairments. To circumvent these issues, we developed a novel BCI toggle switch that users can employ to toggle on or off synchronous BCIs by holding their breath for a few seconds. Two states—normal breath and breath holding—were classified using a linear discriminant analysis with features extracted from the respiration-modulated photoplethysmography (PPG) signals. A real-time BCI toggle switch was implemented with calibration data trained with only 1-min PPG data. We evaluated the performance of our PPG switch by combining it with a steady-state visual evoked potential-based BCI system that was designed to control four external devices, with regard to the true-positive rate and FPR. The parameters of the PPG switch were optimized through an offline experiment with five subjects, and the performance of the switch system was evaluated in an online experiment with seven subjects. All the participants successfully turned on the BCI by holding their breath for approximately 10 s (100% accuracy), and the switch system exhibited a very low FPR of 0.02 false operations per minute, which is the lowest FPR reported thus far. All participants could successfully control external devices in the synchronous BCI mode. Our results demonstrated that the proposed PPG-based BCI toggle switch can be used to implement practical BCIs.


Sign in / Sign up

Export Citation Format

Share Document