scholarly journals An Explainable Machine Learning Model for Early Prediction of Sepsis Using ICU Data

2021 ◽  
Author(s):  
Naimahmed Nesaragi ◽  
Shivnarayan Patidar

Early identification of individuals with sepsis is very useful in assisting clinical triage and decision-making, resulting in early intervention and improved outcomes. This study aims to develop an explainable machine learning model with the clinical interpretability to predict sepsis onset before 6 hours and validate with improved prediction risk power for every time interval since admission to the ICU. The retrospective observational cohort study is carried out using PhysioNet Challenge 2019 ICU data from three distinct hospital systems, viz. A, B, and C. Data from A and B were shared publicly for training and validation while sequestered data from all three cohorts were used for scoring. However, this study is limited only to publicly available training data. Training data contains 15,52,210 patient records of 40,336 ICU patients with up to 40 clinical variables (sourced for each hour of their ICU stay) divided into two datasets, based on hospital systems A and B. The clinical feature exploration and interpretation for early prediction of sepsis is achieved using the proposed framework, viz. the explainable Machine Learning model for Early Prediction of Sepsis (xMLEPS). A total of 85 features comprising the given 40 clinical variables augmented with 10 derived physiological features and 35 time-lag difference features are fed to xMLEPS for the said prediction task of sepsis onset. A ten-fold cross-validation scheme is employed wherein an optimal prediction risk threshold is searched for each of the 10 LightGBM models. These optimum threshold values are later used by the corresponding models to refine the predictive power in terms of utility score for the prediction of labels in each fold. The entire framework is designed via Bayesian optimization and trained with the resultant feature set of 85 features, yielding an average normalized utility score of 0.4214 and area under receiver operating characteristic curve of 0.8591 on publicly available training data. This study establish a practical and explainable sepsis onset prediction model for ICU data using applied ML approach, mainly gradient boosting. The study highlights the clinical significance of physiological inter-relations among the given and proposed clinical signs via feature importance and SHapley Additive exPlanations (SHAP) plots for visualized interpretation.

2021 ◽  
Vol 12 ◽  
Author(s):  
Sijie Chen ◽  
Wenjing Zhou ◽  
Jinghui Tu ◽  
Jian Li ◽  
Bo Wang ◽  
...  

PurposeEstablish a suitable machine learning model to identify its primary lesions for primary metastatic tumors in an integrated learning approach, making it more accurate to improve primary lesions’ diagnostic efficiency.MethodsAfter deleting the features whose expression level is lower than the threshold, we use two methods to perform feature selection and use XGBoost for classification. After the optimal model is selected through 10-fold cross-validation, it is verified on an independent test set.ResultsSelecting features with around 800 genes for training, theR2-score of a 10-fold CV of training data can reach 96.38%, and theR2-score of test data can reach 83.3%.ConclusionThese findings suggest that by combining tumor data with machine learning methods, each cancer has its corresponding classification accuracy, which can be used to predict primary metastatic tumors’ location. The machine-learning-based method can be used as an orthogonal diagnostic method to judge the machine learning model processing and clinical actual pathological conditions.


2020 ◽  
Vol 11 (1) ◽  
pp. 223
Author(s):  
Minsoo Kim ◽  
Sarang Yi ◽  
Seokmoo Hong

Since pipes used for water pipes are thin and difficult to fasten using welding or screws, they are fastened by a crimping joint method using a metal ring and a rubber ring. In the conventional crimping joint method, the metal ring and the rubber ring are arranged side by side. However, if water leaks from the rubber ring, there is a problem that the adjacent metal ring is rapidly corroded. In this study, to delay and minimize the corrosion of connected water pipes, we propose a spaced crimping joint method in which metal rings and rubber rings are separated at appropriate intervals. This not only improves the contact performance between the connected water pipes but also minimizes the load applied to the crimping jig during crimping to prevent damage to the jig. For this, finite element analyses were performed for the crimp tool and process analysis, and the design parameters were set as the curling length at the top of the joint, the distance between the metal rings and rubber rings, and the crimp jig radius. Through FEA of 100 cases, data to be trained in machine learning were acquired. After that, training data were trained on a machine learning model and compared with a regression model to verify the model’s performance. If the number of training data is small, the two methods are similar. However, the greater the number of training data, the higher the accuracy predicted by the machine learning model. Finally, the spaced crimping joint to which the derived optimal shape was applied was manufactured, and the maximum pressure and pressure distribution applied during compression were obtained using a pressure film. This is almost similar to the value obtained by finite element analysis under the same conditions, and through this, the validity of the approach proposed in this study was verified.


Author(s):  
Dr. Kalaivazhi Vijayaragavan ◽  
S. Prakathi ◽  
S. Rajalakshmi ◽  
M Sandhiya

Machine learning is a subfield of artificial intelligence, which is learning algorithms to make decision-based on data and try to behave like a human being. Classification is one of the most fundamental concepts in machine learning. It is a process of recognizing, understanding, and grouping ideas and objects into pre-set categories or sub-populations. Using precategorized training datasets, machine learning concept use variety of algorithms to classify the future datasets into categories. Classification algorithms use input training data in machine learning to predict the subsequent data that fall into one of the predetermined categories. To improve the classification accuracy design of neural network is regarded as effective model to obtain better accuracy. However, design of neural network is usually consider scaling layer, perceptron layers and probabilistic layer. In this paper, an enhanced model selection can be evaluated with training and testing strategy. Further, the classification accuracy can be predicted. Finally by using two popular machine learning frameworks: PyTorch and Tensor Flow the prediction of classification accuracy is compared. Results demonstrate that the proposed method can predict with more accuracy. After the deployment of our machine learning model the performance of the model has been evaluated with the help of iris data set.


Critical Care ◽  
2021 ◽  
Vol 25 (1) ◽  
Author(s):  
Junzi Dong ◽  
Ting Feng ◽  
Binod Thapa-Chhetry ◽  
Byung Gu Cho ◽  
Tunu Shum ◽  
...  

Abstract Background Acute kidney injury (AKI) in pediatric critical care patients is diagnosed using elevated serum creatinine, which occurs only after kidney impairment. There are no treatments other than supportive care for AKI once it has developed, so it is important to identify patients at risk to prevent injury. This study develops a machine learning model to learn pre-disease patterns of physiological measurements and predict pediatric AKI up to 48 h earlier than the currently established diagnostic guidelines. Methods EHR data from 16,863 pediatric critical care patients between 1 month to 21 years of age from three independent institutions were used to develop a single machine learning model for early prediction of creatinine-based AKI using intelligently engineered predictors, such as creatinine rate of change, to automatically assess real-time AKI risk. The primary outcome is prediction of moderate to severe AKI (Stage 2/3), and secondary outcomes are prediction of any AKI (Stage 1/2/3) and requirement of renal replacement therapy (RRT). Predictions generate alerts allowing fast assessment and reduction of AKI risk, such as: “patient has 90% risk of developing AKI in the next 48 h” along with contextual information and suggested response such as “patient on aminoglycosides, suggest check level and review dose and indication”. Results The model was successful in predicting Stage 2/3 AKI prior to detection by conventional criteria with a median lead-time of 30 h at AUROC of 0.89. The model predicted 70% of subsequent RRT episodes, 58% of Stage 2/3 episodes, and 41% of any AKI episodes. The ratio of false to true alerts of any AKI episodes was approximately one-to-one (PPV 47%). Among patients predicted, 79% received potentially nephrotoxic medication after being identified by the model but before development of AKI. Conclusions As the first multi-center validated AKI prediction model for all pediatric critical care patients, the machine learning model described in this study accurately predicts moderate to severe AKI up to 48 h in advance of AKI onset. The model may improve outcome of pediatric AKI by providing early alerting and actionable feedback, potentially preventing or reducing AKI by implementing early measures such as medication adjustment.


Author(s):  
Dr. M. P. Borawake

Abstract: The food we consume plays an important role in our daily life. It provides us energy which is needed to work, grow, be active, and to learn and think. The healthy food is essential for good health and nutrition. Light, oxygen, heat, humidity, temperature and spoilage bacteria can all affect both safety and quality of perishable foods. Food kept at room temperature undergoes some chemical reactions after certain period of time, which affects the taste, texture and smell of a food. Consuming spoiled food is harmful for consumers as it can lead to foodborne diseases. This project aims at detecting spoiled food using appropriate sensors and monitoring gases released by the particular food item. Sensors will measure the different parameters of food such as pH, ammonia gas, oxygen level, moisture, etc. The microcontroller takes the readings from sensors and these readings then given as an input to a machine learning model which can decide whether the food is spoilt or not based on training data set. Also, we plan to implement a machine learning model which can calculate the lifespan of that food item. Index Terms: Arduino Uno, Food spoilage, IoT, Machine Learning, Sensors.


Author(s):  
Osval Antonio Montesinos López ◽  
Abelardo Montesinos López ◽  
Jose Crossa

AbstractThe overfitting phenomenon happens when a statistical machine learning model learns very well about the noise as well as the signal that is present in the training data. On the other hand, an underfitted phenomenon occurs when only a few predictors are included in the statistical machine learning model that represents the complete structure of the data pattern poorly. This problem also arises when the training data set is too small and thus an underfitted model does a poor job of fitting the training data and unsatisfactorily predicts new data points. This chapter describes the importance of the trade-off between prediction accuracy and model interpretability, as well as the difference between explanatory and predictive modeling: Explanatory modeling minimizes bias, whereas predictive modeling seeks to minimize the combination of bias and estimation variance. We assess the importance and different methods of cross-validation as well as the importance and strategies of tuning that are key to the successful use of some statistical machine learning methods. We explain the most important metrics for evaluating the prediction performance for continuous, binary, categorical, and count response variables.


Author(s):  
Essam A. Rashed ◽  
Akimasa Hirata

The significant health and economic effects of COVID-19 emphasize the requirement for reliable forecasting models to avoid the sudden collapse of healthcare facilities with overloaded hospitals. Several forecasting models have been developed based on the data acquired within the early stages of the virus spread. However, with the recent emergence of new virus variants, it is unclear how the new strains could influence the efficiency of forecasting using models adopted using earlier data. In this study, we analyzed daily positive cases (DPC) data using a machine learning model to understand the effect of new viral variants on morbidity rates. A deep learning model that considers several environmental and mobility factors was used to forecast DPC in six districts of Japan. From machine learning predictions with training data since the early days of COVID-19, high-quality estimation has been achieved for data obtained earlier than March 2021. However, a significant upsurge was observed in some districts after the discovery of the new COVID-19 variant B.1.1.7 (Alpha). An average increase of 20–40% in DPC was observed after the emergence of the Alpha variant and an increase of up to 20% has been recognized in the effective reproduction number. Approximately four weeks was needed for the machine learning model to adjust the forecasting error caused by the new variants. The comparison between machine-learning predictions and reported values demonstrated that the emergence of new virus variants should be considered within COVID-19 forecasting models. This study presents an easy yet efficient way to quantify the change caused by new viral variants with potential usefulness for global data analysis.


Author(s):  
Lydia T Tam ◽  
Kristen W Yeom ◽  
Jason N Wright ◽  
Alok Jaju ◽  
Alireza Radmanesh ◽  
...  

Abstract Background Diffuse Intrinsic pontine gliomas (DIPGs) are lethal pediatric brain tumors. Presently, MRI is the mainstay of disease diagnosis and surveillance. We identify clinically significant computational features from MRI and create a prognostic machine learning model. Methods We isolated tumor volumes of T1-post contrast (T1) and T2-weighted (T2) MRIs from 177 treatment-naïve DIPG patients from an international cohort for model training and testing. The Quantitative Image Feature Pipeline and PyRadiomics was used for feature extraction. Ten-fold cross-validation of LASSO Cox regression selected optimal features to predict overall survival (OS) in the training dataset and tested in the independent testing dataset. We analyzed model performance using clinical variables (age at diagnosis and sex) only, radiomics only, and radiomics plus clinical variables. Results All selected features were intensity and texture-based on the wavelet filtered images (three T1 grey-level co-occurrence matrix (GLCM) texture features, T2 GLCM texture feature, and T2 first order-mean). This multivariable Cox model demonstrated a concordance of 0.68 [95% CI: 0.61-0.74] in the training dataset, significantly outperforming the clinical-only model (C=0.57 [95% CI: 0.49-0.64]). Adding clinical features to radiomics slightly improved performance (C=0.70 [95% CI: 0.64-0.77]). The combined radiomics and clinical model was validated in the independent testing dataset (C=0.59 [95% CI: 0.51-0.67], Noether’s test p=0.02). Conclusion In this international study, we demonstrate the use of radiomic signatures to create a machine learning model for DIPG prognostication. Standardized, quantitative approaches that objectively measure DIPG changes, including computational MRI evaluation, could offer new approaches to assessing tumor phenotype and serve a future role for optimizing clinical trial eligibility and tumor surveillance.


Sign in / Sign up

Export Citation Format

Share Document