scholarly journals Prediction Model of Anastomotic Leakage Among Esophageal Cancer Patients After Receiving an Esophagectomy: Machine Learning Approach

10.2196/27110 ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. e27110
Author(s):  
Ziran Zhao ◽  
Xi Cheng ◽  
Xiao Sun ◽  
Shanrui Ma ◽  
Hao Feng ◽  
...  

Background Anastomotic leakage (AL) is one of the severe postoperative adverse events (5%-30%), and it is related to increased medical costs in cancer patients who undergo esophagectomies. Machine learning (ML) methods show good performance at predicting risk for AL. However, AL risk prediction based on ML models among the Chinese population is unavailable. Objective This study uses ML techniques to develop and validate a risk prediction model to screen patients with emerging AL risk factors. Methods Analyses were performed using medical records from 710 patients who underwent esophagectomies at the National Clinical Research Center for Cancer between January 2010 and May 2015. We randomly split (9:1) the data set into a training data set of 639 patients and a testing data set of 71 patients using a computer algorithm. We assessed multiple classification tools to create a multivariate risk prediction model. Our ML algorithms contained decision tree, random forest, naive Bayes, and logistic regression with least absolute shrinkage and selection operator. The optimal AL prediction model was selected based on model evaluation metrics. Results The final risk panel included 36 independent risk features. Of those, 10 features were significantly identified by the logistic model, including aortic calcification (OR 2.77, 95% CI 1.32-5.81), celiac trunk calcification (OR 2.79, 95% CI 1.20-6.48), forced expiratory volume 1% (OR 0.51, 95% CI 0.30-0.89); TLco (OR 0.56, 95% CI 0.27-1.18), peripheral vascular disease (OR 4.97, 95% CI 1.44-17.07), laparoscope (OR 3.92, 95% CI 1.23-12.51), postoperative length of hospital stay (OR 1.17, 95% CI 1.13-1.21), vascular permeability activity (OR 0.46, 95% CI 0.14-1.48), and fat liquefaction of incisions (OR 4.36, 95% CI 1.86-10.21). Logistic regression with least absolute shrinkage and selection operator offered the highest prediction quality with an area under the receiver operator characteristic of 72% in the training data set. The testing model also achieved similar high performance. Conclusions Our model offered a prediction of AL with high accuracy, assisting in AL prevention and treatment. A personalized ML prediction model with a purely data-driven selection of features is feasible and effective in predicting AL in patients who underwent esophagectomy.

10.2196/23128 ◽  
2020 ◽  
Vol 22 (11) ◽  
pp. e23128
Author(s):  
Pan Pan ◽  
Yichao Li ◽  
Yongjiu Xiao ◽  
Bingchao Han ◽  
Longxiang Su ◽  
...  

Background Patients with COVID-19 in the intensive care unit (ICU) have a high mortality rate, and methods to assess patients’ prognosis early and administer precise treatment are of great significance. Objective The aim of this study was to use machine learning to construct a model for the analysis of risk factors and prediction of mortality among ICU patients with COVID-19. Methods In this study, 123 patients with COVID-19 in the ICU of Vulcan Hill Hospital were retrospectively selected from the database, and the data were randomly divided into a training data set (n=98) and test data set (n=25) with a 4:1 ratio. Significance tests, correlation analysis, and factor analysis were used to screen 100 potential risk factors individually. Conventional logistic regression methods and four machine learning algorithms were used to construct the risk prediction model for the prognosis of patients with COVID-19 in the ICU. The performance of these machine learning models was measured by the area under the receiver operating characteristic curve (AUC). Interpretation and evaluation of the risk prediction model were performed using calibration curves, SHapley Additive exPlanations (SHAP), Local Interpretable Model-Agnostic Explanations (LIME), etc, to ensure its stability and reliability. The outcome was based on the ICU deaths recorded from the database. Results Layer-by-layer screening of 100 potential risk factors finally revealed 8 important risk factors that were included in the risk prediction model: lymphocyte percentage, prothrombin time, lactate dehydrogenase, total bilirubin, eosinophil percentage, creatinine, neutrophil percentage, and albumin level. Finally, an eXtreme Gradient Boosting (XGBoost) model established with the 8 important risk factors showed the best recognition ability in the training set of 5-fold cross validation (AUC=0.86) and the verification queue (AUC=0.92). The calibration curve showed that the risk predicted by the model was in good agreement with the actual risk. In addition, using the SHAP and LIME algorithms, feature interpretation and sample prediction interpretation algorithms of the XGBoost black box model were implemented. Additionally, the model was translated into a web-based risk calculator that is freely available for public usage. Conclusions The 8-factor XGBoost model predicts risk of death in ICU patients with COVID-19 well; it initially demonstrates stability and can be used effectively to predict COVID-19 prognosis in ICU patients.


2020 ◽  
Author(s):  
Pan Pan ◽  
Yichao Li ◽  
Yongjiu Xiao ◽  
Bingchao Han ◽  
Longxiang Su ◽  
...  

BACKGROUND Patients with COVID-19 in the intensive care unit (ICU) have a high mortality rate, and methods to assess patients’ prognosis early and administer precise treatment are of great significance. OBJECTIVE The aim of this study was to use machine learning to construct a model for the analysis of risk factors and prediction of mortality among ICU patients with COVID-19. METHODS In this study, 123 patients with COVID-19 in the ICU of Vulcan Hill Hospital were retrospectively selected from the database, and the data were randomly divided into a training data set (n=98) and test data set (n=25) with a 4:1 ratio. Significance tests, correlation analysis, and factor analysis were used to screen 100 potential risk factors individually. Conventional logistic regression methods and four machine learning algorithms were used to construct the risk prediction model for the prognosis of patients with COVID-19 in the ICU. The performance of these machine learning models was measured by the area under the receiver operating characteristic curve (AUC). Interpretation and evaluation of the risk prediction model were performed using calibration curves, SHapley Additive exPlanations (SHAP), Local Interpretable Model-Agnostic Explanations (LIME), etc, to ensure its stability and reliability. The outcome was based on the ICU deaths recorded from the database. RESULTS Layer-by-layer screening of 100 potential risk factors finally revealed 8 important risk factors that were included in the risk prediction model: lymphocyte percentage, prothrombin time, lactate dehydrogenase, total bilirubin, eosinophil percentage, creatinine, neutrophil percentage, and albumin level. Finally, an eXtreme Gradient Boosting (XGBoost) model established with the 8 important risk factors showed the best recognition ability in the training set of 5-fold cross validation (AUC=0.86) and the verification queue (AUC=0.92). The calibration curve showed that the risk predicted by the model was in good agreement with the actual risk. In addition, using the SHAP and LIME algorithms, feature interpretation and sample prediction interpretation algorithms of the XGBoost black box model were implemented. Additionally, the model was translated into a web-based risk calculator that is freely available for public usage. CONCLUSIONS The 8-factor XGBoost model predicts risk of death in ICU patients with COVID-19 well; it initially demonstrates stability and can be used effectively to predict COVID-19 prognosis in ICU patients.


2020 ◽  
Vol 4 (1) ◽  
Author(s):  
Jessica K. Sexton ◽  
Michael Coory ◽  
Sailesh Kumar ◽  
Gordon Smith ◽  
Adrienne Gordon ◽  
...  

Abstract Background Despite advances in the care of women and their babies in the past century, an estimated 1.7 million babies are born still each year throughout the world. A robust method to estimate a pregnant woman’s individualized risk of late-pregnancy stillbirth is needed to inform decision-making around the timing of birth to reduce the risk of stillbirth from 35 weeks of gestation in Australia, a high-resource setting. Methods This is a protocol for a cross-sectional study of all late-pregnancy births in Australia (2005–2015) from 35 weeks of gestation including 5188 stillbirths among 3.1 million births at an estimated rate of 1.7 stillbirths per 1000 births. A multivariable logistic regression model will be developed in line with current TransparentReporting of a multivariable prediction model forIndividualPrognosis orDiagnosis (TRIPOD) guidelines to estimate the gestation-specific probability of stillbirth with prediction intervals. Candidate predictors were identified from systematic reviews and clinical consultation and will be described through univariable regression analysis. To generate a final model, elimination by backward stepwise multivariable logistic regression will be performed. The model will be internally validated using bootstrapping with 1000 repetitions and externally validated using a temporally unique dataset. Overall model performance will be assessed with R2, calibration, and discrimination. Calibration will be reported using a calibration plot with 95% confidence intervals (α = 0.05). Discrimination will be measured by the C-statistic and area underneath the receiver-operator curves. Clinical usefulness will be reported as positive and negative predictive values, and a decision curve analysis will be considered. Discussion A robust method to predict a pregnant woman’s individualized risk of late-pregnancy stillbirth is needed to inform timely, appropriate care to reduce stillbirth. Among existing prediction models designed for obstetric use, few have been subject to internal and external validation and many fail to meet recommended reporting standards. In developing a risk prediction model for late-gestation stillbirth with both providers and pregnant women in mind, we endeavor to develop a validated model for clinical use in Australia that meets current reporting standards.


2019 ◽  
Vol 12 (1) ◽  
Author(s):  
Daichi Shigemizu ◽  
Shintaro Akiyama ◽  
Yuya Asanomi ◽  
Keith A. Boroevich ◽  
Alok Sharma ◽  
...  

Abstract Background Dementia with Lewy bodies (DLB) is the second most common subtype of neurodegenerative dementia in humans following Alzheimer’s disease (AD). Present clinical diagnosis of DLB has high specificity and low sensitivity and finding potential biomarkers of prodromal DLB is still challenging. MicroRNAs (miRNAs) have recently received a lot of attention as a source of novel biomarkers. Methods In this study, using serum miRNA expression of 478 Japanese individuals, we investigated potential miRNA biomarkers and constructed an optimal risk prediction model based on several machine learning methods: penalized regression, random forest, support vector machine, and gradient boosting decision tree. Results The final risk prediction model, constructed via a gradient boosting decision tree using 180 miRNAs and two clinical features, achieved an accuracy of 0.829 on an independent test set. We further predicted candidate target genes from the miRNAs. Gene set enrichment analysis of the miRNA target genes revealed 6 functional genes included in the DHA signaling pathway associated with DLB pathology. Two of them were further supported by gene-based association studies using a large number of single nucleotide polymorphism markers (BCL2L1: P = 0.012, PIK3R2: P = 0.021). Conclusions Our proposed prediction model provides an effective tool for DLB classification. Also, a gene-based association test of rare variants revealed that BCL2L1 and PIK3R2 were statistically significantly associated with DLB.


BMJ Open ◽  
2019 ◽  
Vol 9 (6) ◽  
pp. e026759 ◽  
Author(s):  
John T Y Soong ◽  
Jurgita Kaubryte ◽  
Danny Liew ◽  
Carol Jane Peden ◽  
Alex Bottle ◽  
...  

ObjectivesThis study aimed to examine the prevalence of frailty coding within the Dr Foster Global Comparators (GC) international database. We then aimed to develop and validate a risk prediction model, based on frailty syndromes, for key outcomes using the GC data set.DesignA retrospective cohort analysis of data from patients over 75 years of age from the GC international administrative data. A risk prediction model was developed from the initial analysis based on seven frailty syndrome groups and their relationship to outcome metrics. A weighting was then created for each syndrome group and summated to create the Dr Foster Global Frailty Score. Performance of the score for predictive capacity was compared with an established prognostic comorbidity model (Elixhauser) and tested on another administrative database Hospital Episode Statistics (2011-2015), for external validation.Setting34 hospitals from nine countries across Europe, Australia, the UK and USA.ResultsOf 6.7 million patient records in the GC database, 1.4 million (20%) were from patients aged 75 years or more. There was marked variation in coding of frailty syndromes between countries and hospitals. Frailty syndromes were coded in 2% to 24% of patient spells. Falls and fractures was the most common syndrome coded (24%). The Dr Foster Global Frailty Score was significantly associated with in-hospital mortality, 30-day non-elective readmission and long length of hospital stay. The score had significant predictive capacity beyond that of other known predictors of poor outcome in older persons, such as comorbidity and chronological age. The score’s predictive capacity was higher in the elective group compared with non-elective, and may reflect improved performance in lower acuity states.ConclusionsFrailty syndromes can be coded in international secondary care administrative data sets. The Dr Foster Global Frailty Score significantly predicts key outcomes. This methodology may be feasibly utilised for case-mix adjustment for older persons internationally.


Sign in / Sign up

Export Citation Format

Share Document