scholarly journals Machine Learning-Based Three-Month Outcome Prediction in Acute Ischemic Stroke: A Single Cerebrovascular-Specialty Hospital Study in South Korea

Diagnostics ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1909
Author(s):  
Dougho Park ◽  
Eunhwan Jeong ◽  
Haejong Kim ◽  
Hae Wook Pyun ◽  
Haemin Kim ◽  
...  

Background: Functional outcomes after acute ischemic stroke are of great concern to patients and their families, as well as physicians and surgeons who make the clinical decisions. We developed machine learning (ML)-based functional outcome prediction models in acute ischemic stroke. Methods: This retrospective study used a prospective cohort database. A total of 1066 patients with acute ischemic stroke between January 2019 and March 2021 were included. Variables such as demographic factors, stroke-related factors, laboratory findings, and comorbidities were utilized at the time of admission. Five ML algorithms were applied to predict a favorable functional outcome (modified Rankin Scale 0 or 1) at 3 months after stroke onset. Results: Regularized logistic regression showed the best performance with an area under the receiver operating characteristic curve (AUC) of 0.86. Support vector machines represented the second-highest AUC of 0.85 with the highest F1-score of 0.86, and finally, all ML models applied achieved an AUC > 0.8. The National Institute of Health Stroke Scale at admission and age were consistently the top two important variables for generalized logistic regression, random forest, and extreme gradient boosting models. Conclusions: ML-based functional outcome prediction models for acute ischemic stroke were validated and proven to be readily applicable and useful.

Diagnostics ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 80
Author(s):  
I-Min Chiu ◽  
Wun-Huei Zeng ◽  
Chi-Yung Cheng ◽  
Shih-Hsuan Chen ◽  
Chun-Hung Richard Lin

Prediction of functional outcome in ischemic stroke patients is useful for clinical decisions. Previous studies mostly elaborate on the prediction of favorable outcomes. Miserable outcomes, which are usually defined as modified Rankin Scale (mRS) 5–6, should be considered as well before further invasive intervention. By using a machine learning algorithm, we aimed to develop a multiclass classification model for outcome prediction in acute ischemic stroke patients requiring reperfusion therapy. This was a retrospective study performed at a stroke medical center in Taiwan. Patients with acute ischemic stroke who visited between January 2016 and December 2019 and who were candidates for reperfusion therapy were included. Clinical outcomes were classified as favorable outcome, intermediate outcome, and miserable outcome. We developed four different multiclass machine learning models (Logistic Regression, Supportive Vector Machine, Random Forest, and Extreme Gradient Boosting) to predict clinical outcomes and compared their performance to the DRAGON score. A sample of 590 patients was included in this study. Of them, 180 (30.5%) had favorable outcomes and 152 (25.8%) had miserable outcomes. All selected machine learning models outperformed the DRAGON score on accuracy of outcome prediction (Logistic Regression: 0.70, Supportive Vector Machine: 0.67, Random Forest: 0.69, and Extreme Gradient Boosting: 0.67, vs. DRAGON: 0.51, p < 0.001). Among all selected models, Logistic Regression also had a better performance than the DRAGON score on positive predictive value, sensitivity, and specificity. Compared with the DRAGON score, the multiclass machine learning approach showed better performance on the prediction of the 3-month functional outcome of acute ischemic stroke patients requiring reperfusion therapy.


Stroke ◽  
2021 ◽  
Vol 52 (Suppl_1) ◽  
Author(s):  
Negar Darabi ◽  
Niyousha Hosseinichimeh ◽  
Anthony Noto ◽  
Ramin Zand ◽  
Vida Abedi

Background: At a personalized level, identification of patients at higher risk of 30-day readmission and in need of special clinical attention could lower their chances of readmission. While at a system’s level, reducing hospital readmission improves the overall quality of care delivery and reduces the associated cost burden. Objective: To enhance understanding of the predictors of 30-day readmission after ischemic stroke and identify high-risk individuals. We aimed to compare the performance and the predictive power of machine learning-based methods and identify the best model. Method: The electronic health records (EHR) of acute ischemic stroke patients were extracted from two tertiary centers within the Geisinger Health System between January 1, 2015, and October 7, 2018. A total of 61 variables, including clinical variables, demographical characteristics, discharge status, and type of health insurance were used in this study. Patients were randomly split for model development (80%) and testing (20%). Random forest, gradient boosting machine, extreme gradient boosting (XGBoost), support vector machine, and logistic regression, were developed to predict the 30-day readmission after stroke. The models were evaluated based on the area under the curve (AUC), sensitivity, specificity, and positive predictive value (PPV). Results: A total of 3,184 patients with ischemic stroke (mean age: 71±13.90 years, men: 51.06%) were included in this study. From the 3,184, 301 (9.40%) were readmitted within 30-day. The best performance was obtained when XGBoost was used with ROSE-sampling. The AUC for the test set was 0.74 (95% CI: 0.64-0.78) with PPV of 0.43. The top four predictors of the 30-day readmission model were National Institutes of Health Stroke Scale score above 24, insert an indwelling urinary catheter, hypercoagulable state, and percutaneous gastrostomy. Conclusions: Machine learning model can be designed to predict 30-day readmission after stroke using structured data from EHR. Among the five algorithms analyzed, XGBoost had the best performance.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Cheng Qu ◽  
Lin Gao ◽  
Xian-qiang Yu ◽  
Mei Wei ◽  
Guo-quan Fang ◽  
...  

Background. Acute kidney injury (AKI) has long been recognized as a common and important complication of acute pancreatitis (AP). In the study, machine learning (ML) techniques were used to establish predictive models for AKI in AP patients during hospitalization. This is a retrospective review of prospectively collected data of AP patients admitted within one week after the onset of abdominal pain to our department from January 2014 to January 2019. Eighty patients developed AKI after admission (AKI group) and 254 patients did not (non-AKI group) in the hospital. With the provision of additional information such as demographic characteristics or laboratory data, support vector machine (SVM), random forest (RF), classification and regression tree (CART), and extreme gradient boosting (XGBoost) were used to build models of AKI prediction and compared to the predictive performance of the classic model using logistic regression (LR). XGBoost performed best in predicting AKI with an AUC of 91.93% among the machine learning models. The AUC of logistic regression analysis was 87.28%. Present findings suggest that compared to the classical logistic regression model, machine learning models using features that can be easily obtained at admission had a better performance in predicting AKI in the AP patients.


Processes ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 1563
Author(s):  
Chi-Jie Lu ◽  
Tian-Shyug Lee ◽  
Chien-Chih Wang ◽  
Wei-Jen Chen

Developing an effective sports performance analysis process is an attractive issue in sports team management. This study proposed an improved sports outcome prediction process by integrating adaptive weighted features and machine learning algorithms for basketball game score prediction. The feature engineering method is used to construct designed features based on game-lag information and adaptive weighting of variables in the proposed prediction process. These designed features are then applied to the five machine learning methods, including classification and regression trees (CART), random forest (RF), stochastic gradient boosting (SGB), eXtreme gradient boosting (XGBoost), and extreme learning machine (ELM) for constructing effective prediction models. The empirical results from National Basketball Association (NBA) data revealed that the proposed sports outcome prediction process could generate a promising prediction result compared to the competing models without adaptive weighting features. Our results also showed that the machine learning models with four game-lags information and adaptive weighting of power could generate better prediction performance.


2020 ◽  
Author(s):  
Haoyue Guo ◽  
Li Diao ◽  
Hui Qi ◽  
Chunlei Dai ◽  
Yu Chen ◽  
...  

Abstract Background: Targeted therapy and immune checkpoint inhibitors are the most promising treatments for lung cancers but still facing multiple challenges, including resistance and individual difference. Therefore, patient-derived tumor xenografts (PDX) models are developed for drug discovery and screening. NOG mice is under the destruction of the interleukin-2 (IL-2) receptor common gamma chain, which is appropriate for building PDX models to test immunotherapies. However, current studies have little understanding of the causes of genotype mismatches in PDX or NOG/PDX models, which leads to a massive economic and time loss.Methods: Lung cancer tissues from 53 patients were obtained and engrafted into NOG mice. All of the patients' tumors and NOG/PDX models were detected for common gene mutations. Seventeen clinicopathological features were organized and input to stepwise logistic regression based on the lowest Akaike information criterion (AIC), least absolute shrinkage and selection operator (LASSO)-logistic regression, support vector machine recursive feature elimination (SVM-RFE), eXtreme Gradient Boosting (XGBoost), Gradient Boosting & Categorical Features (CatBoost), and synthetic minority over-sampling technique (SMOTE). Finally, the performance of all models was evaluated by the accuracy, area under the receiver operating characteristic curve (AUC), and F1 score in 100 testing groups.Results: Fifty-three lung cancer NOG/PDX models were successfully established, with a genotype matching rate of 79.2% (42/53). Two multivariable logistic regressions revealed that age, the number of driver mutations, epidermal growth factor receptor (EGFR) gene mutations, the type of prior chemotherapy, prior tyrosine kinase inhibitors (TKIs) therapy, and the source were potent predictors. Moreover, CatBoost (mean accuracy=0.960; mean AUC=0.939; mean F1 score=0.908) and 8-feature SVM (mean accuracy=0.950; mean AUC=0.934; mean F1 score=0.903) showed the best performance compared with the other algorithms. Moreover, the combination of SMOTE with SVM significantly improved the predictive capability (mean accuracy: 0.961 vs. 0.958, P=0.025; mean AUC: 0.940 vs. 0.935, P=0.045; mean F1 score: 0.909 vs. 0.903, P=0.047).Conclusions: We established an optimal predictive model to screen lung cancer patients for NOG/PDX models, and also offered a general approach for building prediction models in small unbalanced biomedical samples.


2021 ◽  
Vol 11 (9) ◽  
pp. 863
Author(s):  
Jeong-Myeong Choi ◽  
Soo-Young Seo ◽  
Pum-Jun Kim ◽  
Yu-Seop Kim ◽  
Sang-Hwa Lee ◽  
...  

Hemorrhagic transformation (HT) is one of the leading causes of a poor prognostic marker after acute ischemic stroke (AIS). We compared the performances of the several machine learning (ML) algorithms to predict HT after AIS using only structured data. A total of 2028 patients with AIS, who were admitted within seven days of symptoms onset, were included in this analysis. HT was defined based on the criteria of the European Co-operative Acute Stroke Study-II trial. The whole dataset was randomly divided into a training and a test dataset with a 7:3 ratio. Binary logistic regression, support vector machine, extreme gradient boosting, and artificial neural network (ANN) algorithms were used to assess the performance of predicting the HT occurrence after AIS. Five-fold cross validation and a grid search technique were used to optimize the hyperparameters of each ML model, which had its performance measured by the area under the receiver operating characteristic (AUROC) curve. Among the included AIS patients, the mean age and number of male subjects were 69.6 years and 1183 (58.3%), respectively. HT was observed in 318 subjects (15.7%). There were no significant differences in corresponding variables between the training and test dataset. Among all the ML algorithms, the ANN algorithm showed the best performance in terms of predicting the occurrence of HT in our dataset (0.844). Feature scaling including standardization and normalization, and the resampling strategy showed no additional improvement of the ANN’s performance. The ANN-based prediction of HT after AIS showed better performance than the conventional ML algorithms. Deep learning may be used to predict important outcomes for structured data-based prediction.


2021 ◽  
Vol 10 (6) ◽  
pp. 1286
Author(s):  
Vida Abedi ◽  
Venkatesh Avula ◽  
Durgesh Chaudhary ◽  
Shima Shahjouei ◽  
Ayesha Khan ◽  
...  

Background: The long-term risk of recurrent ischemic stroke, estimated to be between 17% and 30%, cannot be reliably assessed at an individual level. Our goal was to study whether machine-learning can be trained to predict stroke recurrence and identify key clinical variables and assess whether performance metrics can be optimized. Methods: We used patient-level data from electronic health records, six interpretable algorithms (Logistic Regression, Extreme Gradient Boosting, Gradient Boosting Machine, Random Forest, Support Vector Machine, Decision Tree), four feature selection strategies, five prediction windows, and two sampling strategies to develop 288 models for up to 5-year stroke recurrence prediction. We further identified important clinical features and different optimization strategies. Results: We included 2091 ischemic stroke patients. Model area under the receiver operating characteristic (AUROC) curve was stable for prediction windows of 1, 2, 3, 4, and 5 years, with the highest score for the 1-year (0.79) and the lowest score for the 5-year prediction window (0.69). A total of 21 (7%) models reached an AUROC above 0.73 while 110 (38%) models reached an AUROC greater than 0.7. Among the 53 features analyzed, age, body mass index, and laboratory-based features (such as high-density lipoprotein, hemoglobin A1c, and creatinine) had the highest overall importance scores. The balance between specificity and sensitivity improved through sampling strategies. Conclusion: All of the selected six algorithms could be trained to predict the long-term stroke recurrence and laboratory-based variables were highly associated with stroke recurrence. The latter could be targeted for personalized interventions. Model performance metrics could be optimized, and models can be implemented in the same healthcare system as intelligent decision support for targeted intervention.


2020 ◽  
Author(s):  
Haoyue Guo ◽  
Li Diao ◽  
Hui Qi ◽  
Chunlei Dai ◽  
Yu Chen ◽  
...  

Abstract Introduction: Targeted therapy and immune checkpoint inhibitors are the most promising treatments for lung cancers but still facing multiple challenges, including resistance and individual difference. Therefore, patient-derived tumor xenografts (PDX) models are developed for drug discovery and screening. NOG mice is under the destruction of the interleukin-2 (IL-2) receptor common gamma chain, which is appropriate for building PDX models to test immunotherapies. However, current studies have little understanding of the causes of genotype mismatches in PDX or NOG/PDX models, which leads to a massive economic and time loss.Methods: Lung cancer tissues from 53 patients were obtained and engrafted into NOG mice. All of the patients' tumors and NOG/PDX models were detected for common gene mutations. Seventeen clinicopathological features were organized and input to stepwise logistic regression based on the lowest Akaike information criterion (AIC), least absolute shrinkage and selection operator (LASSO)-logistic regression, support vector machine recursive feature elimination (SVM-RFE), eXtreme Gradient Boosting (XGBoost), Gradient Boosting & Categorical Features (CatBoost), and synthetic minority over-sampling technique (SMOTE). Finally, the performance of all models was evaluated by the accuracy, area under the receiver operating characteristic curve (AUC), and F1 score in 100 testing groups.Results: Fifty-three lung cancer NOG/PDX models were successfully established, with a genotype matching rate of 79.2% (42/53). Two multivariable logistic regressions revealed that age, the number of driver mutations, epidermal growth factor receptor (EGFR) gene mutations, the type of prior chemotherapy, prior tyrosine kinase inhibitors (TKIs) therapy, and the source were potent predictors. Moreover, CatBoost (mean accuracy=0.960; mean AUC=0.939; mean F1 score=0.908) and 8-feature SVM (mean accuracy=0.950; mean AUC=0.934; mean F1 score=0.903) showed the best performance compared with the other algorithms. Moreover, the combination of SMOTE with SVM significantly improved the predictive capability (mean accuracy: 0.961 vs. 0.958, P=0.025; mean AUC: 0.940 vs. 0.935, P=0.045; mean F1 score: 0.909 vs. 0.903, P=0.047).Conclusions: We established an optimal predictive model to screen lung cancer patients for NOG/PDX models, and also offered a general approach for building prediction models in small unbalanced biomedical samples.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sung-Hwi Hur ◽  
Eun-Young Lee ◽  
Min-Kyung Kim ◽  
Somi Kim ◽  
Ji-Yeon Kang ◽  
...  

AbstractImpacted mandibular third molars (M3M) are associated with the occurrence of distal caries on the adjacent mandibular second molars (DCM2M). In this study, we aimed to develop and validate five machine learning (ML) models designed to predict the occurrence of DCM2Ms due to the proximity with M3Ms and determine the relative importance of predictive variables for DCM2Ms that are important for clinical decision making. A total of 2642 mandibular second molars adjacent to M3Ms were analyzed and DCM2Ms were identified in 322 cases (12.2%). The models were trained using logistic regression, random forest, support vector machine, artificial neural network, and extreme gradient boosting ML methods and were subsequently validated using testing datasets. The performance of the ML models was significantly superior to that of single predictors. The area under the receiver operating characteristic curve of the machine learning models ranged from 0.88 to 0.89. Six features (sex, age, contact point at the cementoenamel junction, angulation of M3Ms, Winter's classification, and Pell and Gregory classification) were identified as relevant predictors. These prediction models could be used to detect patients at a high risk of developing DCM2M and ultimately contribute to caries prevention and treatment decision-making for impacted M3Ms.


Author(s):  
Mouhammad A Jumaa ◽  
Zeinab Zoghi ◽  
Syed Zaidi ◽  
Nils Mueller‐Kronast ◽  
Osama Zaidat ◽  
...  

Introduction : Machine learning algorithms have emerged as powerful predictive tools in the field of acute ischemic stroke. Here, we examine the predictive performance of a machine algorithm compared to logistic regression for predicting functional outcomes in the prospective Systematic Evaluation of Patients Treated With Neurothrombectomy Devices for Acute Ischemic Stroke (STRATIS) Registry. Methods : The STRATIS Registry was a prospective, observational study of the use of the Solitaire device in acute ischemic stroke patients. Patients with posterior circulation stroke or missing 90‐day mRS were excluding from the analysis. A statistical algorithm (logistic regression) and a machine learning algorithm (decision tree) were implemented on the preprocessed dataset using 10‐fold cross‐validation method where 80% of the data were fed into the models to be trained and the remaining 20% were utilized in the test phase to evaluate the performance of the models for prediction of 90‐day mRS score as dichotomous output. Results : Of the 938 STRATIS patients, 702 with 90‐day mRS were included. The machine learning model outperformed the logistic regression model with a 0.92±0.026 Area Under Curve (AUC) score compared to a 0.88±0.028 AUC score obtained by implementing logistic regression. Conclusions : Our machine learning model delivered improved performance in comparison with the statistical model in predicting 90‐day functional outcome. More studies are needed to understand and externally validate the predictive capacity of our machine learning model.


Sign in / Sign up

Export Citation Format

Share Document