scholarly journals Predicting 1-Hour Thrombolysis Effect of r-tPA in Patients With Acute Ischemic Stroke Using Machine Learning Algorithm

2022 ◽  
Vol 12 ◽  
Author(s):  
Bin Zhu ◽  
Jianlei Zhao ◽  
Mingnan Cao ◽  
Wanliang Du ◽  
Liuqing Yang ◽  
...  

Background: Thrombolysis with r-tPA is recommended for patients after acute ischemic stroke (AIS) within 4.5 h of symptom onset. However, only a few patients benefit from this therapeutic regimen. Thus, we aimed to develop an interpretable machine learning (ML)–based model to predict the thrombolysis effect of r-tPA at the super-early stage.Methods: A total of 353 patients with AIS were divided into training and test data sets. We then used six ML algorithms and a recursive feature elimination (RFE) method to explore the relationship among the clinical variables along with the NIH stroke scale score 1 h after thrombolysis treatment. Shapley additive explanations and local interpretable model–agnostic explanation algorithms were applied to interpret the ML models and determine the importance of the selected features.Results: Altogether, 353 patients with an average age of 63.0 (56.0–71.0) years were enrolled in the study. Of these patients, 156 showed a favorable thrombolysis effect and 197 showed an unfavorable effect. A total of 14 variables were enrolled in the modeling, and 6 ML algorithms were used to predict the thrombolysis effect. After RFE screening, seven variables under the gradient boosting decision tree (GBDT) model (area under the curve = 0.81, specificity = 0.61, sensitivity = 0.9, and F1 score = 0.79) demonstrated the best performance. Of the seven variables, activated partial thromboplastin clotting time (time), B-type natriuretic peptide, and fibrin degradation products were the three most important clinical characteristics that might influence r-tPA efficiency.Conclusion: This study demonstrated that the GBDT model with the seven variables could better predict the early thrombolysis effect of r-tPA.

Diagnostics ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1909
Author(s):  
Dougho Park ◽  
Eunhwan Jeong ◽  
Haejong Kim ◽  
Hae Wook Pyun ◽  
Haemin Kim ◽  
...  

Background: Functional outcomes after acute ischemic stroke are of great concern to patients and their families, as well as physicians and surgeons who make the clinical decisions. We developed machine learning (ML)-based functional outcome prediction models in acute ischemic stroke. Methods: This retrospective study used a prospective cohort database. A total of 1066 patients with acute ischemic stroke between January 2019 and March 2021 were included. Variables such as demographic factors, stroke-related factors, laboratory findings, and comorbidities were utilized at the time of admission. Five ML algorithms were applied to predict a favorable functional outcome (modified Rankin Scale 0 or 1) at 3 months after stroke onset. Results: Regularized logistic regression showed the best performance with an area under the receiver operating characteristic curve (AUC) of 0.86. Support vector machines represented the second-highest AUC of 0.85 with the highest F1-score of 0.86, and finally, all ML models applied achieved an AUC > 0.8. The National Institute of Health Stroke Scale at admission and age were consistently the top two important variables for generalized logistic regression, random forest, and extreme gradient boosting models. Conclusions: ML-based functional outcome prediction models for acute ischemic stroke were validated and proven to be readily applicable and useful.


Stroke ◽  
2020 ◽  
Vol 51 (Suppl_1) ◽  
Author(s):  
Masaki Ito ◽  
Satoshi Kuroda ◽  
Hidetsugu Asanoi ◽  
Taku Sugiyama ◽  
Takafumi Shindo ◽  
...  

Background: Outcomes of stroke with cancer-related coagulopathy (Trousseau syndrome) is predominantly attributed to cancer managements; however, stroke management by anticoagulants can contribute to the best supportive care. We aimed to find predictors of the outcome by multivariate analysis, including machine-learning (ML) based feature-engineering. Methods: A single-center retrospective study using a prospective cohort was conducted between April 2011 and June 2019. Out of the cumulative total of 110 acute ischemic stroke patients with malignancy, 65 were treated with anticoagulants, including warfarin (n=19), non-vitamin K dependent oral anticoagulants (NOAC, n=40), or subcutaneous heparin injections (n=6). Cancer-related coagulopathy was defined by elevated blood D-dimer levels at the onset of stroke with malignancy. The incidence of stroke recurrence was analyzed using 40 variables by logistic regression (LR) and in-house ML programs. Results: Out of 65 instances of the cancer-related stroke, 12 (18.5%) stroke recurrences were observed during 455 ± 70 days (mean, SEM). The stroke subtypes were cardioembolism (n=2), stroke with undetermined etiology (n=23) or other determined etiology (cancer-related coagulopathy, n=40). Multivariate LR revealed significant predictors of stroke recurrence, including NOAC usage and stroke subtype. Whereas, combination of forward stepwise selection and Naïve-Bayes (NB) or support vector machine found the blood D-dimer level as an additional important predictor. Input the D-dimer level in addition to NOAC usage and stroke subtype yielded the best area under the curve (AUC) for either of LR or NB compared to input warfarin or heparin usage. AUC for the LR for these 3 variables was better than that for NB. Conclusion: This study suggests the incidence of stroke recurrence is high in this clinical situation. NOAC usage, stroke subtype, and blood D-dimer level at the onset of stroke have predictive value of the outcome.


2020 ◽  
Vol 38 (15_suppl) ◽  
pp. e16801-e16801
Author(s):  
Daniel R Cherry ◽  
Qinyu Chen ◽  
James Don Murphy

e16801 Background: Pancreatic cancer has an insidious presentation with four-in-five patients presenting with disease not amenable to potentially curative surgery. Efforts to screen patients for pancreatic cancer using population-wide strategies have proven ineffective. We applied a machine learning approach to create an early prediction model drawing on the content of patients’ electronic health records (EHRs). Methods: We used patient data from OptumLabs which included de-identified data extracted from patient EHRs collected between 2009 and 2017. We identified patients diagnosed with pancreatic cancer at age 40 or later, which we categorized into early-stage pancreatic cancer (ESPC; n = 3,322) and late-stage pancreatic cancer (LSPC; n = 25,908) groups. ESPC cases were matched to non-pancreatic cancer controls in a ratio of 1:16 based on diagnosis year and geographic division, and the cohort was divided into training (70%) and test (30%) sets. The prediction model was built using an eXtreme Gradient Boosting machine learning algorithm of ESPC patients’ EHRs in the year preceding diagnosis, with features including patient demographics, procedure and clinical diagnosis codes, clinical notes and medications. Model discrimination was assessed with sensitivity, specificity, positive predictive value (PPV) and area under the curve (AUC) with a score of 1.0 indicating perfect prediction. Results: The final AUC in the test set was 0.841, and the model included 583 features, of which 248 (42.5%) were physician note elements, 146 (25.0%) were procedure codes, 91 (15.6%) were diagnosis codes, 89 (15.3%) were medications and 9 (1.54%) were demographic features. The most important features were history of pancreatic disorders (not diabetes or cancer), age, income, biliary tract disease, education level, obstructive jaundice and abdominal pain. We evaluated model performance at varying classification thresholds. When applied to patients over 40 choosing a threshold with a sensitivity of 20% produced a specificity of 99.9% and a PPV of 2.5%. The model PPV increased with age; for patients over 80, PPV was 8.0%. LSPC patients identified by the model would have been detected a median of 4 months before their actual diagnosis, with a quarter of these patients identified at least 14 months earlier. Conclusions: Using EHR data to identify early-stage pancreatic cancer patients shows promise. While widespread use of this approach on an unselected population would produce high rates of false positives, this technique could be employed among high risk patients, or paired with other screening tools.


2021 ◽  
Author(s):  
Íris Viana dos Santos Santana ◽  
Andressa C. M. da Silveira ◽  
Álvaro Sobrinho ◽  
Lenardo Chaves e Silva ◽  
Leandro Dias da Silva ◽  
...  

BACKGROUND controlling the COVID-19 outbreak in Brazil is considered a challenge of continental proportions due to the high population and urban density, weak implementation and maintenance of social distancing strategies, and limited testing capabilities. OBJECTIVE to contribute to addressing such a challenge, we present the implementation and evaluation of supervised Machine Learning (ML) models to assist the COVID-19 detection in Brazil based on early-stage symptoms. METHODS firstly, we conducted data preprocessing and applied the Chi-squared test in a Brazilian dataset, mainly composed of early-stage symptoms, to perform statistical analyses. Afterward, we implemented ML models using the Random Forest (RF), Support Vector Machine (SVM), Multilayer Perceptron (MLP), K-Nearest Neighbors (KNN), Decision Tree (DT), Gradient Boosting Machine (GBM), and Extreme Gradient Boosting (XGBoost) algorithms. We evaluated the ML models using precision, accuracy score, recall, the area under the curve, and the Friedman and Nemenyi tests. Based on the comparison, we grouped the top five ML models and measured feature importance. RESULTS the MLP model presented the highest mean accuracy score, with more than 97.85%, when compared to GBM (> 97.39%), RF (> 97.36%), DT (> 97.07%), XGBoost (> 97.06%), KNN (> 95.14%), and SVM (> 94.27%). Based on the statistical comparison, we grouped MLP, GBM, DT, RF, and XGBoost, as the top five ML models, because the evaluation results are statistically indistinguishable. The ML models` importance of features used during predictions varies from gender, profession, fever, sore throat, dyspnea, olfactory disorder, cough, runny nose, taste disorder, and headache. CONCLUSIONS supervised ML models effectively assist the decision making in medical diagnosis and public administration (e.g., testing strategies), based on early-stage symptoms that do not require advanced and expensive exams.


Author(s):  
Isaac Kofi Nti ◽  
◽  
Owusu N yarko-Boateng ◽  
Justice Aning

The numerical value of k in a k-fold cross-validation training technique of machine learning predictive models is an essential element that impacts the model’s performance. A right choice of k results in better accuracy, while a poorly chosen value for k might affect the model’s performance. In literature, the most commonly used values of k are five (5) or ten (10), as these two values are believed to give test error rate estimates that suffer neither from extremely high bias nor very high variance. However, there is no formal rule. To the best of our knowledge, few experimental studies attempted to investigate the effect of diverse k values in training different machine learning models. This paper empirically analyses the prevalence and effect of distinct k values (3, 5, 7, 10, 15 and 20) on the validation performance of four well-known machine learning algorithms (Gradient Boosting Machine (GBM), Logistic Regression (LR), Decision Tree (DT) and K-Nearest Neighbours (KNN)). It was observed that the value of k and model validation performance differ from one machine-learning algorithm to another for the same classification task. However, our empirical suggest that k = 7 offers a slight increase in validations accuracy and area under the curve measure with lesser computational complexity than k = 10 across most MLA. We discuss in detail the study outcomes and outline some guidelines for beginners in the machine learning field in selecting the best k value and machine learning algorithm for a given task.


2018 ◽  
Vol 129 (4) ◽  
pp. 663-674 ◽  
Author(s):  
Feras Hatib ◽  
Zhongping Jian ◽  
Sai Buddi ◽  
Christine Lee ◽  
Jos Settels ◽  
...  

Abstract Editor’s Perspective What We Already Know about This Topic What This Article Tells Us That Is New Background With appropriate algorithms, computers can learn to detect patterns and associations in large data sets. The authors’ goal was to apply machine learning to arterial pressure waveforms and create an algorithm to predict hypotension. The algorithm detects early alteration in waveforms that can herald the weakening of cardiovascular compensatory mechanisms affecting preload, afterload, and contractility. Methods The algorithm was developed with two different data sources: (1) a retrospective cohort, used for training, consisting of 1,334 patients’ records with 545,959 min of arterial waveform recording and 25,461 episodes of hypotension; and (2) a prospective, local hospital cohort used for external validation, consisting of 204 patients’ records with 33,236 min of arterial waveform recording and 1,923 episodes of hypotension. The algorithm relates a large set of features calculated from the high-fidelity arterial pressure waveform to the prediction of an upcoming hypotensive event (mean arterial pressure < 65 mmHg). Receiver-operating characteristic curve analysis evaluated the algorithm’s success in predicting hypotension, defined as mean arterial pressure less than 65 mmHg. Results Using 3,022 individual features per cardiac cycle, the algorithm predicted arterial hypotension with a sensitivity and specificity of 88% (85 to 90%) and 87% (85 to 90%) 15 min before a hypotensive event (area under the curve, 0.95 [0.94 to 0.95]); 89% (87 to 91%) and 90% (87 to 92%) 10 min before (area under the curve, 0.95 [0.95 to 0.96]); 92% (90 to 94%) and 92% (90 to 94%) 5 min before (area under the curve, 0.97 [0.97 to 0.98]). Conclusions The results demonstrate that a machine-learning algorithm can be trained, with large data sets of high-fidelity arterial waveforms, to predict hypotension in surgical patients’ records.


Author(s):  
Sanjay Kumar Singh ◽  
Anjali Goyal

Cervical cancer is second most prevailing cancer in women all over the world and the Pap smear is one of the most popular techniques used to diagnosis cervical cancer at an early stage. Developing countries like India has to face the challenges in order to handle more cases day by day. In this article, various online and offline machine learning algorithms has been applied on benchmarked data sets to detect cervical cancer. This article also addresses the problem of segmentation with hybrid techniques and optimizes the number of features using extra tree classifiers. Accuracy, precision score, recall score, and F1 score are increasing in the proportion of data for training and attained up to 100% by some algorithms. Algorithm like logistic regression with L1 regularization has an accuracy of 100%, but it is too much costly in terms of CPU time in comparison to some of the algorithms which obtain 99% accuracy with less CPU time. The key finding in this article is the selection of the best machine learning algorithm with the highest accuracy. Cost effectiveness in terms of CPU time is also analysed.


Diagnostics ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 80
Author(s):  
I-Min Chiu ◽  
Wun-Huei Zeng ◽  
Chi-Yung Cheng ◽  
Shih-Hsuan Chen ◽  
Chun-Hung Richard Lin

Prediction of functional outcome in ischemic stroke patients is useful for clinical decisions. Previous studies mostly elaborate on the prediction of favorable outcomes. Miserable outcomes, which are usually defined as modified Rankin Scale (mRS) 5–6, should be considered as well before further invasive intervention. By using a machine learning algorithm, we aimed to develop a multiclass classification model for outcome prediction in acute ischemic stroke patients requiring reperfusion therapy. This was a retrospective study performed at a stroke medical center in Taiwan. Patients with acute ischemic stroke who visited between January 2016 and December 2019 and who were candidates for reperfusion therapy were included. Clinical outcomes were classified as favorable outcome, intermediate outcome, and miserable outcome. We developed four different multiclass machine learning models (Logistic Regression, Supportive Vector Machine, Random Forest, and Extreme Gradient Boosting) to predict clinical outcomes and compared their performance to the DRAGON score. A sample of 590 patients was included in this study. Of them, 180 (30.5%) had favorable outcomes and 152 (25.8%) had miserable outcomes. All selected machine learning models outperformed the DRAGON score on accuracy of outcome prediction (Logistic Regression: 0.70, Supportive Vector Machine: 0.67, Random Forest: 0.69, and Extreme Gradient Boosting: 0.67, vs. DRAGON: 0.51, p < 0.001). Among all selected models, Logistic Regression also had a better performance than the DRAGON score on positive predictive value, sensitivity, and specificity. Compared with the DRAGON score, the multiclass machine learning approach showed better performance on the prediction of the 3-month functional outcome of acute ischemic stroke patients requiring reperfusion therapy.


Sign in / Sign up

Export Citation Format

Share Document