scholarly journals 60. Creation and Comparison of a Machine Learning Decision Tree and Traditional Risk Score to Predict Ceftriaxone Resistance in Cancer Patients with E. coli Bacteremia

2021 ◽  
Vol 8 (Supplement_1) ◽  
pp. S41-S41
Author(s):  
Courtney Moc ◽  
William Shropshire ◽  
Patrick McDaneld ◽  
Samuel A Shelburne ◽  
Samuel L Aitken ◽  
...  

Abstract Background There are several clinical tools that have been developed to predict the likelihood of extended-spectrum β-lactamase producing Enterobacterales; however, the creation of these tools included few patients with cancer or otherwise immunosuppressed. The objectives of this retrospective cohort study were to develop a decision tree and traditional risk score to predict ceftriaxone resistance in cancer patients with Escherichia coli (E. coli) bacteremia as well as to compare the predictive accuracy between the tools. Methods Adults age ≥ 18 years old with E. coli bacteremia at The University of Texas MD Anderson Cancer Center from 1/2018 to 12/2019 were included. Isolates recovered within 1 week from the same patient were excluded. The decision tree was constructed using classification and regression tree analysis, with a minimum node size of 10. The risk score was created using a multivariable logistic regression model derived by using stepwise variable selection with backward elimination at level 0.2. The decision tree and risk score statistical metrics were compared. Results A total of 629 E. coli isolates were screened, of which 580 isolates met criteria. Ceftriaxone-resistant (CRO-R) E. coli accounted for 36% of isolates. The machine learning-derived decision tree included 5 predictors whereas the logistic regression-derived risk score included 7 predictors. The risk score cutoff point of ≥ 5 points demonstrated the most optimized overall classification accuracy. The positive predictive value of the decision tree was higher than that of the risk score (88% vs 74%, respectively), but the area under the receiver operating characteristic curve and model accuracy of the risk score was higher than that of the decision tree (0.85 vs 0.73 and 82% vs 74%, respectively). Figure 1. Clinical Decision Tree Table 1. Regression Model and Assigned Points for Clinical Risk Score Table 2. Statistical Metrics of Clinical Decision Tree and Clinical Risk Score Conclusion The decision tree and risk score can be used to determine the likelihood of whether a cancer patient with E. coli bacteremia has a CRO-R infection. In both clinical tools, the strongest predictor was a history of CRO-R E. coli colonization or infection in the last 6 months. The decision tree was more user-friendly, has fewer variables, and has a better positive predictive value in comparison to the risk score. However, the risk score has a significantly better discrimination and model accuracy than that of the decision tree. Disclosures Samuel L. Aitken, PharmD, MPH, BCIDP, Melinta Therapeutoics (Individual(s) Involved: Self): Consultant, Grant/Research Support

Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


2021 ◽  
Vol 10 (1) ◽  
pp. 99
Author(s):  
Sajad Yousefi

Introduction: Heart disease is often associated with conditions such as clogged arteries due to the sediment accumulation which causes chest pain and heart attack. Many people die due to the heart disease annually. Most countries have a shortage of cardiovascular specialists and thus, a significant percentage of misdiagnosis occurs. Hence, predicting this disease is a serious issue. Using machine learning models performed on multidimensional dataset, this article aims to find the most efficient and accurate machine learning models for disease prediction.Material and Methods: Several algorithms were utilized to predict heart disease among which Decision Tree, Random Forest and KNN supervised machine learning are highly mentioned. The algorithms are applied to the dataset taken from the UCI repository including 294 samples. The dataset includes heart disease features. To enhance the algorithm performance, these features are analyzed, the feature importance scores and cross validation are considered.Results: The algorithm performance is compared with each other, so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, Accuracy, AUC ROC are 83% and 99% respectively for Decision Tree algorithm. Logistic Regression algorithm with accuracy and AUC ROC are 88% and 91% respectively has better performance than other algorithms. Therefore, these techniques can be useful for physicians to predict heart disease patients and prescribe them correctly.Conclusion: Machine learning technique can be used in medicine for analyzing the related data collections to a disease and its prediction. The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the prediction of heart disease is compared to determine the most appropriate classification. As a result of evaluation, better performance was observed in both Decision Tree and Logistic Regression models.


2019 ◽  
Author(s):  
Cheng-Sheng Yu ◽  
Yu-Jiun Lin ◽  
Chang-Hsien Lin ◽  
Sen-Te Wang ◽  
Shiyng-Yu Lin ◽  
...  

BACKGROUND Metabolic syndrome is a cluster of disorders that significantly influence the development and deterioration of numerous diseases. FibroScan is an ultrasound device that was recently shown to predict metabolic syndrome with moderate accuracy. However, previous research regarding prediction of metabolic syndrome in subjects examined with FibroScan has been mainly based on conventional statistical models. Alternatively, machine learning, whereby a computer algorithm learns from prior experience, has better predictive performance over conventional statistical modeling. OBJECTIVE We aimed to evaluate the accuracy of different decision tree machine learning algorithms to predict the state of metabolic syndrome in self-paid health examination subjects who were examined with FibroScan. METHODS Multivariate logistic regression was conducted for every known risk factor of metabolic syndrome. Principal components analysis was used to visualize the distribution of metabolic syndrome patients. We further applied various statistical machine learning techniques to visualize and investigate the pattern and relationship between metabolic syndrome and several risk variables. RESULTS Obesity, serum glutamic-oxalocetic transaminase, serum glutamic pyruvic transaminase, controlled attenuation parameter score, and glycated hemoglobin emerged as significant risk factors in multivariate logistic regression. The area under the receiver operating characteristic curve values for classification and regression trees and for the random forest were 0.831 and 0.904, respectively. CONCLUSIONS Machine learning technology facilitates the identification of metabolic syndrome in self-paid health examination subjects with high accuracy.


Author(s):  
M. Carr ◽  
V. Ravi ◽  
G. Sridharan Reddy ◽  
D. Veranna

This paper profiles mobile banking users using machine learning techniques viz. Decision Tree, Logistic Regression, Multilayer Perceptron, and SVM to test a research model with fourteen independent variables and a dependent variable (adoption). A survey was conducted and the results were analysed using these techniques. Using Decision Trees the profile of the mobile banking adopter’s profile was identified. Comparing different machine learning techniques it was found that Decision Trees outperformed the Logistic Regression and Multilayer Perceptron and SVM. Out of all the techniques, Decision Tree is recommended for profiling studies because apart from obtaining high accurate results, it also yields ‘if–then’ classification rules. The classification rules provided here can be used to target potential customers to adopt mobile banking by offering them appropriate incentives.


2019 ◽  
Vol 15 (3) ◽  
Author(s):  
Grzegorz M. Wójcik ◽  
Andrzej Kawiak ◽  
Lukasz Kwasniewicz ◽  
Piotr Schneider ◽  
Jolanta Masiak

AbstractThe Event-Related Potentials were investigated on a group of 70 participants using the dense array electroencephalographic amplifier with photogrammetry geodesic station. The source localisation was computed for each participant. The activity of brodmann areas (BAs) involved in the brain cortical activity of each participant was measured. Then the mean electric charge flowing through particular areas was calculated. The five different machine learning tools (logistic regression, boosted decision tree, Bayes point machine, classic neural network and averaged perceptron classifier) from the Azure ecosystem were trained, and their accuracy was tested in the task of distinguishing standard and target responses in the experiment. The efficiency of each tool was compared, and it was found out that the best tool was logistic regression and the boosted decision tree in our task. Such an approach can be useful in eliminating somatosensory responses in experimental psychology or even in establishing new communication protocols with mildly mentally disabled subjects.


2020 ◽  
Vol 10 (15) ◽  
pp. 5047 ◽  
Author(s):  
Viet-Ha Nhu ◽  
Danesh Zandi ◽  
Himan Shahabi ◽  
Kamran Chapi ◽  
Ataollah Shirzadi ◽  
...  

This paper aims to apply and compare the performance of the three machine learning algorithms–support vector machine (SVM), bayesian logistic regression (BLR), and alternating decision tree (ADTree)–to map landslide susceptibility along the mountainous road of the Salavat Abad saddle, Kurdistan province, Iran. We identified 66 shallow landslide locations, based on field surveys, by recording the locations of the landslides by a global position System (GPS), Google Earth imagery and black-and-white aerial photographs (scale 1: 20,000) and 19 landslide conditioning factors, then tested these factors using the information gain ratio (IGR) technique. We checked the validity of the models using statistical metrics, including sensitivity, specificity, accuracy, kappa, root mean square error (RMSE), and area under the receiver operating characteristic curve (AUC). We found that, although all three machine learning algorithms yielded excellent performance, the SVM algorithm (AUC = 0.984) slightly outperformed the BLR (AUC = 0.980), and ADTree (AUC = 0.977) algorithms. We observed that not only all three algorithms are useful and effective tools for identifying shallow landslide-prone areas but also the BLR algorithm can be used such as the SVM algorithm as a soft computing benchmark algorithm to check the performance of the models in future.


SinkrOn ◽  
2022 ◽  
Vol 7 (1) ◽  
pp. 59-65
Author(s):  
Artika Arista

Many people today are unsure whether they have COVID-19. The frequent fever, dry cough, and sore throat are all signs and symptoms of COVID-19. If a person has signs or symptoms of coronavirus disease 2019 (COVID-19), he/she should see the doctor or go to a clinic as soon as possible. As a result, it's vital to learn and comprehend the fundamental differences. COVID-19 can cause a wide range of symptoms. The experiments were carried out using two Machine Learning Classification Algorithms, namely Decision Tree (DT) and Logistic Regression (LR). Both algorithms were written and analyzed using the Python program in Jupyter Notebook 6.4.5. From the results obtained in the experiments of covid symptoms dataset, on average, the DT model has obtained the best cross-validation average and the testing performance average compared to the LR machine learning models. For cross-validation results, the DT model has achieved an accuracy of 98.0%. For performance testing, the DT model has achieved an accuracy of 98.0%. The LR has obtained the second-best result on the average of cross-validation performance and the testing results. For cross-validation results, the LR model has achieved an accuracy of 96.0%. For performance testing, the LR model has achieved an accuracy of 97.0%. Consequently, the DT for the COVID-19 symptoms dataset is outperforming the LR for cross-validation and testing results.


10.2196/17110 ◽  
2020 ◽  
Vol 8 (3) ◽  
pp. e17110 ◽  
Author(s):  
Cheng-Sheng Yu ◽  
Yu-Jiun Lin ◽  
Chang-Hsien Lin ◽  
Sen-Te Wang ◽  
Shiyng-Yu Lin ◽  
...  

Background Metabolic syndrome is a cluster of disorders that significantly influence the development and deterioration of numerous diseases. FibroScan is an ultrasound device that was recently shown to predict metabolic syndrome with moderate accuracy. However, previous research regarding prediction of metabolic syndrome in subjects examined with FibroScan has been mainly based on conventional statistical models. Alternatively, machine learning, whereby a computer algorithm learns from prior experience, has better predictive performance over conventional statistical modeling. Objective We aimed to evaluate the accuracy of different decision tree machine learning algorithms to predict the state of metabolic syndrome in self-paid health examination subjects who were examined with FibroScan. Methods Multivariate logistic regression was conducted for every known risk factor of metabolic syndrome. Principal components analysis was used to visualize the distribution of metabolic syndrome patients. We further applied various statistical machine learning techniques to visualize and investigate the pattern and relationship between metabolic syndrome and several risk variables. Results Obesity, serum glutamic-oxalocetic transaminase, serum glutamic pyruvic transaminase, controlled attenuation parameter score, and glycated hemoglobin emerged as significant risk factors in multivariate logistic regression. The area under the receiver operating characteristic curve values for classification and regression trees and for the random forest were 0.831 and 0.904, respectively. Conclusions Machine learning technology facilitates the identification of metabolic syndrome in self-paid health examination subjects with high accuracy.


2021 ◽  
Author(s):  
Matthew Nagy ◽  
Nathan Radakovich ◽  
Aziz Nazha

UNSTRUCTURED The rapid development of machine learning (ML) applications in healthcare promises to transform the landscape of healthcare. In order for ML advancements to be effectively utilized in clinical care, it is necessary for the medical workforce to be prepared to handle these changes. As physicians in training are exposed to a wide breadth of clinical tools during medical school, this offers an ideal opportunity to introduce ML concepts. A foundational understanding of ML will not only be practically useful for clinicians, but will also address ethical concerns for clinical decision making. While select medical schools have made effort to integrate ML didactics and practice into their curriculum, we argue that foundational ML principles should be taught to broadly to medical students across the country.


Author(s):  
Krishna Kumar Mohbey

In any industry, attrition is a big problem, whether it is about employee attrition of an organization or customer attrition of an e-commerce site. If we can accurately predict which customer or employee will leave their current company or organization, then it will save much time, effort, and cost of the employer and help them to hire or acquire substitutes in advance, and it would not create a problem in the ongoing progress of an organization. In this chapter, a comparative analysis between various machine learning approaches such as Naïve Bayes, SVM, decision tree, random forest, and logistic regression is presented. The presented result will help us in identifying the behavior of employees who can be attired over the next time. Experimental results reveal that the logistic regression approach can reach up to 86% accuracy over other machine learning approaches.


Sign in / Sign up

Export Citation Format

Share Document