Abstract PO-059: Recurrence predictive models for patients with hepatocellular carcinoma after radiofrequency ablation based on machine learning algorithms

Author(s):  
Ja-Der Liang ◽  
Ta-Wei Yang ◽  
Po-When Chen ◽  
Cheng-Fu Chou ◽  
Yao-Ming Wu

World Health Organization’s (WHO) report 2018, on diabetes has reported that the number of diabetic cases has increased from one hundred eight million to four hundred twenty-two million from the year 1980. The fact sheet shows that there is a major increase in diabetic cases from 4.7% to 8.5% among adults (18 years of age). Major health hazards caused due to diabetes include kidney function failure, heart disease, blindness, stroke, and lower limb dismembering. This article applies supervised machine learning algorithms on the Pima Indian Diabetic dataset to explore various patterns of risks involved using predictive models. Predictive model construction is based upon supervised machine learning algorithms: Naïve Bayes, Decision Tree, Random Forest, Gradient Boosted Tree, and Tree Ensemble. Further, the analytical patterns about these predictive models have been presented based on various performance parameters which include accuracy, precision, recall, and F-measure.


Author(s):  
Yingjun Shen ◽  
Zhe Song ◽  
Andrew Kusiak

Abstract Wind farm needs prediction models for predictive maintenance. There is a need to predict values of non-observable parameters beyond ranges reflected in available data. A prediction model developed for one machine many not perform well in another similar machine. This is usually due to lack of generalizability of data-driven models. To increase generalizability of predictive models, this research integrates the data mining with first-principle knowledge. Physics-based principles are combined with machine learning algorithms through feature engineering, strong rules and divide-and-conquer. The proposed synergy concept is illustrated with the wind turbine blade icing prediction and achieves significant prediction accuracy across different turbines. The proposed process is widely accepted by wind energy predictive maintenance practitioners because of its simplicity and efficiency. Furthermore, the testing scores of KNN, CART and DNN algorithm are increased by 44.78%, 32.72% and 9.13% with our proposed process. We demonstrated the importance of embedding physical principles within the machine learning process, and also highlight an important point that the need for more complex machine learning algorithms in industrial big data mining is often much less than it is in other applications, making it essential to incorporate physics and follow “Less is More” philosophy.


2013 ◽  
Vol 108 (11) ◽  
pp. 1723-1730 ◽  
Author(s):  
Amit G Singal ◽  
Ashin Mukherjee ◽  
Joseph B Elmunzer ◽  
Peter D R Higgins ◽  
Anna S Lok ◽  
...  

2016 ◽  
Vol 36 (suppl_1) ◽  
Author(s):  
Elsie G Ross ◽  
Nicholas Leeper ◽  
Nigam Shah

Introduction: Patients with peripheral artery disease (PAD) are at high risk of major adverse cardiac and cerebrovascular events (MACCE). However, no currently available risk scores accurately delineate which patients are most likely to sustain an event, creating a missed opportunity for more aggressive risk factor management. We set out to develop a novel predictive model - based on automated machine learning algorithms using electronic health record (EHR) data - with the aim of identifying which PAD patients are most likely to have an adverse outcome during follow-up. Methods: Data were derived from patients with a diagnosis of PAD at our institution. Novel machine-learning algorithms including random forest and penalized regression predictive models were developed using structured and unstructured data that including lab values, diagnosis codes, medications, and clinical notes. Patients were matched for total follow-up time to remove length of patient records as a biasing factor in our predictive models. Results: After matching for length of follow-up, 3,807 patients were included in our models. A total of 1,269 patients had a MACCE event after PAD diagnosis. The median time to MACCE was 2.8 years after PAD diagnosis. Utilizing 1,492 different variables extracted from the EHR, our best predictive model was able to very accurately predict which patients would go on to have a MACCE event after diagnosis of PAD with an AUC of 0.98, with a sensitivity, specificity and positive predictive value of 0.90, 0.96, and 0.93, respectively. Conclusions: Hypothesis-free, machine-learning algorithms using freely available data in the EHR can accurately predict which PAD patients are most likely to go on to develop future MACCE. While these findings require validation in an independent data set, there is hope that these informatics approaches can be applied to the medical record in an automated fashion to risk stratify patients with vascular disease and identify those who might benefit from more aggressive disease management in real-time.


2017 ◽  
Vol 135 (3) ◽  
pp. 234-246 ◽  
Author(s):  
André Rodrigues Olivera ◽  
Valter Roesler ◽  
Cirano Iochpe ◽  
Maria Inês Schmidt ◽  
Álvaro Vigo ◽  
...  

ABSTRACT CONTEXT AND OBJECTIVE: Type 2 diabetes is a chronic disease associated with a wide range of serious health complications that have a major impact on overall health. The aims here were to develop and validate predictive models for detecting undiagnosed diabetes using data from the Longitudinal Study of Adult Health (ELSA-Brasil) and to compare the performance of different machine-learning algorithms in this task. DESIGN AND SETTING: Comparison of machine-learning algorithms to develop predictive models using data from ELSA-Brasil. METHODS: After selecting a subset of 27 candidate variables from the literature, models were built and validated in four sequential steps: (i) parameter tuning with tenfold cross-validation, repeated three times; (ii) automatic variable selection using forward selection, a wrapper strategy with four different machine-learning algorithms and tenfold cross-validation (repeated three times), to evaluate each subset of variables; (iii) error estimation of model parameters with tenfold cross-validation, repeated ten times; and (iv) generalization testing on an independent dataset. The models were created with the following machine-learning algorithms: logistic regression, artificial neural network, naïve Bayes, K-nearest neighbor and random forest. RESULTS: The best models were created using artificial neural networks and logistic regression. These achieved mean areas under the curve of, respectively, 75.24% and 74.98% in the error estimation step and 74.17% and 74.41% in the generalization testing step. CONCLUSION: Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.


Sign in / Sign up

Export Citation Format

Share Document