scholarly journals Risk prediction for malignant intraductal papillary mucinous neoplasm of the pancreas: logistic regression versus machine learning

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Jae Seung Kang ◽  
Chanhee Lee ◽  
Wookyeong Song ◽  
Wonho Choo ◽  
Seungyeoun Lee ◽  
...  

AbstractMost models for predicting malignant pancreatic intraductal papillary mucinous neoplasms were developed based on logistic regression (LR) analysis. Our study aimed to develop risk prediction models using machine learning (ML) and LR techniques and compare their performances. This was a multinational, multi-institutional, retrospective study. Clinical variables including age, sex, main duct diameter, cyst size, mural nodule, and tumour location were factors considered for model development (MD). After the division into a MD set and a test set (2:1), the best ML and LR models were developed by training with the MD set using a tenfold cross validation. The test area under the receiver operating curves (AUCs) of the two models were calculated using an independent test set. A total of 3,708 patients were included. The stacked ensemble algorithm in the ML model and variable combinations containing all variables in the LR model were the most chosen during 200 repetitions. After 200 repetitions, the mean AUCs of the ML and LR models were comparable (0.725 vs. 0.725). The performances of the ML and LR models were comparable. The LR model was more practical than ML counterpart, because of its convenience in clinical use and simple interpretability.

Author(s):  
Chenxi Huang ◽  
Shu-Xia Li ◽  
César Caraballo ◽  
Frederick A. Masoudi ◽  
John S. Rumsfeld ◽  
...  

Background: New methods such as machine learning techniques have been increasingly used to enhance the performance of risk predictions for clinical decision-making. However, commonly reported performance metrics may not be sufficient to capture the advantages of these newly proposed models for their adoption by health care professionals to improve care. Machine learning models often improve risk estimation for certain subpopulations that may be missed by these metrics. Methods and Results: This article addresses the limitations of commonly reported metrics for performance comparison and proposes additional metrics. Our discussions cover metrics related to overall performance, discrimination, calibration, resolution, reclassification, and model implementation. Models for predicting acute kidney injury after percutaneous coronary intervention are used to illustrate the use of these metrics. Conclusions: We demonstrate that commonly reported metrics may not have sufficient sensitivity to identify improvement of machine learning models and propose the use of a comprehensive list of performance metrics for reporting and comparing clinical risk prediction models.


2021 ◽  
Vol 12 (04) ◽  
pp. 778-787
Author(s):  
Rod L. Walker ◽  
Susan M. Shortreed ◽  
Rebecca A. Ziebell ◽  
Eric Johnson ◽  
Jennifer M. Boggs ◽  
...  

Abstract Background Suicide risk prediction models have been developed by using information from patients' electronic health records (EHR), but the time elapsed between model development and health system implementation is often substantial. Temporal changes in health systems and EHR coding practices necessitate the evaluation of such models in more contemporary data. Objectives A set of published suicide risk prediction models developed by using EHR data from 2009 to 2015 across seven health systems reported c-statistics of 0.85 for suicide attempt and 0.83 to 0.86 for suicide death. Our objective was to evaluate these models' performance with contemporary data (2014–2017) from these systems. Methods We evaluated performance using mental health visits (6,832,439 to mental health specialty providers and 3,987,078 to general medical providers) from 2014 to 2017 made by 1,799,765 patients aged 13+ across the health systems. No visits in our evaluation were used in the previous model development. Outcomes were suicide attempt (health system records) and suicide death (state death certificates) within 90 days following a visit. We assessed calibration and computed c-statistics with 95% confidence intervals (CI) and cut-point specific estimates of sensitivity, specificity, and positive/negative predictive value. Results Models were well calibrated; 46% of suicide attempts and 35% of suicide deaths in the mental health specialty sample were preceded by a visit (within 90 days) with a risk score in the top 5%. In the general medical sample, 53% of attempts and 35% of deaths were preceded by such a visit. Among these two samples, respectively, c-statistics were 0.862 (95% CI: 0.860–0.864) and 0.864 (95% CI: 0.860–0.869) for suicide attempt, and 0.806 (95% CI: 0.790–0.822) and 0.804 (95% CI: 0.782–0.829) for suicide death. Conclusion Performance of the risk prediction models in this contemporary sample was similar to historical estimates for suicide attempt but modestly lower for suicide death. These published models can inform clinical practice and patient care today.


2020 ◽  
Vol 9 (6) ◽  
pp. 1767 ◽  
Author(s):  
Charat Thongprayoon ◽  
Panupong Hansrivijit ◽  
Tarun Bathini ◽  
Saraschandra Vallabhajosyula ◽  
Poemlarp Mekraksakit ◽  
...  

Cardiac surgery-associated AKI (CSA-AKI) is common after cardiac surgery and has an adverse impact on short- and long-term mortality. Early identification of patients at high risk of CSA-AKI by applying risk prediction models allows clinicians to closely monitor these patients and initiate effective preventive and therapeutic approaches to lessen the incidence of AKI. Several risk prediction models and risk assessment scores have been developed for CSA-AKI. However, the definition of AKI and the variables utilized in these risk scores differ, making general utility complex. Recently, the utility of artificial intelligence coupled with machine learning, has generated much interest and many studies in clinical medicine, including CSA-AKI. In this article, we discussed the evolution of models established by machine learning approaches to predict CSA-AKI.


Author(s):  
Isabelle Kaiser ◽  
Annette B. Pfahlberg ◽  
Wolfgang Uter ◽  
Markus V. Heppt ◽  
Marit B. Veierød ◽  
...  

The rising incidence of cutaneous melanoma over the past few decades has prompted substantial efforts to develop risk prediction models identifying people at high risk of developing melanoma to facilitate targeted screening programs. We review these models, regarding study characteristics, differences in risk factor selection and assessment, evaluation, and validation methods. Our systematic literature search revealed 40 studies comprising 46 different risk prediction models eligible for the review. Altogether, 35 different risk factors were part of the models with nevi being the most common one (n = 35, 78%); little consistency in other risk factors was observed. Results of an internal validation were reported for less than half of the studies (n = 18, 45%), and only 6 performed external validation. In terms of model performance, 29 studies assessed the discriminative ability of their models; other performance measures, e.g., regarding calibration or clinical usefulness, were rarely reported. Due to the substantial heterogeneity in risk factor selection and assessment as well as methodologic aspects of model development, direct comparisons between models are hardly possible. Uniform methodologic standards for the development and validation of risk prediction models for melanoma and reporting standards for the accompanying publications are necessary and need to be obligatory for that reason.


2021 ◽  
Author(s):  
Patricia J. Rodriguez ◽  
David L. Veenstra ◽  
Patrick J. Heagerty ◽  
Christopher H. Goss ◽  
Kathleen J. Ramos ◽  
...  

Author(s):  
Mirza Rizwan Sajid ◽  
Bader A. Almehmadi ◽  
Waqas Sami ◽  
Mansour K. Alzahrani ◽  
Noryanti Muhammad ◽  
...  

Criticism of the implementation of existing risk prediction models (RPMs) for cardiovascular diseases (CVDs) in new populations motivates researchers to develop regional models. The predominant usage of laboratory features in these RPMs is also causing reproducibility issues in low–middle-income countries (LMICs). Further, conventional logistic regression analysis (LRA) does not consider non-linear associations and interaction terms in developing these RPMs, which might oversimplify the phenomenon. This study aims to develop alternative machine learning (ML)-based RPMs that may perform better at predicting CVD status using nonlaboratory features in comparison to conventional RPMs. The data was based on a case–control study conducted at the Punjab Institute of Cardiology, Pakistan. Data from 460 subjects, aged between 30 and 76 years, with (1:1) gender-based matching, was collected. We tested various ML models to identify the best model/models considering LRA as a baseline RPM. An artificial neural network and a linear support vector machine outperformed the conventional RPM in the majority of performance matrices. The predictive accuracies of the best performed ML-based RPMs were between 80.86 and 81.09% and were found to be higher than 79.56% for the baseline RPM. The discriminating capabilities of the ML-based RPMs were also comparable to baseline RPMs. Further, ML-based RPMs identified substantially different orders of features as compared to baseline RPM. This study concludes that nonlaboratory feature-based RPMs can be a good choice for early risk assessment of CVDs in LMICs. ML-based RPMs can identify better order of features as compared to the conventional approach, which subsequently provided models with improved prognostic capabilities.


2021 ◽  
Author(s):  
Ying Gao ◽  
Shu Li ◽  
Yujing Jin ◽  
Lengxiao Zhou ◽  
Shaomei Sun ◽  
...  

BACKGROUND Background: Machine learning algorithms well-suited in cancer research, especially in breast cancer for the investigation and development of riTo assess the performance of available machine learning-based breast cancer risk prediction model. OBJECTIVE Objective: To assess the performance of available machine learning-based breast cancer risk prediction model. METHODS Methods: As of June 9, 2021, articles on breast cancer risk prediction models by machine learning were searched in PubMed, Embase, and Web of Science. Studies describing the development or validation of risk prediction models for predicting future breast cancer risk were included. Pooled area under the curve (AUC) were calculated using the DerSimonian and Laird random-effects model. RESULTS Result: A total of 8 studies with 10 datasets were included. Neural network was the most common machine learning method for the development of risk prediction models. The pooled AUC of machine learning-based optimal risk prediction model reported in each study was 0.73 (95%CI: 0.66-0.80), which was higher than that of traditional risk factor-based risk prediction models (all Pheterogeneity < 0.001). The pooled AUC of neural network-based risk prediction model was higher than that of non-neural network-based optimal risk prediction model (0.71 vs. 0.68). Subgroup analysis showed that incorporation of imaging features risk models had a higher pooled AUC than model of non-incorporation of imaging features (0.73 vs. 0.61; Pheterogeneity =0.001). CONCLUSIONS Conclusions: The pooled machine learning-based breast cancer risk prediction model yield a good prediction performance and promising results.


2021 ◽  
Vol 4 ◽  
Author(s):  
Samuel O. Danso ◽  
Zhanhang Zeng ◽  
Graciela Muniz-Terrera ◽  
Craig W. Ritchie

Alzheimer's disease (AD) has its onset many decades before dementia develops, and work is ongoing to characterise individuals at risk of decline on the basis of early detection through biomarker and cognitive testing as well as the presence/absence of identified risk factors. Risk prediction models for AD based on various computational approaches, including machine learning, are being developed with promising results. However, these approaches have been criticised as they are unable to generalise due to over-reliance on one data source, poor internal and external validations, and lack of understanding of prediction models, thereby limiting the clinical utility of these prediction models. We propose a framework that employs a transfer-learning paradigm with ensemble learning algorithms to develop explainable personalised risk prediction models for dementia. Our prediction models, known as source models, are initially trained and tested using a publicly available dataset (n = 84,856, mean age = 69 years) with 14 years of follow-up samples to predict the individual risk of developing dementia. The decision boundaries of the best source model are further updated by using an alternative dataset from a different and much younger population (n = 473, mean age = 52 years) to obtain an additional prediction model known as the target model. We further apply the SHapely Additive exPlanation (SHAP) algorithm to visualise the risk factors responsible for the prediction at both population and individual levels. The best source model achieves a geometric accuracy of 87%, specificity of 99%, and sensitivity of 76%. In comparison to a baseline model, our target model achieves better performance across several performance metrics, within an increase in geometric accuracy of 16.9%, specificity of 2.7%, and sensitivity of 19.1%, an area under the receiver operating curve (AUROC) of 11% and a transfer learning efficacy rate of 20.6%. The strength of our approach is the large sample size used in training the source model, transferring and applying the “knowledge” to another dataset from a different and undiagnosed population for the early detection and prediction of dementia risk, and the ability to visualise the interaction of the risk factors that drive the prediction. This approach has direct clinical utility.


Sign in / Sign up

Export Citation Format

Share Document