scholarly journals Development of Nonlaboratory-Based Risk Prediction Models for Cardiovascular Diseases Using Conventional and Machine Learning Approaches

Author(s):  
Mirza Rizwan Sajid ◽  
Bader A. Almehmadi ◽  
Waqas Sami ◽  
Mansour K. Alzahrani ◽  
Noryanti Muhammad ◽  
...  

Criticism of the implementation of existing risk prediction models (RPMs) for cardiovascular diseases (CVDs) in new populations motivates researchers to develop regional models. The predominant usage of laboratory features in these RPMs is also causing reproducibility issues in low–middle-income countries (LMICs). Further, conventional logistic regression analysis (LRA) does not consider non-linear associations and interaction terms in developing these RPMs, which might oversimplify the phenomenon. This study aims to develop alternative machine learning (ML)-based RPMs that may perform better at predicting CVD status using nonlaboratory features in comparison to conventional RPMs. The data was based on a case–control study conducted at the Punjab Institute of Cardiology, Pakistan. Data from 460 subjects, aged between 30 and 76 years, with (1:1) gender-based matching, was collected. We tested various ML models to identify the best model/models considering LRA as a baseline RPM. An artificial neural network and a linear support vector machine outperformed the conventional RPM in the majority of performance matrices. The predictive accuracies of the best performed ML-based RPMs were between 80.86 and 81.09% and were found to be higher than 79.56% for the baseline RPM. The discriminating capabilities of the ML-based RPMs were also comparable to baseline RPMs. Further, ML-based RPMs identified substantially different orders of features as compared to baseline RPM. This study concludes that nonlaboratory feature-based RPMs can be a good choice for early risk assessment of CVDs in LMICs. ML-based RPMs can identify better order of features as compared to the conventional approach, which subsequently provided models with improved prognostic capabilities.

2021 ◽  
Author(s):  
Lily D Yan ◽  
Jean Lookens Pierre ◽  
Vanessa Rouzier ◽  
Michel Theard ◽  
Alexandra Apollon ◽  
...  

Background Cardiovascular diseases (CVD) are rapidly increasing in low-middle income countries (LMICs). Accurate risk assessment is essential to reduce premature CVD by targeting primary prevention and risk factor treatment among high-risk groups. Available CVD risk prediction models are built on predominantly Caucasian, high-income country populations, and have not been evaluated in LMIC populations. Objective To compare the predicted 10-year risk of CVD and identify high-risk groups for targeted prevention and treatment in Haiti. Methods We used cross-sectional data within the Haiti CVD Cohort Study, including 653 adults ≥ 40 years without known history of CVD and with complete data. Six CVD risk prediction models were compared: pooled cohort equations (PCE), adjusted PCE with updated cohorts, Framingham CVD Lipids, Framingham CVD Body Mass Index (BMI), WHO Lipids, and WHO BMI. Risk factors were measured during clinical exams. Primary outcome was continuous and categorical predicted 10-year CVD risk. Secondary outcome was statin eligibility. Results Seventy percent were female, 65.5% lived on a daily income of ≤1 USD, 57.0% had hypertension, 14.5% had hypercholesterolemia, 9.3% had diabetes mellitus, 5.5% were current smokers, and 2.0% had HIV. Predicted 10-year CVD risk ranged from 3.9% in adjusted PCE (IQR 1.7-8.4) to 9.8% in Framingham-BMI (IQR 5.0-17.8), and Spearman rank correlation coefficients ranged from 0.87 to 0.98. The percent of the cohort categorized as high risk using the uniform threshold of 10-year CVD risk ≥ 7.5% ranged from 28.8% in the adjusted PCE model to 62.0% in the Framingham-BMI model (χ2 = 331, p value < 0.001). Statin eligibility also varied widely. Conclusions In the Haiti CVD Cohort, there was substantial variation in the proportion identified as high-risk and statin eligible using existing models, leading to very different treatment recommendations and public health implications depending on which prediction model is chosen. There is a need to design and validate CVD risk prediction tools for low-middle income countries that include locally relevant risk factors.


2020 ◽  
Vol 9 (6) ◽  
pp. 1767 ◽  
Author(s):  
Charat Thongprayoon ◽  
Panupong Hansrivijit ◽  
Tarun Bathini ◽  
Saraschandra Vallabhajosyula ◽  
Poemlarp Mekraksakit ◽  
...  

Cardiac surgery-associated AKI (CSA-AKI) is common after cardiac surgery and has an adverse impact on short- and long-term mortality. Early identification of patients at high risk of CSA-AKI by applying risk prediction models allows clinicians to closely monitor these patients and initiate effective preventive and therapeutic approaches to lessen the incidence of AKI. Several risk prediction models and risk assessment scores have been developed for CSA-AKI. However, the definition of AKI and the variables utilized in these risk scores differ, making general utility complex. Recently, the utility of artificial intelligence coupled with machine learning, has generated much interest and many studies in clinical medicine, including CSA-AKI. In this article, we discussed the evolution of models established by machine learning approaches to predict CSA-AKI.


2021 ◽  
Vol 10 (4) ◽  
pp. 199
Author(s):  
Francisco M. Bellas Aláez ◽  
Jesus M. Torres Palenzuela ◽  
Evangelos Spyrakos ◽  
Luis González Vilas

This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.


Author(s):  
Chenxi Huang ◽  
Shu-Xia Li ◽  
César Caraballo ◽  
Frederick A. Masoudi ◽  
John S. Rumsfeld ◽  
...  

Background: New methods such as machine learning techniques have been increasingly used to enhance the performance of risk predictions for clinical decision-making. However, commonly reported performance metrics may not be sufficient to capture the advantages of these newly proposed models for their adoption by health care professionals to improve care. Machine learning models often improve risk estimation for certain subpopulations that may be missed by these metrics. Methods and Results: This article addresses the limitations of commonly reported metrics for performance comparison and proposes additional metrics. Our discussions cover metrics related to overall performance, discrimination, calibration, resolution, reclassification, and model implementation. Models for predicting acute kidney injury after percutaneous coronary intervention are used to illustrate the use of these metrics. Conclusions: We demonstrate that commonly reported metrics may not have sufficient sensitivity to identify improvement of machine learning models and propose the use of a comprehensive list of performance metrics for reporting and comparing clinical risk prediction models.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Lei Li ◽  
Desheng Wu

PurposeThe infraction of securities regulations (ISRs) of listed firms in their day-to-day operations and management has become one of common problems. This paper proposed several machine learning approaches to forecast the risk at infractions of listed corporates to solve financial problems that are not effective and precise in supervision.Design/methodology/approachThe overall proposed research framework designed for forecasting the infractions (ISRs) include data collection and cleaning, feature engineering, data split, prediction approach application and model performance evaluation. We select Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machines, Artificial Neural Network and Long Short-Term Memory Networks (LSTMs) as ISRs prediction models.FindingsThe research results show that prediction performance of proposed models with the prior infractions provides a significant improvement of the ISRs than those without prior, especially for large sample set. The results also indicate when judging whether a company has infractions, we should pay attention to novel artificial intelligence methods, previous infractions of the company, and large data sets.Originality/valueThe findings could be utilized to address the problems of identifying listed corporates' ISRs at hand to a certain degree. Overall, results elucidate the value of the prior infraction of securities regulations (ISRs). This shows the importance of including more data sources when constructing distress models and not only focus on building increasingly more complex models on the same data. This is also beneficial to the regulatory authorities.


2021 ◽  
Vol 297 ◽  
pp. 01073
Author(s):  
Sabyasachi Pramanik ◽  
K. Martin Sagayam ◽  
Om Prakash Jena

Cancer has been described as a diverse illness with several distinct subtypes that may occur simultaneously. As a result, early detection and forecast of cancer types have graced essentially in cancer fact-finding methods since they may help to improve the clinical treatment of cancer survivors. The significance of categorizing cancer suffers into higher or lower-threat categories has prompted numerous fact-finding associates from the bioscience and genomics field to investigate the utilization of machine learning (ML) algorithms in cancer diagnosis and treatment. Because of this, these methods have been used with the goal of simulating the development and treatment of malignant diseases in humans. Furthermore, the capacity of machine learning techniques to identify important characteristics from complicated datasets demonstrates the significance of these technologies. These technologies include Bayesian networks and artificial neural networks, along with a number of other approaches. Decision Trees and Support Vector Machines which have already been extensively used in cancer research for the creation of predictive models, also lead to accurate decision making. The application of machine learning techniques may undoubtedly enhance our knowledge of cancer development; nevertheless, a sufficient degree of validation is required before these approaches can be considered for use in daily clinical practice. An overview of current machine learning approaches utilized in the simulation of cancer development is presented in this paper. All of the supervised machine learning approaches described here, along with a variety of input characteristics and data samples, are used to build the prediction models. In light of the increasing trend towards the use of machine learning methods in biomedical research, we offer the most current papers that have used these approaches to predict risk of cancer or patient outcomes in order to better understand cancer.


2019 ◽  
Vol 40 (Supplement_1) ◽  
Author(s):  
A Banerjee ◽  
S Chen ◽  
G Fatemifar ◽  
H Hemingway ◽  
T Lumbers ◽  
...  

Abstract Introduction Heart failure (HF), acute coronary syndromes (ACS) and atrial fibrillation (AF) are among the commonest cardiovascular diseases (CVD), frequently co-exist and share pathophysiology. Definitions of diagnosis and prognosis are suboptimal. Machine learning (ML) is increasingly used in subtype definition and risk prediction, but the design, methods and results of studies have not been appraised. Purpose To conduct a systematic review of ML for discovery of new subtypes and risk prediction in HF, ACS and AF. Methods PubMed, MEDLINE, and Web of Science databases were searched (January 2000-August 2018) for English language publications with agreed search terms pertaining to machine learning, clustering, CVD, subtype and risk prediction. The baseline characteristics of the study population, the method of ML, covariates and results were extracted for each study. Results Of 5012 identified studies, 43 met inclusion criteria. Of the 33 studies of unsupervised ML for disease clustering (mean n=2354; min 117, max 44886), there were 22 in HF, 9 in ACS and 2 in AF. 22/33 studies involved <1000 individuals and 24 were based in North America. Across diseases, 27 studies were in outpatients, and 5 used trial data. The mean number of covariates used was 26; most commonly demographic and symptom variables. The ML methods used were partitional (n=12), hierarchical (n=4), self-organising map (n=1) and hidden Markov model (n=1). Most studies used only one ML method (n=25). Only 15 studies validated or replicated findings. 20/33 studies found 2 or 3 disease clusters, Most studies found 2–3 clusters (20/33) and most clusters were based on physical or physiological characteristics (30/33). Of the 10 studies of supervised ML for risk prediction (mean n=43003; min 228, max 378256), 4 were in HF, 5 in ACS and 1 in AF. 2/11 studies involved <1000 individuals and most were from North America (n=6). All studies had an observational design, used at least 2 ML methods and validated or replicated findings. The setting was varied: primary care (n=2), emergency department (n=2), inpatient (n=4) and mixed (n=2). The mean number of covariates was 102. The commonest ML methods were neural networks (n=5), random forest (n=4) and support vector machine (n=4). All studies showed positive finding, i.e. ML approaches improved risk prediction. Conclusions Studies to-date of ML in HF, ACS and AF have focused on North America (68.2%), and 50% included less than 1000 individuals. Moreover, there is heterogeneity in clinical setting, study designs for data collection and ML methods used. Comparison between methods of ML and validation are common to studies of risk prediction but not disease clustering. There is likely to be a publication bias of ML studies in HF, AF and ACS. ML may improve data-driven characterisation of CVD but consensus guidelines for reporting of research using ML are urgently needed to ensure the internal and external validity and applicability of study findings. Acknowledgement/Funding Innovative Medicines Initiative (European Union)


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Jae Seung Kang ◽  
Chanhee Lee ◽  
Wookyeong Song ◽  
Wonho Choo ◽  
Seungyeoun Lee ◽  
...  

AbstractMost models for predicting malignant pancreatic intraductal papillary mucinous neoplasms were developed based on logistic regression (LR) analysis. Our study aimed to develop risk prediction models using machine learning (ML) and LR techniques and compare their performances. This was a multinational, multi-institutional, retrospective study. Clinical variables including age, sex, main duct diameter, cyst size, mural nodule, and tumour location were factors considered for model development (MD). After the division into a MD set and a test set (2:1), the best ML and LR models were developed by training with the MD set using a tenfold cross validation. The test area under the receiver operating curves (AUCs) of the two models were calculated using an independent test set. A total of 3,708 patients were included. The stacked ensemble algorithm in the ML model and variable combinations containing all variables in the LR model were the most chosen during 200 repetitions. After 200 repetitions, the mean AUCs of the ML and LR models were comparable (0.725 vs. 0.725). The performances of the ML and LR models were comparable. The LR model was more practical than ML counterpart, because of its convenience in clinical use and simple interpretability.


2019 ◽  
Vol 22 (3) ◽  
pp. 125-128 ◽  
Author(s):  
Daniel Whiting ◽  
Seena Fazel

Prediction models assist in stratifying and quantifying an individual’s risk of developing a particular adverse outcome, and are widely used in cardiovascular and cancer medicine. Whether these approaches are accurate in predicting self-harm and suicide has been questioned. We searched for systematic reviews in the suicide risk assessment field, and identified three recent reviews that have examined current tools and models derived using machine learning approaches. In this clinical review, we present a critical appraisal of these reviews, and highlight three major limitations that are shared between them. First, structured tools are not compared with unstructured assessments routine in clinical practice. Second, they do not sufficiently consider a range of performance measures, including negative predictive value and calibration. Third, the potential role of these models as clinical adjuncts is not taken into consideration. We conclude by presenting the view that the current role of prediction models for self-harm and suicide is currently not known, and discuss some methodological issues and implications of some machine learning and other analytic techniques for clinical utility.


Sign in / Sign up

Export Citation Format

Share Document