scholarly journals Machine Learning and Statistical Approaches for Classification of Risk of Coronary Artery Disease using Plasma Cytokines.

2020 ◽  
Author(s):  
Seema Singh Saharan ◽  
Pankaj Nagar ◽  
Kate Townsend Creasy ◽  
Eveline O. Stock ◽  
James Feng ◽  
...  

Abstract Background As per the 2017 WHO fact sheet, Coronary Artery Disease (CAD) is the primary cause of death in the world, and accounts for 31% of total fatalities. The unprecedented 17.6 million deaths caused by CAD in 2016 underscores the urgent need to facilitate proactive and accelerated pre-emptive diagnosis. The current research took an innovative approach to implement K Nearest Neighbor (k-NN) and ensemble Random Forest Machine Learning algorithms to achieve a targeted “At Risk” Coronary Artery Disease (CAD) classification. To ensure better generalizability mechanisms like k-fold cross validation, hyperparameter tuning and statistical significance (p<.05) were employed. The classification is also unique from the aspect of incorporating 35 cytokines as biomarkers within the predictive feature space of Machine Learning algorithms.Results A total of seven classifiers were developed, with four built using 35 cytokine predictive features and three built using 9 cytokines statistically significant (p<.05) across CAD versus Control groups determined by independent two sample t tests. The best prediction accuracy of 100% was achieved by Random Forest ensemble using nine significant cytokines. Significant cytokines were selected to decrease the noise level of the data, allowing for better classification. Additionally, from the bio-medical perspective, it was enlightening to empirically observe the interplay of the cytokines. Compared to Controls, moderately correlated (correlation coefficient r=.5) cytokines “IL1-β”, “IL-10” were both significant and down regulated in the CAD group. Both cytokines were primarily responsible for the Random forest generated 100% classification. In conjunction with Machine Learning (ML) algorithms, the traditional statistical techniques like correlation and t tests were leveraged to obtain insights that brought forth a role for cytokines in the investigation of CAD risk.Conclusions Presently, as large-scale efforts are gaining momentum to enable early detection of individuals at risk for CAD by the application of novel and powerful ML algorithms, detection can be further improved by incorporating additional biomarkers. Investigation of emerging role of cytokines in CAD can materially enhance the detection of risk and the discovery of mechanisms of disease that can lead to new therapeutic approaches.

2020 ◽  
Author(s):  
Seema Singh Saharan ◽  
Pankaj Nagar ◽  
Kate Townsend Creasy ◽  
Eveline O. Stock ◽  
James Feng ◽  
...  

Abstract BackgroundAs per the 2017 WHO fact sheet, Coronary Artery Disease (CAD) is the primary cause of death in the world, and accounts for 31% of total fatalities. The unprecedented 17.6 million deaths caused by CAD in 2016 underscores the urgent need to facilitate proactive and accelerated pre-emptive diagnosis. The innovative and emerging Machine Learning (ML) techniques can be leveraged to facilitate early detection of CAD which is a crucial factor in saving lives. The standard techniques like angiography, that provide reliable evidence are invasive and typically very expensive and risky. In contrast, ML model generated diagnosis is non-invasive, fast, accurate and affordable. Therefore, it can be used as a supplement or precursor to the conventional methods. This research demonstrates the implementation of K Nearest Neighbor (k-NN) and Random Forest ML algorithms to achieve a targeted “At Risk” CAD classification using an emerging set of 35 cytokine biomarkers that are strongly indicative predictive variables that can be potential targets for therapy. To ensure better generalizability, mechanisms such as data balancing, k-fold cross validation for hyperparameter tuning, feature selection via feature importance identification were integrated within the models.ResultsA total of 5 classifiers were developed, with two built using 35 cytokine predictive features and three built using a subset of cytokines, selected by variable importance techniques namely Random Forest, ReliefF and Boruta. The best Area under Receiver Operating Characteristic (AUROC) based accuracy of .99 was achieved by the Random Forest classifier with 35 cytokine biomarkers. The second-best AUROC accuracy was achieved by the k-NN model using cytokines selected by the Random Forest variable importance selection mechanism.ConclusionsPresently, as large-scale efforts are gaining momentum to enable early, fast, reliable, affordable, and accessible detection of individuals at risk for CAD, the application of powerful ML algorithms can be leveraged as a supplement to the conventional treatments such as angiography. The early detection can be further improved by incorporating 65 novel and sensitive cytokines biomarkers. Investigation of the emerging role of cytokines in CAD can materially enhance the detection of risk and the discovery of mechanisms of disease that can lead to new therapeutic modalities.


Author(s):  
Harinder Singh ◽  
Tasneem Bano Rehman ◽  
Ch. Gangadhar ◽  
Rohit Anand ◽  
Nidhi Sindhwani ◽  
...  

2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Seema Singh Saharan ◽  
Pankaj Nagar ◽  
Kate Townsend Creasy ◽  
Eveline O. Stock ◽  
James Feng ◽  
...  

Abstract Background As per the 2017 WHO fact sheet, Coronary Artery Disease (CAD) is the primary cause of death in the world, and accounts for 31% of total fatalities. The unprecedented 17.6 million deaths caused by CAD in 2016 underscores the urgent need to facilitate proactive and accelerated pre-emptive diagnosis. The innovative and emerging Machine Learning (ML) techniques can be leveraged to facilitate early detection of CAD which is a crucial factor in saving lives. The standard techniques like angiography, that provide reliable evidence are invasive and typically expensive and risky. In contrast, ML model generated diagnosis is non-invasive, fast, accurate and affordable. Therefore, ML algorithms can be used as a supplement or precursor to the conventional methods. This research demonstrates the implementation and comparative analysis of K Nearest Neighbor (k-NN) and Random Forest ML algorithms to achieve a targeted “At Risk” CAD classification using an emerging set of 35 cytokine biomarkers that are strongly indicative predictive variables that can be potential targets for therapy. To ensure better generalizability, mechanisms such as data balancing, repeated k-fold cross validation for hyperparameter tuning, were integrated within the models. To determine the separability efficacy of “At Risk” CAD versus Control achieved by the models, Area under Receiver Operating Characteristic (AUROC) metric is used which discriminates the classes by exhibiting tradeoff between the false positive and true positive rates. Results A total of 2 classifiers were developed, both built using 35 cytokine predictive features. The best AUROC score of .99 with a 95% Confidence Interval (CI) (.982,.999) was achieved by the Random Forest classifier using 35 cytokine biomarkers. The second-best AUROC score of .954 with a 95% Confidence Interval (.929,.979) was achieved by the k-NN model using 35 cytokines. A p-value of less than 7.481e-10 obtained by an independent t-test validated that Random Forest classifier was significantly better than the k-NN classifier with regards to the AUROC score. Presently, as large-scale efforts are gaining momentum to enable early, fast, reliable, affordable, and accessible detection of individuals at risk for CAD, the application of powerful ML algorithms can be leveraged as a supplement to conventional methods such as angiography. Early detection can be further improved by incorporating 65 novel and sensitive cytokine biomarkers. Investigation of the emerging role of cytokines in CAD can materially enhance the detection of risk and the discovery of mechanisms of disease that can lead to new therapeutic modalities.


2021 ◽  
Author(s):  
Seema Singh Saharan ◽  
Pankaj Nagar ◽  
Kate Townsend Creasy ◽  
Eveline O. Stock ◽  
James Feng ◽  
...  

Abstract Background As per the 2017 WHO fact sheet, Coronary Artery Disease (CAD) is the primary cause of death in the world, and accounts for 31% of total fatalities. The unprecedented 17.6 million deaths caused by CAD in 2016 underscores the urgent need to facilitate proactive and accelerated pre-emptive diagnosis. The innovative and emerging Machine Learning (ML) techniques can be leveraged to facilitate early detection of CAD which is a crucial factor in saving lives. The standard techniques like angiography, that provide reliable evidence are invasive and typically expensive and risky. In contrast, ML model generated diagnosis is non-invasive, fast, accurate and affordable. Therefore, ML algorithms can be used as a supplement or precursor to the conventional methods. This research demonstrates the implementation and comparative analysis of K Nearest Neighbor (k-NN) and Random Forest ML algorithms to achieve a targeted “At Risk” CAD classification using an emerging set of 35 cytokine biomarkers that are strongly indicative predictive variables that can be potential targets for therapy. To ensure better generalizability, mechanisms such as data balancing, repeated k-fold cross validation for hyperparameter tuning, were integrated within the models. To determine the separability efficacy of “At Risk” CAD versus Control achieved by the models, Area under Receiver Operating Characteristic (AUROC) metric is used which discriminates the classes by exhibiting tradeoff between the false positive and true positive rates.Results A total of 2 classifiers were developed, both built using 35 cytokine predictive features. The best AUROC score of .99 with a 95% Confidence Interval(CI) (.982,.999) was achieved by the Random Forest classifier using 35 cytokine biomarkers. The second-best AUROC score of .954 with a 95% Confidence Interval (.929,.979) was achieved by the k-NN model using 35 cytokines. A p-value of less than 7.481e-10 obtained by an independent t-test validated that Random Forest classifier was significantly better than the k-NN classifier with regards to the AUROC score.Presently, as large-scale efforts are gaining momentum to enable early, fast, reliable, affordable, and accessible detection of individuals at risk for CAD, the application of powerful ML algorithms can be leveraged as a supplement to conventional methods such as angiography. Early detection can be further improved by incorporating 65 novel and sensitive cytokine biomarkers. Investigation of the emerging role of cytokines in CAD can materially enhance the detection of risk and the discovery of mechanisms of disease that can lead to new therapeutic modalities.


2021 ◽  
Vol 8 ◽  
Author(s):  
Chen Wang ◽  
Yue Zhao ◽  
Bingyu Jin ◽  
Xuedong Gan ◽  
Bin Liang ◽  
...  

Early identification of coronary artery disease (CAD) can prevent the progress of CAD and effectually lower the mortality rate, so we intended to construct and validate a machine learning model to predict the risk of CAD based on conventional risk factors and lab test data. There were 3,112 CAD patients and 3,182 controls enrolled from three centers in China. We compared the baseline and clinical characteristics between two groups. Then, Random Forest algorithm was used to construct a model to predict CAD and the model was assessed by receiver operating characteristic (ROC) curve. In the development cohort, the Random Forest model showed a good AUC 0.948 (95%CI: 0.941–0.954) to identify CAD patients from controls, with a sensitivity of 90%, a specificity of 85.4%, a positive predictive value of 0.863 and a negative predictive value of 0.894. Validation of the model also yielded a favorable discriminatory ability with the AUC, sensitivity, specificity, positive predictive value, and negative predictive value of 0.944 (95%CI: 0.934–0.955), 89.5%, 85.8%, 0.868, and 0.886 in the validation cohort 1, respectively, and 0.940 (95%CI: 0.922–0.960), 79.5%, 94.3%, 0.932, and 0.823 in the validation cohort 2, respectively. An easy-to-use tool that combined 15 indexes to assess the CAD risk was constructed and validated using Random Forest algorithm, which showed favorable predictive capability (http://45.32.120.149:3000/randomforest). Our model is extremely valuable for clinical practice, which will be helpful for the management and primary prevention of CAD patients.


Diagnostics ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 961
Author(s):  
Yu-Cheng Hsu ◽  
I-Jung Tsai ◽  
Hung Hsu ◽  
Po-Wen Hsu ◽  
Ming-Hui Cheng ◽  
...  

Machine learning (ML) algorithms have been applied to predicting coronary artery disease (CAD). Our purpose was to utilize autoantibody isotypes against four different unmodified and malondialdehyde (MDA)-modified peptides among Taiwanese with CAD and healthy controls (HCs) for CAD prediction. In this study, levels of MDA, MDA-modified protein (MDA-protein) adducts, and autoantibody isotypes against unmodified peptides and MDA-modified peptides were measured with enzyme-linked immunosorbent assay (ELISA). To improve the performance of ML, we used decision tree (DT), random forest (RF), and support vector machine (SVM) coupled with five-fold cross validation and parameters optimization. Levels of plasma MDA and MDA-protein adducts were higher in CAD patients than in HCs. IgM anti-IGKC76–99 MDA and IgM anti-A1AT284–298 MDA decreased the most in patients with CAD compared to HCs. In the experimental results of CAD prediction, the decision tree classifier achieved an area under the curve (AUC) of 0.81; the random forest classifier achieved an AUC of 0.94; the support vector machine achieved an AUC of 0.65 for differentiating between CAD patients with stenosis rates of 70% and HCs. In this study, we demonstrated that autoantibody isotypes imported into machine learning algorithms can lead to accurate models for clinical use.


Sign in / Sign up

Export Citation Format

Share Document