Identifying relatively high-risk group of coronary artery calcification based on progression rate: Statistical and machine learning methods

Author(s):  
Ha-Young Kim ◽  
Sanghyun Yoo ◽  
Jihyun Lee ◽  
Hye Jin Kam ◽  
Kyoung-Gu Woo ◽  
...  
2020 ◽  
Vol 21 (Supplement_1) ◽  
Author(s):  
D M Adamczak ◽  
M Bednarski ◽  
A Rogala ◽  
M Antoniak ◽  
T Kiebalo ◽  
...  

Abstract BACKGROUND Hypertrophic cardiomyopathy (HCM) is a heart disease characterized by hypertrophy of the left ventricular myocardium. The disease is the most common cause of sudden cardiac death (SCD) in young people and competitive athletes due to fatal ventricular arrhythmias, but in most patients, however, HCM has a benign course. Therefore, it is of the utmost importance to properly evaluate patients and identify those who would benefit from a cardioverter-defibrillator (ICD) implantation. The HCM SCD-Risk Calculator is a useful tool for estimating the 5-year risk of SCD. Parameters included in the model at evaluation are: age, maximum left ventricular wall thickness, left atrial dimension, maximum gradient in left ventricular outflow tract, family history of SCD, non-sustained ventricular tachycardia and unexplained syncope. Patients’ risk of SCD is classified as low (<4%), intermediate (4-<6%) or high (≥6%). Those in the high-risk group should have an ICD implantation. It can also be considered in the intermediate-risk group. However, the calculator still needs improvement and machine learning (ML) has the potential to fulfill this task. ML algorithm creates a model for solving a specific problem without explicit programming - instead it relies only on available data - by discovering patterns and relations. METHODS 252 HCM patients (aged 20-88 years, 49,6% were men) treated in our Department from 2005 to 2018, have been enrolled. The follow-up lasted 0-13 years (average: 3.8 years). SCD was defined as sudden cardiac arrest (SCA) or an appropriate ICD intervention. All parameters from HCM SCD-Risk Calculator have been obtained and the risk of SCD has been calculated for all patients during the first echocardiographic evaluation. ML model with variables from HCM SCD-Risk Calculator has been created. Both methods have been compared. RESULTS 20 patients reached an SCD end-point. 1 patient died due to SCA and 19 had an appropriate ICD intervention. Among them, there were respectively 6, 7 and 7 patients in the low, intermediate and high-risk group of SCD. 1 patient, who died, had a low risk. The ML model correctly assessed the SCD event only in 1 patient. According to ML, the risk of SCD ≤2.07% was a negative predictor. CONCLUSIONS The study did not show an advantage of ML over HCM SCD-Risk Calculator. Because of the characteristic of the dataset (approximately the same number of features and observations), the selection of machine learning algorithms was limited. Best results (evaluated using LOOCV) were achieved with a decision tree. We expect that bigger dataset would allow improving model performance because of strong regularization need in the current setup.


Circulation ◽  
2015 ◽  
Vol 131 (suppl_2) ◽  
Author(s):  
Kazuyuki Naoi ◽  
Yuji Asami ◽  
Shinichi Takatsuki ◽  
Mitsuru Seki ◽  
Satoshi Ikehara ◽  
...  

Background: Although RAISE study demonstrated that intravenous immunoglobulins (IVIG) plus prednisolone (PSL) therapy for refractory Kawasaki disease (KD) reduced coronary artery aneurysm, the study population included a few incomplete KD. The aim of study was to evaluate the efficacy of RAISE protocol for incomplete KD. Methods: Children with incomplete KD which have 4 or less major symptoms of KD were enrolled. Using Kobayashi score, children were divided into low and high risk for IVIG resistance. Children with low risk group received IVIG monotherapy, whereas IVIG plus PSL therapy administrated for high risk group. Retrospectively, we assessed initial treatment response and coronary artery abnormalities (CAAs). Results: Overall, 63 incomplete KD (age; median and range 17 months (1-115 months), sex ratio; boys:girls 37:26) were enrolled. Median day of illness at diagnosis was 5 day of illness (2-10 days). Low risk group included 52 cases (83%) and the remaining 11 cases were high risk group. In low risk group, 87% of children (45 cases) were initial treatment responders. All 7 non-responders were responded to additional methylprednisolone steroid pulse therapy. All 11 children with high risk group were responders to the initial treatment. Five cases have equal to greater than 2.5 of Z-Score, which were all low risk group. All CAAs regressed to normal coronary diameter within 1 month. Conclusions: RAISE protocol was useful for the treatment of incomplete KD without any CAAs.


2014 ◽  
Vol 44 (15) ◽  
pp. 3289-3302 ◽  
Author(s):  
K. J. Wardenaar ◽  
H. M. van Loo ◽  
T. Cai ◽  
M. Fava ◽  
M. J. Gruber ◽  
...  

Background.Although variation in the long-term course of major depressive disorder (MDD) is not strongly predicted by existing symptom subtype distinctions, recent research suggests that prediction can be improved by using machine learning methods. However, it is not known whether these distinctions can be refined by added information about co-morbid conditions. The current report presents results on this question.Method.Data came from 8261 respondents with lifetime DSM-IV MDD in the World Health Organization (WHO) World Mental Health (WMH) Surveys. Outcomes included four retrospectively reported measures of persistence/severity of course (years in episode; years in chronic episodes; hospitalization for MDD; disability due to MDD). Machine learning methods (regression tree analysis; lasso, ridge and elastic net penalized regression) followed by k-means cluster analysis were used to augment previously detected subtypes with information about prior co-morbidity to predict these outcomes.Results.Predicted values were strongly correlated across outcomes. Cluster analysis of predicted values found three clusters with consistently high, intermediate or low values. The high-risk cluster (32.4% of cases) accounted for 56.6–72.9% of high persistence, high chronicity, hospitalization and disability. This high-risk cluster had both higher sensitivity and likelihood ratio positive (LR+; relative proportions of cases in the high-risk cluster versus other clusters having the adverse outcomes) than in a parallel analysis that excluded measures of co-morbidity as predictors.Conclusions.Although the results using the retrospective data reported here suggest that useful MDD subtyping distinctions can be made with machine learning and clustering across multiple indicators of illness persistence/severity, replication with prospective data is needed to confirm this preliminary conclusion.


2019 ◽  
Vol 5 (suppl) ◽  
pp. 13-13
Author(s):  
Po-Jung SU ◽  
Yu-Ann Fang ◽  
Yung-Chun Chang ◽  
Yung-Chia Kuo ◽  
Yung-Chang Lin

13 Background: For de novo metastatic prostate cancer (mPC)) patients, their prognosis may be really different. Some of these patients response very well to hormone therapy with durable survival, but others may be not. For those poor prognosis patients, if we could predict them as high risk patients when diagnosed, and provide aggressive upfront chemotherapy or novel hormonal therapy, they might get better treatment outcomes. Methods: We used data of prostate cancer patients from 2000 to 2016 in Chang Gung Research Database. There are 799 de novo mPC patients with castration. We predicted the possibility for these patients progressed to metastatic castration-resistant prostate cancer (mCRPC) in 1 year and find the high risk group patients. Then we figured out the best features for prediction from the best classifier with Recursive Feature Elimination. Results: The de nove mPC patients who pregressed to mCRPC in 1 year, whose mOS is 21.9 months is worse than who progressed to mCRPC beyond 1 year significantly, whose mOS is 80.7 months. (adjusted hazard ratio[aHR]: 6.43, P<0.001). The overall performance of machine learning by XGBoost is the best in all predictive models for high risk patients. (AUC=0.7000, Accuracy=0.7143). We excluded the features with missing data over 50%, then put all other features in the model. (AUC=0.7042, Accuracy=0.7239). But we got the best performance with only 11 features, including age, time from diagnosis to castration, nadir PSA, hemoglobin, eosinophil/white blood cell ratio, alkaline phosphatase, alanine transaminase, blood urea nitrogen, creatinine, prothrombin time, and secondary primary cancer, by Recursive Feature Elimination. (AUC=0.7131, Accuracy=0.7267). Conclusions: We found the predictive model has better predictive accuracy and shorter manuscript time with less features selected by Recursive Feature Elimination.We can predict high risk group in de novo mPC patients and make better clinical decision for treatment with this XGBoost model.


2020 ◽  
Author(s):  
Anjiao Peng ◽  
Xiaorong Yang ◽  
Zhining Wen ◽  
Wanling Li ◽  
Yusha Tang ◽  
...  

Abstract Background : Stroke is one of the most important causes of epilepsy and we aimed to find if it is possible to predict patients with high risk of developing post-stroke epilepsy (PSE) at the time of discharge using machine learning methods. Methods : Patients with stroke were enrolled and followed at least one year. Machine learning methods including support vector machine (SVM), random forest (RF) and logistic regression (LR) were used to learn the data. Results : A total of 2730 patients with cerebral infarction and 844 patients with cerebral hemorrhage were enrolled and the risk of PSE was 2.8% after cerebral infarction and 7.8% after cerebral hemorrhage in one year. Machine learning methods showed good performance in predicting PSE. The area under the receiver operating characteristic curve (AUC) for SVM and RF in predicting PSE after cerebral infarction was close to 1 and it was 0.92 for LR. When predicting PSE after cerebral hemorrhage, the performance of SVM was best with AUC being close to 1, followed by RF ( AUC = 0.99) and LR (AUC = 0.85) . Conclusion : Machine learning methods could be used to predict patients with high risk of developing PSE, which will help to stratify patients with high risk and start treatment earlier. Nevertheless, more work is needed before the application of thus intelligent predictive model in clinical practice.


2014 ◽  
Vol 32 (4_suppl) ◽  
pp. 331-331
Author(s):  
Satoru Muto ◽  
Takeshi Ieda ◽  
Syou-ichiro Sugiura ◽  
Akiko Nakajima ◽  
Akira Horiuchi ◽  
...  

331 Background: To predict recurrence and progression of non-muscle invasive bladder cancer (NMIBC), EORTC risk tables are widely used worldwide. EORTC risk tables were, however, developed on the basis of individual data from 2,596 NMIBC patients included in seven special European Organization for Research and Treatment of Cancer trials. Therefore, it is not clear the efficacy of these risk tables in clinical practice, especially in Japan. I will report the recurrence and progression rate on the basis of EORTC risk tables in Japanese NMIBC patients. Methods: A retrospective analysis of 619 patients with NMIBC treated between January 1998 and 2012 was performed. Patients were divided into three groups on the basis of EORTC risk tables. We compared recurrence- and progression-free survival rates between groups. Recurrence- and progression-free survival was estimated using the Kaplan-Meier method. Results: We evaluated the clinical outcome of 1,032 TUR-Bt. The recurrence rate is 32.3% in low risk group (n=31), 44.5% in intermediate risk group (n=757), and 49.4% in high risk group (n=85). The median recurrence free survival time is 87 months in low risk group, 35 months in intermediate risk group, and 25 months in high risk group. Although there are significant differences in recurrence free survival time between low risk group and intermediate risk group (p=0.0351), there are no significant differences between intermediate risk group and high risk group (p=0.1871). On the other hand, the progression rate is 1.6% in low risk group (n=128), 5.8% in intermediate risk group (n=451), and 18.0% in high risk group (n=294). The median progression free survival time is 176 months in low risk group, 131 months in intermediate risk group, and 109 months in high risk group. There are significant differences in progression free survival time between low risk group and intermediate risk group (p=0.0138), and between intermediate risk group and high risk group (p=<0.0001). Conclusions: There is an urgent need to establish the standard of recurrence risk classification in Japan.


2020 ◽  
Author(s):  
Lili Chan ◽  
Girish N. Nadkarni ◽  
Fergus Fleming ◽  
James R. McCullough ◽  
Patti Connolly ◽  
...  

ABSTRACTImportanceDiabetic kidney disease (DKD) is the leading cause of kidney failure in the United States and predicting progression is necessary for improving outcomes.ObjectiveTo develop and validate a machine-learned, prognostic risk score (KidneyIntelX™) combining data from electronic health records (EHR) and circulating biomarkers to predict DKD progression.DesignObservational cohort studySettingTwo EHR linked biobanks: Mount Sinai BioMe Biobank and the Penn Medicine Biobank.ParticipantsPatients with prevalent DKD (G3a-G3b with all grades of albuminuria (A1-A3) and G1 & G2 with A2-A3 level albuminuria) and banked plasma.Main outcomes and measuresPlasma biomarkers soluble tumor necrosis factor 1/2 (sTNFR1, sTNFR2) and kidney injury molecule-1 (KIM-1) were measured at baseline. Patients were divided into derivation [60%] and validation sets [40%]. The composite primary end point, progressive decline in kidney function, including the following: rapid kidney function decline (RKFD) (estimated glomerular filtration rate (eGFR) decline of ≥5 ml/min/1.73m2/year), ≥40% sustained decline, or kidney failure within 5 years. A machine learning model (random forest) was trained and performance assessed using standard metrics.ResultsIn 1146 patients with DKD the median age was 63, 51% were female, median baseline eGFR was 54 ml/min/1.73 m2, urine albumin to creatinine ratio (uACR) was 61 mg/g, and follow-up was 4.3 years. 241 patients (21%) experienced progressive decline in kidney function. On 10-fold cross validation in the derivation set (n=686), the risk model had an area under the curve (AUC) of 0.77 (95% CI 0.74-0.79). In validation (n=460), the AUC was 0.77 (95% CI 0.76-0.79). By comparison, the AUC for an optimized clinical model was 0.62 (95% CI 0.61-0.63) in derivation and 0.61 (95% CI 0.60-0.63) in validation. Using cutoffs from derivation, KidneyIntelX stratified 46%, 37% and 16.5% of validation cohort into low-, intermediate- and high-risk groups, with a positive predictive value (PPV) of 62% (vs. PPV of 37% for the clinical model and 40% for KDIGO; p < 0.001) in the high-risk group and a negative predictive value (NPV) of 91% in the low-risk group. The net reclassification index for events into high-risk group was 41% (p<0.05).Conclusions and RelevanceA machine learned model combining plasma biomarkers and EHR data improved prediction of progressive decline in kidney function within 5 years over KDIGO and standard clinical models in patients with early DKD.


2021 ◽  
Author(s):  
Yafei Wu ◽  
Zhongquan Jiang ◽  
Shaowu Lin ◽  
Ya Fang

Abstract Background: Prediction of stroke based on individuals’ risk factors, especially for a first stroke event, is of great significance for primary prevention of high-risk populations. Our study aimed to investigate the applicability of interpretable machine learning for predicting a 2-year stroke occurrence in older adults compared with logistic regression.Methods: A total of 5960 participants consecutively surveyed from July 2011 to August 2013 in the China Health and Retirement Longitudinal Study were included for analysis. We constructed a traditional logistic regression (LR) and two machine learning methods, namely random forest (RF) and extreme gradient boosting (XGBoost), to distinguish stroke occurrence versus non-stroke occurrence using data on demographics, lifestyle, disease history, and clinical variables. Grid search and 10-fold cross validation were used to tune the hyperparameters. Model performance was assessed by discrimination, calibration, decision curve and predictiveness curve analysis.Results: Among the 5960 participants, 131 (2.20%) of them developed stroke after an average of 2-year follow-up. Our prediction models distinguished stroke occurrence versus non-stroke occurrence with excellent performance. The AUCs of machine learning methods (RF, 0.823[95% CI, 0.759-0.886]; XGBoost, 0.808[95% CI, 0.730-0.886]) were significantly higher than LR (0.718[95% CI, 0.649, 0.787], p<0.05). No significant difference was observed between RF and XGBoost (p>0.05). All prediction models had good calibration results, and the brier score were 0.022 (95% CI, 0.015-0.028) in LR, 0.019 (95% CI, 0.014-0.025) in RF, and 0.020 (95% CI, 0.015-0.026) in XGBoost. XGBoost had much higher net benefits within a wider threshold range in terms of decision curve analysis, and more capable of recognizing high risk individuals in terms of predictiveness curve analysis. A total of eight predictors including gender, waist-to-height ratio, dyslipidemia, glycated hemoglobin, white blood cell count, blood glucose, triglycerides, and low-density lipoprotein cholesterol ranked top 5 in three prediction models.Conclusions: Machine learning methods, especially for XGBoost, had the potential to predict stroke occurrence compared with traditional logistic regression in the older adults.


Sign in / Sign up

Export Citation Format

Share Document