Machine learning provides evidence that stroke risk is not linear: The non-linear Framingham stroke risk score

Background: Current stroke risk assessment tools presume the impact of risk factors is linear and cumulative. However, both novel risk factors and their interplay influencing stroke incidence are difficult to reveal using traditional linear models. Objective: To improve upon the Revised-Framingham Stroke Risk Score and design an interactive non-linear Stroke Risk Score (NSRS). Our work aimed at increasing the accuracy of event prediction and uncovering new relationships in an interpretable user-friendly fashion. Methods: A two phase approach was used to develop our stroke risk score predictor. First, clinical examinations of the Framingham offspring cohort were utilized as the training dataset for the predictive model consisting of 14,196 samples where each clinical examination was considered an independent observation. Optimal Classification Trees (OCT) were used to train a model to predict 10-year stroke risk. Second, this model was validated with 17,527 observations from the Boston Medical Center. The NSRS was developed into an online user friendly application in the form of a questionnaire (http://www.mit.edu/~agniorf/files/questionnaire_Cohort2.html). Results: The algorithm suggests a key dichotomy between patients with or without history of cardiovascular disease. While the model agrees with known findings, it also identified 23 unique stroke risk profiles and introduced new non-linear relationships; such as the role of T-wave abnormality on electrocardiography and hematocrit levels in a patient’s risk profile. Our results in both the training and validation populations suggested that the non-linear approach significantly improves upon the existing revised Framingham stroke risk calculator in the c-statistic (training 87.43% (CI 0.85-0.90) vs. 73.74% (CI 0.70-0.76); validation 75.29% (CI 0.74-0.76) vs 65.93% (CI 0.64-0.67), even in multi-ethnicity populations. Conclusions: We constructed a highly predictive, interpretable and user-friendly stroke risk calculator using novel machine-learning uncovering new risk factors, interactions and unique profiles. The clinical implications include prioritization of risk factor modification and personalized care improving targeted intervention for stroke prevention.

Download Full-text

MACHINE LEARNING ISCHEMIA RISK SCORE FROM CORONARY CT ANGIOGRAPHY PREDICTS LESION-SPECIFIC ISCHEMIA AND IMPAIRED MYOCARDIAL BLOOD FLOW: RESULTS FROM THE PACIFIC TRIAL

Journal of the American College of Cardiology ◽

10.1016/s0735-1097(21)02627-9 ◽

2021 ◽

Vol 77 (18) ◽

pp. 1269

Author(s):

Andrew Lin ◽

Pepijn Van Diemen ◽

Manish Motwani ◽

Priscilla McElhinney ◽

Yuka Otaki ◽

...

Keyword(s):

Machine Learning ◽

Blood Flow ◽

Myocardial Blood Flow ◽

Ct Angiography ◽

Risk Score ◽

Coronary Ct Angiography ◽

Coronary Ct ◽

The Pacific

Download Full-text

Machine Learning Based Device Simulation Using Multi-variable Non-linear Regression to Assess the Impact of Device Parameter Variability on Threshold Voltage of Double Gate-All-Around (DGAA) MOSFET

2020 IEEE 2nd International Conference on Circuits and Systems (ICCS) ◽

10.1109/iccs51219.2020.9336608 ◽

2020 ◽

Author(s):

Sandeep Moparthi ◽

Chandan Yadav ◽

Gopi Krishna Saramekala ◽

Pramod Kumar Tiwari

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Threshold Voltage ◽

Device Simulation ◽

Double Gate ◽

Device Parameter ◽

Non Linear ◽

The Impact

Download Full-text

Prediction of preterm birth based on machine learning using bacterial risk score in cervicovaginal fluid

American Journal of Reproductive Immunology ◽

10.1111/aji.13435 ◽

2021 ◽

Author(s):

Sunwha Park ◽

Daejoong Oh ◽

Hanna Heo ◽

Gain Lee ◽

Soo Min Kim ◽

...

Keyword(s):

Machine Learning ◽

Preterm Birth ◽

Risk Score ◽

Cervicovaginal Fluid

Download Full-text

A machine-learning-based alloy design platform that enables both forward and inverse predictions for thermo-mechanically controlled processed (TMCP) steel alloys

Scientific Reports ◽

10.1038/s41598-021-90237-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jin-Woong Lee ◽

Chaewon Park ◽

Byung Do Lee ◽

Joonseo Park ◽

Nam Hoon Goo ◽

...

Keyword(s):

Machine Learning ◽

Feature Space ◽

Research Strategy ◽

Nsga Ii ◽

Steel Alloys ◽

Non Linear ◽

Data Driven Approach ◽

Processing Information ◽

Set Up ◽

World Industry

AbstractPredicting mechanical properties such as yield strength (YS) and ultimate tensile strength (UTS) is an intricate undertaking in practice, notwithstanding a plethora of well-established theoretical and empirical models. A data-driven approach should be a fundamental exercise when making YS/UTS predictions. For this study, we collected 16 descriptors (attributes) that implicate the compositional and processing information and the corresponding YS/UTS values for 5473 thermo-mechanically controlled processed (TMCP) steel alloys. We set up an integrated machine-learning (ML) platform consisting of 16 ML algorithms to predict the YS/UTS based on the descriptors. The integrated ML platform involved regularization-based linear regression algorithms, ensemble ML algorithms, and some non-linear ML algorithms. Despite the dirty nature of most real-world industry data, we obtained acceptable holdout dataset test results such as R2 > 0.6 and MSE < 0.01 for seven non-linear ML algorithms. The seven fully trained non-linear ML models were used for the ensuing ‘inverse design (prediction)’ based on an elitist-reinforced, non-dominated sorting genetic algorithm (NSGA-II). The NSGA-II enabled us to predict solutions that exhibit desirable YS/UTS values for each ML algorithm. In addition, the NSGA-II-driven solutions in the 16-dimensional input feature space were visualized using holographic research strategy (HRS) in order to systematically compare and analyze the inverse-predicted solutions for each ML algorithm.

Download Full-text

Physically interpretable machine learning algorithm on multidimensional non-linear fields

Journal of Computational Physics ◽

10.1016/j.jcp.2020.110074 ◽

2021 ◽

Vol 428 ◽

pp. 110074

Author(s):

Rem-Sophia Mouradi ◽

Cédric Goeury ◽

Olivier Thual ◽

Fabrice Zaoui ◽

Pablo Tassi

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Interpretable Machine Learning ◽

Non Linear

Download Full-text

All-cause Dementia Prediction by Machine Learning: The Health, Aging, and Body Composition Study

Innovation in Aging ◽

10.1093/geroni/igaa057.1575 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

pp. 487-487

Author(s):

Chenkai Wu ◽

Xurui Jin

Keyword(s):

Machine Learning ◽

Body Composition ◽

Risk Prediction ◽

Risk Score ◽

Body Composition Study ◽

C Statistic ◽

Dementia Risk ◽

Health Aging ◽

Composition Study ◽

Traditional Approaches

Abstract There are several shortcomings of the currently available risk prediction models for dementia. We developed a risk prediction model for dementia using machine-learning approach and compared its performance with traditional approaches. Data were from the Health, Aging, and Body Composition Study, comprising 3,075 older adults (at least 70 years). Dementia was defined as (1) use of a prescribed dementia medication, (2) adjudicated dementia diagnosis, or (3) a race-stratified cognitive decline>1.5 SDs from the baseline mean. We selected 275 predictors collected from questionnaires, imaging data, performance testing, and biospecimen. We used random survival forest (RSF) to build the full model and rank the importance of predictors. Subsequently, we built parsimonious models with top-20 predictors using RSF and Cox regression. A dementia risk score was developed using top-ranked variables. We used the C-statistic for performance evaluation. Over a median of 11.4 years of follow-up, 659 dementias (21.4%) occurred. The RSF model (both including all and top-20 variables) showed a higher C-statistic than the regression model. Digit symbol score, physical performance battery, finger tapping score, weight change since age 50, serum adiponectin, and APOE genotype were the top-6 variables. We created a dementia risk score (0-10) using the top-6 variables. A 1-unit increase in the risk score was associated with an 8% higher risk of dementia. The risk score demonstrated good discrimination (C-statistic=0.75). Machine learning methods offered improvement over traditional approaches in predicting dementia. The risk prediction score derived from a parsimonious model had good prediction performance.

Download Full-text

Stroke Risk Stratification and its Validation using Ultrasonic Echolucent Carotid Wall Plaque Morphology: A Machine Learning Paradigm

Computers in Biology and Medicine ◽

10.1016/j.compbiomed.2016.11.011 ◽

2017 ◽

Vol 80 ◽

pp. 77-96 ◽

Cited By ~ 21

Author(s):

Tadashi Araki ◽

Pankaj K. Jain ◽

Harman S. Suri ◽

Narendra D. Londhe ◽

Nobutaka Ikeda ◽

...

Keyword(s):

Machine Learning ◽

Risk Stratification ◽

Stroke Risk ◽

Plaque Morphology ◽

Learning Paradigm ◽

Carotid Wall

Download Full-text

Application of data driven machine learning approach for modelling of non-linear filtration through granular porous media

International Journal of Heat and Mass Transfer ◽

10.1016/j.ijheatmasstransfer.2021.121650 ◽

2021 ◽

Vol 179 ◽

pp. 121650

Author(s):

Ashes Banerjee ◽

Srinivas Pasupuleti ◽

Koushik Mondal ◽

M. Mousavi Nezhad

Keyword(s):

Machine Learning ◽

Porous Media ◽

Data Driven ◽

Learning Approach ◽

Machine Learning Approach ◽

Non Linear ◽

Granular Porous Media ◽

Linear Filtration

Download Full-text

Effect of a Real-Time Risk Score on 30-day Readmission Reduction in Singapore

Applied Clinical Informatics ◽

10.1055/s-0041-1726422 ◽

2021 ◽

Vol 12 (02) ◽

pp. 372-382

Author(s):

Christine Xia Wu ◽

Ernest Suresh ◽

Francis Wei Loong Phng ◽

Kai Pik Tai ◽

Janthorn Pakdeethai ◽

...

Keyword(s):

Machine Learning ◽

High Risk ◽

Real Time ◽

Risk Score ◽

Patient Specific ◽

Learning Models ◽

Medicine Department ◽

High Risk Patients ◽

Risk Patients ◽

Machine Learning Models

Abstract Objective To develop a risk score for the real-time prediction of readmissions for patients using patient specific information captured in electronic medical records (EMR) in Singapore to enable the prospective identification of high-risk patients for enrolment in timely interventions. Methods Machine-learning models were built to estimate the probability of a patient being readmitted within 30 days of discharge. EMR of 25,472 patients discharged from the medicine department at Ng Teng Fong General Hospital between January 2016 and December 2016 were extracted retrospectively for training and internal validation of the models. We developed and implemented a real-time 30-day readmission risk score generation in the EMR system, which enabled the flagging of high-risk patients to care providers in the hospital. Based on the daily high-risk patient list, the various interfaces and flow sheets in the EMR were configured according to the information needs of the various stakeholders such as the inpatient medical, nursing, case management, emergency department, and postdischarge care teams. Results Overall, the machine-learning models achieved good performance with area under the receiver operating characteristic ranging from 0.77 to 0.81. The models were used to proactively identify and attend to patients who are at risk of readmission before an actual readmission occurs. This approach successfully reduced the 30-day readmission rate for patients admitted to the medicine department from 11.7% in 2017 to 10.1% in 2019 (p < 0.01) after risk adjustment. Conclusion Machine-learning models can be deployed in the EMR system to provide real-time forecasts for a more comprehensive outlook in the aspects of decision-making and care provision.

Download Full-text