Predictive models for identifying risk of readmission after index hospitalization for heart failure: A systematic review

2018 ◽  
Vol 17 (8) ◽  
pp. 675-689 ◽  
Author(s):  
Satish M Mahajan ◽  
Paul Heidenreich ◽  
Bruce Abbott ◽  
Ana Newton ◽  
Deborah Ward

Aims: Readmission rates for patients with heart failure have consistently remained high over the past two decades. As more electronic data, computing power, and newer statistical techniques become available, data-driven care could be achieved by creating predictive models for adverse outcomes such as readmissions. We therefore aimed to review models for predicting risk of readmission for patients admitted for heart failure. We also aimed to analyze and possibly group the predictors used across the models. Methods: Major electronic databases were searched to identify studies that examined correlation between readmission for heart failure and risk factors using multivariate models. We rigorously followed the review process using PRISMA methodology and other established criteria for quality assessment of the studies. Results: We did a detailed review of 334 papers and found 25 multivariate predictive models built using data from either health system or trials. A majority of models was built using multiple logistic regression followed by Cox proportional hazards regression. Some newer studies ventured into non-parametric and machine learning methods. Overall predictive accuracy with C-statistics ranged from 0.59 to 0.84. We examined significant predictors across the studies using clinical, administrative, and psychosocial groups. Conclusions: Complex disease management and correspondingly increasing costs for heart failure are driving innovations in building risk prediction models for readmission. Large volumes of diverse electronic data and new statistical methods have improved the predictive power of the models over the past two decades. More work is needed for calibration, external validation, and deployment of such models for clinical use.

Author(s):  
Shaan Khurshid ◽  
Uri Kartoun ◽  
Jeffrey M. Ashburner ◽  
Ludovic Trinquart ◽  
Anthony Philippakis ◽  
...  

Background - Atrial fibrillation (AF) is associated with increased risks of stroke and heart failure. Electronic health record (EHR) based AF risk prediction may facilitate efficient deployment of interventions to diagnose or prevent AF altogether. Methods - We externally validated an EHR atrial fibrillation (EHR-AF) score in IBM Explorys Life Sciences, a multi-institutional dataset containing statistically de-identified EHR data for over 21 million individuals ("Explorys Dataset"). We included individuals with complete AF risk data, ≥2 office visits within two years, and no prevalent AF. We compared EHR-AF to existing scores including CHARGE-AF, C 2 HEST, and CHA 2 DS 2 -VASc. We assessed association between AF risk scores and 5-year incident AF, stroke, and heart failure using Cox proportional hazards modeling, 5-year AF discrimination using c-indices, and calibration of predicted AF risk to observed AF incidence. Results - Of 21,825,853 individuals in the Explorys Dataset, 4,508,180 comprised the analysis (age 62.5, 56.3% female). AF risk scores were strongly associated with 5-year incident AF (hazard ratio [HR] per standard deviation [SD] increase 1.85 using CHA 2 DS 2 -VASc to 2.88 using EHR-AF), stroke (1.61 using C 2 HEST to 1.92 using CHARGE-AF), and heart failure (1.91 using CHA 2 DS 2 -VASc to 2.58 using EHR-AF). EHR-AF (c-index 0.808 [95%CI 0.807-0.809]) demonstrated favorable AF discrimination compared to CHARGE-AF (0.806 [0.805-0.807]), C 2 HEST (0.683 [0.682-0.684]), and CHA 2 DS 2 -VASc (0.720 [0.719-0.722]). Of the scores, EHR-AF demonstrated the best calibration to incident AF (calibration slope 1.002 [0.997-1.007]). In subgroup analyses, AF discrimination using EHR-AF was lower in individuals with stroke (c-index 0.696 [0.692-0.700]) and heart failure (0.621 [0.617-0.625]). Conclusions - EHR-AF demonstrates predictive accuracy for incident AF using readily ascertained EHR data. AF risk is associated with incident stroke and heart failure. Use of such risk scores may facilitate decision-support and population health management efforts focused on minimizing AF-related morbidity.


PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0244629
Author(s):  
Ali A. El-Solh ◽  
Yolanda Lawson ◽  
Michael Carter ◽  
Daniel A. El-Solh ◽  
Kari A. Mergenhagen

Objective Our objective is to compare the predictive accuracy of four recently established outcome models of patients hospitalized with coronavirus disease 2019 (COVID-19) published between January 1st and May 1st 2020. Methods We used data obtained from the Veterans Affairs Corporate Data Warehouse (CDW) between January 1st, 2020, and May 1st 2020 as an external validation cohort. The outcome measure was hospital mortality. Areas under the ROC (AUC) curves were used to evaluate discrimination of the four predictive models. The Hosmer–Lemeshow (HL) goodness-of-fit test and calibration curves assessed applicability of the models to individual cases. Results During the study period, 1634 unique patients were identified. The mean age of the study cohort was 68.8±13.4 years. Hypertension, hyperlipidemia, and heart disease were the most common comorbidities. The crude hospital mortality was 29% (95% confidence interval [CI] 0.27–0.31). Evaluation of the predictive models showed an AUC range from 0.63 (95% CI 0.60–0.66) to 0.72 (95% CI 0.69–0.74) indicating fair to poor discrimination across all models. There were no significant differences among the AUC values of the four prognostic systems. All models calibrated poorly by either overestimated or underestimated hospital mortality. Conclusions All the four prognostic models examined in this study portend high-risk bias. The performance of these scores needs to be interpreted with caution in hospitalized patients with COVID-19.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Matthew W Segar ◽  
Byron Jaeger ◽  
Kershaw V Patel ◽  
Vijay Nambi ◽  
Chiadi E Ndumele ◽  
...  

Introduction: Heart failure (HF) risk and the underlying biological risk factors vary by race. Machine learning (ML) may improve race-specific HF risk prediction but this has not been fully evaluated. Methods: The study included participants from 4 cohorts (ARIC, DHS, JHS, and MESA) aged > 40 years, free of baseline HF, and with adjudicated HF event follow-up. Black adults from JHS and white adults from ARIC were used to derive race-specific ML models to predict 10-year HF risk. The ML models were externally validated in subgroups of black and white adults from ARIC (excluding JHS participants) and pooled MESA/DHS cohorts and compared to prior established HF risk scores developed in ARIC and MESA. Harrell’s C-index and Greenwood-Nam-D’Agostino chi-square were used to assess discrimination and calibration, respectively. Results: In the derivation cohorts, 288 of 4141 (7.0%) black and 391 of 8242 (4.7%) white adults developed HF over 10 years. The ML models had excellent discrimination in both black and white participants (C-indices = 0.88 and 0.89). In the external validation cohorts for black participants from ARIC (excluding JHS, N = 1072) and MESA/DHS pooled cohorts (N = 2821), 131 (12.2%) and 115 (4.1%) developed HF. The ML model had adequate calibration and demonstrated superior discrimination compared to established HF risk models (Fig A). A consistent pattern was also observed in the external validation cohorts of white participants from the MESA/DHS pooled cohorts (N=3236; 100 [3.1%] HF events) (Fig A). The most important predictors of HF in both races were NP levels. Cardiac biomarkers and glycemic parameters were most important among blacks while LV hypertrophy and prevalent CVD and traditional CV risk factors were the strongest predictors among whites (Fig B). Conclusions: Race-specific and ML-based HF risk models that integrate clinical, laboratory, and biomarker data demonstrated superior performance when compared to traditional risk prediction models.


2007 ◽  
Vol 25 (11) ◽  
pp. 1316-1322 ◽  
Author(s):  
Pierre I. Karakiewicz ◽  
Alberto Briganti ◽  
Felix K.-H. Chun ◽  
Quoc-Dien Trinh ◽  
Paul Perrotte ◽  
...  

Purpose We tested the hypothesis that the prediction of renal cancer–specific survival can be improved if traditional predictor variables are used within a prognostic nomogram. Patients and Methods Two cohorts of patients treated with either radical or partial nephrectomy for renal cortical tumors were used: one (n = 2,530) for nomogram development and for internal validation (200 bootstrap resamples), and a second (n = 1,422) for external validation. Cox proportional hazards regression analyses modeled the 2002 TNM stages, tumor size, Fuhrman grade, histologic subtype, local symptoms, age, and sex. The accuracy of the nomogram was compared with an established staging scheme. Results Cancer-specific mortality was observed in 598 (23.6%) patients, whereas 200 (7.9%) died as a result of other causes. Follow-up ranged from 0.1 to 286 months (median, 38.8 months). External validation of the nomogram at 1, 2, 5, and 10 years after nephrectomy revealed predictive accuracy of 87.8%, 89.2%, 86.7%, and 88.8%, respectively. Conversely, the alternative staging scheme predicting at 2 and 5 years was less accurate, as evidenced by 86.1% (P = .006) and 83.9% (P = .02) estimates. Conclusion The new nomogram is more contemporary, provides predictions that reach further in time and, compared with its alternative, which predicts at 2 and 5 years, generates 3.1% and 2.8% more accurate predictions, respectively.


2020 ◽  
Vol 48 (W1) ◽  
pp. W580-W585 ◽  
Author(s):  
Priyanka Banerjee ◽  
Mathias Dunkel ◽  
Emanuel Kemmler ◽  
Robert Preissner

Abstract Cytochrome P450 enzymes (CYPs)-mediated drug metabolism influences drug pharmacokinetics and results in adverse outcomes in patients through drug–drug interactions (DDIs). Absorption, distribution, metabolism, excretion and toxicity (ADMET) issues are the leading causes for the failure of a drug in the clinical trials. As details on their metabolism are known for just half of the approved drugs, a tool for reliable prediction of CYPs specificity is needed. The SuperCYPsPred web server is currently focused on five major CYPs isoenzymes, which includes CYP1A2, CYP2C19, CYP2D6, CYP2C9 and CYP3A4 that are responsible for more than 80% of the metabolism of clinical drugs. The prediction models for classification of the CYPs inhibition are based on well-established machine learning methods. The models were validated both on cross-validation and external validation sets and achieved good performance. The web server takes a 2D chemical structure as input and reports the CYP inhibition profile of the chemical for 10 models using different molecular fingerprints, along with confidence scores, similar compounds, known CYPs information of drugs—published in literature, detailed interaction profile of individual cytochromes including a DDIs table and an overall CYPs prediction radar chart (http://insilico-cyp.charite.de/SuperCYPsPred/). The web server does not require log in or registration and is free to use.


2021 ◽  
Author(s):  
Jonathan Hsijing Lu ◽  
Alison Callahan ◽  
Birju Patel ◽  
Keith Morse ◽  
Dev Dash ◽  
...  

Objective: To assess whether the documentation available for commonly used machine learning models developed by an electronic health record (EHR) vendor provides information requested by model reporting guidelines. Materials and Methods: We identified items requested for reporting from model reporting guidelines published in computer science, biomedical informatics, and clinical journals, and merged similar items into representative "atoms". Four independent reviewers and one adjudicator assessed the degree to which model documentation for 12 models developed by Epic Systems reported the details requested in each atom. We present summary statistics of consensus, interrater agreement, and reporting rates of all atoms for the 12 models. Results: We identified 220 unique atoms across 15 model reporting guidelines. After examining the documentation for the 12 most commonly used Epic models, the independent reviewers had an interrater agreement of 76%. After adjudication, the model documentations' median completion rate of applicable atoms was 39% (range: 31%-47%). Most of the commonly requested atoms had reporting rates of 90% or above, including atoms concerning outcome definition, preprocessing, AUROC, internal validation and intended clinical use. For individual reporting guidelines, the median adherence rate for an entire guideline was 54% (range: 15%-71%). Atoms reported half the time or less included those relating to fairness (summary statistics and subgroup analyses, including for age, race/ethnicity, or sex), usefulness (net benefit, prediction time, warnings on out-of-scope use and when to stop use), and transparency (model coefficients). Atoms reported the least often related to missingness (missing data statistics, missingness strategy), validation (calibration plot, external validation), and monitoring (how models are updated/tuned, prediction monitoring). Conclusion: There are many recommendations about what should be reported about predictive models used to guide care. Existing model documentation examined in this study provides less than half of applicable atoms, and entire reporting guidelines have low adherence rates. Half or less of the reviewed documentation reported information related to usefulness, reliability, transparency and fairness of models. There is a need for better operationalization of reporting recommendations for predictive models in healthcare.


2019 ◽  
Vol 2019 ◽  
pp. 1-8 ◽  
Author(s):  
Debora F. B. Leite ◽  
Jose G. Cecatti

The actual burden and future burden of the small-for-gestational-age (SGA) babies turn their screening in pregnancy a question of major concern for clinicians and policymakers. Half of stillbirths are due to growth restriction in utero, and possibly, a quarter of livebirths of low- and middle-income countries are SGA. Growing body of evidence shows their higher risk of adverse outcomes at any period of life, including increased rates of neurologic delay, noncommunicable chronic diseases (central obesity and metabolic syndrome), and mortality. Although there is no consensus regarding its definition, birthweight centile threshold, or follow-up, we believe birthweight <10th centile is the most suitable cutoff for clinical and epidemiological purposes. Maternal clinical factors have modest predictive accuracy; being born SGA appears to be of transgenerational heredity. Addition of ultrasound parameters improves prediction models, especially using estimated fetal weight and abdominal circumference in the 3rd trimester of pregnancy. Placental growth factor levels are decreased in SGA pregnancies, and it is the most promising biomarker in differentiating angiogenesis-related SGA from other causes. Unfortunately, however, only few societies recommend universal screening. SGA evaluation is the first step of a multidimensional approach, which includes adequate management and long-term follow-up of these newborns. Apart from only meliorating perinatal outcomes, we hypothesize SGA screening is a key for socioeconomic progress.


2019 ◽  
Vol 40 (Supplement_1) ◽  
Author(s):  
P Krisai ◽  
S Blum ◽  
S Aeschbacher ◽  
J H Beer ◽  
G Moschovitis ◽  
...  

Abstract Background Comprehensive information on the impact of atrial fibrillation (AF)-related symptoms and quality of life (QoL) on adverse outcomes is sparse. Purpose We aimed to investigate whether AF-related symptoms and/or QoL are associated with cardiovascular outcomes in a large cohort of AF patients. Methods A total of 3902 participants with documented AF from two nationwide prospective cohort studies in Switzerland were included. Information on AF-related symptoms was assessed yearly by standardized questionnaires, QoL was quantified using a visual analog scale (0–100, with higher scores indicating better QoL). The primary endpoint was a composite of stroke and systemic embolism. The secondary endpoint was a composite of cardiovascular death, hospitalization for heart failure and myocardial infarction. We assessed associations using multivariable, time-updated Cox proportional-hazards models including age, sex, study cohort, history of heart failure, hypertension, diabetes, prior stroke, prior myocardial infarction, vascular disease and prior catheter ablation for AF as covariates. Results Mean age was 72 years, and 72% were male. The median QoL score was 75 points, and 2572 (66%) participants had AF-related symptoms. Symptomatic individuals were younger (71 vs 75 years) and had more often paroxysmal AF (29 vs 23%) (p for both <0.001). The most frequent symptoms were palpitations (42%), dyspnea (25%) and fatigue (18%). In multivariable, time-updated models, the hazard ratio (HR) was 1.24 (95% confidence intervals (CI) 0.72; 2.11, p=0.43) for the primary endpoint and HR 0.83 (95% CI 0.65; 1.06, p=0.14) for the secondary endpoint in symptomatic vs non-symptomatic individuals. There was a significant, inverse association for a 5-point increase in the QoL score with both the primary (HR 0.94 (95% CI 0.88; 0.99), p=0.04) and secondary (HR 0.91 (95% CI 0.88; 0.93), p<0.0001) endpoints. Conclusions AF-related symptoms are not associated with adverse cardiovascular events in AF patients. In contrast, QoL is inversely associated with to adverse cardiovascular outcomes.


2019 ◽  
Vol 37 (7_suppl) ◽  
pp. 414-414
Author(s):  
Ping Tan ◽  
Lu Yang ◽  
Hang Xu ◽  
Qiang Wei

414 Background: Recently, several postoperative nomograms for cancer-specific survival (CSS) after radical nephroureterectomy (RNU) were proposed, while they did not incorporate the same variables; meanwhile, many preoperative blood-based parameters, which were recently reported to be related to survival, were not included in their models. In addition, no nomogram for overall survival (OS) was available to date. Methods: The full data of 716 patients were available. The whole cohort was randomly divided into two cohorts: the training cohort for developing the nomograms (n = 508) and the validation cohort for validating the models (n = 208). Univariate and multivariate Cox proportional hazards regression models were used for establishing the prediction models. The discriminative accuracy of nomograms were measured by Harrell’s concordance index (C-index). The clinical usefulness and net benefit of the predictive models were estimated and visualized by using Decision curve analyses (DCA). Results: The median follow-up time was 42.0 months (IQR: 18.0-76.0). For CSS, tumor size, grade and pT stage, lymph node metastasis, NLR, PLR and fibrinogen level were identified as independent risk factors in the final model; while tumor grade and pT stage, lymph node metastasis, PLR, Cys-C and fibrinogen level were identified as independent predictors for OS model. The C-index for CSS prediction was 0.82 (95%CI: 0.79-0.85), and the OS nomogram model had an accuracy of 0.83 (95%CI: 0.80-0.86). The results of bootstrapping showed no deviation from the ideal. The calibration plots for the probability of CSS and OS at 3 or 5-year after RNU showed a favorable agreement between the prediction by the nomograms and actual observation. In the external validation cohort, the C-indexes of the nomograms for predicting CSS and OS were 0.79 (95%CI: 0.74-0.84) and 0.80 (95%CI: 0.75-0.85), respectively. As indicated by calibration plots, optimal agreement was observed between prediction and observation in the external cohort. Conclusions: The nomograms developed and validated based on preoperative blood-based parameters were superior to any single variable for predicting CSS and OS after RNU.


Circulation ◽  
2018 ◽  
Vol 138 (Suppl_1) ◽  
Author(s):  
Jenica N Upshaw ◽  
Jason Nelson ◽  
Benjamin Wessler ◽  
Benjamin Koethe ◽  
Christine Lundquist ◽  
...  

Introduction: Most heart failure (HF) clinical prediction models (CPMs] have not been independently externally validated. We sought to test the performance of HF models in a diverse population using a systematic approach. Methods: A systematic review identified CPMs predicting outcomes for patients with HF. Individual patient data from 5 large publicaly available clinical trials enrolling patients with chronic HF were matched to published CPMs based on similarity in populations and available outcome and predictor variables in the clinical trial databases. CPM performance was evaluated for discrimination (c-statistic, % relative change in c-statistic) and calibration (Harrell’s E and E 90 , the mean and the 90% quantile of the error distribution from the smoothed loess observed value) for the original and recalibrated models. Results: Out of 135 HF CPMs reviewed, we identified 45 CPM-trial pairs including 13 unique CPMs. The outcome was mortality for all of the models with a trial match. During external validations, median c-statistic was 0.595 (IQR 0.563 to 0.630) with a median relative decrease in the c-statistic of -57 % (IQR, -49% to -71%) compared to the c-statistic reported in the derivation cohort. Overall, the median Harrell’s E was 0.09 (IQR, 0.04 to 0.135) and E 90 was 0.11 (IQR, 0.07 to 0.21). Recalibration of the intercept and slope led to substantially improved calibration with median change in Harrell’s E of -35% [IQR 0 to -75%] for the intercept and -56% [IQR -17% to -75%] for the intercept and slope. Refitting model covariates improved the median c-statistic by 38% to 0.629 [IQR 0.613 to 0.649]. Conclusion: For HF CPMs, independent external validations demonstrate that CPMs perform significantly worse than originally presented; however with significant heterogeneity. Recalibration of the intercept and slope improved model calibration. These results underscore the need to carefully consider the derivation cohort characteristics when using published CPMs.


Sign in / Sign up

Export Citation Format

Share Document