scholarly journals External validation of risk prediction models for incident colorectal cancer using UK Biobank

2018 ◽  
Vol 118 (5) ◽  
pp. 750-759 ◽  
Author(s):  
J A Usher-Smith ◽  
A Harshfield ◽  
C L Saunders ◽  
S J Sharp ◽  
J Emery ◽  
...  

Abstract Background: This study aimed to compare and externally validate risk scores developed to predict incident colorectal cancer (CRC) that include variables routinely available or easily obtainable via self-completed questionnaire. Methods: External validation of fourteen risk models from a previous systematic review in 373 112 men and women within the UK Biobank cohort with 5-year follow-up, no prior history of CRC and data for incidence of CRC through linkage to national cancer registries. Results: There were 1719 (0.46%) cases of incident CRC. The performance of the risk models varied substantially. In men, the QCancer10 model and models by Tao, Driver and Ma all had an area under the receiver operating characteristic curve (AUC) between 0.67 and 0.70. Discrimination was lower in women: the QCancer10, Wells, Tao, Guesmi and Ma models were the best performing with AUCs between 0.63 and 0.66. Assessment of calibration was possible for six models in men and women. All would require country-specific recalibration if estimates of absolute risks were to be given to individuals. Conclusions: Several risk models based on easily obtainable data have relatively good discrimination in a UK population. Modelling studies are now required to estimate the potential health benefits and cost-effectiveness of implementing stratified risk-based CRC screening.

Gut ◽  
2018 ◽  
Vol 68 (4) ◽  
pp. 672-683 ◽  
Author(s):  
Todd Smith ◽  
David C Muller ◽  
Karel G M Moons ◽  
Amanda J Cross ◽  
Mattias Johansson ◽  
...  

ObjectiveTo systematically identify and validate published colorectal cancer risk prediction models that do not require invasive testing in two large population-based prospective cohorts.DesignModels were identified through an update of a published systematic review and validated in the European Prospective Investigation into Cancer and Nutrition (EPIC) and the UK Biobank. The performance of the models to predict the occurrence of colorectal cancer within 5 or 10 years after study enrolment was assessed by discrimination (C-statistic) and calibration (plots of observed vs predicted probability).ResultsThe systematic review and its update identified 16 models from 8 publications (8 colorectal, 5 colon and 3 rectal). The number of participants included in each model validation ranged from 41 587 to 396 515, and the number of cases ranged from 115 to 1781. Eligible and ineligible participants across the models were largely comparable. Calibration of the models, where assessable, was very good and further improved by recalibration. The C-statistics of the models were largely similar between validation cohorts with the highest values achieved being 0.70 (95% CI 0.68 to 0.72) in the UK Biobank and 0.71 (95% CI 0.67 to 0.74) in EPIC.ConclusionSeveral of these non-invasive models exhibited good calibration and discrimination within both external validation populations and are therefore potentially suitable candidates for the facilitation of risk stratification in population-based colorectal screening programmes. Future work should both evaluate this potential, through modelling and impact studies, and ascertain if further enhancement in their performance can be obtained.


2020 ◽  
Vol 122 (10) ◽  
pp. 1572-1575
Author(s):  
J. A. Usher-Smith ◽  
A. Harshfield ◽  
C. L. Saunders ◽  
S. J. Sharp ◽  
J. Emery ◽  
...  

2020 ◽  
Author(s):  
Musa Saulawa Ibrahim ◽  
Dong Pang ◽  
Yannis Pappas ◽  
Gurch Randhawa

Abstract Background:Metabolic syndrome is linked with increased risk of cardiovascular disease, diabetes and all-cause mortality. Despite the high number of models and scores for assessing the risk of developing MetS, there is hardly any used in practical setting. Hence, we conducted a systematic review to determine the performance of risk models and scores for predicting metabolic syndrome.Methods:We systematically searched MEDLINE, CINAHL, PUBMED and Web of Science to identify studies that either derive or validate risk prediction models or scores for predicting the risk of metabolic syndrome. Data concerning models’ statistical properties as well as details of internal or external validations were extracted. Tables were used to compare various components of models and statistical properties. Finally, PROBAST was used to assess the methodological quality (risk of bias) of included studies.Results:A total of 15102 titles were scanned, 29 full papers were analysed in detail and 24 papers were included. The studies reported about the development, validation or both of 40 MetS risk models; out of these, 24 models were studied in details. There is significant heterogeneity between studies in terms of geography/demographics, data type and methodological approach. Majority of the models or risk scores were developed or validated using data from cross-sectional studies, or routine data that were often assembled for other reasons. Various combinations of risk factors (predictors) were considered significant in the respective final model. Similarly, different criteria were used in the diagnosis of MetS, but, NCEP criteria including its modified versions were by far the most widely used (32.5%). There is generally poor reporting quality across the studies, especially concerning statistical data. Any form of internal validation is either not conducted, or not reported in nearly a fifth of the studies. Only two (2) risk models or scores were externally validatedConclusions:There is an abundance of MetS models in the literature. But, their usefulness is doubtful, due to limitations in methodology, poor reporting and lack of external validation and impact studies. Therefore, researchers in the future should focus more on externally validating/ applying such models in a different setting.Protocol: The protocol of this study can be found at https://bmjopen.bmj.com/content/9/9/e027326PROSPERO registration number CRD42019139326


2020 ◽  
Vol 13 (6) ◽  
pp. 509-520 ◽  
Author(s):  
Catherine L. Saunders ◽  
Britt Kilian ◽  
Deborah J. Thompson ◽  
Luke J. McGeoch ◽  
Simon J. Griffin ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Chi-Ming Chu ◽  
Huan-Ming Hsu ◽  
Chi-Wen Chang ◽  
Yuan-Kuei Li ◽  
Yu-Jia Chang ◽  
...  

AbstractGenetic co-expression network (GCN) analysis augments the understanding of breast cancer (BC). We aimed to propose GCN-based modeling for BC relapse-free survival (RFS) prediction and to discover novel biomarkers. We used GCN and Cox proportional hazard regression to create various prediction models using mRNA microarray of 920 tumors and conduct external validation using independent data of 1056 tumors. GCNs of 34 identified candidate genes were plotted in various sizes. Compared to the reference model, the genetic predictors selected from bigger GCNs composed better prediction models. The prediction accuracy and AUC of 3 ~ 15-year RFS are 71.0–81.4% and 74.6–78% respectively (rfm, ACC 63.2–65.5%, AUC 61.9–74.9%). The hazard ratios of risk scores of developing relapse ranged from 1.89 ~ 3.32 (p < 10–8) over all models under the control of the node status. External validation showed the consistent finding. We found top 12 co-expressed genes are relative new or novel biomarkers that have not been explored in BC prognosis or other cancers until this decade. GCN-based modeling creates better prediction models and facilitates novel genes exploration on BC prognosis.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Matthew W Segar ◽  
Byron Jaeger ◽  
Kershaw V Patel ◽  
Vijay Nambi ◽  
Chiadi E Ndumele ◽  
...  

Introduction: Heart failure (HF) risk and the underlying biological risk factors vary by race. Machine learning (ML) may improve race-specific HF risk prediction but this has not been fully evaluated. Methods: The study included participants from 4 cohorts (ARIC, DHS, JHS, and MESA) aged > 40 years, free of baseline HF, and with adjudicated HF event follow-up. Black adults from JHS and white adults from ARIC were used to derive race-specific ML models to predict 10-year HF risk. The ML models were externally validated in subgroups of black and white adults from ARIC (excluding JHS participants) and pooled MESA/DHS cohorts and compared to prior established HF risk scores developed in ARIC and MESA. Harrell’s C-index and Greenwood-Nam-D’Agostino chi-square were used to assess discrimination and calibration, respectively. Results: In the derivation cohorts, 288 of 4141 (7.0%) black and 391 of 8242 (4.7%) white adults developed HF over 10 years. The ML models had excellent discrimination in both black and white participants (C-indices = 0.88 and 0.89). In the external validation cohorts for black participants from ARIC (excluding JHS, N = 1072) and MESA/DHS pooled cohorts (N = 2821), 131 (12.2%) and 115 (4.1%) developed HF. The ML model had adequate calibration and demonstrated superior discrimination compared to established HF risk models (Fig A). A consistent pattern was also observed in the external validation cohorts of white participants from the MESA/DHS pooled cohorts (N=3236; 100 [3.1%] HF events) (Fig A). The most important predictors of HF in both races were NP levels. Cardiac biomarkers and glycemic parameters were most important among blacks while LV hypertrophy and prevalent CVD and traditional CV risk factors were the strongest predictors among whites (Fig B). Conclusions: Race-specific and ML-based HF risk models that integrate clinical, laboratory, and biomarker data demonstrated superior performance when compared to traditional risk prediction models.


2019 ◽  
Vol 6 (Supplement_2) ◽  
pp. S831-S832
Author(s):  
Donald A Perry ◽  
Daniel Shirley ◽  
Dejan Micic ◽  
Rosemary K B Putler ◽  
Pratish Patel ◽  
...  

Abstract Background Annually in the US alone, Clostridioides difficile infection (CDI) afflicts nearly 500,000 patients causing 29,000 deaths. Since early and aggressive interventions could save lives but are not optimally deployed in all patients, numerous studies have published predictive models for adverse outcomes. These models are usually developed at a single institution, and largely are not externally validated. This aim of this study was to validate the predictability for severe CDI with previously published risk scores in a multicenter cohort of patients with CDI. Methods We conducted a retrospective study on four separate inpatient cohorts with CDI from three distinct sites: the Universities of Michigan (2010–2012 and 2016), Chicago (2012), and Wisconsin (2012). The primary composite outcome was admission to an intensive care unit, colectomy, and/or death attributed to CDI within 30 days of positive test. Structured query and manual chart review abstracted data from the medical record at each site. Published CDI severity scores were assessed and compared with each other and the IDSA guideline definition of severe CDI. Sensitivity, specificity, area under the receiver operator characteristic curve (AuROC), precision-recall curves, and net reclassification index (NRI) were calculated to compare models. Results We included 3,775 patients from the four cohorts (Table 1) and evaluated eight severity scores (Table 2). The IDSA (baseline comparator) model showed poor performance across cohorts(Table 3). Of the binary classification models, including those that were most predictive of the primary composite outcome, Jardin, performed poorly with minimal to no NRI improvement compared with IDSA. The continuous score models, Toro and ATLAS, performed better, but the AuROC varied by site by up to 17% (Table 3). The Gujja model varied the most: from most predictive in the University of Michigan 2010–2012 cohort to having no predictive value in the 2016 cohort (Table 3). Conclusion No published CDI severity score showed stable, acceptable predictive ability across multiple cohorts/institutions. To maximize performance and clinical utility, future efforts should focus on a multicenter-derived and validated scoring system, and/or incorporate novel biomarkers. Disclosures All authors: No reported disclosures.


2020 ◽  
Author(s):  
Chundong Zhang ◽  
Zubing Mei ◽  
Junpeng Pei ◽  
Masanobu Abe ◽  
Xiantao Zeng ◽  
...  

Abstract Background The American Joint Committee on Cancer (AJCC) 8th tumor/node/metastasis (TNM) classification for colorectal cancer (CRC) has limited ability to predict prognosis. Methods We included 45,379 eligible stage I-III CRC patients from the Surveillance, Epidemiology, and End Results Program. Patients were randomly assigned individually to a training (N =31,772) or an internal validation cohort (N =13,607). External validation was performed in 10,902 additional patients. Patients were divided according to T and N stage permutations. Survival analyses were conducted by a Cox proportional hazard model and Kaplan-Meier analysis, with T1N0 as the reference. Area under receiver operating characteristic curve (AUC) and Akaike information criteria (AIC) were applied for prognostic discrimination and model-fitting, respectively. Clinical benefits were further assessed by decision curve analyses. Results We created a modified TNM (mTNM) classification: stages I (T1-2N0-1a), IIA (T1N1b, T2N1b, T3N0), IIB (T1-2N2a-2b, T3N1a-1b, T4aN0), IIC (T3N2a, T4aN1a-2a, T4bN0), IIIA (T3N2b, T4bN1a), IIIB (T4aN2b, T4bN1b), and IIIC (T4bN2a-2b). In the internal validation cohort, compared to the AJCC 8th TNM classification, the mTNM classification showed superior prognostic discrimination (AUC = 0.675 vs. 0.667, respectively; two-sided P &lt;0.001) and better model-fitting (AIC = 70,937 vs. 71,238, respectively). Similar findings were obtained in the external validation cohort. Decision curve analyses revealed that the mTNM had superior net benefits over the AJCC 8th TNM classification in the internal and external validation cohorts. Conclusions The mTNM classification provides better prognostic discrimination than AJCC 8th TNM classification, with good applicability in various populations and settings, to help better stratify stage I-III CRC patients into prognostic groups.


2016 ◽  
Vol 34 (21) ◽  
pp. 2534-2540 ◽  
Author(s):  
Kathleen F. Kerr ◽  
Marshall D. Brown ◽  
Kehao Zhu ◽  
Holly Janes

The decision curve is a graphical summary recently proposed for assessing the potential clinical impact of risk prediction biomarkers or risk models for recommending treatment or intervention. It was applied recently in an article in Journal of Clinical Oncology to measure the impact of using a genomic risk model for deciding on adjuvant radiation therapy for prostate cancer treated with radical prostatectomy. We illustrate the use of decision curves for evaluating clinical- and biomarker-based models for predicting a man’s risk of prostate cancer, which could be used to guide the decision to biopsy. Decision curves are grounded in a decision-theoretical framework that accounts for both the benefits of intervention and the costs of intervention to a patient who cannot benefit. Decision curves are thus an improvement over purely mathematical measures of performance such as the area under the receiver operating characteristic curve. However, there are challenges in using and interpreting decision curves appropriately. We caution that decision curves cannot be used to identify the optimal risk threshold for recommending intervention. We discuss the use of decision curves for miscalibrated risk models. Finally, we emphasize that a decision curve shows the performance of a risk model in a population in which every patient has the same expected benefit and cost of intervention. If every patient has a personal benefit and cost, then the curves are not useful. If subpopulations have different benefits and costs, subpopulation-specific decision curves should be used. As a companion to this article, we released an R software package called DecisionCurve for making decision curves and related graphics.


2020 ◽  
Author(s):  
Musa Saulawa Ibrahim ◽  
Dong Pang ◽  
Yannis Pappas ◽  
Gurch Randhawa

Abstract Background: Metabolic syndrome - ‘a clustering of risk factors which includes hypertension, central obesity, impaired glucose metabolism with insulin resistance, and dyslipidaemia’ is linked with increased risk of CVD, T2DM and all-cause mortality. Despite the high number of models and scores for assessing the risk of developing MetS, there is hardly any used in research or practical setting. Hence, we conducted a systematic review to determine the performance of risk models and scores for predicting MetS.Methods: Following the methods proposed by EPPI-Centre, we systematically searched MEDLINE, CINAHL, PUBMED and Web of Science to identify studies that either derive or validate risk models or scores for predicting the risk of MetS. Search was originally done in September 2018 and updated in September 2020. Data concerning models’ statistical properties as well as details of internal or external validations were extracted. Tables were used to compare various components of models and statistical properties. Finally, PROBAST was used to assess the methodological quality (risk of bias) of included studies.Results: A total of 27 studies reporting about the development, validation or both of MetS risk models were included. There is significant heterogeneity between studies in terms of geography/demographics, data type and methodological approach. Majority of the models or risk scores were developed or validated using data from cross-sectional studies, or routine data. Various combinations of risk factors (predictors) were considered significant in the respective final model. Similarly, different criteria were used in the diagnosis of MetS, but, NCEP criteria including its modified versions were by far the most widely used (32.5%). There is generally poor reporting quality across the studies, especially concerning statistical data. Any form of internal validation is either not conducted, or not reported in nearly a fifth of the studies. Only two (2) risk models or scores were externally validatedConclusions: There is an abundance of MetS models in the literature. But, their usefulness is doubtful, due to limitations in methodology, poor reporting and lack of external validation and impact studies. Therefore, researchers in the future should focus more on externally validating/ applying such models in a different setting.Protocol:The protocol of this study can be found at https://bmjopen.bmj.com/content/9/9/e027326PROSPERO registration number CRD42019139326


Sign in / Sign up

Export Citation Format

Share Document