scholarly journals Study protocol: Comparison of different risk prediction modelling approaches for COVID-19 related death using the OpenSAFELY platform

2020 ◽  
Vol 5 ◽  
pp. 243 ◽  
Author(s):  
◽  
Elizabeth J. Williamson ◽  
John Tazare ◽  
Krishnan Bhaskaran ◽  
Alex J Walker ◽  
...  

On March 11th 2020, the World Health Organization characterised COVID-19 as a pandemic. Responses to containing the spread of the virus have relied heavily on policies involving restricting contact between people. Evolving policies regarding shielding and individual choices about restricting social contact will rely heavily on perceived risk of poor outcomes from COVID-19. In order to make informed decisions, both individual and collective, good predictive models are required.   For outcomes related to an infectious disease, the performance of any risk prediction model will depend heavily on the underlying prevalence of infection in the population of interest. Incorporating measures of how this changes over time may result in important improvements in prediction model performance.  This protocol reports details of a planned study to explore the extent to which incorporating time-varying measures of infection burden over time improves the quality of risk prediction models for COVID-19 death in a large population of adult patients in England. To achieve this aim, we will compare the performance of different modelling approaches to risk prediction, including static cohort approaches typically used in chronic disease settings and landmarking approaches incorporating time-varying measures of infection prevalence and policy change, using COVID-19 related deaths data linked to longitudinal primary care electronic health records data within the OpenSAFELY secure analytics platform.

BMC Cancer ◽  
2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Michele Sassano ◽  
Marco Mariani ◽  
Gianluigi Quaranta ◽  
Roberta Pastorino ◽  
Stefania Boccia

Abstract Background Risk prediction models incorporating single nucleotide polymorphisms (SNPs) could lead to individualized prevention of colorectal cancer (CRC). However, the added value of incorporating SNPs into models with only traditional risk factors is still not clear. Hence, our primary aim was to summarize literature on risk prediction models including genetic variants for CRC, while our secondary aim was to evaluate the improvement of discriminatory accuracy when adding SNPs to a prediction model with only traditional risk factors. Methods We conducted a systematic review on prediction models incorporating multiple SNPs for CRC risk prediction. We tested whether a significant trend in the increase of Area Under Curve (AUC) according to the number of SNPs could be observed, and estimated the correlation between AUC improvement and number of SNPs. We estimated pooled AUC improvement for SNP-enhanced models compared with non-SNP-enhanced models using random effects meta-analysis, and conducted meta-regression to investigate the association of specific factors with AUC improvement. Results We included 33 studies, 78.79% using genetic risk scores to combine genetic data. We found no significant trend in AUC improvement according to the number of SNPs (p for trend = 0.774), and no correlation between the number of SNPs and AUC improvement (p = 0.695). Pooled AUC improvement was 0.040 (95% CI: 0.035, 0.045), and the number of cases in the study and the AUC of the starting model were inversely associated with AUC improvement obtained when adding SNPs to a prediction model. In addition, models constructed in Asian individuals achieved better AUC improvement with the incorporation of SNPs compared with those developed among individuals of European ancestry. Conclusions Though not conclusive, our results provide insights on factors influencing discriminatory accuracy of SNP-enhanced models. Genetic variants might be useful to inform stratified CRC screening in the future, but further research is needed.


2021 ◽  
Author(s):  
Xuecheng Zhang ◽  
Kehua Zhou ◽  
Jingjing Zhang ◽  
Ying Chen ◽  
Hengheng Dai ◽  
...  

Abstract Background Nearly a third of patients with acute heart failure (AHF) die or are readmitted within three months after discharge, accounting for the majority of costs associated with heart failure-related care. A considerable number of risk prediction models, which predict outcomes for mortality and readmission rates, have been developed and validated for patients with AHF. These models could help clinicians stratify patients by risk level and improve decision making, and provide specialist care and resources directed to high-risk patients. However, clinicians sometimes reluctant to utilize these models, possibly due to their poor reliability, the variety of models, and/or the complexity of statistical methodologies. Here, we describe a protocol to systematically review extant risk prediction models. We will describe characteristics, compare performance, and critically appraise the reporting transparency and methodological quality of risk prediction models for AHF patients. Method Embase, Pubmed, Web of Science, and the Cochrane Library will be searched from their inception onwards. A back word will be searched on derivation studies to find relevant external validation studies. Multivariable prognostic models used for AHF and mortality and/or readmission rate will be eligible for review. Two reviewers will conduct title and abstract screening, full-text review, and data extraction independently. Included models will be summarized qualitatively and quantitatively. We will also provide an overview of critical appraisal of the methodological quality and reporting transparency of included studies using the Prediction model Risk of Bias Assessment Tool(PROBAST tool) and the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis(TRIPOD statement). Discussion The result of the systematic review could help clinicians better understand and use the prediction models for AHF patients, as well as make standardized decisions about more precise, risk-adjusted management. Systematic review registration : PROSPERO registration number CRD42021256416.


Author(s):  
Julie R. Palmer ◽  
Gary Zirpoli ◽  
Kimberly A. Bertrand ◽  
Tracy Battaglia ◽  
Leslie Bernstein ◽  
...  

PURPOSE Breast cancer risk prediction models are used to identify high-risk women for early detection, targeted interventions, and enrollment into prevention trials. We sought to develop and evaluate a risk prediction model for breast cancer in US Black women, suitable for use in primary care settings. METHODS Breast cancer relative risks and attributable risks were estimated using data from Black women in three US population-based case-control studies (3,468 breast cancer cases; 3,578 controls age 30-69 years) and combined with SEER age- and race-specific incidence rates, with incorporation of competing mortality, to develop an absolute risk model. The model was validated in prospective data among 51,798 participants of the Black Women's Health Study, including 1,515 who developed invasive breast cancer. A second risk prediction model was developed on the basis of estrogen receptor (ER)–specific relative risks and attributable risks. Model performance was assessed by calibration (expected/observed cases) and discriminatory accuracy (C-statistic). RESULTS The expected/observed ratio was 1.01 (95% CI, 0.95 to 1.07). Age-adjusted C-statistics were 0.58 (95% CI, 0.56 to 0.59) overall and 0.63 (95% CI, 0.58 to 0.68) among women younger than 40 years. These measures were almost identical in the model based on estrogen receptor–specific relative risks and attributable risks. CONCLUSION Discriminatory accuracy of the new model was similar to that of the most frequently used questionnaire-based breast cancer risk prediction models in White women, suggesting that effective risk stratification for Black women is now possible. This model may be especially valuable for risk stratification of young Black women, who are below the ages at which breast cancer screening is typically begun.


2021 ◽  
Author(s):  
Gabriella Gatt ◽  
Nikolai Attard

Abstract BackgroundDespite increasing prevalence, age specific risk predictive models for erosive tooth wear in preschool age children have not been developed. Identification of at risk groups and the timely introduction of behavioural change or treatment will stop the progression of erosive wear in the permanent dentition. The aim of this study was to identify age specific risk factors for erosive wear. Distinct risk prediction models for three year old and five year old children were developed.MethodsA prospective cohort study included clinical examinations and parent administered questionnaires for three and five-year-old children. Chi-square tests explored categorical demographic variables, Spearman Rank Order correlation tests examined changes in BEWE scores with changes in food frequencies while Wilcoxon signed rank tests evaluated the temporal effect of frequencies of consumption of dietary items. Mann-Whitney U tests compared changes in BEWE scores over time for the twenty-six bivariate categorical variables and Kruskall-Wallis tests compared changes in BEWE scores over time across the remaining 55 categorical variables representing demographic factors, oral hygiene habits and dietary habits. Change in BEWE scores for continuous variables was investigated using Spearman Rho correlation coefficient Test. Those variables showing significance with a difference in BEWE cumulative score over time were utilised to develop two risk prediction models. The models were evaluated by Receiver Operating Characteristics (ROC) analysis.ResultsRisk factors for the three-year-old cohort included the erosive wear (χ2 (1, 92) = 12.829, p < 0.001), district (χ2 (5, 92) = 17.032, p = 0.004) and family size (χ2 (1, 92) = 4.547, p = 0.033). Risk factors for the five-year-old cohort also included erosive wear (χ2 (1, 144) = 4.768, p = 0.029) gender (χ2 (1, 144) = 19.399, p <0.001), consumption of iced tea (χ2 (1, 144) = 8.872, p = 0.003) and dry mouth (χ2 (1, 144) = 9.598, p = 0.002).Conclusions: Predictive risk factors for three-year-old children are based on demographic factors and are distinct from those for the five-year-old cohort, which are based on biological and behavioural factors. The presence of erosive wear is a risk factor for further wear in both age cohorts.


PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0243189
Author(s):  
Michał Wieczorek ◽  
Jakub Siłka ◽  
Dawid Połap ◽  
Marcin Woźniak ◽  
Robertas Damaševičius

Since the epidemic outbreak in early months of 2020 the spread of COVID-19 has grown rapidly in most countries and regions across the World. Because of that, SARS-CoV-2 was declared as a Public Health Emergency of International Concern (PHEIC) on January 30, 2020, by The World Health Organization (WHO). That’s why many scientists are working on new methods to reduce further growth of new cases and, by intelligent patients allocation, reduce number of patients per doctor, what can lead to more successful treatments. However to properly manage the COVID-19 spread there is a need for real-time prediction models which can reliably support various decisions both at national and international level. The problem in developing such system is the lack of general knowledge how the virus spreads and what would be the number of cases each day. Therefore prediction model must be able to conclude the situation from past data in the way that results will show a future trend and will possibly closely relate to the real numbers. In our opinion Artificial Intelligence gives a possibility to do it. In this article we present a model which can work as a part of an online system as a real-time predictor to help in estimation of COVID-19 spread. This prediction model is developed using Artificial Neural Networks (ANN) to estimate the future situation by the use of geo-location and numerical data from past 2 weeks. The results of our model are confirmed by comparing them with real data and, during our research the model was correctly predicting the trend and very closely matching the numbers of new cases in each day.


Cancers ◽  
2021 ◽  
Vol 13 (22) ◽  
pp. 5672
Author(s):  
Vincent Bourbonne ◽  
Vincent Jaouen ◽  
Truong An Nguyen ◽  
Valentin Tissot ◽  
Laurent Doucet ◽  
...  

Significant advances in lymph node involvement (LNI) risk modeling in prostate cancer (PCa) have been achieved with the addition of visual interpretation of magnetic resonance imaging (MRI) data, but it is likely that quantitative analysis could further improve prediction models. In this study, we aimed to develop and internally validate a novel LNI risk prediction model based on radiomic features extracted from preoperative multimodal MRI. All patients who underwent a preoperative MRI and radical prostatectomy with extensive lymph node dissection were retrospectively included in a single institution. Patients were randomly divided into the training (60%) and testing (40%) sets. Radiomic features were extracted from the index tumor volumes, delineated on the apparent diffusion coefficient corrected map and the T2 sequences. A ComBat harmonization method was applied to account for inter-site heterogeneity. A prediction model was trained using a neural network approach (Multilayer Perceptron Network, SPSS v24.0©) combining clinical, radiomic and all features. It was then evaluated on the testing set and compared to the current available models using the Receiver Operative Characteristics and the C-Index. Two hundred and eighty patients were included, with a median age of 65.2 y (45.3–79.6), a mean PSA level of 9.5 ng/mL (1.04–63.0) and 79.6% of ISUP ≥ 2 tumors. LNI occurred in 51 patients (18.2%), with a median number of extracted nodes of 15 (10–19). In the testing set, with their respective cutoffs applied, the Partin, Roach, Yale, MSKCC, Briganti 2012 and 2017 models resulted in a C-Index of 0.71, 0.66, 0.55, 0.67, 0.65 and 0.73, respectively, while our proposed combined model resulted in a C-Index of 0.89 in the testing set. Radiomic features extracted from the preoperative MRI scans and combined with clinical features through a neural network seem to provide added predictive performance compared to state of the art models regarding LNI risk prediction in PCa.


2021 ◽  
Vol 15 (7) ◽  
pp. e0009491
Author(s):  
Hamidah Mahmud ◽  
Emma Landskroner ◽  
Abdou Amza ◽  
Solomon Aragie ◽  
William W. Godwin ◽  
...  

The World Health Organization (WHO) recommends continuing azithromycin mass drug administration (MDA) for trachoma until endemic regions drop below 5% prevalence of active trachoma in children aged 1–9 years. Azithromycin targets the ocular strains of Chlamydia trachomatis that cause trachoma. Regions with low prevalence of active trachoma may have little if any ocular chlamydia, and, thus, may not benefit from azithromycin treatment. Understanding what happens to active trachoma and ocular chlamydia prevalence after stopping azithromycin MDA may improve future treatment decisions. We systematically reviewed published evidence for community prevalence of both active trachoma and ocular chlamydia after cessation of azithromycin distribution. We searched electronic databases for all peer-reviewed studies published before May 2020 that included at least 2 post-MDA surveillance surveys of ocular chlamydia and/or the active trachoma marker, trachomatous inflammation–follicular (TF) prevalence. We assessed trends in the prevalence of both indicators over time after stopping azithromycin MDA. Of 140 identified studies, 21 met inclusion criteria and were used for qualitative synthesis. Post-MDA, we found a gradual increase in ocular chlamydia infection prevalence over time, while TF prevalence generally gradually declined. Ocular chlamydia infection may be a better measurement tool compared to TF for detecting trachoma recrudescence in communities after stopping azithromycin MDA. These findings may guide future trachoma treatment and surveillance efforts.


2019 ◽  
Vol 12 (1) ◽  
Author(s):  
Naoko Sasamoto ◽  
Ana Babic ◽  
Bernard A. Rosner ◽  
Renée T. Fortner ◽  
Allison F. Vitonis ◽  
...  

Abstract Background Cancer Antigen 125 (CA125) is currently the best available ovarian cancer screening biomarker. However, CA125 has been limited by low sensitivity and specificity in part due to normal variation between individuals. Personal characteristics that influence CA125 could be used to improve its performance as screening biomarker. Methods We developed and validated linear and dichotomous (≥35 U/mL) circulating CA125 prediction models in postmenopausal women without ovarian cancer who participated in one of five large population-based studies: Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO, n = 26,981), European Prospective Investigation into Cancer and Nutrition (EPIC, n = 861), the Nurses’ Health Studies (NHS/NHSII, n = 81), and the New England Case Control Study (NEC, n = 923). The prediction models were developed using stepwise regression in PLCO and validated in EPIC, NHS/NHSII and NEC. Result The linear CA125 prediction model, which included age, race, body mass index (BMI), smoking status and duration, parity, hysterectomy, age at menopause, and duration of hormone therapy (HT), explained 5% of the total variance of CA125. The correlation between measured and predicted CA125 was comparable in PLCO testing dataset (r = 0.18) and external validation datasets (r = 0.14). The dichotomous CA125 prediction model included age, race, BMI, smoking status and duration, hysterectomy, time since menopause, and duration of HT with AUC of 0.64 in PLCO and 0.80 in validation dataset. Conclusions The linear prediction model explained a small portion of the total variability of CA125, suggesting the need to identify novel predictors of CA125. The dichotomous prediction model showed moderate discriminatory performance which validated well in independent dataset. Our dichotomous model could be valuable in identifying healthy women who may have elevated CA125 levels, which may contribute to reducing false positive tests using CA125 as screening biomarker.


2020 ◽  
Vol 4 (1) ◽  
Author(s):  
Alexander Pate ◽  
Richard Emsley ◽  
Matthew Sperrin ◽  
Glen P. Martin ◽  
Tjeerd van Staa

Abstract Background Stability of risk estimates from prediction models may be highly dependent on the sample size of the dataset available for model derivation. In this paper, we evaluate the stability of cardiovascular disease risk scores for individual patients when using different sample sizes for model derivation; such sample sizes include those similar to models recommended in the national guidelines, and those based on recently published sample size formula for prediction models. Methods We mimicked the process of sampling N patients from a population to develop a risk prediction model by sampling patients from the Clinical Practice Research Datalink. A cardiovascular disease risk prediction model was developed on this sample and used to generate risk scores for an independent cohort of patients. This process was repeated 1000 times, giving a distribution of risks for each patient. N = 100,000, 50,000, 10,000, Nmin (derived from sample size formula) and Nepv10 (meets 10 events per predictor rule) were considered. The 5–95th percentile range of risks across these models was used to evaluate instability. Patients were grouped by a risk derived from a model developed on the entire population (population-derived risk) to summarise results. Results For a sample size of 100,000, the median 5–95th percentile range of risks for patients across the 1000 models was 0.77%, 1.60%, 2.42% and 3.22% for patients with population-derived risks of 4–5%, 9–10%, 14–15% and 19–20% respectively; for N = 10,000, it was 2.49%, 5.23%, 7.92% and 10.59%, and for N using the formula-derived sample size, it was 6.79%, 14.41%, 21.89% and 29.21%. Restricting this analysis to models with high discrimination, good calibration or small mean absolute prediction error reduced the percentile range, but high levels of instability remained. Conclusions Widely used cardiovascular disease risk prediction models suffer from high levels of instability induced by sampling variation. Many models will also suffer from overfitting (a closely linked concept), but at acceptable levels of overfitting, there may still be high levels of instability in individual risk. Stability of risk estimates should be a criterion when determining the minimum sample size to develop models.


Sign in / Sign up

Export Citation Format

Share Document