scholarly journals A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes

2018 ◽  
Vol 28 (9) ◽  
pp. 2768-2786 ◽  
Author(s):  
Thomas PA Debray ◽  
Johanna AAG Damen ◽  
Richard D Riley ◽  
Kym Snell ◽  
Johannes B Reitsma ◽  
...  

It is widely recommended that any developed—diagnostic or prognostic—prediction model is externally validated in terms of its predictive performance measured by calibration and discrimination. When multiple validations have been performed, a systematic review followed by a formal meta-analysis helps to summarize overall performance across multiple settings, and reveals under which circumstances the model performs suboptimal (alternative poorer) and may need adjustment. We discuss how to undertake meta-analysis of the performance of prediction models with either a binary or a time-to-event outcome. We address how to deal with incomplete availability of study-specific results (performance estimates and their precision), and how to produce summary estimates of the c-statistic, the observed:expected ratio and the calibration slope. Furthermore, we discuss the implementation of frequentist and Bayesian meta-analysis methods, and propose novel empirically-based prior distributions to improve estimation of between-study heterogeneity in small samples. Finally, we illustrate all methods using two examples: meta-analysis of the predictive performance of EuroSCORE II and of the Framingham Risk Score. All examples and meta-analysis models have been implemented in our newly developed R package “metamisc”.

2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
A Youssef

Abstract Study question Which models that predict pregnancy outcome in couples with unexplained RPL exist and what is the performance of the most used model? Summary answer We identified seven prediction models; none followed the recommended prediction model development steps. Moreover, the most used model showed poor predictive performance. What is known already RPL remains unexplained in 50–75% of couples For these couples, there is no effective treatment option and clinical management rests on supportive care. Essential part of supportive care consists of counselling on the prognosis of subsequent pregnancies. Indeed, multiple prediction models exist, however the quality and validity of these models varies. In addition, the prediction model developed by Brigham et al is the most widely used model, but has never been externally validated. Study design, size, duration We performed a systematic review to identify prediction models for pregnancy outcome after unexplained RPL. In addition we performed an external validation of the Brigham model in a retrospective cohort, consisting of 668 couples with unexplained RPL that visited our RPL clinic between 2004 and 2019. Participants/materials, setting, methods A systematic search was performed in December 2020 in Pubmed, Embase, Web of Science and Cochrane library to identify relevant studies. Eligible studies were selected and assessed according to the TRIPOD) guidelines, covering topics on model performance and validation statement. The performance of predicting live birth in the Brigham model was evaluated through calibration and discrimination, in which the observed pregnancy rates were compared to the predicted pregnancy rates. Main results and the role of chance Seven models were compared and assessed according to the TRIPOD statement. This resulted in two studies of low, three of moderate and two of above average reporting quality. These studies did not follow the recommended steps for model development and did not calculate a sample size. Furthermore, the predictive performance of neither of these models was internally- or externally validated. We performed an external validation of Brigham model. Calibration showed overestimation of the model and too extreme predictions, with a negative calibration intercept of –0.52 (CI 95% –0.68 – –0.36), with a calibration slope of 0.39 (CI 95% 0.07 – 0.71). The discriminative ability of the model was very low with a concordance statistic of 0.55 (CI 95% 0.50 – 0.59). Limitations, reasons for caution None of the studies are specifically named prediction models, therefore models may have been missed in the selection process. The external validation cohort used a retrospective design, in which only the first pregnancy after intake was registered. Follow-up time was not limited, which is important in counselling unexplained RPL couples. Wider implications of the findings: Currently, there are no suitable models that predict on pregnancy outcome after RPL. Moreover, we are in need of a model with several variables such that prognosis is individualized, and factors from both the female as the male to enable a couple specific prognosis. Trial registration number Not applicable


Water ◽  
2019 ◽  
Vol 11 (12) ◽  
pp. 2516 ◽  
Author(s):  
Changhyun Choi ◽  
Jeonghwan Kim ◽  
Jungwook Kim ◽  
Hung Soo Kim

Adequate forecasting and preparation for heavy rain can minimize life and property damage. Some studies have been conducted on the heavy rain damage prediction model (HDPM), however, most of their models are limited to the linear regression model that simply explains the linear relation between rainfall data and damage. This study develops the combined heavy rain damage prediction model (CHDPM) where the residual prediction model (RPM) is added to the HDPM. The predictive performance of the CHDPM is analyzed to be 4–14% higher than that of HDPM. Through this, we confirmed that the predictive performance of the model is improved by combining the RPM of the machine learning models to complement the linearity of the HDPM. The results of this study can be used as basic data beneficial for natural disaster management.


BMJ ◽  
2019 ◽  
pp. l4293 ◽  
Author(s):  
Mohammed T Hudda ◽  
Mary S Fewtrell ◽  
Dalia Haroun ◽  
Sooky Lum ◽  
Jane E Williams ◽  
...  

Abstract Objectives To develop and validate a prediction model for fat mass in children aged 4-15 years using routinely available risk factors of height, weight, and demographic information without the need for more complex forms of assessment. Design Individual participant data meta-analysis. Setting Four population based cross sectional studies and a fifth study for external validation, United Kingdom. Participants A pooled derivation dataset (four studies) of 2375 children and an external validation dataset of 176 children with complete data on anthropometric measurements and deuterium dilution assessments of fat mass. Main outcome measure Multivariable linear regression analysis, using backwards selection for inclusion of predictor variables and allowing non-linear relations, was used to develop a prediction model for fat-free mass (and subsequently fat mass by subtracting resulting estimates from weight) based on the four studies. Internal validation and then internal-external cross validation were used to examine overfitting and generalisability of the model’s predictive performance within the four development studies; external validation followed using the fifth dataset. Results Model derivation was based on a multi-ethnic population of 2375 children (47.8% boys, n=1136) aged 4-15 years. The final model containing predictor variables of height, weight, age, sex, and ethnicity had extremely high predictive ability (optimism adjusted R 2 : 94.8%, 95% confidence interval 94.4% to 95.2%) with excellent calibration of observed and predicted values. The internal validation showed minimal overfitting and good model generalisability, with excellent calibration and predictive performance. External validation in 176 children aged 11-12 years showed promising generalisability of the model (R 2 : 90.0%, 95% confidence interval 87.2% to 92.8%) with good calibration of observed and predicted fat mass (slope: 1.02, 95% confidence interval 0.97 to 1.07). The mean difference between observed and predicted fat mass was −1.29 kg (95% confidence interval −1.62 to −0.96 kg). Conclusion The developed model accurately predicted levels of fat mass in children aged 4-15 years. The prediction model is based on simple anthropometric measures without the need for more complex forms of assessment and could improve the accuracy of assessments for body fatness in children (compared with those provided by body mass index) for effective surveillance, prevention, and management of clinical and public health obesity.


2019 ◽  
Vol 105 (5) ◽  
pp. 439-445 ◽  
Author(s):  
Bob Phillips ◽  
Jessica Elizabeth Morgan ◽  
Gabrielle M Haeusler ◽  
Richard D Riley

BackgroundRisk-stratified approaches to managing cancer therapies and their consequent complications rely on accurate predictions to work effectively. The risk-stratified management of fever with neutropenia is one such very common area of management in paediatric practice. Such rules are frequently produced and promoted without adequate confirmation of their accuracy.MethodsAn individual participant data meta-analytic validation of the ‘Predicting Infectious ComplicatioNs In Children with Cancer’ (PICNICC) prediction model for microbiologically documented infection in paediatric fever with neutropenia was undertaken. Pooled estimates were produced using random-effects meta-analysis of the area under the curve-receiver operating characteristic curve (AUC-ROC), calibration slope and ratios of expected versus observed cases (E/O).ResultsThe PICNICC model was poorly predictive of microbiologically documented infection (MDI) in these validation cohorts. The pooled AUC-ROC was 0.59, 95% CI 0.41 to 0.78, tau2=0, compared with derivation value of 0.72, 95% CI 0.71 to 0.76. There was poor discrimination (pooled slope estimate 0.03, 95% CI −0.19 to 0.26) and calibration in the large (pooled E/O ratio 1.48, 95% CI 0.87 to 2.1). Three different simple recalibration approaches failed to improve performance meaningfully.ConclusionThis meta-analysis shows the PICNICC model should not be used at admission to predict MDI. Further work should focus on validating alternative prediction models. Validation across multiple cohorts from diverse locations is essential before widespread clinical adoption of such rules to avoid overtreating or undertreating children with fever with neutropenia.


2021 ◽  
Author(s):  
Donald Ray Williams ◽  
Josue E. Rodriguez ◽  
Paul - Christian Bürkner

We shed much needed light upon a critical assumption that is oft-overlooked---or not considered at all---in random-effects meta-analysis.Namely, that between-study variance is constant across \emph{all} studies which implies they are from the \emph{same} population. Yet it is not hard to imagine a situation where there are several and not merely one population of studies, perhaps differing in their between-study variance (i.e., heteroskedasticity). The objective is to then make inference, given that there are variations in heterogeneity. There is an immediate problem, however, in that modeling heterogeneous variance components is not straightforward to do in a general way. To this end, we propose novel methodology, termed Bayesian location-scale meta-analysis, that can accommodate moderators for both the overall effect (location) and the between-study variance (scale). After introducing the model, we then extend heterogeneity statistics, prediction intervals, and hierarchical shrinkage, all of which customarily assume constant heterogeneity, to include variations therein. With these new tools in hand, we go to work demonstrating that quite literally \emph{everything} changes when between-study variance is not constant across studies. The changes were not small and easily passed the interocular trauma test---the importance hits right between the eyes. Such examples include (but are not limited to) inference on the overall effect, a compromised predictive distribution, and improper shrinkage of the study-specific effects. Further, we provide an illustrative example where heterogeneity was not considered a mere nuisance to show that modeling variance for its own sake can provide unique inferences, in this case into discrimination across nine countries. The discussion includes several ideas for future research. We have implemented the proposed methodology in the {\tt R} package \textbf{blsmeta}.


F1000Research ◽  
2019 ◽  
Vol 7 ◽  
pp. 1505 ◽  
Author(s):  
Michail Tsagris ◽  
Ioannis Tsamardinos

Feature (or variable) selection is the process of identifying the minimal set of features with the highest predictive performance on the target variable of interest. Numerous feature selection algorithms have been developed over the years, but only few have been implemented in R and made publicly available R as packages while offering few options. The R package MXM offers a variety of feature selection algorithms, and has unique features that make it advantageous over its competitors: a) it contains feature selection algorithms that can treat numerous types of target variables, including continuous, percentages, time to event (survival), binary, nominal, ordinal, clustered, counts, left censored, etc; b) it contains a variety of regression models that can be plugged into the feature selection algorithms (for example with time to event data the user can choose among Cox, Weibull, log logistic or exponential regression); c) it includes an algorithm for detecting multiple solutions (many sets of statistically equivalent features, plain speaking, two features can carry statistically equivalent information when substituting one with the other does not effect the inference or the conclusions); and d) it includes memory efficient algorithms for high volume data, data that cannot be loaded into R (In a 16GB RAM terminal for example, R cannot directly load data of 16GB size. By utilizing the proper package, we load the data and then perform feature selection.). In this paper, we qualitatively compare MXM with other relevant feature selection packages and discuss its advantages and disadvantages. Further, we provide a demonstration of MXM’s algorithms using real high-dimensional data from various applications.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Kaci L Pickett ◽  
Krithika Suresh ◽  
Kristen R Campbell ◽  
Scott Davis ◽  
Elizabeth Juarez-Colunga

Abstract Background Risk prediction models for time-to-event outcomes play a vital role in personalized decision-making. A patient’s biomarker values, such as medical lab results, are often measured over time but traditional prediction models ignore their longitudinal nature, using only baseline information. Dynamic prediction incorporates longitudinal information to produce updated survival predictions during follow-up. Existing methods for dynamic prediction include joint modeling, which often suffers from computational complexity and poor performance under misspecification, and landmarking, which has a straightforward implementation but typically relies on a proportional hazards model. Random survival forests (RSF), a machine learning algorithm for time-to-event outcomes, can capture complex relationships between the predictors and survival without requiring prior specification and has been shown to have superior predictive performance. Methods We propose an alternative approach for dynamic prediction using random survival forests in a landmarking framework. With a simulation study, we compared the predictive performance of our proposed method with Cox landmarking and joint modeling in situations where the proportional hazards assumption does not hold and the longitudinal marker(s) have a complex relationship with the survival outcome. We illustrated the use of the RSF landmark approach in two clinical applications to assess the performance of various RSF model building decisions and to demonstrate its use in obtaining dynamic predictions. Results In simulation studies, RSF landmarking outperformed joint modeling and Cox landmarking when a complex relationship between the survival and longitudinal marker processes was present. It was also useful in application when there were several predictors for which the clinical relevance was unknown and multiple longitudinal biomarkers were present. Individualized dynamic predictions can be obtained from this method and the variable importance metric is useful for examining the changing predictive power of variables over time. In addition, RSF landmarking is easily implementable in standard software and using suggested specifications requires less computation time than joint modeling. Conclusions RSF landmarking is a nonparametric, machine learning alternative to current methods for obtaining dynamic predictions when there are complex or unknown relationships present. It requires little upfront decision-making and has comparable predictive performance and has preferable computational speed.


GigaScience ◽  
2020 ◽  
Vol 9 (10) ◽  
Author(s):  
Magali Jaillard ◽  
Mattia Palmieri ◽  
Alex van Belkum ◽  
Pierre Mahé

Abstract Background Recent years have witnessed the development of several k-mer–based approaches aiming to predict phenotypic traits of bacteria on the basis of their whole-genome sequences. While often convincing in terms of predictive performance, the underlying models are in general not straightforward to interpret, the interplay between the actual genetic determinant and its translation as k-mers being generally hard to decipher. Results We propose a simple and computationally efficient strategy allowing one to cope with the high correlation inherent to k-mer–based representations in supervised machine learning models, leading to concise and easily interpretable signatures. We demonstrate the benefit of this approach on the task of predicting the antibiotic resistance profile of a Klebsiella pneumoniae strain from its genome, where our method leads to signatures defined as weighted linear combinations of genetic elements that can easily be identified as genuine antibiotic resistance determinants, with state-of-the-art predictive performance. Conclusions By enhancing the interpretability of genomic k-mer–based antibiotic resistance prediction models, our approach improves their clinical utility and hence will facilitate their adoption in routine diagnostics by clinicians and microbiologists. While antibiotic resistance was the motivating application, the method is generic and can be transposed to any other bacterial trait. An R package implementing our method is available at https://gitlab.com/biomerieux-data-science/clustlasso.


BMJ Open ◽  
2019 ◽  
Vol 9 (8) ◽  
pp. e025579 ◽  
Author(s):  
Mohammad Ziaul Islam Chowdhury ◽  
Fahmida Yeasmin ◽  
Doreen M Rabi ◽  
Paul E Ronksley ◽  
Tanvir C Turin

ObjectiveStroke is a major cause of disability and death worldwide. People with diabetes are at a twofold to fivefold increased risk for stroke compared with people without diabetes. This study systematically reviews the literature on available stroke prediction models specifically developed or validated in patients with diabetes and assesses their predictive performance through meta-analysis.DesignSystematic review and meta-analysis.Data sourcesA detailed search was performed in MEDLINE, PubMed and EMBASE (from inception to 22 April 2019) to identify studies describing stroke prediction models.Eligibility criteriaAll studies that developed stroke prediction models in populations with diabetes were included.Data extraction and synthesisTwo reviewers independently identified eligible articles and extracted data. Random effects meta-analysis was used to obtain a pooled C-statistic.ResultsOur search retrieved 26 202 relevant papers and finally yielded 38 stroke prediction models, of which 34 were specifically developed for patients with diabetes and 4 were developed in general populations but validated in patients with diabetes. Among the models developed in those with diabetes, 9 reported their outcome as stroke, 23 reported their outcome as composite cardiovascular disease (CVD) where stroke was a component of the outcome and 2 did not report stroke initially as their outcome but later were validated for stroke as the outcome in other studies. C-statistics varied from 0.60 to 0.92 with a median C-statistic of 0.71 (for stroke as the outcome) and 0.70 (for stroke as part of a composite CVD outcome). Seventeen models were externally validated in diabetes populations with a pooled C-statistic of 0.68.ConclusionsOverall, the performance of these diabetes-specific stroke prediction models was not satisfactory. Research is needed to identify and incorporate new risk factors into the model to improve models’ predictive ability and further external validation of the existing models in diverse population to improve generalisability.


2021 ◽  
Author(s):  
Constanza L Andaur Navarro ◽  
Johanna AA Damen ◽  
Toshihiko Takada ◽  
Steven WJ Nijman ◽  
Paula Dhiman ◽  
...  

ABSTRACT Objective. While many studies have consistently found incomplete reporting of regression-based prediction model studies, evidence is lacking for machine learning-based prediction model studies. Our aim is to systematically review the adherence of Machine Learning (ML)-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Statement. Study design and setting: We included articles reporting on development or external validation of a multivariable prediction model (either diagnostic or prognostic) developed using supervised ML for individualized predictions across all medical fields (PROSPERO, CRD42019161764). We searched PubMed from 1 January 2018 to 31 December 2019. Data extraction was performed using the 22-item checklist for reporting of prediction model studies (www.TRIPOD-statement.org). We measured the overall adherence per article and per TRIPOD item. Results: Our search identified 24 814 articles, of which 152 articles were included: 94 (61.8%) prognostic and 58 (38.2%) diagnostic prediction model studies. Overall, articles adhered to a median of 38.7% (IQR 31.0-46.4) of TRIPOD items. No articles fully adhered to complete reporting of the abstract and very few reported the flow of participants (3.9%, 95% CI 1.8 to 8.3), appropriate title (4.6%, 95% CI 2.2 to 9.2), blinding of predictors (4.6%, 95% CI 2.2 to 9.2), model specification (5.2%, 95% CI 2.4 to 10.8), and model's predictive performance (5.9%, 95% CI 3.1 to 10.9). There was often complete reporting of source of data (98.0%, 95% CI 94.4 to 99.3) and interpretation of the results (94.7%, 95% CI 90.0 to 97.3). Conclusion. Similar to studies using conventional statistical techniques, the completeness of reporting is poor. Essential information to decide to use the model (i.e. model specification and its performance) is rarely reported. However, some items and sub-items of TRIPOD might be less suitable for ML-based prediction model studies and thus, TRIPOD requires extensions. Overall, there is an urgent need to improve the reporting quality and usability of research to avoid research waste.


Sign in / Sign up

Export Citation Format

Share Document