scholarly journals Re-evaluation of the comparative effectiveness of bootstrap-based optimism correction methods in the development of multivariable clinical prediction models

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Katsuhiro Iba ◽  
Tomohiro Shinozaki ◽  
Kazushi Maruo ◽  
Hisashi Noma

Abstract Background Multivariable prediction models are important statistical tools for providing synthetic diagnosis and prognostic algorithms based on patients’ multiple characteristics. Their apparent measures for predictive accuracy usually have overestimation biases (known as ‘optimism’) relative to the actual performances for external populations. Existing statistical evidence and guidelines suggest that three bootstrap-based bias correction methods are preferable in practice, namely Harrell’s bias correction and the .632 and .632+ estimators. Although Harrell’s method has been widely adopted in clinical studies, simulation-based evidence indicates that the .632+ estimator may perform better than the other two methods. However, these methods’ actual comparative effectiveness is still unclear due to limited numerical evidence. Methods We conducted extensive simulation studies to compare the effectiveness of these three bootstrapping methods, particularly using various model building strategies: conventional logistic regression, stepwise variable selections, Firth’s penalized likelihood method, ridge, lasso, and elastic-net regression. We generated the simulation data based on the Global Utilization of Streptokinase and Tissue plasminogen activator for Occluded coronary arteries (GUSTO-I) trial Western dataset and considered how event per variable, event fraction, number of candidate predictors, and the regression coefficients of the predictors impacted the performances. The internal validity of C-statistics was evaluated. Results Under relatively large sample settings (roughly, events per variable ≥ 10), the three bootstrap-based methods were comparable and performed well. However, all three methods had biases under small sample settings, and the directions and sizes of biases were inconsistent. In general, Harrell’s and .632 methods had overestimation biases when event fraction become lager. Besides, .632+ method had a slight underestimation bias when event fraction was very small. Although the bias of the .632+ estimator was relatively small, its root mean squared error (RMSE) was comparable or sometimes larger than those of the other two methods, especially for the regularized estimation methods. Conclusions In general, the three bootstrap estimators were comparable, but the .632+ estimator performed relatively well under small sample settings, except when the regularized estimation methods are adopted.

2017 ◽  
Vol 47 (7) ◽  
pp. 1163-1178 ◽  
Author(s):  
E. Studerus ◽  
A. Ramyead ◽  
A. Riecher-Rössler

BackgroundTo enhance indicated prevention in patients with a clinical high risk (CHR) for psychosis, recent research efforts have been increasingly directed towards estimating the risk of developing psychosis on an individual level using multivariable clinical prediction models. The aim of this study was to systematically review the methodological quality and reporting of studies developing or validating such models.MethodA systematic literature search was carried out (up to 14 March 2016) to find all studies that developed or validated a clinical prediction model predicting the transition to psychosis in CHR patients. Data were extracted using a comprehensive item list which was based on current methodological recommendations.ResultsA total of 91 studies met the inclusion criteria. None of the retrieved studies performed a true external validation of an existing model. Only three studies (3.5%) had an event per variable ratio of at least 10, which is the recommended minimum to avoid overfitting. Internal validation was performed in only 14 studies (15%) and seven of these used biased internal validation strategies. Other frequently observed modeling approaches not recommended by methodologists included univariable screening of candidate predictors, stepwise variable selection, categorization of continuous variables, and poor handling and reporting of missing data.ConclusionsOur systematic review revealed that poor methods and reporting are widespread in prediction of psychosis research. Since most studies relied on small sample sizes, did not perform internal or external cross-validation, and used poor model development strategies, most published models are probably overfitted and their reported predictive accuracy is likely to be overoptimistic.


2020 ◽  
Vol 26 (33) ◽  
pp. 4195-4205
Author(s):  
Xiaoyu Ding ◽  
Chen Cui ◽  
Dingyan Wang ◽  
Jihui Zhao ◽  
Mingyue Zheng ◽  
...  

Background: Enhancing a compound’s biological activity is the central task for lead optimization in small molecules drug discovery. However, it is laborious to perform many iterative rounds of compound synthesis and bioactivity tests. To address the issue, it is highly demanding to develop high quality in silico bioactivity prediction approaches, to prioritize such more active compound derivatives and reduce the trial-and-error process. Methods: Two kinds of bioactivity prediction models based on a large-scale structure-activity relationship (SAR) database were constructed. The first one is based on the similarity of substituents and realized by matched molecular pair analysis, including SA, SA_BR, SR, and SR_BR. The second one is based on SAR transferability and realized by matched molecular series analysis, including Single MMS pair, Full MMS series, and Multi single MMS pairs. Moreover, we also defined the application domain of models by using the distance-based threshold. Results: Among seven individual models, Multi single MMS pairs bioactivity prediction model showed the best performance (R2 = 0.828, MAE = 0.406, RMSE = 0.591), and the baseline model (SA) produced the most lower prediction accuracy (R2 = 0.798, MAE = 0.446, RMSE = 0.637). The predictive accuracy could further be improved by consensus modeling (R2 = 0.842, MAE = 0.397 and RMSE = 0.563). Conclusion: An accurate prediction model for bioactivity was built with a consensus method, which was superior to all individual models. Our model should be a valuable tool for lead optimization.


2021 ◽  
pp. 1-27 ◽  
Author(s):  
Brandon de la Cuesta ◽  
Naoki Egami ◽  
Kosuke Imai

Abstract Conjoint analysis has become popular among social scientists for measuring multidimensional preferences. When analyzing such experiments, researchers often focus on the average marginal component effect (AMCE), which represents the causal effect of a single profile attribute while averaging over the remaining attributes. What has been overlooked, however, is the fact that the AMCE critically relies upon the distribution of the other attributes used for the averaging. Although most experiments employ the uniform distribution, which equally weights each profile, both the actual distribution of profiles in the real world and the distribution of theoretical interest are often far from uniform. This mismatch can severely compromise the external validity of conjoint analysis. We empirically demonstrate that estimates of the AMCE can be substantially different when averaging over the target profile distribution instead of uniform. We propose new experimental designs and estimation methods that incorporate substantive knowledge about the profile distribution. We illustrate our methodology through two empirical applications, one using a real-world distribution and the other based on a counterfactual distribution motivated by a theoretical consideration. The proposed methodology is implemented through an open-source software package.


Mathematics ◽  
2021 ◽  
Vol 9 (16) ◽  
pp. 1850
Author(s):  
Rashad A. R. Bantan ◽  
Farrukh Jamal ◽  
Christophe Chesneau ◽  
Mohammed Elgarhy

Unit distributions are commonly used in probability and statistics to describe useful quantities with values between 0 and 1, such as proportions, probabilities, and percentages. Some unit distributions are defined in a natural analytical manner, and the others are derived through the transformation of an existing distribution defined in a greater domain. In this article, we introduce the unit gamma/Gompertz distribution, founded on the inverse-exponential scheme and the gamma/Gompertz distribution. The gamma/Gompertz distribution is known to be a very flexible three-parameter lifetime distribution, and we aim to transpose this flexibility to the unit interval. First, we check this aspect with the analytical behavior of the primary functions. It is shown that the probability density function can be increasing, decreasing, “increasing-decreasing” and “decreasing-increasing”, with pliant asymmetric properties. On the other hand, the hazard rate function has monotonically increasing, decreasing, or constant shapes. We complete the theoretical part with some propositions on stochastic ordering, moments, quantiles, and the reliability coefficient. Practically, to estimate the model parameters from unit data, the maximum likelihood method is used. We present some simulation results to evaluate this method. Two applications using real data sets, one on trade shares and the other on flood levels, demonstrate the importance of the new model when compared to other unit models.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Menelaos Pavlou ◽  
Gareth Ambler ◽  
Rumana Z. Omar

Abstract Background Clustered data arise in research when patients are clustered within larger units. Generalised Estimating Equations (GEE) and Generalised Linear Models (GLMM) can be used to provide marginal and cluster-specific inference and predictions, respectively. Methods Confounding by Cluster (CBC) and Informative cluster size (ICS) are two complications that may arise when modelling clustered data. CBC can arise when the distribution of a predictor variable (termed ‘exposure’), varies between clusters causing confounding of the exposure-outcome relationship. ICS means that the cluster size conditional on covariates is not independent of the outcome. In both situations, standard GEE and GLMM may provide biased or misleading inference, and modifications have been proposed. However, both CBC and ICS are routinely overlooked in the context of risk prediction, and their impact on the predictive ability of the models has been little explored. We study the effect of CBC and ICS on the predictive ability of risk models for binary outcomes when GEE and GLMM are used. We examine whether two simple approaches to handle CBC and ICS, which involve adjusting for the cluster mean of the exposure and the cluster size, respectively, can improve the accuracy of predictions. Results Both CBC and ICS can be viewed as violations of the assumptions in the standard GLMM; the random effects are correlated with exposure for CBC and cluster size for ICS. Based on these principles, we simulated data subject to CBC/ICS. The simulation studies suggested that the predictive ability of models derived from using standard GLMM and GEE ignoring CBC/ICS was affected. Marginal predictions were found to be mis-calibrated. Adjusting for the cluster-mean of the exposure or the cluster size improved calibration, discrimination and the overall predictive accuracy of marginal predictions, by explaining part of the between cluster variability. The presence of CBC/ICS did not affect the accuracy of conditional predictions. We illustrate these concepts using real data from a multicentre study with potential CBC. Conclusion Ignoring CBC and ICS when developing prediction models for clustered data can affect the accuracy of marginal predictions. Adjusting for the cluster mean of the exposure or the cluster size can improve the predictive accuracy of marginal predictions.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Ferhat Bingöl

Wind farm siting relies on in situ measurements and statistical analysis of the wind distribution. The current statistical methods include distribution functions. The one that is known to provide the best fit to the nature of the wind is the Weibull distribution function. It is relatively straightforward to parameterize wind resources with the Weibull function if the distribution fits what the function represents but the estimation process gets complicated if the distribution of the wind is diverse in terms of speed and direction. In this study, data from a 101 m meteorological mast were used to test several estimation methods. The available data display seasonal variations, with low wind speeds in different seasons and effects of a moderately complex surrounding. The results show that the maximum likelihood method is much more successful than industry standard WAsP method when the diverse winds with high percentile of low wind speed occur.


2002 ◽  
Vol 53 (5) ◽  
pp. 869 ◽  
Author(s):  
Richard McGarvey ◽  
Andrew H. Levings ◽  
Janet M. Matthews

The growth of Australian giant crabs, Pseudocarcinus gigas, has not been previously studied. A tagging program was undertaken in four Australian states where the species is subject to commercial exploitation. Fishers reported a recapture sample of 1372 females and 383 males from commercial harvest, of which 190 females and 160 males had moulted at least once. Broad-scale modes of growth increment were readily identified and interpreted as 0 , 1 and 2 moults during time at large. Single-moult increments were normally distributed for six of seven data sets. Moult increments were constant with length for males and declined slowly for three of four female data sets. Seasonality of moulting in South Australia was inferred from monthly proportions captured with newly moulted shells. Female moulting peaked strongly in winter (June and July). Males moult in summer (November and December). Intermoult period estimates for P. gigas varied from 3 to 4 years for juvenile males and females (80–120 mm carapace length, CL), with rapid lengthening in time between moulting events to approximately seven years for females and four and a half years for males at legal minimum length of 150 mm CL. New moulting growth estimation methods include a generalization of the anniversary method for estimating intermoult period that uses (rather than rejects) most capture–recapture data and a multiple likelihood method for assigning recaptures to their most probable number of moults during time at large.


Author(s):  
Ting Bai ◽  
Fan Wu ◽  
Shuhan Yan ◽  
Feng Zhang ◽  
Xujuan Xu

<b><i>Objectives:</i></b> The aim of the study was to construct and evaluate a rat model of postpartum fatigue. <b><i>Design:</i></b> This is an article about animal model building. <b><i>Methods:</i></b> Sprague-Dawley rats on the 1st day after delivery were randomized into control group and fatigue group. The deep sleep of rats was interfered with by forcing them to stand in water, to make the rats experience mental and physical fatigue. To maintain galactosis and lactation, rats and pups were caged for 90 min after every 3 h of separation. The control group was separated routinely without any stimulus. The model was evaluated from mental and physical fatigue on the 8th day and 15th day. The mental fatigue was evaluated by a water maze test and the rat’s 5-hydroxytryptamine (5-HT) level in hippocampus, while the physical fatigue was evaluated using lactic acid level in serum and duration of weight-loaded forced swimming. <b><i>Results:</i></b> Among the 7-day and 14-day modeling groups, compared with the control group, the success rate of water maze landing was significantly decreased, the time for water maze landing was significantly prolonged and 5-HT level in hippocampus significantly decreased in the fatigue group. With respect to physical fatigue, among the 7-day and 14-day modeling groups, the lactic acid level in serum in the fatigue group was significantly increased, and the duration of exhaustive swimming of rats was significantly shortened. <b><i>Limitations:</i></b> A small sample size was the main limitation of this study. <b><i>Conclusions:</i></b> We have successfully constructed a rat model of postpartum fatigue by forcing postpartum rats to stand in water, which was similar to a level of stress that contributes to the development of postpartum fatigue. Our model opens the door for future studies evaluating the effectiveness of pharmacological and behavioral therapies.


2012 ◽  
Vol 36 (2) ◽  
pp. 88-103 ◽  
Author(s):  
Lai-Fa Hung

Rasch used a Poisson model to analyze errors and speed in reading tests. An important property of the Poisson distribution is that the mean and variance are equal. However, in social science research, it is very common for the variance to be greater than the mean (i.e., the data are overdispersed). This study embeds the Rasch model within an overdispersion framework and proposes new estimation methods. The parameters in the proposed model can be estimated using the Markov chain Monte Carlo method implemented in WinBUGS and the marginal maximum likelihood method implemented in SAS. An empirical example based on models generated by the results of empirical data, which are fitted and discussed, is examined.


Sign in / Sign up

Export Citation Format

Share Document