confidence interval coverage
Recently Published Documents


TOTAL DOCUMENTS

34
(FIVE YEARS 18)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
pp. 107699862110520
Author(s):  
Jin Liu ◽  
Robert A. Perera ◽  
Le Kang ◽  
Roy T. Sabo ◽  
Robert M. Kirkpatrick

This study proposes transformation functions and matrices between coefficients in the original and reparameterized parameter spaces for an existing linear-linear piecewise model to derive the interpretable coefficients directly related to the underlying change pattern. Additionally, the study extends the existing model to allow individual measurement occasions and investigates predictors for individual differences in change patterns. We present the proposed methods with simulation studies and a real-world data analysis. Our simulation study demonstrates that the method can generally provide an unbiased and accurate point estimate and appropriate confidence interval coverage for each parameter. The empirical analysis shows that the model can estimate the growth factor coefficients and path coefficients directly related to the underlying developmental process, thereby providing meaningful interpretation.


2021 ◽  
Vol 18 (1) ◽  
Author(s):  
Kanako Fuyama ◽  
Yasuhiro Hagiwara ◽  
Yutaka Matsuyama

Abstract Background Risk ratio is a popular effect measure in epidemiological research. Although previous research has suggested that logistic regression may provide biased odds ratio estimates when the number of events is small and there are multiple confounders, the performance of risk ratio estimation has yet to be examined in the presence of multiple confounders. Methods We conducted a simulation study to evaluate the statistical performance of three regression approaches for estimating risk ratios: (1) risk ratio interpretation of logistic regression coefficients, (2) modified Poisson regression, and (3) regression standardization using logistic regression. We simulated 270 scenarios with systematically varied sample size, the number of binary confounders, exposure proportion, risk ratio, and outcome proportion. Performance evaluation was based on convergence proportion, bias, standard error estimation, and confidence interval coverage. Results With a sample size of 2500 and an outcome proportion of 1%, both logistic regression and modified Poisson regression at times failed to converge, and the three approaches were comparably biased. As the outcome proportion or sample size increased, modified Poisson regression and regression standardization yielded unbiased risk ratio estimates with appropriate confidence intervals irrespective of the number of confounders. The risk ratio interpretation of logistic regression coefficients, by contrast, became substantially biased as the outcome proportion increased. Conclusions Regression approaches for estimating risk ratios should be cautiously used when the number of events is small. With an adequate number of events, risk ratios are validly estimated by modified Poisson regression and regression standardization, irrespective of the number of confounders.


2021 ◽  
pp. 096228022110417
Author(s):  
Martina Mittlböck ◽  
Ulrike Pötschger ◽  
Harald Heinzl

Generalised pseudo-values have been suggested to evaluate the impact of allogeneic stem cell transplantation on childhood leukaemia. The approach compares long-term survival of two cohorts defined by the availability or non-availability of suitable donors for stem cell transplantation. A patient's cohort membership becomes known only after completed donor search with or without an identified donor. If a patient suffers an event during donor search, stem cell transplantation will no longer be indicated. In such a case, donor search will be ceased and cohort membership will remain unknown. The generalised pseudo-values approach considers donor identification as binary time-dependent covariate and uses inverse-probability-of-censoring weighting to adjust for non-identified donors. The approach leads to time-consuming computations due to multiple redefinitions of the risk set for pseudo-value calculation and an explicit adjustment for waiting-time bias. Here, the problem is looked at from a different angle. By considering the probability that a donor would have been identified after ceasing of donor search, weights for common pseudo-values are defined. This leads to a faster alternative approach as only a single risk set is necessary. Extensive computer simulations show that both, the generalised and the new weighted pseudo-values approach, provide approximately unbiased estimates. Confidence interval coverage is satisfactory for typical clinical scenarios. In situations, where donor identification takes considerably longer than usual, the weighted pseudo-values approach is preferable. Both approaches complement each other as they have different potential in addressing further aspects of the underlying medical question.


Author(s):  
Md Belal Hossain ◽  
Lucy Mosquera ◽  
Mohammad Karim

Introduction: The instrumental variable (IV)-based methods (e.g., two-stage least square [2SLS], two-stage residual inclusion [2SRI], and nonparametric causal bound [NPCB]) can be used to address non-adherence in pragmatic trials. These methods require assumptions, e.g., exclusion restriction, although they are known to handle unmeasured confounding. The inverse probability-weighted per-protocol [IPW-PP] method is useful in the same setting but requires different assumptions (no unmeasured confounding). Although all these methods aim to address the same problem, comprehensive simulations to compare their performance are absent in the literature. We performed extensive simulations when (1) confounding is present, (2) confounder is unmeasured but exclusion restriction is met, (3) exclusion restriction is violated, and (4) non-adherence is one-sided and differential. Method: We compared the performance in terms of bias, standard error (SE), mean squared error (MSE), and 95% confidence interval coverage probability. Results: For setting-1, IPW-PP outperforms IV-methods in terms of bias, SE, MSE, and coverage for <80% non-adherence but produces high bias beyond that point. IPW-PP also has high biases, but 2SLS and 2SRI work well for setting-2. For setting-3, 2SLS and 2SRI perform the worst in all scenarios; IPW-PP produces unbiased estimates when necessary confounders are measured and adjusted. For setting-4, IPW-PP has less bias, but 2SLS and 2SRI have higher SE and MSE. NPCB has wider bounds in all scenarios. We also analyze a two-arm trial to estimate the effect of vitamin A supplementation on childhood mortality after addressing non-adherence. Conclusion: We need to be cautious using the IPW-PP when non-adherence is very high or strong unmeasured confounding and should avoid using the IV methods when the exclusion restriction assumption is violated or high differential non-adherence. Since assumptions are different and often untestable for IPW-PP and IV methods, we suggest analyzing data using both methods for a robust conclusion.


Author(s):  
Jin Liu ◽  
Le Kang ◽  
Roy T. Sabo ◽  
Robert M. Kirkpatrick ◽  
Robert A. Perera

Empirical researchers are usually interested in investigating the impacts that baseline covariates have when uncovering sample heterogeneity and separating samples into more homogeneous groups. However, a considerable number of studies in the structural equation modeling (SEM) framework usually start with vague hypotheses in terms of heterogeneity and possible causes. It suggests that (1) the determination and specification of a proper model with covariates is not straightforward, and (2) the exploration process may be computationally intensive given that a model in the SEM framework is usually complicated and the pool of candidate covariates is usually huge in the psychological and educational domain where the SEM framework is widely employed. Following Bakk and Kuha (2017), this article presents a two-step growth mixture model (GMM) that examines the relationship between latent classes of nonlinear trajectories and baseline characteristics. Our simulation studies demonstrate that the proposed model is capable of clustering the nonlinear change patterns, and estimating the parameters of interest unbiasedly, precisely, as well as exhibiting appropriate confidence interval coverage. Considering the pool of candidate covariates is usually huge and highly correlated, this study also proposes implementing exploratory factor analysis (EFA) to reduce the dimension of covariate space. We illustrate how to use the hybrid method, the two-step GMM and EFA, to efficiently explore the heterogeneity of nonlinear trajectories of longitudinal mathematics achievement data.


Author(s):  
Heather Kitada Smalley ◽  
Sarah C. Emerson ◽  
Virginia Lesser

In this chapter, we develop theory and methodology to support mode adjustment and hindcasting/forecasting in the presence of different possible mode effect types, including additive effects and odds-multiplicative effects. Mode adjustment is particularly important if the ultimate goal is to report one aggregate estimate of response parameters, and also to allow for comparison to historical surveys performed with different modes. Effect type has important consequences for inferential validity when the baseline response changes over time (i.e. when there is a time trend or time effect). We present a methodology to provide inference for additive and odds-multiplicative effect types, and demonstrate its performance in a simulation study. We also show that if the wrong effect type is assumed, the resulting inference can be invalid as confidence interval coverage is greatly reduced and estimates can be biased.


Author(s):  
David B Richardson ◽  
Stephen R Cole ◽  
Rachael K Ross ◽  
Charles Poole ◽  
Haitao Chu ◽  
...  

Abstract Meta-analyses are undertaken to combine information from a set of studies, often in settings where some of the individual study-specific estimates are based on relatively small study samples. Finite sample bias may occur when maximum likelihood estimates of associations are obtained by fitting logistic regression models to sparse data sets. Here we show that combining information from small studies by undertaking a meta-analytical summary of logistic regression estimates can propagate such sparse-data bias. In simulations, we illustrate 2 challenges encountered in meta-analyses of logistic regression results in settings of sparse data: 1) bias in the summary meta-analytical result and 2) confidence interval coverage that can worsen rather than improve, in terms of being less than nominal, as the number of studies in the meta-analysis increases.


2020 ◽  
Vol 30 (Supplement_5) ◽  
Author(s):  
M Mahendran ◽  
G Bauer ◽  
D Lizotte ◽  
Y Zhu

Abstract Introduction This study evaluated seven quantitative methods for their predictive accuracy for intersectionally defined subgroups, via a simulation study. The methods were single-level regression with interaction terms, cross-classification, multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA), and four decision tree Methods classification and regression trees (CART), conditional inference trees, chi-square automatic interaction detector, and random forest. Also evaluated was how well methods identified variables relevant to the outcome. An example analysis will be presented using data from the U.S. National Health and Nutritional Examination Survey. Methods The simulated datasets varied by outcome variable type (binary and continuous), input variable types, sample size, and size and direction of the effects. Accuracy was evaluated using mean squared error or mean absolute percentage error. The secondary outcome was evaluated via significance and confidence interval coverage of regression terms and variable selection of the machine learning methods. Results Predictive accuracy improved with increasing sample size for all methods except CART. At small sample sizes random forest and MAIHDA generally created the most precise predictions. Variable selection consistently faced a high type 1 error for CTree and CHAID. While performing well for prediction, variable selection by random forest and confidence interval coverage and power of MAIHDA main effects coefficients were suboptimal. Discussion From this study emerge recommendations for applying methods in quantitative intersectionality. Different methodologies are optimal for different purposes, for example while random forest and MAIHDA performed well for prediction, they were less reliable for variable identification. In our discussion, we will work through how to select, apply, and interpret methodologies to achieve analytic goals that align with intersectionality theory.


Sign in / Sign up

Export Citation Format

Share Document