Avoiding bias and incorrect confidence interval coverage in prescription drug labeling

Propensity score-based estimators are increasingly used for causal inference in observational studies. However, model selection for propensity score estimation in high-dimensional data has received little attention. In these settings, propensity score models have traditionally been selected based on the goodness-of-fit for the treatment mechanism itself, without consideration of the causal parameter of interest. Collaborative minimum loss-based estimation is a novel methodology for causal inference that takes into account information on the causal parameter of interest when selecting a propensity score model. This “collaborative learning” considers variable associations with both treatment and outcome when selecting a propensity score model in order to minimize a bias-variance tradeoff in the estimated treatment effect. In this study, we introduce a novel approach for collaborative model selection when using the LASSO estimator for propensity score estimation in high-dimensional covariate settings. To demonstrate the importance of selecting the propensity score model collaboratively, we designed quasi-experiments based on a real electronic healthcare database, where only the potential outcomes were manually generated, and the treatment and baseline covariates remained unchanged. Results showed that the collaborative minimum loss-based estimation algorithm outperformed other competing estimators for both point estimation and confidence interval coverage. In addition, the propensity score model selected by collaborative minimum loss-based estimation could be applied to other propensity score-based estimators, which also resulted in substantive improvement for both point estimation and confidence interval coverage. We illustrate the discussed concepts through an empirical example comparing the effects of non-selective nonsteroidal anti-inflammatory drugs with selective COX-2 inhibitors on gastrointestinal complications in a population of Medicare beneficiaries.

Download Full-text

Bias and Confidence Interval Coverage of Creel Survey Estimators Evaluated by Simulation

Transactions of the American Fisheries Society ◽

10.1577/1548-8659(1998)127<0469:bacico>2.0.co;2 ◽

1998 ◽

Vol 127 (3) ◽

pp. 469-480 ◽

Cited By ~ 31

Author(s):

Paul W. Rasmussen ◽

Michael D. Staggs ◽

T. Douglas Beard ◽

Steven P. Newman

Keyword(s):

Confidence Interval ◽

Confidence Interval Coverage ◽

Creel Survey ◽

Interval Coverage

Download Full-text

Evaluating quantitative methods for intercategorical-intersectionality research: a simulation study

European Journal of Public Health ◽

10.1093/eurpub/ckaa165.745 ◽

2020 ◽

Vol 30 (Supplement_5) ◽

Author(s):

M Mahendran ◽

G Bauer ◽

D Lizotte ◽

Y Zhu

Keyword(s):

Random Forest ◽

Confidence Interval ◽

Variable Selection ◽

Sample Size ◽

Simulation Study ◽

Quantitative Methods ◽

Predictive Accuracy ◽

Conditional Inference ◽

Confidence Interval Coverage ◽

Interval Coverage

Abstract Introduction This study evaluated seven quantitative methods for their predictive accuracy for intersectionally defined subgroups, via a simulation study. The methods were single-level regression with interaction terms, cross-classification, multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA), and four decision tree Methods classification and regression trees (CART), conditional inference trees, chi-square automatic interaction detector, and random forest. Also evaluated was how well methods identified variables relevant to the outcome. An example analysis will be presented using data from the U.S. National Health and Nutritional Examination Survey. Methods The simulated datasets varied by outcome variable type (binary and continuous), input variable types, sample size, and size and direction of the effects. Accuracy was evaluated using mean squared error or mean absolute percentage error. The secondary outcome was evaluated via significance and confidence interval coverage of regression terms and variable selection of the machine learning methods. Results Predictive accuracy improved with increasing sample size for all methods except CART. At small sample sizes random forest and MAIHDA generally created the most precise predictions. Variable selection consistently faced a high type 1 error for CTree and CHAID. While performing well for prediction, variable selection by random forest and confidence interval coverage and power of MAIHDA main effects coefficients were suboptimal. Discussion From this study emerge recommendations for applying methods in quantitative intersectionality. Different methodologies are optimal for different purposes, for example while random forest and MAIHDA performed well for prediction, they were less reliable for variable identification. In our discussion, we will work through how to select, apply, and interpret methodologies to achieve analytic goals that align with intersectionality theory.

Download Full-text

Effect of Heterogeneous Survival on Bird-Banding Model Confidence Interval Coverage Rates

Journal of Wildlife Management ◽

10.2307/3808798 ◽

1992 ◽

Vol 56 (1) ◽

pp. 111 ◽

Cited By ~ 6

Author(s):

Richard J. Barker

Keyword(s):

Confidence Interval ◽

Model Confidence ◽

Confidence Interval Coverage ◽

Coverage Rates ◽

Interval Coverage

Download Full-text

Estimating Stage-Specific Daily Survival Probabilities of Nests When Nest age is Unknown

The Auk ◽

10.1093/auk/121.1.134 ◽

2004 ◽

Vol 121 (1) ◽

pp. 134-147

Author(s):

Thomas R. Stanley

Keyword(s):

Monte Carlo ◽

Confidence Interval ◽

Survival Probabilities ◽

Daily Survival ◽

Confidence Interval Coverage ◽

Sources Of Variation ◽

Nominal Rate ◽

Avian Populations ◽

Interval Coverage ◽

Nest Type

Abstract Estimation of daily survival probabilities of nests is common in studies of avian populations. Since the introduction of Mayfield's (1961, 1975) estimator, numerous models have been developed to relax Mayfield's assumptions and account for biologically important sources of variation. Stanley (2000) presented a model for estimating stage-specific (e.g. incubation stage, nestling stage) daily survival probabilities of nests that conditions on “nest type” and requires that nests be aged when they are found. Because aging nests typically requires handling the eggs, there may be situations where nests can not or should not be aged and the Stanley (2000) model will be inapplicable. Here, I present a model for estimating stage-specific daily survival probabilities that conditions on nest stage for active nests, thereby obviating the need to age nests when they are found. Specifically, I derive the maximumlikelihood function for the model, evaluate the model's performance using Monte Carlo simulations, and provide software for estimating parameters (along with an example). For sample sizes as low as 50 nests, bias was small and confidence interval coverage was close to the nominal rate, especially when a reduced-parameter model was used for estimation.

Download Full-text