scholarly journals Doubly robust pseudo-likelihood for incomplete hierarchical binary data

2018 ◽  
Vol 20 (1) ◽  
pp. 42-57
Author(s):  
Lisa Hermans ◽  
Anna Ivanova ◽  
Cristina Sotto ◽  
Geert Molenberghs ◽  
Geert Verbeke ◽  
...  

Missing data is almost inevitable in correlated-data studies. For non-Gaussian outcomes with moderate to large sequences, direct-likelihood methods can involve complex, hard-to-manipulate likelihoods. Popular alternative approaches, like generalized estimating equations, that are frequently used to circumvent the computational complexity of full likelihood, are less suitable when scientific interest, at least in part, is placed on the association structure; pseudo-likelihood (PL) methods are then a viable alternative. When the missing data are missing at random, Molenberghs et al. (2011, Statistica Sinica, 21,187–206) proposed a suite of corrections to the standard form of PL, taking the form of singly and doubly robust estimators. They provided the basis and exemplified it in insightful yet primarily illustrative examples. We here consider the important case of marginal models for hierarchical binary data, provide an effective implementation and illustrate it using data from an analgesic trial. Our doubly robust estimator is more convenient than the classical doubly robust estimators. The ideas are illustrated using a marginal model for a binary response, more specifically a Bahadur model.

2018 ◽  
Vol 19 (2) ◽  
pp. 174-193 ◽  
Author(s):  
José LP da Silva ◽  
Enrico A Colosimo ◽  
Fábio N Demarqui

Generalized estimating equations (GEEs) are a well-known method for the analysis of categorical longitudinal data. This method presents computational simplicity and provides consistent parameter estimates that have a population-averaged interpretation. However, with missing data, the resulting parameter estimates are consistent only under the strong assumption of missing completely at random (MCAR). Some corrections can be done when the missing data mechanism is missing at random (MAR): inverse probability weighting GEE (WGEE) and multiple imputation GEE (MIGEE). A recent method combining ideas of these two approaches has a doubly robust property in the sense that one only needs to correctly specify the weight or the imputation model in order to obtain consistent estimates for the parameters. In this work, a proportional odds model is assumed and a doubly robust estimator is proposed for the analysis of ordinal longitudinal data with intermittently missing responses and covariates under the MAR mechanism. In addition, the association structure is modelled by means of either the correlation coefficient or local odds ratio. The performance of the proposed method is compared to both WGEE and MIGEE through a simulation study. The method is applied to a dataset related to rheumatic mitral stenosis.


2019 ◽  
Vol 7 (4) ◽  
pp. 465-497
Author(s):  
Yaoyuan V Tan ◽  
Carol A C Flannagan ◽  
Michael R Elliott

Abstract Examples of “doubly robust” estimators for missing data include augmented inverse probability weighting (AIPWT) and penalized splines of propensity prediction (PSPP). Doubly robust estimators have the property that, if either the response propensity or the mean is modeled correctly, a consistent estimator of the population mean is obtained. However, doubly robust estimators can perform poorly when modest misspecification is present in both models. Here we consider extensions of the AIPWT and PSPP that use Bayesian additive regression trees (BART) to provide highly robust propensity and mean model estimation. We term these “robust-squared” in the sense that the propensity score, the means, or both can be estimated with minimal model misspecification, and applied to the doubly robust estimator. We consider their behavior via simulations where propensities and/or mean models are misspecified. We apply our proposed method to impute missing instantaneous velocity (delta-v) values from the 2014 National Automotive Sampling System Crashworthiness Data System dataset and missing Blood Alcohol Concentration values from the 2015 Fatality Analysis Reporting System dataset. We found that BART, applied to PSPP and AIPWT, provides a more robust estimate compared with PSPP and AIPWT.


2019 ◽  
Vol 6 (1) ◽  
pp. 125-148
Author(s):  
Edward H. Kennedy ◽  
Jacqueline A. Mauro ◽  
Michael J. Daniels ◽  
Natalie Burns ◽  
Dylan S. Small

In instrumental variable studies, missing instrument data are very common. For example, in the Wisconsin Longitudinal Study, one can use genotype data as a Mendelian randomization–style instrument, but this information is often missing when subjects do not contribute saliva samples or when the genotyping platform output is ambiguous. Here we review missing at random assumptions one can use to identify instrumental variable causal effects, and discuss various approaches for estimation and inference. We consider likelihood-based methods, regression and weighting estimators, and doubly robust estimators. The likelihood-based methods yield the most precise inference and are optimal under the model assumptions, while the doubly robust estimators can attain the nonparametric efficiency bound while allowing flexible nonparametric estimation of nuisance functions (e.g., instrument propensity scores). The regression and weighting estimators can sometimes be easiest to describe and implement. Our main contribution is an extensive review of this wide array of estimators under varied missing-at-random assumptions, along with discussion of asymptotic properties and inferential tools. We also implement many of the estimators in an analysis of the Wisconsin Longitudinal Study, to study effects of impaired cognitive functioning on depression.


2016 ◽  
Vol 12 (1) ◽  
pp. 253-282 ◽  
Author(s):  
Karel Vermeulen ◽  
Stijn Vansteelandt

Abstract Doubly robust estimators have now been proposed for a variety of target parameters in the causal inference and missing data literature. These consistently estimate the parameter of interest under a semiparametric model when one of two nuisance working models is correctly specified, regardless of which. The recently proposed bias-reduced doubly robust estimation procedure aims to partially retain this robustness in more realistic settings where both working models are misspecified. These so-called bias-reduced doubly robust estimators make use of special (finite-dimensional) nuisance parameter estimators that are designed to locally minimize the squared asymptotic bias of the doubly robust estimator in certain directions of these finite-dimensional nuisance parameters under misspecification of both parametric working models. In this article, we extend this idea to incorporate the use of data-adaptive estimators (infinite-dimensional nuisance parameters), by exploiting the bias reduction estimation principle in the direction of only one nuisance parameter. We additionally provide an asymptotic linearity theorem which gives the influence function of the proposed doubly robust estimator under correct specification of a parametric nuisance working model for the missingness mechanism/propensity score but a possibly misspecified (finite- or infinite-dimensional) outcome working model. Simulation studies confirm the desirable finite-sample performance of the proposed estimators relative to a variety of other doubly robust estimators.


Biometrika ◽  
2009 ◽  
Vol 96 (3) ◽  
pp. 723-734 ◽  
Author(s):  
Weihua Cao ◽  
Anastasios A. Tsiatis ◽  
Marie Davidian

Abstract Considerable recent interest has focused on doubly robust estimators for a population mean response in the presence of incomplete data, which involve models for both the propensity score and the regression of outcome on covariates. The usual doubly robust estimator may yield severely biased inferences if neither of these models is correctly specified and can exhibit nonnegligible bias if the estimated propensity score is close to zero for some observations. We propose alternative doubly robust estimators that achieve comparable or improved performance relative to existing methods, even with some estimated propensity scores close to zero.


Biometrika ◽  
2020 ◽  
Vol 107 (4) ◽  
pp. 919-933
Author(s):  
Wei Li ◽  
Yuwen Gu ◽  
Lan Liu

Summary For estimating the population mean of a response variable subject to ignorable missingness, a new class of methods, called multiply robust procedures, has been proposed. The advantage of multiply robust procedures over the traditional doubly robust methods is that they permit the use of multiple candidate models for both the propensity score and the outcome regression, and they are consistent if any one of the multiple models is correctly specified, a property termed multiple robustness. This paper shows that, somewhat surprisingly, multiply robust estimators are special cases of doubly robust estimators, where the final propensity score and outcome regression models are certain combinations of the candidate models. To further improve model specifications in the doubly robust estimators, we adapt a model mixing procedure as an alternative method for combining multiple candidate models. We show that multiple robustness and asymptotic normality can also be achieved by our mixing-based doubly robust estimator. Moreover, our estimator and its theoretical properties are not confined to parametric models. Numerical examples demonstrate that the proposed estimator is comparable to and can even outperform existing multiply robust estimators.


2018 ◽  
Vol 28 (10-11) ◽  
pp. 3057-3073 ◽  
Author(s):  
Garrett M Fitzmaurice ◽  
Stuart R Lipsitz ◽  
Roger D Weiss

Conventional approaches for handling missingness in substance use disorder trials commonly rely upon a single deterministic “worst value” imputation that posits a perfect relationship between missingness and drug use (“missing value = presumed drug use”); this yields biased estimates of treatment effects and their standard errors. Instead, deterministic imputations should be replaced by probabilistic versions that encode researchers prior beliefs that those with missing data are more likely to be using drugs at those occasions. Motivated by this problem, we present a method for handling non-monotone missing binary data in longitudinal studies. Specifically, we consider a joint model that combines a not missing at random (NMAR) selection model with a generalized linear mixed model for longitudinal binary data. The selection model links the distribution of a missing outcome to the corresponding distribution of the outcome for those observed at that occasion via a fixed and known sensitivity parameter. The mixed model for longitudinal binary data assumes the random effects have bridge distributions; the latter yields regression parameters that have both subject-specific and marginal interpretations. This approach is completely transparent about what is being assumed about missing data and can be used as the basis for sensitivity analysis.


Sign in / Sign up

Export Citation Format

Share Document