Combining Multiple Imputation and Inverse-Probability Weighting for Analyzing Response with Missing in the Presence of Covariates

Trial Data ◽

Data Sets ◽

Probability Weighting ◽

Inverse Probability ◽

Doubly Robust

Introduction: Missing values are frequently seen in data sets of research studiesespecially in medical studies.Therefore, it is essential that the data, especially in medical research should evaluate in terms of the structure of missingness.This study aims to provide new statistical methods for analyzing such data. Methods:Multiple imputation (MI) and inverse-probability weighting (IPW)aretwo common methods whichused to deal with missing data. MI method is more effectiveand complexthan IPW.MI requires a model for the joint distribution of the missing data given the observed data.While IPW need only a model for the probability that a subject has fulldata .Inefficacy in each of these models may causeto serious bias if missingness in dataset is large .Anothermethod that combines these approaches to give a doubly robust estimator.In addition, using of these methodswill demonstrate in the clinical trial data related to postpartum bleeding. Results:In this article, we examine the performance of IPW/MI relative to MI and IPW alone in terms of bias and efficiency.According to the results of simulation can be said that that IPW/MI have advantages over alternatives.Also results of real data showed that,results of MI/MI doesnot differ with the results ofIPW/MIsignificantly. Conclusion:Problem of missing data are in many studies that causes bias and decreasing efficacy inmodel.In this study, after comparing the results of these techniques,it was concludedthat IPW/MI method has better performance than other methods.

Analysis of Incomplete Data Using Inverse Probability Weighting and Doubly Robust Estimators

Methodology ◽

10.1027/1614-2241/a000005 ◽

2010 ◽

Vol 6 (1) ◽

pp. 37-48 ◽

Cited By ~ 36

Author(s):

Stijn Vansteelandt ◽

James Carpenter ◽

Michael G. Kenward

Keyword(s):

Incomplete Data ◽

Missing At Random ◽

Estimation Methods ◽

Data Sets ◽

Probability Weighting ◽

Inverse Probability ◽

Doubly Robust Estimation ◽

Doubly Robust ◽

Standard Software

This article reviews inverse probability weighting methods and doubly robust estimation methods for the analysis of incomplete data sets. We first consider methods for estimating a population mean when the outcome is missing at random, in the sense that measured covariates can explain whether or not the outcome is observed. We then sketch the rationale of these methods and elaborate on their usefulness in the presence of influential inverse weights. We finally outline how to apply these methods in a variety of settings, such as for fitting regression models with incomplete outcomes or covariates, emphasizing the use of standard software programs.

Robust inference when combining inverse-probability weighting and multiple imputation to address missing data with application to an electronic health records-based study of bariatric surgery

The Annals of Applied Statistics ◽

10.1214/20-aoas1386 ◽

2021 ◽

Vol 15 (1) ◽

Author(s):

Tanayott Thaweethai ◽

David E. Arterburn ◽

Karen J. Coleman ◽

Sebastien Haneuse

Keyword(s):

Bariatric Surgery ◽

Missing Data ◽

Electronic Health Records ◽

Multiple Imputation ◽

Robust Inference ◽

Probability Weighting ◽

Health Records ◽

Inverse Probability ◽

Electronic Health

Adjusting for selection bias due to missing data in electronic health records-based research

10.1177/09622802211027601 ◽

2021 ◽

Vol 30 (10) ◽

pp. 2221-2238

Author(s):

Sarah B Peskoe ◽

David Arterburn ◽

Karen J Coleman ◽

Lisa J Herrinton ◽

Michael J Daniels ◽

...

Keyword(s):

Missing Data ◽

Electronic Health Records ◽

Selection Bias ◽

Small Sample ◽

Data Provenance ◽

Probability Weighting ◽

Health Records ◽

Inverse Probability ◽

Electronic Health

While electronic health records data provide unique opportunities for research, numerous methodological issues must be considered. Among these, selection bias due to incomplete/missing data has received far less attention than other issues. Unfortunately, standard missing data approaches (e.g. inverse-probability weighting and multiple imputation) generally fail to acknowledge the complex interplay of heterogeneous decisions made by patients, providers, and health systems that govern whether specific data elements in the electronic health records are observed. This, in turn, renders the missing-at-random assumption difficult to believe in standard approaches. In the clinical literature, the collection of decisions that gives rise to the observed data is referred to as the data provenance. Building on a recently-proposed framework for modularizing the data provenance, we develop a general and scalable framework for estimation and inference with respect to regression models based on inverse-probability weighting that allows for a hierarchy of missingness mechanisms to better align with the complex nature of electronic health records data. We show that the proposed estimator is consistent and asymptotically Normal, derive the form of the asymptotic variance, and propose two consistent estimators. Simulations show that naïve application of standard methods may yield biased point estimates, that the proposed estimators have good small-sample properties, and that researchers may have to contend with a bias-variance trade-off as they consider how to handle missing data. The proposed methods are motivated by an on-going, electronic health records-based study of bariatric surgery.

Responsiveness-informed multiple imputation and inverse probability-weighting in cohort studies with missing data that are non-monotone or not missing at random

10.1177/0962280216628902 ◽

2016 ◽

Vol 27 (2) ◽

pp. 352-363 ◽

Cited By ~ 8

Author(s):

James C Doidge

Keyword(s):

Missing Data ◽

Data Collection ◽

Cohort Studies ◽

Multiple Imputation ◽

Missing At Random ◽

Probability Weighting ◽

Inverse Probability ◽

Not Missing At Random ◽

Over Time

Population-based cohort studies are invaluable to health research because of the breadth of data collection over time, and the representativeness of their samples. However, they are especially prone to missing data, which can compromise the validity of analyses when data are not missing at random. Having many waves of data collection presents opportunity for participants’ responsiveness to be observed over time, which may be informative about missing data mechanisms and thus useful as an auxiliary variable. Modern approaches to handling missing data such as multiple imputation and maximum likelihood can be difficult to implement with the large numbers of auxiliary variables and large amounts of non-monotone missing data that occur in cohort studies. Inverse probability-weighting can be easier to implement but conventional wisdom has stated that it cannot be applied to non-monotone missing data. This paper describes two methods of applying inverse probability-weighting to non-monotone missing data, and explores the potential value of including measures of responsiveness in either inverse probability-weighting or multiple imputation. Simulation studies are used to compare methods and demonstrate that responsiveness in longitudinal studies can be used to mitigate bias induced by missing data, even when data are not missing at random.

Scaled Inverse Probability Weighting: A Method to Assess Potential Bias Due to Event Nonreporting in Ecological Momentary Assessment Studies

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998617738241 ◽

2017 ◽

Vol 43 (3) ◽

pp. 354-381 ◽

Cited By ~ 2

Author(s):

Stephanie A. Kovalchik ◽

Steven C. Martino ◽

Rebecca L. Collins ◽

William G. Shadel ◽

Elizabeth J. D’Amico ◽

...

Keyword(s):

Missing Data ◽

Ecological Momentary Assessment ◽

Assessment Method ◽

Probability Weighting ◽

Inverse Probability ◽

Traditional Assessment ◽

Urge To Smoke ◽

Ecological Momentary ◽

Momentary Assessment

Ecological momentary assessment (EMA) is a popular assessment method in psychology that aims to capture events, emotions, and cognitions in real time, usually repeatedly throughout the day. Because EMA typically involves more intensive monitoring than traditional assessment methods, missing data are commonly an issue and this missingness may bias results. EMA can involve two types of missing data: known missingness, arising from nonresponse to scheduled prompts, and hidden missingness, arising from nonreporting of focal events (e.g., an urge to smoke or a meal). Prior research on missing data in EMA has focused almost exclusively on nonresponse to scheduled prompts. In this study, we introduce a scaled inverse probability weighting approach to assess the risk of bias due to nonreporting of events due to fatigue on estimates of exposure or correlates of exposure. In our proposed approach, the inverse probability is the estimated probability of compliance with random prompts from a model that uses participant and contextual factors to predict this compliance and a fatigue factor that adjusts for attrition in event reporting over time. We demonstrate the use and utility of our bias assessment method with the Tracking and Recording Alcohol Communications Study, an EMA study of adolescent exposure to alcohol advertising.

Impact of Outcome Model Misspecification on Regression and Doubly-Robust Inverse Probability Weighting to Estimate Causal Effect

The International Journal of Biostatistics ◽

10.2202/1557-4679.1207 ◽

2010 ◽

Vol 6 (2) ◽

Cited By ~ 1

Author(s):

Geneviève Lefebvre ◽

Paul Gustafson

Keyword(s):

Causal Effect ◽

Model Misspecification ◽

Probability Weighting ◽

Inverse Probability ◽

Doubly Robust

G-computation and doubly robust standardisation for continuous-time data: A comparison with inverse probability weighting

10.1177/09622802211047345 ◽

2021 ◽

pp. 096228022110473

Author(s):

Arthur Chatton ◽

Florent Le Borgne ◽

Clémence Leyrat ◽

Yohann Foucher

Keyword(s):

Continuous Time ◽

Model Specification ◽

Practical Implementation ◽

Probability Weighting ◽

Survival Times ◽

Time Data ◽

Inverse Probability ◽

Real World Datasets ◽

Doubly Robust

In time-to-event settings, g-computation and doubly robust estimators are based on discrete-time data. However, many biological processes are evolving continuously over time. In this paper, we extend the g-computation and the doubly robust standardisation procedures to a continuous-time context. We compare their performance to the well-known inverse-probability-weighting estimator for the estimation of the hazard ratio and restricted mean survival times difference, using a simulation study. Under a correct model specification, all methods are unbiased, but g-computation and the doubly robust standardisation are more efficient than inverse-probability-weighting. We also analyse two real-world datasets to illustrate the practical implementation of these approaches. We have updated the R package RISCA to facilitate the use of these methods and their dissemination.

On weighting approaches for missing data

10.1177/0962280211403597 ◽

2011 ◽

Vol 22 (1) ◽

pp. 14-30 ◽

Cited By ~ 36

Author(s):

Lingling Li ◽

Changyu Shen ◽

Xiaochun Li ◽

James M Robins

Keyword(s):

Missing Data ◽

Selection Bias ◽

Probability Weighting ◽

Full Data ◽

Inverse Probability ◽

Intuitive Idea ◽

Complex Settings ◽

Conceptual Overview

We review the class of inverse probability weighting (IPW) approaches for the analysis of missing data under various missing data patterns and mechanisms. The IPW methods rely on the intuitive idea of creating a pseudo-population of weighted copies of the complete cases to remove selection bias introduced by the missing data. However, different weighting approaches are required depending on the missing data pattern and mechanism. We begin with a uniform missing data pattern (i.e. a scalar missing indicator indicating whether or not the full data is observed) to motivate the approach. We then generalise to more complex settings. Our goal is to provide a conceptual overview of existing IPW approaches and illustrate the connections and differences among these approaches.

Review of inverse probability weighting for dealing with missing data

10.1177/0962280210395740 ◽

2011 ◽

Vol 22 (3) ◽

pp. 278-295 ◽

Cited By ~ 515

Author(s):

Shaun R Seaman ◽

Ian R White

Keyword(s):

Missing Data ◽

Probability Weighting ◽

Inverse Probability

A comparison of confounding adjustment methods with an application to early life determinants of childhood obesity

Journal of Developmental Origins of Health and Disease ◽

10.1017/s2040174414000415 ◽

2014 ◽

Vol 5 (6) ◽

pp. 435-447 ◽

Cited By ~ 9

Author(s):

L. Li ◽

K. Kleinman ◽

M. W. Gillman

Keyword(s):

Cesarean Section ◽

The Body ◽

Birth Cohort Study ◽

Probability Weighting ◽

Regression Estimate ◽

Inverse Probability ◽

Doubly Robust Estimation ◽

Doubly Robust ◽

Confounding Adjustment

We implemented six confounding adjustment methods: (1) covariate-adjusted regression, (2) propensity score (PS) regression, (3) PS stratification, (4) PS matching with two calipers, (5) inverse probability weighting and (6) doubly robust estimation to examine the associations between the body mass index (BMI) z-score at 3 years and two separate dichotomous exposure measures: exclusive breastfeeding v. formula only (n=437) and cesarean section v. vaginal delivery (n=1236). Data were drawn from a prospective pre-birth cohort study, Project Viva. The goal is to demonstrate the necessity and usefulness, and approaches for multiple confounding adjustment methods to analyze observational data. Unadjusted (univariate) and covariate-adjusted linear regression associations of breastfeeding with BMI z-score were −0.33 (95% CI −0.53, −0.13) and −0.24 (−0.46, −0.02), respectively. The other approaches resulted in smaller n (204–276) because of poor overlap of covariates, but CIs were of similar width except for inverse probability weighting (75% wider) and PS matching with a wider caliper (76% wider). Point estimates ranged widely, however, from −0.01 to −0.38. For cesarean section, because of better covariate overlap, the covariate-adjusted regression estimate (0.20) was remarkably robust to all adjustment methods, and the widths of the 95% CIs differed less than in the breastfeeding example. Choice of covariate adjustment method can matter. Lack of overlap in covariate structure between exposed and unexposed participants in observational studies can lead to erroneous covariate-adjusted estimates and confidence intervals. We recommend inspecting covariate overlap and using multiple confounding adjustment methods. Similar results bring reassurance. Contradictory results suggest issues with either the data or the analytic method.