scholarly journals Improving upon the efficiency of complete case analysis when covariates are MNAR

Biostatistics ◽  
2014 ◽  
Vol 15 (4) ◽  
pp. 719-730 ◽  
Author(s):  
Jonathan W. Bartlett ◽  
James R. Carpenter ◽  
Kate Tilling ◽  
Stijn Vansteelandt

Abstract Missing values in covariates of regression models are a pervasive problem in empirical research. Popular approaches for analyzing partially observed datasets include complete case analysis (CCA), multiple imputation (MI), and inverse probability weighting (IPW). In the case of missing covariate values, these methods (as typically implemented) are valid under different missingness assumptions. In particular, CCA is valid under missing not at random (MNAR) mechanisms in which missingness in a covariate depends on the value of that covariate, but is conditionally independent of outcome. In this paper, we argue that in some settings such an assumption is more plausible than the missing at random assumption underpinning most implementations of MI and IPW. When the former assumption holds, although CCA gives consistent estimates, it does not make use of all observed information. We therefore propose an augmented CCA approach which makes the same conditional independence assumption for missingness as CCA, but which improves efficiency through specification of an additional model for the probability of missingness, given the fully observed variables. The new method is evaluated using simulations and illustrated through application to data on reported alcohol consumption and blood pressure from the US National Health and Nutrition Examination Survey, in which data are likely MNAR independent of outcome.

2019 ◽  
Vol 12 (1) ◽  
pp. 45-55
Author(s):  
Mwiche Musukuma ◽  
Brian Sonkwe ◽  
Isaac Fwemba ◽  
Patrick Musonda

Background: With the increase in the use of secondary data in epidemiological studies, the inquiry of how to manage missing data has become more relevant. Our study applied imputation techniques on traumatic spinal cord injuries data; a medical problem where data is generally sporadic. Traumatic spinal cord injuries due to blunt force cause widespread physiological impairments, medical and non-medical problems. The effects of spinal cord injuries are a burden not only to the victims but to their families and to the entire health system of a country. This study also evaluated the causes of traumatic spinal cord injuries in patients admitted to the University Teaching Hospital and factors associated with clinical complications in these patients. Methods: The study used data from medical records of patients who were admitted to the University Teaching Hospital in Lusaka, Zambia. Patients presenting with traumatic spinal cord injuries between 1st January 2013 and 31st December 2017 were part of the study. The data was first analysed using complete case analysis, then multiple imputation techniques were applied, to account for the missing data. Thereafter, both descriptive and inferential analyses were performed on the imputed data. Results: During the study period of interest, a total of 176 patients were identified as having suffered from spinal cord injuries. Road traffic accidents accounted for 56% (101) of the injuries. Clinical complications suffered by these patients included paralysis, death, bowel and bladder dysfunction and pressure sores among other things. Eighty-eight (50%) patients had paralysis. Patients with cervical spine injuries compared to patients with thoracic spine injuries had 87% reduced odds of suffering from clinical complications (OR=0.13, 95% CI{0.08, 0.22}p<.0001). Being paraplegic at discharge increased the odds of developing a clinical complication by 8.1 times (OR=8.01, 95% CI{2.74, 23.99}, p<.001). Under-going an operation increased the odds of having a clinical complication (OR=3.71, 95% CI{=1.99, 6.88}, p<.0001). A patient who presented with Frankel Grade C or E had a 96% reduction in the odds of having a clinical complication (OR=.04, 95% CI{0.02, 0.09} and {0.02, 0.12} respectively, p<.0001) compared to a patient who presented with Frankel Grade A. Conclusion: A comparison of estimates obtained from complete case analysis and from multiple imputations revealed that when there are a lot of missing values, estimates obtained from complete case analysis are unreliable and lack power. Efforts should be made to use ideas to deal with missing values such as multiple imputation techniques. The most common cause of traumatic spinal cord injuries was road traffic accidents. Findings suggest that paralysis had the greatest negative effect on clinical complications. When the category of Frankel Grade increased from A-E, the less likely a patient was likely to succumb to clinical complications. No evidence of an association was found between age, sex and developing a clinical complication.


2020 ◽  
Vol 4 (2) ◽  
pp. 9-12
Author(s):  
Dler H. Kadir

Increasing the response rate and minimizing non-response rates represent the primary challenges to researchers in performing longitudinal and cohort research. This is most obvious in the area of paediatric medicine. When there are missing data, complete case analysis makes findings biased. Inverse Probability Weighting (IPW) is one of many available approaches for reducing the bias using a complete case analysis. Here, a complete case is weighted by probability inverse of complete cases. The data of this work is collected from the neonatal intensive care unit at Erbil maternity hospital for the years 2012 to 2017. In total, 570 babies (288 male and 282 females) were born very preterm. The aim of this paper is to use inverse probability weighting on the Bayesian logistic model developmental outcome. The Mental Development Index (MDI) approach is used for assessing the cognitive development of those born very preterm. Almost half of the information for the babies was missing, meaning that we do not know whether they have cognitive development issues or they have not. We obtained greater precision in results and standard deviation of parameter estimates which are less in the posterior weighted model in comparison with frequent analysis.


Author(s):  
Tra My Pham ◽  
Irene Petersen ◽  
James Carpenter ◽  
Tim Morris

ABSTRACT BackgroundEthnicity is an important factor to be considered in health research because of its association with inequality in disease prevalence and the utilisation of healthcare. Ethnicity recording has been incorporated in primary care electronic health records, and hence is available in large UK primary care databases such as The Health Improvement Network (THIN). However, since primary care data are routinely collected for clinical purposes, a large amount of data that are relevant for research including ethnicity is often missing. A popular approach for missing data is multiple imputation (MI). However, the conventional MI method assuming data are missing at random does not give plausible estimates of the ethnicity distribution in THIN compared to the general UK population. This might be due to the fact that ethnicity data in primary care are likely to be missing not at random. ObjectivesI propose a new MI method, termed ‘weighted multiple imputation’, to deal with data that are missing not at random in categorical variables.MethodsWeighted MI combines MI and probability weights which are calculated using external data sources. Census summary statistics for ethnicity can be used to form weights in weighted MI such that the correct marginal ethnic breakdown is recovered in THIN. I conducted a simulation study to examine weighted MI when ethnicity data are missing not at random. In this simulation study which resembled a THIN dataset, ethnicity was an independent variable in a survival model alongside other covariates. Weighted MI was compared to the conventional MI and other traditional missing data methods including complete case analysis and single imputation.ResultsWhile a small bias was still present in ethnicity coefficient estimates under weighted MI, it was less severe compared to MI assuming missing at random. Complete case analysis and single imputation were inadequate to handle data that are missing not at random in ethnicity.ConclusionsAlthough not a total cure, weighted MI represents a pragmatic approach that has potential applications not only in ethnicity but also in other incomplete categorical health indicators in electronic health records.


2012 ◽  
Vol 40 (6) ◽  
pp. 3031-3049 ◽  
Author(s):  
Hira L. Koul ◽  
Ursula U. Müller ◽  
Anton Schick

2020 ◽  
Vol 189 (12) ◽  
pp. 1583-1589
Author(s):  
Rachael K Ross ◽  
Alexander Breskin ◽  
Daniel Westreich

Abstract When estimating causal effects, careful handling of missing data is needed to avoid bias. Complete-case analysis is commonly used in epidemiologic analyses. Previous work has shown that covariate-stratified effect estimates from complete-case analysis are unbiased when missingness is independent of the outcome conditional on the exposure and covariates. Here, we assess the bias of complete-case analysis for adjusted marginal effects when confounding is present under various causal structures of missing data. We show that estimation of the marginal risk difference requires an unbiased estimate of the unconditional joint distribution of confounders and any other covariates required for conditional independence of missingness and outcome. The dependence of missing data on these covariates must be considered to obtain a valid estimate of the covariate distribution. If none of these covariates are effect-measure modifiers on the absolute scale, however, the marginal risk difference will equal the stratified risk differences and the complete-case analysis will be unbiased when the stratified effect estimates are unbiased. Estimation of unbiased marginal effects in complete-case analysis therefore requires close consideration of causal structure and effect-measure modification.


Trials ◽  
2015 ◽  
Vol 16 (S2) ◽  
Author(s):  
Sofia Bazakou ◽  
Robin Henderson ◽  
Linda Sharples ◽  
John Matthews

Sign in / Sign up

Export Citation Format

Share Document