Missing covariate data within cancer prognostic studies: a review of current reporting and proposed guidelines

This paper examines the problem of identification and inference on a conditional moment condition model with missing data, with special focus on the case when the conditioning covariates are missing. We impose no assumption on the distribution of the missing data and we confront the missing data problem by using a worst case scenario approach.We characterize the sharp identified set and argue that this set is usually too complex to compute or to use for inference. Given this difficulty, we consider the construction of outer identified sets (i.e. supersets of the identified set) that are easier to compute and can still characterize the parameter of interest. Two different outer identification strategies are proposed. Both of these strategies are shown to have nontrivial identifying power and are relatively easy to use and combine for inferential purposes.

Download Full-text

The performance of multiple imputation for missing covariate data within the context of regression relative survival analysis

Statistics in Medicine ◽

10.1002/sim.3476 ◽

2008 ◽

Vol 27 (30) ◽

pp. 6310-6331 ◽

Cited By ~ 21

Author(s):

Roch Giorgi ◽

Aurélien Belot ◽

Jean Gaudart ◽

Guy Launoy

Keyword(s):

Survival Analysis ◽

Multiple Imputation ◽

Relative Survival ◽

Covariate Data ◽

Missing Covariate Data

Download Full-text

A robust imputation method for missing responses and covariates in sample selection models

Statistical Methods in Medical Research ◽

10.1177/0962280217715663 ◽

2017 ◽

Vol 28 (1) ◽

pp. 102-116 ◽

Cited By ~ 3

Author(s):

Emmanuel O Ogundimu ◽

Gary S Collins

Keyword(s):

Economic Status ◽

Sample Selection ◽

Imputation Method ◽

Covariate Data ◽

Missing Responses ◽

Missing Covariate Data ◽

Parent Selection ◽

Partially Observed ◽

Exclusion Restrictions ◽

Better Than

Sample selection arises when the outcome of interest is partially observed in a study. Although sophisticated statistical methods in the parametric and non-parametric framework have been proposed to solve this problem, it is yet unclear how to deal with selectively missing covariate data using simple multiple imputation techniques, especially in the absence of exclusion restrictions and deviation from normality. Motivated by the 2003–2004 NHANES data, where previous authors have studied the effect of socio-economic status on blood pressure with missing data on income variable, we proposed the use of a robust imputation technique based on the selection-t sample selection model. The imputation method, which is developed within the frequentist framework, is compared with competing alternatives in a simulation study. The results indicate that the robust alternative is not susceptible to the absence of exclusion restrictions – a property inherited from the parent selection-t model – and performs better than models based on the normal assumption even when the data is generated from the normal distribution. Applications to missing outcome and covariate data further corroborate the robustness properties of the proposed method. We implemented the proposed approach within the MICE environment in R Statistical Software.

Download Full-text