missingness mechanism
Recently Published Documents


TOTAL DOCUMENTS

16
(FIVE YEARS 11)

H-INDEX

2
(FIVE YEARS 1)

Entropy ◽  
2020 ◽  
Vol 22 (10) ◽  
pp. 1154
Author(s):  
Jiwei Zhao ◽  
Chi Chen

We study how to conduct statistical inference in a regression model where the outcome variable is prone to missing values and the missingness mechanism is unknown. The model we consider might be a traditional setting or a modern high-dimensional setting where the sparsity assumption is usually imposed and the regularization technique is popularly used. Motivated by the fact that the missingness mechanism, albeit usually treated as a nuisance, is difficult to specify correctly, we adopt the conditional likelihood approach so that the nuisance can be completely ignored throughout our procedure. We establish the asymptotic theory of the proposed estimator and develop an easy-to-implement algorithm via some data manipulation strategy. In particular, under the high-dimensional setting where regularization is needed, we propose a data perturbation method for the post-selection inference. The proposed methodology is especially appealing when the true missingness mechanism tends to be missing not at random, e.g., patient reported outcomes or real world data such as electronic health records. The performance of the proposed method is evaluated by comprehensive simulation experiments as well as a study of the albumin level in the MIMIC-III database.


Author(s):  
Roderick J. Little

I review assumptions about the missing-data mechanism that underlie methods for the statistical analysis of data with missing values. I describe Rubin's original definition of missing at random, (MAR), its motivation and criticisms, and his sufficient conditions for ignoring the missingness mechanism for likelihood-based, Bayesian, and frequentist inference. Related definitions, including missing completely at random, always MAR, always missing completely at random, and partially MAR are also covered. I present a formal argument for weakening Rubin's sufficient conditions for frequentist maximum likelihood inference with precision based on the observed information. Some simple examples of MAR are described, together with an example where the missingness mechanism can be ignored even though MAR does not hold. Alternative approaches to statistical inference based on the likelihood function are reviewed, along with non-likelihood frequentist approaches, including weighted generalized estimating equations. Connections with the causal inference literature are also discussed. Finally, alternatives to Rubin's MAR definition are discussed, including informative missingness, informative censoring, and coarsening at random. The intent is to provide a relatively nontechnical discussion, although some of the underlying issues are challenging and touch on fundamental questions of statistical inference. Expected final online publication date for the Annual Review of Statistics, Volume 8 is March 7, 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2020 ◽  
Vol 117 (32) ◽  
pp. 19045-19053
Author(s):  
Alexander M. Franks ◽  
Edoardo M. Airoldi ◽  
Donald B. Rubin

Data analyses typically rely upon assumptions about the missingness mechanisms that lead to observed versus missing data, assumptions that are typically unassessable. We explore an approach where the joint distribution of observed data and missing data are specified in a nonstandard way. In this formulation, which traces back to a representation of the joint distribution of the data and missingness mechanism, apparently first proposed by J. W. Tukey, the modeling assumptions about the distributions are either assessable or are designed to allow relatively easy incorporation of substantive knowledge about the problem at hand, thereby offering a possibly realistic portrayal of the data, both observed and missing. We develop Tukey’s representation for exponential-family models, propose a computationally tractable approach to inference in this class of models, and offer some general theoretical comments. We then illustrate the utility of this approach with an example in systems biology.


Author(s):  
Samah Zakaria ◽  
Mai Sherif Hafez ◽  
Ahmed Mahmoud Gad

Latent variable models are widely used in social sciences for measuring constructs (latent variables) such as ability, attitude, behavior, and wellbeing. Those unobserved constructs are measured through a number of observed items (variables). The observed variables are often subject to item nonresponse, that may be nonignorable. Incorporating a missingness mechanism within the model used to analyze data with nonresponse is crucial to obtain valid estimates for parameters, especially when the missingness is nonignorable.In this paper, we propose a latent class model (LCM) where a categorical latent variable is used to capture a latent phenomenon, and another categorical latent variable is used to summarize response propensity. The proposed model incorporates a missingness mechanism. Bayesian estimation using Markov Chain Monte Carlo (MCMC) methods are used for fitting this LCM. Real data with binary items from the 2014 Egyptian Demographic and Health Survey (EDHS14) are used. Different levels of missingness are artificially created in order to study results of the model under low, moderate and high levels of missingness.


Sign in / Sign up

Export Citation Format

Share Document