An Alternative Sensitivity Approach for Longitudinal Analysis with Dropout

In any longitudinal study, a dropout before the final timepoint can rarely be avoided. The chosen dropout model is commonly one of these types: Missing Completely at Random (MCAR), Missing at Random (MAR), Missing Not at Random (MNAR), and Shared Parameter (SP). In this paper we estimate the parameters of the longitudinal model for simulated data and real data using the Linear Mixed Effect (LME) method. We investigate the consequences of misspecifying the missingness mechanism by deriving the so-called least false values. These are the values the parameter estimates converge to, when the assumptions may be wrong. The knowledge of the least false values allows us to conduct a sensitivity analysis, which is illustrated. This method provides an alternative to a local misspecification sensitivity procedure, which has been developed for likelihood-based analysis. We compare the results obtained by the method proposed with the results found by using the local misspecification method. We apply the local misspecification and least false methods to estimate the bias and sensitivity of parameter estimates for a clinical trial example.

Download Full-text

Handling missing data in modelling quality of clinician-prescribed routine care: Sensitivity analysis of departure from missing at random assumption

Statistical Methods in Medical Research ◽

10.1177/0962280220918279 ◽

2020 ◽

Vol 29 (10) ◽

pp. 3076-3092 ◽

Cited By ~ 1

Author(s):

Susan Gachau ◽

Matteo Quartagno ◽

Edmund Njeru Njagi ◽

Nelson Owuor ◽

Mike English ◽

...

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Multiple Imputation ◽

Missing At Random ◽

Parameter Estimates ◽

Analysis Model ◽

Major Drawback ◽

Missing Not At Random ◽

Prior Distributions ◽

Random Mechanism

Missing information is a major drawback in analyzing data collected in many routine health care settings. Multiple imputation assuming a missing at random mechanism is a popular method to handle missing data. The missing at random assumption cannot be confirmed from the observed data alone, hence the need for sensitivity analysis to assess robustness of inference. However, sensitivity analysis is rarely conducted and reported in practice. We analyzed routine paediatric data collected during a cluster randomized trial conducted in Kenyan hospitals. We imputed missing patient and clinician-level variables assuming the missing at random mechanism. We also imputed missing clinician-level variables assuming a missing not at random mechanism. We incorporated opinions from 15 clinical experts in the form of prior distributions and shift parameters in the delta adjustment method. An interaction between trial intervention arm and follow-up time, hospital, clinician and patient-level factors were included in a proportional odds random-effects analysis model. We performed these analyses using R functions derived from the jomo package. Parameter estimates from multiple imputation under the missing at random mechanism were similar to multiple imputation estimates assuming the missing not at random mechanism. Our inferences were insensitive to departures from the missing at random assumption using either the prior distributions or shift parameters sensitivity analysis approach.

Download Full-text

EDSS variability before randomization may limit treatment discovery in primary progressive MS

Multiple Sclerosis Journal ◽

10.1177/1352458512459685 ◽

2012 ◽

Vol 19 (6) ◽

pp. 775-781 ◽

Cited By ~ 8

Author(s):

Jiameng Zhang ◽

Emmanuelle Waubant ◽

Gary Cutter ◽

Jerry S Wolinsky ◽

Robert Glanzman

Keyword(s):

Therapeutic Effect ◽

Controlled Trial ◽

Simulated Data ◽

Real Data ◽

Progressive Multiple Sclerosis ◽

Primary Progressive Multiple Sclerosis ◽

Disability Progression ◽

Mixed Effect ◽

Primary Progressive ◽

Pre Treatment

Background: Baseline Expanded Disability Status Scale (EDSS) is usually based on a single measurement. Here we evaluated whether using a baseline EDSS derived from two pre-treatment measurements improves the detection of progression events and the ability to demonstrate a therapeutic effect in delaying MS disability progression. Methods: Real data from OLYMPUS, a phase II/III randomized, placebo-controlled trial of rituximab in patients with primary progressive multiple sclerosis (PPMS), as well as simulated data were analyzed. Several definitions of baseline EDSS were used to capture sustained disability progression (SDP) events. Variations in the EDSS were estimated by linear mixed-effect models. Results: Selecting the higher of two baseline EDSS scores lowered the number of SDP events in both treatment groups, so decreasing sensitivity, and reduced the number of false SDP events, so increasing specificity. Conversely, selecting the lower of two baseline scores increased sensitivity but decreased specificity. Increased power (~7% based on the simulation study) was observed when the average of screening and Week 0 EDSS scores was used for baseline. Conclusion: Baseline EDSS derived from two pre-treatment EDSS measurements may enhance the ability of detecting a therapeutic effect in slowing disability progression in PPMS. This strategy could be implemented in future clinical trials of patients with MS.

Download Full-text

A four-step strategy for handling missing outcome data in randomised trials affected by a pandemic

10.21203/rs.3.rs-32455/v2 ◽

2020 ◽

Author(s):

Suzie Cro ◽

Tim P Morris ◽

Brennan C Kahan ◽

Victoria R Cornelius ◽

James R Carpenter

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Treatment Effect ◽

Missing At Random ◽

Outcome Data ◽

Sensitivity Analyses ◽

Free World ◽

Randomised Trials ◽

Primary Analysis ◽

Missing Not At Random

Abstract Background: The coronavirus pandemic (Covid-19) presents a variety of challenges for ongoing clinical trials, including an inevitably higher rate of missing outcome data, with new and non-standard reasons for missingness. International drug trial guidelines recommend trialists review plans for handling missing data in the conduct and statistical analysis, but clear recommendations are lacking.Methods: We present a four-step strategy for handling missing outcome data in the analysis of randomised trials that are ongoing during a pandemic. We consider handling missing data arising due to (i) participant infection, (ii) treatment disruptions and (iii) loss to follow-up. We consider both settings where treatment effects for a ‘pandemic-free world’ and ‘world including a pandemic’ are of interest. Results: In any trial, investigators should; (1) Clarify the treatment estimand of interest with respect to the occurrence of the pandemic; (2) Establish what data are missing for the chosen estimand; (3) Perform primary analysis under the most plausible missing data assumptions followed by; (4) Sensitivity analysis under alternative plausible assumptions. To obtain an estimate of the treatment effect in a ‘pandemic-free world’, participant data that are clinically affected by the pandemic (directly due to infection or indirectly via treatment disruptions) are not relevant and can be set to missing. For primary analysis, a missing-at-random assumption that conditions on all observed data that are expected to be associated with both the outcome and missingness may be most plausible. For the treatment effect in the ‘world including a pandemic’, all participant data is relevant and should be included in the analysis. For primary analysis, a missing-at-random assumption – potentially incorporating a pandemic time-period indicator and participant infection status – or a missing-not-at-random assumption with a poorer response may be most relevant, depending on the setting. In all scenarios, sensitivity analysis under credible missing-not-at-random assumptions should be used to evaluate the robustness of results. We highlight controlled multiple imputation as an accessible tool for conducting sensitivity analyses.Conclusions: Missing data problems will be exacerbated for trials active during the Covid-19 pandemic. This four-step strategy will facilitate clear thinking about the appropriate analysis for relevant questions of interest.

Download Full-text

Inverse sampling regression for pooled data

Statistical Methods in Medical Research ◽

10.1177/0962280214568047 ◽

2015 ◽

Vol 26 (3) ◽

pp. 1093-1109 ◽

Cited By ~ 1

Author(s):

Osval A Montesinos-López ◽

Abelardo Montesinos-López ◽

Kent Eskridge ◽

José Crossa

Keyword(s):

Simulation Study ◽

Low Cost ◽

Group Testing ◽

Real Data ◽

Parameter Estimates ◽

Inverse Sampling ◽

Mixed Effect ◽

Individual Level ◽

Covariate Information ◽

The Individual

Because pools are tested instead of individuals in group testing, this technique is helpful for estimating prevalence in a population or for classifying a large number of individuals into two groups at a low cost. For this reason, group testing is a well-known means of saving costs and producing precise estimates. In this paper, we developed a mixed-effect group testing regression that is useful when the data-collecting process is performed using inverse sampling. This model allows including covariate information at the individual level to incorporate heterogeneity among individuals and identify which covariates are associated with positive individuals. We present an approach to fit this model using maximum likelihood and we performed a simulation study to evaluate the quality of the estimates. Based on the simulation study, we found that the proposed regression method for inverse sampling with group testing produces parameter estimates with low bias when the pre-specified number of positive pools (r) to stop the sampling process is at least 10 and the number of clusters in the sample is also at least 10. We performed an application with real data and we provide an NLMIXED code that researchers can use to implement this method.

Download Full-text

Missing not at random in end of life care studies: multiple imputation and sensitivity analysis on data from the ACTION study

BMC Medical Research Methodology ◽

10.1186/s12874-020-01180-y ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Giulia Carreras ◽

◽

Guido Miccinesi ◽

Andrew Wilcock ◽

Nancy Preston ◽

...

Keyword(s):

Sensitivity Analysis ◽

Multiple Imputation ◽

End Of Life ◽

End Of Life Care ◽

Missing Values ◽

Controlled Trial ◽

Missing At Random ◽

Life Care ◽

Missing Not At Random ◽

Cluster Randomized

Abstract Background Missing data are common in end-of-life care studies, but there is still relatively little exploration of which is the best method to deal with them, and, in particular, if the missing at random (MAR) assumption is valid or missing not at random (MNAR) mechanisms should be assumed. In this paper we investigated this issue through a sensitivity analysis within the ACTION study, a multicenter cluster randomized controlled trial testing advance care planning in patients with advanced lung or colorectal cancer. Methods Multiple imputation procedures under MAR and MNAR assumptions were implemented. Possible violation of the MAR assumption was addressed with reference to variables measuring quality of life and symptoms. The MNAR model assumed that patients with worse health were more likely to have missing questionnaires, making a distinction between single missing items, which were assumed to satisfy the MAR assumption, and missing values due to completely missing questionnaire for which a MNAR mechanism was hypothesized. We explored the sensitivity to possible departures from MAR on gender differences between key indicators and on simple correlations. Results Up to 39% of follow-up data were missing. Results under MAR reflected that missingness was related to poorer health status. Correlations between variables, although very small, changed according to the imputation method, as well as the differences in scores by gender, indicating a certain sensitivity of the results to the violation of the MAR assumption. Conclusions The findings confirmed the importance of undertaking this kind of analysis in end-of-life care studies.

Download Full-text

Matrix Normal Cluster-Weighted Models

Journal of Classification ◽

10.1007/s00357-021-09389-2 ◽

2021 ◽

Author(s):

Salvatore D. Tomarchio ◽

Paul D. McNicholas ◽

Antonio Punzo

Keyword(s):

Simulated Data ◽

Real Data ◽

Information Criterion ◽

Parameter Estimates ◽

Finite Mixtures ◽

Parameter Recovery ◽

Model Based Clustering ◽

Data Points ◽

Matrix Normal ◽

Mixtures Of Regressions

AbstractFinite mixtures of regressions with fixed covariates are a commonly used model-based clustering methodology to deal with regression data. However, they assume assignment independence, i.e., the allocation of data points to the clusters is made independently of the distribution of the covariates. To take into account the latter aspect, finite mixtures of regressions with random covariates, also known as cluster-weighted models (CWMs), have been proposed in the univariate and multivariate literature. In this paper, the CWM is extended to matrix data, e.g., those data where a set of variables are simultaneously observed at different time points or locations. Specifically, the cluster-specific marginal distribution of the covariates and the cluster-specific conditional distribution of the responses given the covariates are assumed to be matrix normal. Maximum likelihood parameter estimates are derived using an expectation-conditional maximization algorithm. Parameter recovery, classification assessment, and the capability of the Bayesian information criterion to detect the underlying groups are investigated using simulated data. Finally, two real data applications concerning educational indicators and the Italian non-life insurance market are presented.

Download Full-text

JMASM 54: A Comparison of Four Different Estimation Approaches for Prognostic Survival Oral Cancer Model

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1594045320 ◽

2020 ◽

Vol 18 (2) ◽

pp. 2-6

Author(s):

Thomas R. Knapp

Keyword(s):

Missing Data ◽

Oral Cancer ◽

Missing At Random ◽

Cancer Model ◽

Missing Not At Random ◽

Opposing View ◽

Missing Completely At Random ◽

Almost All

Rubin (1976, and elsewhere) claimed that there are three kinds of “missingness”: missing completely at random; missing at random; and missing not at random. He gave examples of each. The article that now follows takes an opposing view by arguing that almost all missing data are missing not at random.

Download Full-text

Sensitivity analysis of incomplete longitudinal data departing from the missing at random assumption: Methodology and application in a clinical trial with drop-outs

Statistical Methods in Medical Research ◽

10.1177/0962280213490014 ◽

2013 ◽

Vol 25 (4) ◽

pp. 1471-1489 ◽

Cited By ~ 15

Author(s):

M Moreno-Betancur ◽

M Chavance

Keyword(s):

Clinical Trial ◽

Sensitivity Analysis ◽

Longitudinal Data ◽

Missing At Random ◽

Drop Outs

Download Full-text

A Comparison of Three Approaches to Handling Incomplete State-Level Data

State Politics & Policy Quarterly ◽

10.1177/153244000700700306 ◽

2007 ◽

Vol 7 (3) ◽

pp. 325-338 ◽

Cited By ~ 3

Author(s):

J. Scott Granberg-Rademacker

Keyword(s):

Monte Carlo ◽

Missing Data ◽

State Level ◽

Missing At Random ◽

Parameter Estimates ◽

Simulation Experiments ◽

Listwise Deletion ◽

Missing Completely At Random ◽

Level Data ◽

Level Analysis

This article compares three approaches to handling missing data at the state level under three distinct conditions. Using Monte Carlo simulation experiments, I compare the results from a linear model using listwise deletion (LD), Markov Chain Monte Carlo with the Gibbs sampler algorithm (MCMC), and multiple imputation by chained equations (MICE) as approaches to dealing with different severity levels of missing data: missing completely at random (MCAR), missing at random (MAR), and nonignorable missingness (NI). I compare the results from each of these approaches under each condition for missing data to the results from the fully observed dataset. I conclude that the MICE algorithm performs best under most missing data conditions, MCMC provides the most stable parameter estimates across the missing data conditions (but often produced estimates that were moderately biased), and LD performs worst under most missing data conditions. I conclude with recommendations for handling missing data in state-level analysis.

Download Full-text

A four-step strategy for handling missing outcome data in randomised trials affected by a pandemic

10.21203/rs.3.rs-32455/v1 ◽

2020 ◽

Author(s):

Suzie Cro ◽

Tim P Morris ◽

Brennan C Kahan ◽

Victoria R Cornelius ◽

James R Carpenter

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Treatment Effect ◽

Missing At Random ◽

Outcome Data ◽

Sensitivity Analyses ◽

Free World ◽

Randomised Trials ◽

Primary Analysis ◽

Missing Not At Random

Abstract Background The coronavirus pandemic (Covid-19) presents a variety of challenges for ongoing clinical trials, including an inevitably higher rate of missing outcome data, with new and non-standard reasons for missingness. International drug trial guidelines recommend trialists review plans for handling missing data in the conduct and statistical analysis, but clear recommendations are lacking. Methods We present a four-step strategy for handling missing outcome data in the analysis of randomised trials that are ongoing during a pandemic. We consider handling missing data arising due to (i) participant infection, (ii) treatment disruptions and (iii) loss to follow-up. We consider both settings where treatment effects for a ‘pandemic-free world’ and ‘world including a pandemic’ are of interest. Results In any trial, investigators should; (1) Clarify the treatment estimand of interest; (2) Establish what data are missing for the estimand at hand; (3) Perform primary analysis under the most plausible missing data assumptions followed by; (4) Sensitivity analysis under alternative plausible assumptions. To obtain an estimate of the treatment effect in a ‘pandemic-free world’, data from participants clinically affected by the pandemic (directly via infection or indirectly via treatment disruptions) are not relevant and can be set to missing. For primary analysis, a missing-at-random assumption that conditions on all observed data that are expected to be associated with both the outcome and missingness may be most plausible. For the treatment effect in the ‘world including a pandemic’, all participant data is relevant and should be included in the analysis. For primary analysis, a missing-at-random assumption – potentially incorporating a pandemic time-period indicator and participant infection status – or a missing-not-at-random assumption with a poorer response may be most relevant, depending on the setting. In all scenarios, sensitivity analysis under credible missing-not-at-random assumptions should be used to evaluate the robustness of results. We highlight controlled multiple imputation as an accessible tool for conducting sensitivity analyses. Conclusions Missing data problems will be exacerbated for trials active during the Covid-19 pandemic. This four-step strategy will facilitate clear thinking about the appropriate analysis for relevant questions of interest.

Download Full-text