An alternative characterization of MAR in shared parameter models for incomplete longitudinal data and its utilization for sensitivity analysis

Statistical Modelling ◽

10.1177/1471082x20927114 ◽

2020 ◽

pp. 1471082X2092711

Author(s):

Grigorios Papageorgiou ◽

Dimitris Rizopoulos

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Missing At Random ◽

Joint Models ◽

Specific Level ◽

Conditional Independence Assumption ◽

Wide Range ◽

Subject Specific ◽

Alternative Characterization

Dropout is a common complication in longitudinal studies, especially since the distinction between missing not at random (MNAR) and missing at random (MAR) dropout is intractable. Consequently, one starts with an analysis that is valid under MAR and then performs a sensitivity analysis by considering MNAR departures from it. To this end, specific classes of joint models, such as pattern-mixture models (PMMs) and selection models (SeMs), have been proposed. On the contrary, shared-parameter models (SPMs) have received less attention, possibly because they do not embody a characterization of MAR. A few approaches to achieve MAR in SPMs exist, but are difficult to implement in existing software. In this article, we focus on SPMs for incomplete longitudinal and time-to-dropout data and propose an alternative characterization of MAR by exploiting the conditional independence assumption, under which outcome and missingness are independent given a set of random effects. By doing so, the censoring distribution can be utilized to cover a wide range of assumptions for the missing data mechanism on the subject-specific level. This approach offers substantial advantages over its counterparts and can be easily implemented in existing software. More specifically, it offers flexibility over the assumption for the missing data generating mechanism that governs dropout by allowing subject-specific perturbations of the censoring distribution, whereas in PMMs and SeMs dropout is considered MNAR strictly.

Download Full-text

P427 A hybrid approach of handling missing data in inflammatory bowel disease (IBD) trials: results from VISIBLE 1 and VARSITY

Journal of Crohn s and Colitis ◽

10.1093/ecco-jcc/jjz203.556 ◽

2020 ◽

Vol 14 (Supplement_1) ◽

pp. S388-S389

Author(s):

J Chen ◽

S Hunter ◽

K Kisfalvi ◽

R A Lirio

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Statistical Power ◽

Hybrid Approach ◽

Missing At Random ◽

P Value ◽

Two Phase ◽

Treatment Difference ◽

Mayo Score ◽

The Impact

Abstract Background Missing data is common in IBD trials. Depending on the volume and nature of missing data, it can reduce statistical power for detecting treatment difference, introduce potential bias and invalidate conclusions. Non-responder imputation (NRI), where patients (patients) with missing data are considered treatment failures, is widely used to handle missing data for dichotomous efficacy endpoints in IBD trials. However, it does not consider the mechanisms leading to missing data and can potentially underestimate the treatment effect. We proposed a hybrid (HI) approach combining NRI and multiple imputation (MI) as an alternative to NRI in the analyses of two phase 3 trials of vedolizumab (VDZ) in patients with moderate-to-severe UC – VISIBLE 11 and VARSITY2. Methods VISIBLE 1 and VARSITY assessed efficacy using dichotomous endpoints based on complete Mayo score. Full methodologies reported previously.1,2 Our proposed HI approach is aimed at imputing missing Mayo scores, instead of imputing the missing dichotomous efficacy endpoint. To assess the impact of dropouts for different missing data mechanisms (categorised as ‘missing not at random [MNAR]’ and ‘missing at random [MAR]’, HI was implemented as a potential sensitivity analysis, where dropouts owing to safety or lack of efficacy were imputed using NRI (assuming MNAR) and other missing data were imputed using MI (assuming MAR). For MI, each component of the Mayo score was imputed via a multivariate stepwise approach using a fully conditional specification ordinal logistic method. Missing baseline scores were imputed using baseline characteristics data. Missing scores from each subsequent visit were imputed using all previous visits in a stepwise fashion. Fifty imputation datasets were computed for each component of Mayo score. The complete Mayo score and relevant efficacy endpoints were derived subsequently. The analysis was performed within each imputed dataset to determine treatment difference, 95% CI and p-value, which were then combined via Rubin’s rules3. Results Tables 1 and 2 show a comparison of efficacy in the two studies using the primary NRI analysis vs. the alternative HI approach for handling missing data. Conclusion HI and NRI approaches can provide consistent efficacy analyses in IBD trials. The HI approach can serve as a useful sensitivity analysis to assess the impact of dropouts under different missing data mechanisms and evaluate the robustness of efficacy conclusions. Reference

Download Full-text

A four-step strategy for handling missing outcome data in randomised trials affected by a pandemic

10.21203/rs.3.rs-32455/v2 ◽

2020 ◽

Author(s):

Suzie Cro ◽

Tim P Morris ◽

Brennan C Kahan ◽

Victoria R Cornelius ◽

James R Carpenter

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Treatment Effect ◽

Missing At Random ◽

Outcome Data ◽

Sensitivity Analyses ◽

Free World ◽

Randomised Trials ◽

Primary Analysis ◽

Missing Not At Random

Abstract Background: The coronavirus pandemic (Covid-19) presents a variety of challenges for ongoing clinical trials, including an inevitably higher rate of missing outcome data, with new and non-standard reasons for missingness. International drug trial guidelines recommend trialists review plans for handling missing data in the conduct and statistical analysis, but clear recommendations are lacking.Methods: We present a four-step strategy for handling missing outcome data in the analysis of randomised trials that are ongoing during a pandemic. We consider handling missing data arising due to (i) participant infection, (ii) treatment disruptions and (iii) loss to follow-up. We consider both settings where treatment effects for a ‘pandemic-free world’ and ‘world including a pandemic’ are of interest. Results: In any trial, investigators should; (1) Clarify the treatment estimand of interest with respect to the occurrence of the pandemic; (2) Establish what data are missing for the chosen estimand; (3) Perform primary analysis under the most plausible missing data assumptions followed by; (4) Sensitivity analysis under alternative plausible assumptions. To obtain an estimate of the treatment effect in a ‘pandemic-free world’, participant data that are clinically affected by the pandemic (directly due to infection or indirectly via treatment disruptions) are not relevant and can be set to missing. For primary analysis, a missing-at-random assumption that conditions on all observed data that are expected to be associated with both the outcome and missingness may be most plausible. For the treatment effect in the ‘world including a pandemic’, all participant data is relevant and should be included in the analysis. For primary analysis, a missing-at-random assumption – potentially incorporating a pandemic time-period indicator and participant infection status – or a missing-not-at-random assumption with a poorer response may be most relevant, depending on the setting. In all scenarios, sensitivity analysis under credible missing-not-at-random assumptions should be used to evaluate the robustness of results. We highlight controlled multiple imputation as an accessible tool for conducting sensitivity analyses.Conclusions: Missing data problems will be exacerbated for trials active during the Covid-19 pandemic. This four-step strategy will facilitate clear thinking about the appropriate analysis for relevant questions of interest.

Download Full-text

Fitting Ordinal Factor Analysis Models With Missing Data: A Comparison Between Pairwise Deletion and Multiple Imputation

Educational and Psychological Measurement ◽

10.1177/0013164419845039 ◽

2019 ◽

Vol 80 (1) ◽

pp. 41-66 ◽

Cited By ~ 3

Author(s):

Dexin Shi ◽

Taehun Lee ◽

Amanda J. Fairchild ◽

Alberto Maydeu-Olivares

Keyword(s):

Factor Analysis ◽

Missing Data ◽

Multiple Imputation ◽

Missing At Random ◽

Complete Data ◽

Model Fit ◽

Parameter Estimates ◽

Fit Indices ◽

Wide Range ◽

Analysis Models

This study compares two missing data procedures in the context of ordinal factor analysis models: pairwise deletion (PD; the default setting in Mplus) and multiple imputation (MI). We examine which procedure demonstrates parameter estimates and model fit indices closer to those of complete data. The performance of PD and MI are compared under a wide range of conditions, including number of response categories, sample size, percent of missingness, and degree of model misfit. Results indicate that both PD and MI yield parameter estimates similar to those from analysis of complete data under conditions where the data are missing completely at random (MCAR). When the data are missing at random (MAR), PD parameter estimates are shown to be severely biased across parameter combinations in the study. When the percentage of missingness is less than 50%, MI yields parameter estimates that are similar to results from complete data. However, the fit indices (i.e., χ2, RMSEA, and WRMR) yield estimates that suggested a worse fit than results observed in complete data. We recommend that applied researchers use MI when fitting ordinal factor models with missing data. We further recommend interpreting model fit based on the TLI and CFI incremental fit indices.

Download Full-text

Using the CES-D scale in a large cohort study and dealing with missing data: Application to the French E3N cohort

European Psychiatry ◽

10.1016/s0924-9338(11)72279-9 ◽

2011 ◽

Vol 26 (S2) ◽

pp. 572-572

Author(s):

N. Resseguier ◽

H. Verdoux ◽

F. Clavel-Chapelon ◽

X. Paoletti

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Multiple Imputation ◽

Missing Values ◽

Large Population ◽

Missing At Random ◽

Population Based ◽

Missing Value ◽

Perform Sensitivity Analysis ◽

The Impact

IntroductionThe CES-D scale is commonly used to assess depressive symptoms (DS) in large population-based studies. Missing values in items of the scale may create biases.ObjectivesTo explore reasons for not completing items of the CES-D scale and to perform sensitivity analysis of the prevalence of DS to assess the impact of different missing data hypotheses.Methods71412 women included in the French E3N cohort returned in 2005 a questionnaire containing the CES-D scale. 45% presented at least one missing value in the scale. An interview study was carried out on a random sample of 204 participants to examine the different hypotheses for the missing value mechanism. The prevalence of DS was estimated according to different methods for handling missing values: complete cases analysis, single imputation, multiple imputation under MAR (missing at random) and MNAR (missing not at random) assumptions.ResultsThe interviews showed that participants were not embarrassed to fill in questions about DS. Potential reasons of nonresponse were identified. MAR and MNAR hypotheses remained plausible and were explored.Among complete responders, the prevalence of DS was 26.1%. After multiple imputation under MAR assumption, it was 28.6%, 29.8% and 31.7% among women presenting up to 4, to 10 and to 20 missing values, respectively. The estimates were robust after applying various scenarios of MNAR data for the sensitivity analysis.ConclusionsThe CES-D scale can easily be used to assess DS in large cohorts. Multiple imputation under MAR assumption allows to reliably handle missing values.

Download Full-text

The Influence of Missing Data on Disabilities in Patients Treated with High-Dose Spinal Cord Stimulation: A Tipping Point Sensitivity Analysis

Journal of Clinical Medicine ◽

10.3390/jcm10214897 ◽

2021 ◽

Vol 10 (21) ◽

pp. 4897

Author(s):

Lisa Goudman ◽

Geert Molenberghs ◽

Rui V. Duarte ◽

Maarten Moens

Keyword(s):

Spinal Cord ◽

Sensitivity Analysis ◽

Missing Data ◽

Spinal Cord Stimulation ◽

Missing At Random ◽

Tipping Point ◽

High Dose ◽

Shift Parameter ◽

Disability Score ◽

Failed Back Surgery

New waveforms have changed the field of Spinal Cord Stimulation (SCS) to optimize therapy outcomes, among which is High-Dose SCS (HD-SCS). Missing observations are often encountered when conducting clinical trials in this field. In this study, different approaches with varying assumptions were constructed to evaluate how conclusions may be influenced by these assumptions. The aim is to perform a tipping point sensitivity analysis to evaluate the influence of missing data on the overall conclusion regarding the effectiveness of HD-SCS on disability. Data from the Discover study were used, in which 185 patients with Failed Back Surgery Syndrome were included. Disability was evaluated before SCS and after 1, 3 and 12 months of HD-SCS. During the second, third and fourth visit, data from 130, 114 and 90 patients were available, respectively. HD-SCS resulted in a significant decrease in disability scores based on the analysis of observed data and with multiple imputations. The tipping point sensitivity analysis revealed that the shift parameter was 17. Thus, the conclusion concerning the time effect under a “missing at random” mechanism is robust when the shift parameter for the disability score is 17. From a clinical point of view, a shift of 17 points on disability is not very plausible. Therefore we tend to consider the conclusions drawn under “missing at random” as being robust.

Download Full-text

A review of the use of controlled multiple imputation in randomised controlled trials with missing outcome data

BMC Medical Research Methodology ◽

10.1186/s12874-021-01261-6 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Ping-Tee Tan ◽

Suzie Cro ◽

Eleanor Van Vogt ◽

Matyas Szigeti ◽

Victoria R. Cornelius

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Multiple Imputation ◽

Randomised Controlled Trials ◽

Missing At Random ◽

Sensitivity Analyses ◽

Controlled Trials ◽

Primary Analysis ◽

Randomised Controlled ◽

The Impact

Abstract Background Missing data are common in randomised controlled trials (RCTs) and can bias results if not handled appropriately. A statistically valid analysis under the primary missing-data assumptions should be conducted, followed by sensitivity analysis under alternative justified assumptions to assess the robustness of results. Controlled Multiple Imputation (MI) procedures, including delta-based and reference-based approaches, have been developed for analysis under missing-not-at-random assumptions. However, it is unclear how often these methods are used, how they are reported, and what their impact is on trial results. This review evaluates the current use and reporting of MI and controlled MI in RCTs. Methods A targeted review of phase II-IV RCTs (non-cluster randomised) published in two leading general medical journals (The Lancet and New England Journal of Medicine) between January 2014 and December 2019 using MI. Data was extracted on imputation methods, analysis status, and reporting of results. Results of primary and sensitivity analyses for trials using controlled MI analyses were compared. Results A total of 118 RCTs (9% of published RCTs) used some form of MI. MI under missing-at-random was used in 110 trials; this was for primary analysis in 43/118 (36%), and in sensitivity analysis for 70/118 (59%) (3 used in both). Sixteen studies performed controlled MI (1.3% of published RCTs), either with a delta-based (n = 9) or reference-based approach (n = 7). Controlled MI was mostly used in sensitivity analysis (n = 14/16). Two trials used controlled MI for primary analysis, including one reporting no sensitivity analysis whilst the other reported similar results without imputation. Of the 14 trials using controlled MI in sensitivity analysis, 12 yielded comparable results to the primary analysis whereas 2 demonstrated contradicting results. Only 5/110 (5%) trials using missing-at-random MI and 5/16 (31%) trials using controlled MI reported complete details on MI methods. Conclusions Controlled MI enabled the impact of accessible contextually relevant missing data assumptions to be examined on trial results. The use of controlled MI is increasing but is still infrequent and poorly reported where used. There is a need for improved reporting on the implementation of MI analyses and choice of controlled MI parameters.

Download Full-text

A four-step strategy for handling missing outcome data in randomised trials affected by a pandemic

10.21203/rs.3.rs-32455/v1 ◽

2020 ◽

Author(s):

Suzie Cro ◽

Tim P Morris ◽

Brennan C Kahan ◽

Victoria R Cornelius ◽

James R Carpenter

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Treatment Effect ◽

Missing At Random ◽

Outcome Data ◽

Sensitivity Analyses ◽

Free World ◽

Randomised Trials ◽

Primary Analysis ◽

Missing Not At Random

Abstract Background The coronavirus pandemic (Covid-19) presents a variety of challenges for ongoing clinical trials, including an inevitably higher rate of missing outcome data, with new and non-standard reasons for missingness. International drug trial guidelines recommend trialists review plans for handling missing data in the conduct and statistical analysis, but clear recommendations are lacking. Methods We present a four-step strategy for handling missing outcome data in the analysis of randomised trials that are ongoing during a pandemic. We consider handling missing data arising due to (i) participant infection, (ii) treatment disruptions and (iii) loss to follow-up. We consider both settings where treatment effects for a ‘pandemic-free world’ and ‘world including a pandemic’ are of interest. Results In any trial, investigators should; (1) Clarify the treatment estimand of interest; (2) Establish what data are missing for the estimand at hand; (3) Perform primary analysis under the most plausible missing data assumptions followed by; (4) Sensitivity analysis under alternative plausible assumptions. To obtain an estimate of the treatment effect in a ‘pandemic-free world’, data from participants clinically affected by the pandemic (directly via infection or indirectly via treatment disruptions) are not relevant and can be set to missing. For primary analysis, a missing-at-random assumption that conditions on all observed data that are expected to be associated with both the outcome and missingness may be most plausible. For the treatment effect in the ‘world including a pandemic’, all participant data is relevant and should be included in the analysis. For primary analysis, a missing-at-random assumption – potentially incorporating a pandemic time-period indicator and participant infection status – or a missing-not-at-random assumption with a poorer response may be most relevant, depending on the setting. In all scenarios, sensitivity analysis under credible missing-not-at-random assumptions should be used to evaluate the robustness of results. We highlight controlled multiple imputation as an accessible tool for conducting sensitivity analyses. Conclusions Missing data problems will be exacerbated for trials active during the Covid-19 pandemic. This four-step strategy will facilitate clear thinking about the appropriate analysis for relevant questions of interest.

Download Full-text

Handling missing data in modelling quality of clinician-prescribed routine care: Sensitivity analysis of departure from missing at random assumption

Statistical Methods in Medical Research ◽

10.1177/0962280220918279 ◽

2020 ◽

Vol 29 (10) ◽

pp. 3076-3092 ◽

Cited By ~ 1

Author(s):

Susan Gachau ◽

Matteo Quartagno ◽

Edmund Njeru Njagi ◽

Nelson Owuor ◽

Mike English ◽

...

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Multiple Imputation ◽

Missing At Random ◽

Parameter Estimates ◽

Analysis Model ◽

Major Drawback ◽

Missing Not At Random ◽

Prior Distributions ◽

Random Mechanism

Missing information is a major drawback in analyzing data collected in many routine health care settings. Multiple imputation assuming a missing at random mechanism is a popular method to handle missing data. The missing at random assumption cannot be confirmed from the observed data alone, hence the need for sensitivity analysis to assess robustness of inference. However, sensitivity analysis is rarely conducted and reported in practice. We analyzed routine paediatric data collected during a cluster randomized trial conducted in Kenyan hospitals. We imputed missing patient and clinician-level variables assuming the missing at random mechanism. We also imputed missing clinician-level variables assuming a missing not at random mechanism. We incorporated opinions from 15 clinical experts in the form of prior distributions and shift parameters in the delta adjustment method. An interaction between trial intervention arm and follow-up time, hospital, clinician and patient-level factors were included in a proportional odds random-effects analysis model. We performed these analyses using R functions derived from the jomo package. Parameter estimates from multiple imputation under the missing at random mechanism were similar to multiple imputation estimates assuming the missing not at random mechanism. Our inferences were insensitive to departures from the missing at random assumption using either the prior distributions or shift parameters sensitivity analysis approach.

Download Full-text

A four-step strategy for handling missing outcome data in randomised trials affected by a pandemic

10.31219/osf.io/d2tnc ◽

2020 ◽

Author(s):

Suzie Cro ◽

Tim P Morris ◽

Brennan Kahan ◽

Victoria R Cornelius ◽

James Carpenter

Keyword(s):

Sensitivity Analysis ◽

Missing Data ◽

Treatment Effect ◽

Missing At Random ◽

Outcome Data ◽

Free World ◽

Randomised Trials ◽

Primary Analysis ◽

Missing Not At Random ◽

Clear Thinking

Background: The coronavirus pandemic (Covid-19) presents a variety of challenges for ongoing clinical trials, including an inevitably higher rate of missing outcome data, with new and non-standard reasons for missingness. International drug trial guidelines recommend trialists review plans for handling missing data in the conduct and statistical analysis, but clear recommendations are lacking.Methods: We present a four-step strategy for handling missing outcome data in the analysis of randomised trials that are ongoing during a pandemic. We consider handling missing data arising due to (i) participant infection, (ii) treatment disruptions and (iii) loss to follow-up. We consider both settings where treatment effects for a ‘pandemic-free world’ and ‘world including a pandemic’ are of interest. Results: In any trial, investigators should; (1) Clarify the treatment estimand of interest with respect to the occurrence of the pandemic; (2) Establish what data are missing for the chosen estimand; (3) Perform primary analysis under the most plausible missing data assumptions followed by; (4) Sensitivity analysis under alternative plausible assumptions. To obtain an estimate of the treatment effect in a ‘pandemic-free world’, participant data that are clinically affected by the pandemic (directly due to infection or indirectly via treatment disruptions) are not relevant and can be set to missing. For primary analysis, a missing-at-random assumption that conditions on all observed data that are expected to be associated with both the outcome and missingness may be most plausible. For the treatment effect in the ‘world including a pandemic’, all participant data is relevant and should be included in the analysis. For primary analysis, a missing-at-random assumption – potentially incorporating a pandemic time-period indicator and participant infection status – or a missing-not-at-random assumption with a poorer response may be most relevant, depending on the setting. In all scenarios, sensitivity analysis under credible missing-not-at-random assumptions should be used to evaluate the robustness of results. We highlight controlled multiple imputation as an accessible tool for conducting sensitivity analyses.Conclusions: Missing data problems will be exacerbated for trials active during the Covid-19 pandemic. This four-step strategy will facilitate clear thinking about the appropriate analysis for relevant questions of interest.

Download Full-text

Missing at random: a stochastic process perspective

Biometrika ◽

10.1093/biomet/asab002 ◽

2021 ◽

Author(s):

D Farewell ◽

R Daniel ◽

S Seaman

Keyword(s):

Stochastic Process ◽

Missing Data ◽

Continuous Time ◽

Missing At Random ◽

Likelihood Ratios ◽

Explanatory Variables ◽

Data Framework ◽

Stopping Set ◽

Process Perspective

Abstract We offer a natural and extensible measure-theoretic treatment of missingness at random. Within the standard missing data framework, we give a novel characterization of the observed data as a stopping-set sigma algebra. We demonstrate that the usual missingness at random conditions are equivalent to requiring particular stochastic processes to be adapted to a set-indexed filtration. These measurability conditions ensure the usual factorization of likelihood ratios. We illustrate how the theory extends easily to incorporate explanatory variables, to describe longitudinal data in continuous time, and to admit more general coarsening of observations.

Download Full-text