scholarly journals Missing data in randomised controlled trials of rheumatoid arthritis drug therapy are substantial and handled inappropriately

RMD Open ◽  
2021 ◽  
Vol 7 (2) ◽  
pp. e001708
Author(s):  
Nasim A Khan ◽  
Karina D Torralba ◽  
Fawad Aslam

ObjectivesTo analyse the amount, reporting and handling of missing data, approach to intention-to-treat (ITT) principle application and sensitivity analysis utilisation in randomised clinical trials (RCTs) of rheumatoid arthritis (RA). To assess the trend in such reporting 10 years apart (2006 and 2016).MethodsParallel group drug therapy RA RCTs with a clinical primary endpoint.Results176 studies enrolling a median of 160 (IQR 62–339) patients were eligible. In terms of actual analysis: 81 (46%) RCTs conducted ITT, 42 (23.9%) conducted modified ITT while 53 (30.1%) conducted non-ITT analysis. Only 58 of 97 (59.8%) RCTs reporting an ITT analysis actually performed it. The median (IQR) numbers of participants completing the trial and included in analysis for primary outcome were 86% (74%–91%) and 100% (97.1%–100%), respectively. 53 (32.7%) and 65 (40.1%) RCTs had >20% and 10%–20% missing primary outcome data, respectively. Missing data handling was unreported by 58 of 171 (33.9%) RCTs. When reported, vast majority used simple imputation methods. No significant trend towards improved reporting was seen between 2006 and 2016. Sensitivity analysis numerically improved from 2006 to 2016 (14.7% vs 21.4%).ConclusionsThere is significant discrepancy in the reported and the actual performed analysis in RA drug therapy RCTs. Nearly one-third of RCTs had >20% missing data. The reporting and methods of missing data handling remain inadequate with high usage of non-preferred simple imputation methods. Sensitivity analysis utilisation was low. No trend towards better missing data reporting and handling was seen.

2018 ◽  
Author(s):  
Jean Gaudart ◽  
Pascal Adalian ◽  
George Leonetti

AbstractIntroductionIn many studies, covariates are not always fully observed because of missing data process. Usually, subjects with missing data are excluded from the analysis but the number of covariates can be greater than the size of the sample when the number of removed subjects is high. Subjective selection or imputation procedures are used but this leads to biased or powerless models.The aim of our study was to develop a method based on the selection of the nearest covariate to the centroid of a homogeneous cluster of covariates. We applied this method to a forensic medicine data set to estimate the age of aborted fetuses.AnalysisMethodsWe measured 46 biometric covariates on 50 aborted fetuses. But the covariates were complete for only 18 fetuses.First, to obtain homogeneous clusters of covariates we used a hierarchical cluster analysis.Second, for each obtained cluster we selected the nearest covariate to the centroid of the cluster, maximizing the sum of correlations (the centroid criterion).Third, with the covariate selected this way, the sample size was sufficient to compute a classical linear regression model.We have shown the almost sure convergence of the centroid criterion and simulations were performed to build its empirical distribution.We compared our method to a subjective deletion method, two simple imputation methods and to the multiple imputation method.ResultsThe hierarchical cluster analysis built 2 clusters of covariates and 6 remaining covariates. After the selection of the nearest covariate to the centroid of each cluster, we computed a stepwise linear regression model. The model was adequate (R2=90.02%) and the cross-validation showed low prediction errors (2.23 10−3).The empirical distribution of the criterion provided empirical mean (31.91) and median (32.07) close to the theoretical value (32.03).The comparisons showed that deletion and simple imputation methods provided models of inferior quality than the multiple imputation method and the centroid method.ConclusionWhen the number of continuous covariates is greater than the sample size because of missing process, the usual procedures are biased. Our selection procedure based on the centroid criterion is a valid alternative to compose a set of predictors.


2020 ◽  
Author(s):  
Suzie Cro ◽  
Tim P Morris ◽  
Brennan C Kahan ◽  
Victoria R Cornelius ◽  
James R Carpenter

Abstract Background: The coronavirus pandemic (Covid-19) presents a variety of challenges for ongoing clinical trials, including an inevitably higher rate of missing outcome data, with new and non-standard reasons for missingness. International drug trial guidelines recommend trialists review plans for handling missing data in the conduct and statistical analysis, but clear recommendations are lacking.Methods: We present a four-step strategy for handling missing outcome data in the analysis of randomised trials that are ongoing during a pandemic. We consider handling missing data arising due to (i) participant infection, (ii) treatment disruptions and (iii) loss to follow-up. We consider both settings where treatment effects for a ‘pandemic-free world’ and ‘world including a pandemic’ are of interest. Results: In any trial, investigators should; (1) Clarify the treatment estimand of interest with respect to the occurrence of the pandemic; (2) Establish what data are missing for the chosen estimand; (3) Perform primary analysis under the most plausible missing data assumptions followed by; (4) Sensitivity analysis under alternative plausible assumptions. To obtain an estimate of the treatment effect in a ‘pandemic-free world’, participant data that are clinically affected by the pandemic (directly due to infection or indirectly via treatment disruptions) are not relevant and can be set to missing. For primary analysis, a missing-at-random assumption that conditions on all observed data that are expected to be associated with both the outcome and missingness may be most plausible. For the treatment effect in the ‘world including a pandemic’, all participant data is relevant and should be included in the analysis. For primary analysis, a missing-at-random assumption – potentially incorporating a pandemic time-period indicator and participant infection status – or a missing-not-at-random assumption with a poorer response may be most relevant, depending on the setting. In all scenarios, sensitivity analysis under credible missing-not-at-random assumptions should be used to evaluate the robustness of results. We highlight controlled multiple imputation as an accessible tool for conducting sensitivity analyses.Conclusions: Missing data problems will be exacerbated for trials active during the Covid-19 pandemic. This four-step strategy will facilitate clear thinking about the appropriate analysis for relevant questions of interest.


2021 ◽  
Author(s):  
◽  
Maoxin Luo

<p>The Food Nutrition Environment Survey (FNES) is a survey of New Zealand early childhood centres and schools and the food and nutritional services that they provide for their pupils. The 2007 and 2009 FNES surveys were managed by the Ministry of Health. Like all the other social surveys, the FNES has the common problem of unit and item non-responses. In other words, the FNES has missing data. In this thesis, we have surveyed a wide variety of missing data handling techniques and applied most of them to the FNES datasets. This thesis can be roughly divided into two parts. In the first part, we have studied and investigated the different nature of missing data (i.e. missing data mechanisms), and all the common and popular imputation methods, using the Synthetic Unit Record File (SURF) which has been developed by the Statistics New Zealand for educational purposes. By comparing all those different imputation methods, Bayesian Multiple Imputation (MI) method is the preferred option to impute missing data in terms of reducing non-response bias and properly propagating imputation uncertainty. Due to the overlaps in the samples selected for the 2007 and 2009 FNES surveys, we have discovered that the Bayesian MI can be improved by incorporating the matched dataset. Hence, we have proposed a couple of new approaches to utilize the extra information from the matched dataset. We believe that adapting the Bayesian MI to use the extra information from the matched dataset is a preferable imputation strategy for imputing the FNES missing data. This is because the use of the matched dataset provides more prediction power to the imputation model.</p>


2021 ◽  
Author(s):  
◽  
Maoxin Luo

<p>The Food Nutrition Environment Survey (FNES) is a survey of New Zealand early childhood centres and schools and the food and nutritional services that they provide for their pupils. The 2007 and 2009 FNES surveys were managed by the Ministry of Health. Like all the other social surveys, the FNES has the common problem of unit and item non-responses. In other words, the FNES has missing data. In this thesis, we have surveyed a wide variety of missing data handling techniques and applied most of them to the FNES datasets. This thesis can be roughly divided into two parts. In the first part, we have studied and investigated the different nature of missing data (i.e. missing data mechanisms), and all the common and popular imputation methods, using the Synthetic Unit Record File (SURF) which has been developed by the Statistics New Zealand for educational purposes. By comparing all those different imputation methods, Bayesian Multiple Imputation (MI) method is the preferred option to impute missing data in terms of reducing non-response bias and properly propagating imputation uncertainty. Due to the overlaps in the samples selected for the 2007 and 2009 FNES surveys, we have discovered that the Bayesian MI can be improved by incorporating the matched dataset. Hence, we have proposed a couple of new approaches to utilize the extra information from the matched dataset. We believe that adapting the Bayesian MI to use the extra information from the matched dataset is a preferable imputation strategy for imputing the FNES missing data. This is because the use of the matched dataset provides more prediction power to the imputation model.</p>


2020 ◽  
Author(s):  
Suzie Cro ◽  
Tim P Morris ◽  
Brennan C Kahan ◽  
Victoria R Cornelius ◽  
James R Carpenter

Abstract Background The coronavirus pandemic (Covid-19) presents a variety of challenges for ongoing clinical trials, including an inevitably higher rate of missing outcome data, with new and non-standard reasons for missingness. International drug trial guidelines recommend trialists review plans for handling missing data in the conduct and statistical analysis, but clear recommendations are lacking. Methods We present a four-step strategy for handling missing outcome data in the analysis of randomised trials that are ongoing during a pandemic. We consider handling missing data arising due to (i) participant infection, (ii) treatment disruptions and (iii) loss to follow-up. We consider both settings where treatment effects for a ‘pandemic-free world’ and ‘world including a pandemic’ are of interest. Results In any trial, investigators should; (1) Clarify the treatment estimand of interest; (2) Establish what data are missing for the estimand at hand; (3) Perform primary analysis under the most plausible missing data assumptions followed by; (4) Sensitivity analysis under alternative plausible assumptions. To obtain an estimate of the treatment effect in a ‘pandemic-free world’, data from participants clinically affected by the pandemic (directly via infection or indirectly via treatment disruptions) are not relevant and can be set to missing. For primary analysis, a missing-at-random assumption that conditions on all observed data that are expected to be associated with both the outcome and missingness may be most plausible. For the treatment effect in the ‘world including a pandemic’, all participant data is relevant and should be included in the analysis. For primary analysis, a missing-at-random assumption – potentially incorporating a pandemic time-period indicator and participant infection status – or a missing-not-at-random assumption with a poorer response may be most relevant, depending on the setting. In all scenarios, sensitivity analysis under credible missing-not-at-random assumptions should be used to evaluate the robustness of results. We highlight controlled multiple imputation as an accessible tool for conducting sensitivity analyses. Conclusions Missing data problems will be exacerbated for trials active during the Covid-19 pandemic. This four-step strategy will facilitate clear thinking about the appropriate analysis for relevant questions of interest.


Author(s):  
Anna Chaimani ◽  
Dimitris Mavridis ◽  
Georgia Salanti ◽  
Julian P. T. Higgins ◽  
Ian R. White

Missing outcome data can invalidate the results of randomized trials and their meta-analysis. However, addressing missing data is often a challenging issue because it requires untestable assumptions. The impact of missing outcome data on the meta-analysis summary effect can be explored by assuming a relationship between the outcome in the observed and the missing participants via an informative missingness parameter. The informative missingness parameters cannot be estimated from the observed data, but they can be specified, with associated uncertainty, using evidence external to the meta-analysis, such as expert opinion. The use of informative missingness parameters in pairwise meta-analysis of aggregate data with binary outcomes has been previously implemented in Stata by the metamiss command. In this article, we present the new command metamiss2, which is an extension of metamiss for binary or continuous data in pairwise or network meta-analysis. The command can be used to explore the robustness of results to different assumptions about the missing data via sensitivity analysis.


2020 ◽  
Author(s):  
Suzie Cro ◽  
Tim P Morris ◽  
Brennan Kahan ◽  
Victoria R Cornelius ◽  
James Carpenter

Background: The coronavirus pandemic (Covid-19) presents a variety of challenges for ongoing clinical trials, including an inevitably higher rate of missing outcome data, with new and non-standard reasons for missingness. International drug trial guidelines recommend trialists review plans for handling missing data in the conduct and statistical analysis, but clear recommendations are lacking.Methods: We present a four-step strategy for handling missing outcome data in the analysis of randomised trials that are ongoing during a pandemic. We consider handling missing data arising due to (i) participant infection, (ii) treatment disruptions and (iii) loss to follow-up. We consider both settings where treatment effects for a ‘pandemic-free world’ and ‘world including a pandemic’ are of interest. Results: In any trial, investigators should; (1) Clarify the treatment estimand of interest with respect to the occurrence of the pandemic; (2) Establish what data are missing for the chosen estimand; (3) Perform primary analysis under the most plausible missing data assumptions followed by; (4) Sensitivity analysis under alternative plausible assumptions. To obtain an estimate of the treatment effect in a ‘pandemic-free world’, participant data that are clinically affected by the pandemic (directly due to infection or indirectly via treatment disruptions) are not relevant and can be set to missing. For primary analysis, a missing-at-random assumption that conditions on all observed data that are expected to be associated with both the outcome and missingness may be most plausible. For the treatment effect in the ‘world including a pandemic’, all participant data is relevant and should be included in the analysis. For primary analysis, a missing-at-random assumption – potentially incorporating a pandemic time-period indicator and participant infection status – or a missing-not-at-random assumption with a poorer response may be most relevant, depending on the setting. In all scenarios, sensitivity analysis under credible missing-not-at-random assumptions should be used to evaluate the robustness of results. We highlight controlled multiple imputation as an accessible tool for conducting sensitivity analyses.Conclusions: Missing data problems will be exacerbated for trials active during the Covid-19 pandemic. This four-step strategy will facilitate clear thinking about the appropriate analysis for relevant questions of interest.


Sign in / Sign up

Export Citation Format

Share Document