scholarly journals Return‐to‐baseline multiple imputation for missing values in clinical trials

2022 ◽  
Author(s):  
Yongming Qu ◽  
Biyue Dai
2019 ◽  
Author(s):  
Donna Coffman ◽  
Jiangxiu Zhou ◽  
Xizhen Cai

Abstract Background Causal effect estimation with observational data is subject to bias due to confounding, which is often controlled for using propensity scores. One unresolved issue in propensity score estimation is how to handle missing values in covariates.Method Several approaches have been proposed for handling covariate missingness, including multiple imputation (MI), multiple imputation with missingness pattern (MIMP), and treatment mean imputation. However, there are other potentially useful approaches that have not been evaluated, including single imputation (SI) + prediction error (PE), SI+PE + parameter uncertainty (PU), and Generalized Boosted Modeling (GBM), which is a nonparametric approach for estimating propensity scores in which missing values are automatically handled in the estimation using a surrogate split method. To evaluate the performance of these approaches, a simulation study was conducted.Results Results suggested that SI+PE, SI+PE+PU, MI, and MIMP perform almost equally well and better than treatment mean imputation and GBM in terms of bias; however, MI and MIMP account for the additional uncertainty of imputing the missingness.Conclusions Applying GBM to the incomplete data and relying on the surrogate split approach resulted in substantial bias. Imputation prior to implementing GBM is recommended.


Author(s):  
Byron C. Jaeger ◽  
Ryan Cantor ◽  
Venkata Sthanam ◽  
Rongbing Xie ◽  
James K. Kirklin ◽  
...  

Background: Risk prediction models play an important role in clinical decision making. When developing risk prediction models, practitioners often impute missing values to the mean. We evaluated the impact of applying other strategies to impute missing values on the prognostic accuracy of downstream risk prediction models, that is, models fitted to the imputed data. A secondary objective was to compare the accuracy of imputation methods based on artificially induced missing values. To complete these objectives, we used data from the Interagency Registry for Mechanically Assisted Circulatory Support. Methods: We applied 12 imputation strategies in combination with 2 different modeling strategies for mortality and transplant risk prediction following surgery to receive mechanical circulatory support. Model performance was evaluated using Monte-Carlo cross-validation and measured based on outcomes 6 months following surgery using the scaled Brier score, concordance index, and calibration error. We used Bayesian hierarchical models to compare model performance. Results: Multiple imputation with random forests emerged as a robust strategy to impute missing values, increasing model concordance by 0.0030 (25th–75th percentile: 0.0008–0.0052) compared with imputation to the mean for mortality risk prediction using a downstream proportional hazards model. The posterior probability that single and multiple imputation using random forests would improve concordance versus mean imputation was 0.464 and >0.999, respectively. Conclusions: Selecting an optimal strategy to impute missing values such as random forests and applying multiple imputation can improve the prognostic accuracy of downstream risk prediction models.


Author(s):  
Thelma Dede Baddoo ◽  
Zhijia Li ◽  
Samuel Nii Odai ◽  
Kenneth Rodolphe Chabi Boni ◽  
Isaac Kwesi Nooni ◽  
...  

Reconstructing missing streamflow data can be challenging when additional data are not available, and missing data imputation of real-world datasets to investigate how to ascertain the accuracy of imputation algorithms for these datasets are lacking. This study investigated the necessary complexity of missing data reconstruction schemes to obtain the relevant results for a real-world single station streamflow observation to facilitate its further use. This investigation was implemented by applying different missing data mechanisms spanning from univariate algorithms to multiple imputation methods accustomed to multivariate data taking time as an explicit variable. The performance accuracy of these schemes was assessed using the total error measurement (TEM) and a recommended localized error measurement (LEM) in this study. The results show that univariate missing value algorithms, which are specially developed to handle univariate time series, provide satisfactory results, but the ones which provide the best results are usually time and computationally intensive. Also, multiple imputation algorithms which consider the surrounding observed values and/or which can understand the characteristics of the data provide similar results to the univariate missing data algorithms and, in some cases, perform better without the added time and computational downsides when time is taken as an explicit variable. Furthermore, the LEM would be especially useful when the missing data are in specific portions of the dataset or where very large gaps of ‘missingness’ occur. Finally, proper handling of missing values of real-world hydroclimatic datasets depends on imputing and extensive study of the particular dataset to be imputed.


2018 ◽  
Vol 11 (16) ◽  
pp. 1-11
Author(s):  
Tlhalitshi Volition Montshiwa ◽  
Ntebo Moroke ◽  
Elias Munapo ◽  
◽  
◽  
...  

2019 ◽  
Vol 6 (1) ◽  
pp. e000348 ◽  
Author(s):  
Mimi Kim ◽  
Joan T Merrill ◽  
Cuiling Wang ◽  
Shankar Viswanathan ◽  
Ken Kalunian ◽  
...  

ObjectiveA common problem in clinical trials is missing data due to participant dropout and loss to follow-up, an issue which continues to receive considerable attention in the clinical research community. Our objective was to examine and compare current and alternative methods for handling missing data in SLE trials with a particular focus on multiple imputation, a flexible technique that has been applied in different disease settings but not to address missing data in the primary outcome of an SLE trial.MethodsData on 279 patients with SLE randomised to standard of care (SoC) and also receiving mycophenolate mofetil (MMF), azathioprine or methotrexate were obtained from the Lupus Foundation of America-Collective Data Analysis Initiative Database. Complete case analysis (CC), last observation carried forward (LOCF), non-responder imputation (NRI) and multiple imputation (MI) were applied to handle missing data in an analysis to assess differences in SLE Responder Index-5 (SRI-5) response rates at 52 weeks between patients on SoC treated with MMF versus other immunosuppressants (non-MMF).ResultsThe rates of missing data were 32% in the MMF and 23% in the non-MMF groups. As expected, the NRI missing data approach yielded the lowest estimated response rates. The smallest and least significant estimates of differences between groups were observed with LOCF, and precision was lowest with the CC method. Estimated between-group differences were magnified with the MI approach, and imputing SRI-5 directly versus deriving SRI-5 after separately imputing its individual components yielded similar results.ConclusionThe potential advantages of applying MI to address missing data in an SLE trial include reduced bias when estimating treatment effects, and measures of precision that properly reflect uncertainty in the imputations. However, results can vary depending on the imputation model used, and the underlying assumptions should be plausible. Sensitivity analysis should be conducted to demonstrate robustness of results, especially when missing data proportions are high.


Sign in / Sign up

Export Citation Format

Share Document