missing at random
Recently Published Documents


TOTAL DOCUMENTS

323
(FIVE YEARS 122)

H-INDEX

25
(FIVE YEARS 3)

2022 ◽  
Vol 2022 ◽  
pp. 1-8
Author(s):  
Xiaoying Lv ◽  
Ruonan Zhao ◽  
Tongsheng Su ◽  
Liyun He ◽  
Rui Song ◽  
...  

Objective. To explore the optimal fitting path of missing data of the Scale to make the fitting data close to the real situation of patients’ data. Methods. Based on the complete data set of the SDS of 507 patients with stroke, the data simulation sets of Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR) were constructed by R software, respectively, with missing rates of 5%, 10%, 15%, 20%, 25%, 30%, 35%, and 40% under three missing mechanisms. Mean substitution (MS), random forest regression (RFR), and predictive mean matching (PMM) were used to fit the data. Root mean square error (RMSE), the width of 95% confidence intervals (95% CI), and Spearman correlation coefficient (SCC) were used to evaluate the fitting effect and determine the optimal fitting path. Results. when dealing with the problem of missing data in scales, the optimal fitting path is ① under the MCAR deletion mechanism, when the deletion proportion is less than 20%, the MS method is the most convenient; when the missing ratio is greater than 20%, RFR algorithm is the best fitting method. ② Under the Mar mechanism, when the deletion ratio is less than 35%, the MS method is the most convenient. When the deletion ratio is greater than 35%, RFR has a better correlation. ③ Under the mechanism of MNAR, RFR is the best data fitting method, especially when the missing proportion is greater than 30%. In reality, when the deletion ratio is small, the complete case deletion method is the most commonly used, but the RFR algorithm can greatly expand the application scope of samples and save the cost of clinical research when the deletion ratio is less than 30%. The best way to deal with data missing should be based on the missing mechanism and proportion of actual data, and choose the best method between the statistical analysis ability of the research team, the effectiveness of the method, and the understanding of readers.


Author(s):  
Samantha Wilkinson ◽  
Alind Gupta ◽  
Eric Mackay ◽  
Paul Arora ◽  
Kristian Thorlund ◽  
...  

IntroductionThe German health technology assessment (HTA) rejected additional benefit of alectinib for second line (2L) ALK+ NSCLC, citing possible biases from missing ECOG performance status data and unmeasured confounding in real-world evidence (RWE) for 2L ceritinib that was submitted as a comparator to the single arm alectinib trial. Alectinib was approved in the US and therefore US post-launch RWE can be used to evaluate this HTA decision.MethodsWe compared the real-world effectiveness of alectinib with ceritinib in 2L post-crizotinib ALK+ NSCLC using the nationwide Flatiron Health electronic health record (EHR)-derived de-identified database. Using quantitative bias analysis (QBA), we estimated the strength of (i) unmeasured confounding and (ii) deviation from missing-at-random (MAR) assumptions needed to nullify any overall survival (OS) benefit.ResultsAlectinib had significantly longer median OS than ceritinib in complete case analysis. The estimated effect size (Hazard Ratio: 0.55) was robust to risk ratios of unmeasured confounder-outcome and confounder-exposure associations of <2.4.Based on tipping point analysis, missing baseline ECOG performance status for ceritinib-treated patients (49% missing) would need to be more than 3.4-times worse than expected under MAR to nullify the OS benefit observed for alectinib.ConclusionsOnly implausible levels of bias reversed our conclusions. These methods could provide a framework to explore uncertainty and aid decision-making for HTAs to enable patient access to innovative therapies.


2021 ◽  
pp. 096228022110605
Author(s):  
Ujjwal Das ◽  
Ranojoy Basu

We consider partially observed binary matched-pair data. We assume that the incomplete subjects are missing at random. Within this missing framework, we propose an EM-algorithm based approach to construct an interval estimator of the proportion difference incorporating all the subjects. In conjunction with our proposed method, we also present two improvements to the interval estimator through some correction factors. The performances of the three competing methods are then evaluated through extensive simulation. Recommendation for the method is given based on the ability to preserve type-I error for various sample sizes. Finally, the methods are illustrated in two real-world data sets. An R-function is developed to implement the three proposed methods.


2021 ◽  
Author(s):  
◽  
Amy Jennings

<p>This thesis presents a comparison of maternal outcomes for births in New Zealand District Health Boards (DHBs).This is carried out through analysis of the National Minimum Dataset collected by the Ministry of Health for 2007.  The outcome compared is postpartum haemorrhage (PPH) the results are displayed using funnel plots, a useful tool for displaying unbiased information on performance outcomes when comparing institutions.   Exploration of the data found that there are differences in the demographics, maternal and birth characteristics among DHBs. The rates of PPH are different and the population mixes are made up of a range of different proportions of ethnic groups, ages and deprivation indexes. The exploratory analysis found that a large number of factors are associated with PPH. And that birth weight, parity and gestation had a large number of missing observations. These factors are not missing at random and require imputing prior to constructing the funnel plots.  Results show that there is divergence amongst DHBs in the postpartum haemorrhage rate. First a raw PPH rate was plotted and the results indicated there were differences among DHBs. As there are many potential predictors for PPHa logistic regression model was applied to find the most important factors related to PPH. This allows us to apply an adjusted rate for the funnel plot. The risk adjusted funnel plot also indicated differences among DHBs.  Two approaches are taken to account for the overdispersion. A winsorised estimate and a winsorised estimate with a random effects term are applied to the data. The approaches produced different results. The winsorised estimate widened the control limits and the random effects term narrowed the control limits. All four plots identified an extreme outlier and this was later removed from the analysis and the winsorisation funnel plots were rerun. The influential outlier made a difference and from this we can concluded that 2 out 20 DHBs lie outside the 95% control limits. These two DHBs could be stated as having a very low rate of PPH.</p>


2021 ◽  
Author(s):  
◽  
Amy Jennings

<p>This thesis presents a comparison of maternal outcomes for births in New Zealand District Health Boards (DHBs).This is carried out through analysis of the National Minimum Dataset collected by the Ministry of Health for 2007.  The outcome compared is postpartum haemorrhage (PPH) the results are displayed using funnel plots, a useful tool for displaying unbiased information on performance outcomes when comparing institutions.   Exploration of the data found that there are differences in the demographics, maternal and birth characteristics among DHBs. The rates of PPH are different and the population mixes are made up of a range of different proportions of ethnic groups, ages and deprivation indexes. The exploratory analysis found that a large number of factors are associated with PPH. And that birth weight, parity and gestation had a large number of missing observations. These factors are not missing at random and require imputing prior to constructing the funnel plots.  Results show that there is divergence amongst DHBs in the postpartum haemorrhage rate. First a raw PPH rate was plotted and the results indicated there were differences among DHBs. As there are many potential predictors for PPHa logistic regression model was applied to find the most important factors related to PPH. This allows us to apply an adjusted rate for the funnel plot. The risk adjusted funnel plot also indicated differences among DHBs.  Two approaches are taken to account for the overdispersion. A winsorised estimate and a winsorised estimate with a random effects term are applied to the data. The approaches produced different results. The winsorised estimate widened the control limits and the random effects term narrowed the control limits. All four plots identified an extreme outlier and this was later removed from the analysis and the winsorisation funnel plots were rerun. The influential outlier made a difference and from this we can concluded that 2 out 20 DHBs lie outside the 95% control limits. These two DHBs could be stated as having a very low rate of PPH.</p>


Mathematics ◽  
2021 ◽  
Vol 9 (21) ◽  
pp. 2834
Author(s):  
José Antonio Roldán-Nofuentes ◽  
Saad Bouh Regad

The average kappa coefficient of a binary diagnostic test is a parameter that measures the average beyond-chance agreement between the diagnostic test and the gold standard. This parameter depends on the accuracy of the diagnostic test and also on the disease prevalence. This article studies the comparison of the average kappa coefficients of two binary diagnostic tests when the gold standard is not applied to all individuals in a random sample. In this situation, known as partial disease verification, the disease status of some individuals is a missing piece of data. Assuming that the missing data mechanism is missing at random, the comparison of the average kappa coefficients is solved by applying two computational methods: the EM algorithm and the SEM algorithm. With the EM algorithm the parameters are estimated and with the SEM algorithm their variances-covariances are estimated. Simulation experiments have been carried out to study the sizes and powers of the hypothesis tests studied, obtaining that the proposed method has good asymptotic behavior. A function has been written in R to solve the proposed problem, and the results obtained have been applied to the diagnosis of Alzheimer's disease.


2021 ◽  
Vol 10 (21) ◽  
pp. 4897
Author(s):  
Lisa Goudman ◽  
Geert Molenberghs ◽  
Rui V. Duarte ◽  
Maarten Moens

New waveforms have changed the field of Spinal Cord Stimulation (SCS) to optimize therapy outcomes, among which is High-Dose SCS (HD-SCS). Missing observations are often encountered when conducting clinical trials in this field. In this study, different approaches with varying assumptions were constructed to evaluate how conclusions may be influenced by these assumptions. The aim is to perform a tipping point sensitivity analysis to evaluate the influence of missing data on the overall conclusion regarding the effectiveness of HD-SCS on disability. Data from the Discover study were used, in which 185 patients with Failed Back Surgery Syndrome were included. Disability was evaluated before SCS and after 1, 3 and 12 months of HD-SCS. During the second, third and fourth visit, data from 130, 114 and 90 patients were available, respectively. HD-SCS resulted in a significant decrease in disability scores based on the analysis of observed data and with multiple imputations. The tipping point sensitivity analysis revealed that the shift parameter was 17. Thus, the conclusion concerning the time effect under a “missing at random” mechanism is robust when the shift parameter for the disability score is 17. From a clinical point of view, a shift of 17 points on disability is not very plausible. Therefore we tend to consider the conclusions drawn under “missing at random” as being robust.


2021 ◽  
pp. 096228022110473
Author(s):  
Lauren J Beesley ◽  
Irina Bondarenko ◽  
Michael R Elliot ◽  
Allison W Kurian ◽  
Steven J Katz ◽  
...  

Multiple imputation is a well-established general technique for analyzing data with missing values. A convenient way to implement multiple imputation is sequential regression multiple imputation, also called chained equations multiple imputation. In this approach, we impute missing values using regression models for each variable, conditional on the other variables in the data. This approach, however, assumes that the missingness mechanism is missing at random, and it is not well-justified under not-at-random missingness without additional modification. In this paper, we describe how we can generalize the sequential regression multiple imputation imputation procedure to handle missingness not at random in the setting where missingness may depend on other variables that are also missing but not on the missing variable itself, conditioning on fully observed variables. We provide algebraic justification for several generalizations of standard sequential regression multiple imputation using Taylor series and other approximations of the target imputation distribution under missingness not at random. Resulting regression model approximations include indicators for missingness, interactions, or other functions of the missingness not at random missingness model and observed data. In a simulation study, we demonstrate that the proposed sequential regression multiple imputation modifications result in reduced bias in the final analysis compared to standard sequential regression multiple imputation, with an approximation strategy involving inclusion of an offset in the imputation model performing the best overall. The method is illustrated in a breast cancer study, where the goal is to estimate the prevalence of a specific genetic pathogenic variant.


Methodology ◽  
2021 ◽  
Vol 17 (3) ◽  
pp. 205-230
Author(s):  
Kristian Kleinke ◽  
Markus Fritsch ◽  
Mark Stemmler ◽  
Jost Reinecke ◽  
Friedrich Lösel

Quantile regression (QR) is a valuable tool for data analysis and multiple imputation (MI) of missing values – especially when standard parametric modelling assumptions are violated. Yet, Monte Carlo simulations that systematically evaluate QR-based MI in a variety of different practically relevant settings are still scarce. In this paper, we evaluate the method regarding the imputation of ordinal data and compare the results with other standard and robust imputation methods. We then apply QR-based MI to an empirical dataset, where we seek to identify risk factors for corporal punishment of children by their fathers. We compare the modelling results with previously published findings based on complete cases. Our Monte Carlo results highlight the advantages of QR-based MI over fully parametric imputation models: QR-based MI yields unbiased statistical inferences across large parts of the conditional distribution, when parametric modelling assumptions, such as normal and homoscedastic error terms, are violated. Regarding risk factors for corporal punishment, our MI results support previously published findings based on complete cases. Our empirical results indicate that the identified “missing at random” processes in the investigated dataset are negligible.


Sign in / Sign up

Export Citation Format

Share Document