scholarly journals A comparative study of R functions for clustered data analysis

Trials ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Wei Wang ◽  
Michael O. Harhay

Abstract Background Clustered or correlated outcome data is common in medical research studies, such as the analysis of national or international disease registries, or cluster-randomized trials, where groups of trial participants, instead of each trial participant, are randomized to interventions. Within-group correlation in studies with clustered data requires the use of specific statistical methods, such as generalized estimating equations and mixed-effects models, to account for this correlation and support unbiased statistical inference. Methods We compare different approaches to estimating generalized estimating equations and mixed effects models for a continuous outcome in R through a simulation study and a data example. The methods are implemented through four popular functions of the statistical software R, “geese”, “gls”, “lme”, and “lmer”. In the simulation study, we compare the mean squared error of estimating all the model parameters and compare the coverage proportion of the 95% confidence intervals. In the data analysis, we compare estimation of the intervention effect and the intra-class correlation. Results In the simulation study, the function “lme” takes the least computation time. There is no difference in the mean squared error of the four functions. The “lmer” function provides better coverage of the fixed effects when the number of clusters is small as 10. The function “gls” produces close to nominal scale confidence intervals of the intra-class correlation. In the data analysis and the “gls” function yields a positive estimate of the intra-class correlation while the “geese” function gives a negative estimate. Neither of the confidence intervals contains the value zero. Conclusions The “gls” function efficiently produces an estimate of the intra-class correlation with a confidence interval. When the within-group correlation is as high as 0.5, the confidence interval is not always obtainable.

2021 ◽  
pp. 096228022110342
Author(s):  
Denis Talbot ◽  
Awa Diop ◽  
Mathilde Lavigne-Robichaud ◽  
Chantal Brisson

Background The change in estimate is a popular approach for selecting confounders in epidemiology. It is recommended in epidemiologic textbooks and articles over significance test of coefficients, but concerns have been raised concerning its validity. Few simulation studies have been conducted to investigate its performance. Methods An extensive simulation study was realized to compare different implementations of the change in estimate method. The implementations were also compared when estimating the association of body mass index with diastolic blood pressure in the PROspective Québec Study on Work and Health. Results All methods were susceptible to introduce important bias and to produce confidence intervals that included the true effect much less often than expected in at least some scenarios. Overall mixed results were obtained regarding the accuracy of estimators, as measured by the mean squared error. No implementation adequately differentiated confounders from non-confounders. In the real data analysis, none of the implementation decreased the estimated standard error. Conclusion Based on these results, it is questionable whether change in estimate methods are beneficial in general, considering their low ability to improve the precision of estimates without introducing bias and inability to yield valid confidence intervals or to identify true confounders.


2021 ◽  
Vol 19 (1) ◽  
pp. 2-21
Author(s):  
Talha Omer ◽  
Zawar Hussain ◽  
Muhammad Qasim ◽  
Said Farooq Shah ◽  
Akbar Ali Khan

Shrinkage estimators are introduced for the scale parameter of the Rayleigh distribution by using two different shrinkage techniques. The mean squared error properties of the proposed estimator have been derived. The comparison of proposed classes of the estimators is made with the respective conventional unbiased estimators by means of mean squared error in the simulation study. Simulation results show that the proposed shrinkage estimators yield smaller mean squared error than the existence of unbiased estimators.


Author(s):  
Tarek Mahmoud Omara

In this paper, we introduce the new biased estimator to deal with the problem of multicollinearity. This estimator is considered a modification of Two-Parameter Ridge-Liu estimator based on ridge estimation. Furthermore, the superiority of the new estimator than Ridge, Liu and Two-Parameter Ridge-Liu estimator were discussed. We used the mean squared error matrix (MSEM) criterion to verify the superiority of the new estimate.  In addition to, we illustrated the performance of the new estimator at several factors through the simulation study.


2021 ◽  
Vol 7 (1) ◽  
pp. 28
Author(s):  
Rebeca Peláez Suárez ◽  
Ricardo Cao Abad ◽  
Juan M. Vilar Fernández

This work proposes a resampling technique to approximate the smoothing parameter of Beran’s estimator. It is based on resampling by the smoothed bootstrap and minimising the bootstrap approximation of the mean integrated squared error to find the bootstrap bandwidth. The behaviour of this method has been tested by simulation on several models. Bootstrap confidence intervals are also addressed in this research and their performance is analysed in the simulation study.


2020 ◽  
Vol 17 (1(Suppl.)) ◽  
pp. 0361
Author(s):  
Mustafa Ismaeel Naif Alheety

This paper considers and proposes new estimators that depend on the sample and on prior information in the case that they either are equally or are not equally important in the model. The prior information is described as linear stochastic restrictions. We study the properties and the performances of these estimators compared to other common estimators using the mean squared error as a criterion for the goodness of fit. A numerical example and a simulation study are proposed to explain the performance of the estimators.


Author(s):  
Paloma María Teresa Martínez-Sánchez ◽  
Carlos Hernán Hernán Fajardo-Toro ◽  
Úrsula Gomes Rosa Maruyama ◽  
Paola Andrea Sánchez Sánchez

This article presents a study case focused on the establishment and improvement of the service level in a central warehouse of a hospital organization in Colombia which provide the products of medical use, clothing, anesthetics, and supplies, to nineteen medical dependencies. The warehouse is managed by a person, developing planning processes, purchases, reception, and administration of products or inventories who depends on the administrative and financial sub-direction of the institution. Through the use of interviews and surveys conducted at different dependencies. As well as the collection of information in the field, there were problems with the availability of products due to problems related to planning purchases which are done empirically without data analysis. Taking into account the problematic previously raised, we took different models of predictions as well as the use of the Mean Squared Error (MSE) and the accuracy of predictions to determine the best model according to with the product analyzed.


2018 ◽  
Vol 5 (331) ◽  
pp. 169-183 ◽  
Author(s):  
Małgorzata Karolina Krzciuk

 We consider the problem of the estimation of the mean squared error (MSE) of some domain mean predictor for Fay‑Herriot model. In the simulation study we analyze properties of eight MSE estimators including estimators based on the jackknife method (Jiang, Lahiri, Wan, 2002; Chen, Lahiri, 2002; 2003) and parametric bootstrap (Gonzalez‑Manteiga et al., 2008; Buthar, Lahiri, 2003). In the standard Fay‑Herriot model the independence of random effects is assumed, and the biases of the MSE estimators are small for large number of domains. The aim of the paper is the comparison of the properties of MSE estimators for different number of domains and the misspecification of the model due to the correlation of random effects in the simulation study. 


Econometrics ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 40
Author(s):  
Erhard Reschenhofer ◽  
Manveer K. Mangat

For typical sample sizes occurring in economic and financial applications, the squared bias of estimators for the memory parameter is small relative to the variance. Smoothing is therefore a suitable way to improve the performance in terms of the mean squared error. However, in an analysis of financial high-frequency data, where the estimates are obtained separately for each day and then combined by averaging, the variance decreases with the sample size but the bias remains fixed. This paper proposes a method of smoothing that does not entail an increase in the bias. This method is based on the simultaneous examination of different partitions of the data. An extensive simulation study is carried out to compare it with conventional estimation methods. In this study, the new method outperforms its unsmoothed competitors with respect to the variance and its smoothed competitors with respect to the bias. Using the results of the simulation study for the proper interpretation of the empirical results obtained from a financial high-frequency dataset, we conclude that significant long-range dependencies are present only in the intraday volatility but not in the intraday returns. Finally, the robustness of these findings against daily and weekly periodic patterns is established.


Sign in / Sign up

Export Citation Format

Share Document