Effects of Compounded Nonnormality of Residuals in Hierarchical Linear Modeling

2021 ◽  
pp. 001316442110102
Author(s):  
Kaiwen Man ◽  
Randall Schumacker ◽  
Monica Morell ◽  
Yurou Wang

While hierarchical linear modeling is often used in social science research, the assumption of normally distributed residuals at the individual and cluster levels can be violated in empirical data. Previous studies have focused on the effects of nonnormality at either lower or higher level(s) separately. However, the violation of the normality assumption simultaneously across all levels could bias parameter estimates in unforeseen ways. This article aims to raise awareness of the drawbacks associated with compounded nonnormality residuals across levels when the number of clusters range from small to large. The effects of the breach of the normality assumption at both individual and cluster levels were explored. A simulation study was conducted to evaluate the relative bias and the root mean square of the model parameter estimates by manipulating the normality of the data. The results indicate that nonnormal residuals have a larger impact on the random effects than fixed effects, especially when the number of clusters and cluster size are small. In addition, for a simple random-effects structure, the use of restricted maximum likelihood estimation is recommended to improve parameter estimates when compounded residuals across levels show moderate nonnormality, with a combination of small number of clusters and a large cluster size.

2021 ◽  
Vol 12 ◽  
Author(s):  
Soyoung Kim ◽  
Yoonhwa Jeong ◽  
Sehee Hong

The present study investigated estimate biases in cross-classified random effect modeling (CCREM) and hierarchical linear modeling (HLM) when ignoring a crossed factor in CCREM considering the impact of the feeder and the magnitude of coefficients. There were six simulation factors: the magnitude of coefficient, the correlation between the level 2 residuals, the number of groups, the average number of individuals sampled from each group, the intra-unit correlation coefficient, and the number of feeders. The targeted interests of the coefficients were four fixed effects and two random effects. The results showed that ignoring a crossed factor in cross-classified data causes a parameter bias for the random effects of level 2 predictors and a standard error bias for the fixed effects of intercepts, level 1 predictors, and level 2 predictors. Bayesian information criteria generally outperformed Akaike information criteria in detecting the correct model.


2021 ◽  
Vol 99 (Supplement_1) ◽  
pp. 158-159
Author(s):  
Chad A Russell ◽  
E J Pollak ◽  
Matthew L Spangler

Abstract The commercial beef cattle industry relies heavily on the use of natural service sires. Either due to the size of breeding herds or to safe-guard against injury during the breeding season, multiple-sire breeding pastures are utilized. Although each bull might be given an equal opportunity to produce offspring, evidence suggest that there is substantial variation in the number of calves sired by each bull in a breeding pasture. DNA-based paternity assignment enables correct assignment of calves to their respective sires in multi-sire pastures and presents an opportunity to investigate the degree to which this trait complex is under genetic control. Field data from a large commercial ranch were used to estimate genetic parameters for calf count (CC; n=623) and yearling scrotal circumference (SC; n=1962) using univariate and bivariate animal models. Average CC and SC were 12.1±11.1 calves and 35.4±2.30 cm, respectively. Average number breeding seasons per bull and bulls per contemporary group were 1.40 and 24.9, respectively. The model for CC included fixed effects of age during the breeding season (in years) and contemporary group (concatenation of breeding pasture and year). Random effects included additive genetic and permanent environmental effects, and a residual. The model for SC included fixed effects of age (in days) and contemporary group (concatenation of month and year of measurement). Random effects included an additive genetic effect and a residual. Univariate model heritability estimates for CC and SC were 0.237±0.156 and 0.456±0.072, respectively. Similarly, the bivariate model resulted in heritability estimates for CC and SC of 0.240±0.155 and 0.461±0.072, respectively. Repeatability estimates for CC from univariate and bivariate models were 0.517±0.054 and 0.518±0.053, respectively. The estimate of genetic correlation between CC and SC was 0.270±0.220. Parameter estimates suggest that both CC and SC would respond favorably to selection and that CC is moderately repeatable.


2021 ◽  
Author(s):  
Dylan G.E. Gomes

AbstractAs generalized linear mixed-effects models (GLMMs) have become a widespread tool in ecology, the need to guide the use of such tools is increasingly important. One common guideline is that one needs at least five levels of a random effect. Having such few levels makes the estimation of the variance of random effects terms (such as ecological sites, individuals, or populations) difficult, but it need not muddy one’s ability to estimate fixed effects terms – which are often of primary interest in ecology. Here, I simulate ecological datasets and fit simple models and show that having too few random effects terms does not influence the parameter estimates or uncertainty around those estimates for fixed effects terms. Thus, it should be acceptable to use fewer levels of random effects if one is not interested in making inference about the random effects terms (i.e. they are ‘nuisance’ parameters used to group non-independent data). I also use simulations to assess the potential for pseudoreplication in (generalized) linear models (LMs), when random effects are explicitly ignored and find that LMs do not show increased type-I errors compared to their mixed-effects model counterparts. Instead, LM uncertainty (and p values) appears to be more conservative in an analysis with a real ecological dataset presented here. These results challenge the view that it is never appropriate to model random effects terms with fewer than five levels – specifically when inference is not being made for the random effects, but suggest that in simple cases LMs might be robust to ignored random effects terms. Given the widespread accessibility of GLMMs in ecology and evolution, future simulation studies and further assessments of these statistical methods are necessary to understand the consequences of both violating and blindly following simple guidelines.


2019 ◽  
Author(s):  
Joakim Nyberg ◽  
E. Niclas Jonsson ◽  
Mats O. Karlsson ◽  
Jonas Häggström ◽  

SummaryTwo full model approaches was compared with respect to their ability to handle missing covariate information. The reference data analysis approach was the full model method in which the covariate effects are estimated conventionally using fixed effects, and missing covariate data is imputed with the median of the non-missing covariate information. This approach was compared to a novel full model method which treats the covariate data as observed data and estimates the covariates as random effects. A consequence of this way of handling the covariates is that no covariate imputation is required and that any missingness in the covariates is handled implicitly. The comparison between the two analysis methods was based on simulated data from a model of height for age z-scores as a function of age. Data was simulated with increasing degrees of randomly missing covariate information (0-90%) and analyzed using each of the two analysis approaches. Not surprisingly, the precision in the parameter estimates from both methods decreased with increasing degrees of missing covariate information. However, while the bias in the parameter estimates increased in a similar fashion for the reference method, the full random effects approach provided unbiased estimates for all degrees of covariate missingness.


2005 ◽  
Vol 19 (4) ◽  
pp. 387-403 ◽  
Author(s):  
Samuel Y. Todd ◽  
T. Russell Crook ◽  
Anthony G. Barilla

Most data involving organizations are hierarchical in nature and often contain variables measured at multiple levels of analysis. Hierarchical linear modeling (HLM) is a relatively new and innovative statistical method that organizational scientists have used to alleviate some common problems associated with multilevel data, thus advancing our understanding of organizations. This article presents a broad overview of HLM’s logic through an empirical analysis and outlines how its use can strengthen sport management research. For illustration purposes, we use both HLM and the traditional linear regression model to analyze how organizational and individual factors in Major League Baseball impact individual players’ salaries. A key implication is that, depending on the method, parameter estimates differ because of the multilevel data structure and, thus, findings differ. We explain these differences and conclude by presenting theoretical discussions from strategic management and consumer behavior to provide a potential research agenda for sport management scholars.


2020 ◽  
Vol 42 ◽  
pp. e49916
Author(s):  
Roney Peterson Pereira ◽  
Terezinha Aparecida Guedes ◽  
Érika Cristina Ferreira ◽  
Silvana Marques de Araújo ◽  
Larissa Aparecida Ricardini ◽  
...  

The use of linear mixed models for nested structure longitudinal data is called hierarchical linear modeling. This modeling takes into account the dependence of existing data within each level and between hierarchical levels. The process of modeling, estimating and analyzing diagnoses was illustrated through data on the weights of mice experimentally infected by Trypanosoma cruzi, divided into different treatment groups, with the purpose of verifying the evolution of their body weight as a result of using different types of biotherapeutics produced from Gallus gallus domesticus (chicken) serum to treat Trypanosoma cruzi. Through the model selection criteria AIC and BIC and the likelihood ratio test, a model was chosen to describe the data correctly. Model diagnoses were then performed by means of residual analysis for both levels and an analysis of influential observations to verify if any observations were signaled as influencing the fixed effects, the components of variance and the adjusted values. After the analysis, it was possible to notice that the observations that were signaled as influential had little impact on the Model chosen initially, so it was maintained, with no differences being evidenced between the treatments with the biotherapeutics tested; only the Time variable and the Random intercept were necessary to describe the weight of the mice.


Author(s):  
C A Russell ◽  
E J Pollak ◽  
M L Spangler

Abstract The commercial beef cattle industry relies heavily on the use of natural service sires. When artificial insemination is deemed difficult to implement, multi-sire breeding pastures are used to increase reproductive rates in large breeding herds or to safe-guard against bull injury during the breeding season. Although each bull might be given an equal opportunity to produce offspring, evidence suggest that there is substantial variation in the number of calves sired by each bull in a breeding pasture. With the use of DNA-based paternity testing, correctly assigning calves to their respective sires in multi-sire pastures is possible and presents an opportunity to investigate the degree to which this trait complex is under genetic control. Field data from a large commercial ranch was used to estimate genetic parameters for calf count (CC; 574 records from 443 sires) and yearling scrotal circumference (SC; n=1961) using univariate and bivariate animal models. Calf counts averaged 12.2±10.7 and SC averaged 35.4±2.30 cm. Bulls had an average of 1.30 records and there were 23.9±11.1 bulls per contemporary group. The model for CC included fixed effects of age during the breeding season (in years) and contemporary group (concatenation of breeding pasture and year). Random effects included additive genetic and permanent environmental effects, and a residual. The model for SC included fixed effects of age (in days) and contemporary group (concatenation of month and year of measurement). Random effects included an additive genetic effect and a residual. Univariate model heritability estimates for CC and SC were 0.178±0.142 and 0.455±0.072, respectively. Similarly, the bivariate model resulted in heritability estimates for CC and SC of 0.184±0.142 and 0.457±0.072, respectively. Repeatability estimates for CC from univariate and bivariate models were 0.315±0.080 and 0.317±0.080, respectively. The estimate of genetic correlation between CC and SC was 0.268±0.274. Heritability estimates suggest that both CC and SC would respond favorably to selection. Moreover, CC is lowly repeatable and although favorably correlated, SC appears to be weakly associated with CC.


2019 ◽  
Vol 18 (2) ◽  
pp. 106-111
Author(s):  
Fong-Yi Lai ◽  
Szu-Chi Lu ◽  
Cheng-Chen Lin ◽  
Yu-Chin Lee

Abstract. The present study proposed that, unlike prior leader–member exchange (LMX) research which often implicitly assumed that each leader develops equal-quality relationships with their supervisors (leader’s LMX; LLX), every leader develops different relationships with their supervisors and, in turn, receive different amounts of resources. Moreover, these differentiated relationships with superiors will influence how leader–member relationship quality affects team members’ voice and creativity. We adopted a multi-temporal (three wave) and multi-source (leaders and employees) research design. Hypotheses were tested on a sample of 227 bank employees working in 52 departments. Results of the hierarchical linear modeling (HLM) analysis showed that LLX moderates the relationship between LMX and team members’ voice behavior and creative performance. Strengths, limitations, practical implications, and directions for future research are discussed.


Sign in / Sign up

Export Citation Format

Share Document