scholarly journals Multiple imputation of missing covariate values in multilevel models with random slopes: a cautionary note

2015 ◽  
Vol 48 (2) ◽  
pp. 640-649 ◽  
Author(s):  
Simon Grund ◽  
Oliver Lüdtke ◽  
Alexander Robitzsch
SAGE Open ◽  
2016 ◽  
Vol 6 (4) ◽  
pp. 215824401666822 ◽  
Author(s):  
Simon Grund ◽  
Oliver Lüdtke ◽  
Alexander Robitzsch

The treatment of missing data can be difficult in multilevel research because state-of-the-art procedures such as multiple imputation (MI) may require advanced statistical knowledge or a high degree of familiarity with certain statistical software. In the missing data literature, pan has been recommended for MI of multilevel data. In this article, we provide an introduction to MI of multilevel missing data using the R package pan, and we discuss its possibilities and limitations in accommodating typical questions in multilevel research. To make pan more accessible to applied researchers, we make use of the mitml package, which provides a user-friendly interface to the pan package and several tools for managing and analyzing multiply imputed data sets. We illustrate the use of pan and mitml with two empirical examples that represent common applications of multilevel models, and we discuss how these procedures may be used in conjunction with other software.


2018 ◽  
Author(s):  
Jan Paul Heisig ◽  
Merlin Schaeffer

Mixed effects multilevel models are often used to investigate cross-level interactions, a specific type of context effect that may be understood as an upper-level variable moderating the association between a lower-level predictor and the outcome. We argue that multilevel models involving cross-level interactions should always include random slopes on the lower-level components of those interactions. Failure to do so will usually result in severely anti-conservative statistical inference. Monte Carlo simulations and illustrative empirical analyses highlight the practical relevance of the issue. Using European Social Survey data, we examine a total 30 cross-level interactions. Introducing a random slope term on the lower-level variable involved in a cross-level interaction, reduces the absolute t-ratio by 31% or more in three quarters of cases, with an average reduction of 42%. Many practitioners seem to be unaware of these issues. Roughly half of the cross-level interaction estimates published in the European Sociological Review between 2011 and 2016 are based on models that omit the crucial random slope term. Detailed analysis of the associated test statistics suggests that many of the estimates would not meet conventional standards of statistical significance if estimated using the correct specification. This raises the question how much robust evidence of cross-level interactions sociology has actually produced over the past decades.


2019 ◽  
Vol 2 (3) ◽  
pp. 288-311 ◽  
Author(s):  
Lesa Hoffman

The increasing availability of software with which to estimate multivariate multilevel models (also called multilevel structural equation models) makes it easier than ever before to leverage these powerful techniques to answer research questions at multiple levels of analysis simultaneously. However, interpretation can be tricky given that different choices for centering model predictors can lead to different versions of what appear to be the same parameters; this is especially the case when the predictors are latent variables created through model-estimated variance components. A further complication is a recent change to Mplus (Version 8.1), a popular software program for estimating multivariate multilevel models, in which the selection of Bayesian estimation instead of maximum likelihood results in different lower-level predictors when random slopes are requested. This article provides a detailed explication of how the parameters of multilevel models differ as a function of the analyst’s decisions regarding centering and the form of lower-level predictors (i.e., observed or latent), the method of estimation, and the variant of program syntax used. After explaining how different methods of centering lower-level observed predictor variables result in different higher-level effects within univariate multilevel models, this article uses simulated data to demonstrate how these same concepts apply in specifying multivariate multilevel models with latent lower-level predictor variables. Complete data, input, and output files for all of the example models have been made available online to further aid readers in accurately translating these central tenets of multivariate multilevel modeling into practice.


2018 ◽  
Author(s):  
Kevin Michael King ◽  
Dale S. Kim ◽  
Connor McCabe ◽  
Sean P. Lane

In multilevel models, stepwise methods are commonly used to test cross-level interactions, where a cluster level variable explains differences in the effect of an observation level variable on the outcome.Researchers often wish to establish that there is between cluster variance in slopes before testing whether an observed cluster level variable explains between cluster variance in slopes.In the stepwise method, between cluster slope variance (i.e. random slopes) is required before a cross-level interaction is tested.We argue that this requirement unnecessarily reduces the power to detect true cross-level interactions, because it imposes an unnecessary constraint on the power to detect valid interactions.In short, the stepwise approach would only be valid if, in the same data, the stepwise approach always identifies true interactions that are also identified by a direct test of the interaction model. Using Monte Carlo simulations, we demonstrate that this is not the case.The power to detect a true interaction was especially low when the residual slope variance (i.e. variance unexplained by the interaction), the variance of the moderator, the number of observations per cluster, or the number of clusters was small.We recommend that researchers directly test interactions that are of interest, regardless of the presence of random slope variance.


2017 ◽  
Vol 21 (1) ◽  
pp. 111-149 ◽  
Author(s):  
Simon Grund ◽  
Oliver Lüdtke ◽  
Alexander Robitzsch

2020 ◽  
pp. 1471082X2094971
Author(s):  
Leonardo Grilli ◽  
Maria Francesca Marino ◽  
Omar Paccagnella ◽  
Carla Rampichini

The article is motivated by the analysis of the relationship between university student ratings and teacher practices and attitudes, which are measured via a set of binary and ordinal items collected by an innovative survey. The analysis is conducted through a two-level random intercept model, where student ratings are nested within teachers. The analysis must face two issues about the items measuring teacher practices and attitudes, which are level 2 predictors: (a) the items are severely affected by missingness due to teacher non-response and (b) there is redundancy in both the number of items and the number of categories of their measurement scale. We tackle the missing data issue by considering a multiple imputation strategy exploiting information at both student and teacher levels. For the redundancy issue, we rely on regularization techniques for ordinal predictors, also accounting for the multilevel data structure. The proposed solution addresses the problem at hand in an original way, and it can be applied whenever it is required to select level 2 predictors affected by missing values. The results obtained with the final model indicate that ratings on teacher ability to motivate students are related to certain teacher practices and attitudes.


2020 ◽  
Vol 23 ◽  
Author(s):  
Pablo García-Patos ◽  
Ricardo Olmos

Abstract Although modern lines for dealing with missing data are well established from the 1970s, today there is a challenge when researchers encounter this problem in multilevel models. First, there is a variety of existing software to handle missing data based on multiple imputation (MI), currently pointed out by experts as the most promising strategy. Second, the two principal paradigms of MI are joint modelling (JM) and fully conditional specification (FCS), one more complication because they are not equally useful depending on the combination of multilevel model and the estimated parameters affected by missing data. Technical literature do not contribute to ease the number of decisions that researcher has to do. Given these inconveniences, the present paper has three objectives. (1) To present a thorough revision of the most recently developed software and functions about multiple imputation in multilevel models. (2) We derive a set of suggestions, recommendations, and guides for helping researchers to handle missing data. We list a number of key questions to consider when analyzing multilevel models. (3) Finally, based on the previous relevant questions, we present two detailed examples using the recommended R packages to be easy for the researcher applying multiple imputation in multilevel models.


2014 ◽  
Vol 39 (6) ◽  
pp. 524-549 ◽  
Author(s):  
Michael David Bates ◽  
Katherine E. Castellano ◽  
Sophia Rabe-Hesketh ◽  
Anders Skrondal

Author(s):  
Simon Grund ◽  
Oliver Lüdtke ◽  
Alexander Robitzsch

AbstractMultilevel models often include nonlinear effects, such as random slopes or interaction effects. The estimation of these models can be difficult when the underlying variables contain missing data. Although several methods for handling missing data such as multiple imputation (MI) can be used with multilevel data, conventional methods for multilevel MI often do not properly take the nonlinear associations between the variables into account. In the present paper, we propose a sequential modeling approach based on Bayesian estimation techniques that can be used to handle missing data in a variety of multilevel models that involve nonlinear effects. The main idea of this approach is to decompose the joint distribution of the data into several parts that correspond to the outcome and explanatory variables in the intended analysis, thus generating imputations in a manner that is compatible with the substantive analysis model. In three simulation studies, we evaluate the sequential modeling approach and compare it with conventional as well as other substantive-model-compatible approaches to multilevel MI. We implemented the sequential modeling approach in the R package and provide a worked example to illustrate its application.


Sign in / Sign up

Export Citation Format

Share Document