scholarly journals Multiple Imputation of missing values in exploratory factor analysis of multidimensional scales: estimating latent trait scores

2016 ◽  
Vol 32 (2) ◽  
pp. 596 ◽  
Author(s):  
Urbano Lorenzo-Seva ◽  
Joost R. Van Ginkel

<p>Researchers frequently have to analyze scales in which some participants have failed to respond to some items. In this paper we focus on the exploratory factor analysis of multidimensional scales (i.e., scales that consist of a number of subscales) where each subscale is made up of a number of Likert-type items, and the aim of the analysis is to estimate participants’ scores on the corresponding latent traits. Our approach uses the following steps: (1) multiple imputation creates several copies of the data, in which the missing values are imputed; (2) each copy of the data is subject to independent factor analysis, and the same number of factors is extracted from all copies; (3) all factor solutions are simultaneously orthogonally (or obliquely) rotated so that they are both (a) factorially simple, and (b) as similar to one another as possible; (4) latent trait scores are estimated for ordinal data in each copy; and (5) participants’ scores on the latent traits are estimated as the average of the estimates of the latent traits obtained in the copies. We applied the approach in a real dataset where missing responses were artificially introduced following a real pattern of non-responses and a simulation study based on artificial datasets. The results show that our approach was able to compute factor score estimates even for participants that have missing data.</p>

2021 ◽  
pp. 001316442110220
Author(s):  
David Goretzko

Determining the number of factors in exploratory factor analysis is arguably the most crucial decision a researcher faces when conducting the analysis. While several simulation studies exist that compare various so-called factor retention criteria under different data conditions, little is known about the impact of missing data on this process. Hence, in this study, we evaluated the performance of different factor retention criteria—the Factor Forest, parallel analysis based on a principal component analysis as well as parallel analysis based on the common factor model and the comparison data approach—in combination with different missing data methods, namely an expectation-maximization algorithm called Amelia, predictive mean matching, and random forest imputation within the multiple imputations by chained equations (MICE) framework as well as pairwise deletion with regard to their accuracy in determining the number of factors when data are missing. Data were simulated for different sample sizes, numbers of factors, numbers of manifest variables (indicators), between-factor correlations, missing data mechanisms and proportions of missing values. In the majority of conditions and for all factor retention criteria except the comparison data approach, the missing data mechanism had little impact on the accuracy and pairwise deletion performed comparably well as the more sophisticated imputation methods. In some conditions, especially small-sample cases and when comparison data were used to determine the number of factors, random forest imputation was preferable to other missing data methods, though. Accordingly, depending on data characteristics and the selected factor retention criterion, choosing an appropriate missing data method is crucial to obtain a valid estimate of the number of factors to extract.


2015 ◽  
Vol 1 (311) ◽  
Author(s):  
Piotr Tarka

Abstract: The objective article is the comparative analysis of Likert rating scale based on the following range of response categories, i.e. 5, 7, 9 and 11 in context of the appropriate process of factors extraction in exploratory factor analysis (EFA). The problem which is being addressed in article relates primarily to the methodological aspects, both in selection of the optimal number of response categories of the measured items (constituting the Likert scale) and identification of possible changes, differences or similarities associated (as a result of the impact of four types of scales) with extraction and determination the appropriate number of factors in EFA model.Keywords: Exploratory factor analysis, Likert scale, experiment research, marketing


2016 ◽  
Vol 8 (1) ◽  
pp. 4-16 ◽  
Author(s):  
Manfred Hauben ◽  
Eric Hung ◽  
Wen-Yaw Hsieh

Background: Severe cutaneous adverse reactions (SCARs) are prominent in pharmacovigilance (PhV). They have some commonalities such as nonimmediate nature and T-cell mediation and rare overlap syndromes have been documented, most commonly involving acute generalized exanthematous pustulosis (AGEP) and drug rash with eosinophilia and systemic symptoms (DRESS), and DRESS and toxic epidermal necrolysis (TEN). However, they display diverse clinical phenotypes and variations in specific T-cell immune response profiles, plus some specific genotype–phenotype associations. A question is whether causation of a given SCAR by a given drug supports causality of the same drug for other SCARs. If so, we might expect significant intercorrelations between SCARs with respect to overall drug-reporting patterns. SCARs with significant intercorrelations may reflect a unified underlying concept. Methods: We used exploratory factor analysis (EFA) on data from the United States Food and Drug Administration Adverse Event Reporting System (FAERS) to assess reporting intercorrelations between six SCARs [AGEP, DRESS, erythema multiforme (EM), Stevens–Johnson syndrome (SJS), TEN, exfoliative dermatitis (ExfolDerm)]. We screened the data using visual inspection of scatterplot matrices for problematic data patterns. We assessed factorability via Bartlett’s test of sphericity, Kaiser-Myer-Olkin (KMO) statistic, initial estimates of communality and the anti-image correlation matrix. We extracted factors via principle axis factoring (PAF). The number of factors was determined by scree plot/Kaiser’s rule. We also examined solutions with an additional factor. We applied various oblique rotations. We assessed the strength of the solution by percentage of variance explained, minimum number of factors loading per major factor, the magnitude of the communalities, loadings and crossloadings, and reproduced- and residual correlations. Results: The data were generally adequate for factor analysis but the amount of variance explained, shared variance, and communalities were low, suggesting caution in general against extrapolating causality between SCARs. SJS and TEN displayed most shared variance. AGEP and DRESS, the other SCAR pair most often observed in overlap syndromes, demonstrated modest shared variance, along with maculopapular rash (MPR). DRESS and TEN, another of the more commonly diagnosed pairs in overlap syndromes, did not. EM was uncorrelated with SJS and TEN. Conclusions: The notion that causality of a drug for one SCAR bolsters support for causality of the same drug with other SCARs was generally not supported.


2020 ◽  
pp. 001316442094289
Author(s):  
Amanda K. Montoya ◽  
Michael C. Edwards

Model fit indices are being increasingly recommended and used to select the number of factors in an exploratory factor analysis. Growing evidence suggests that the recommended cutoff values for common model fit indices are not appropriate for use in an exploratory factor analysis context. A particularly prominent problem in scale evaluation is the ubiquity of correlated residuals and imperfect model specification. Our research focuses on a scale evaluation context and the performance of four standard model fit indices: root mean square error of approximate (RMSEA), standardized root mean square residual (SRMR), comparative fit index (CFI), and Tucker–Lewis index (TLI), and two equivalence test-based model fit indices: RMSEAt and CFIt. We use Monte Carlo simulation to generate and analyze data based on a substantive example using the positive and negative affective schedule ( N = 1,000). We systematically vary the number and magnitude of correlated residuals as well as nonspecific misspecification, to evaluate the impact on model fit indices in fitting a two-factor exploratory factor analysis. Our results show that all fit indices, except SRMR, are overly sensitive to correlated residuals and nonspecific error, resulting in solutions that are overfactored. SRMR performed well, consistently selecting the correct number of factors; however, previous research suggests it does not perform well with categorical data. In general, we do not recommend using model fit indices to select number of factors in a scale evaluation framework.


2020 ◽  
pp. 1-27
Author(s):  
Erik-Jan van Kesteren ◽  
Rogier A. Kievit

Dimension reduction is widely used and often necessary to make network analyses and their interpretation tractable by reducing high-dimensional data to a small number of underlying variables. Techniques such as exploratory factor analysis (EFA) are used by neuroscientists to reduce measurements from a large number of brain regions to a tractable number of factors. However, dimension reduction often ignores relevant a priori knowledge about the structure of the data. For example, it is well established that the brain is highly symmetric. In this paper, we (a) show the adverse consequences of ignoring a priori structure in factor analysis, (b) propose a technique to accommodate structure in EFA by using structured residuals (EFAST), and (c) apply this technique to three large and varied brain-imaging network datasets, demonstrating the superior fit and interpretability of our approach. We provide an R software package to enable researchers to apply EFAST to other suitable datasets.


2020 ◽  
Author(s):  
Amanda Kay Montoya ◽  
Michael C. Edwards

Model fit indices are being increasingly recommended and used to select the number of factors in an exploratory factor analysis. Growing evidence suggests that the recommended cutoff values for common model fit indices are not appropriate for use in an exploratory factor analysis context. A particularly prominent problem in scale evaluation is the ubiquity of correlated residuals and imperfect model specification. Our research focuses on a scale evaluation context and the performance of four standard model fit indices: root mean squared error of approximate (RMSEA), standardized root mean squared residual (SRMR), comparative fit index (CFI), and Tucker-Lewis index (TLI), and two equivalence test-based model fit indices: RMSEAt and CFIt. We use Monte Carlo simulation to generate and analyze data based on a substantive example using the positive and negative affective schedule (N = 1000). We systematically vary the number and magnitude of correlated residuals as well as nonspecific misspecification, to evaluate the impact on model fit indices in fitting a two-factor EFA. Our results show that all fit indices, except SRMR, are overly sensitive to correlated residuals and nonspecific error, resulting in solutions which are over-factored. SRMR performed well, consistently selecting the correct number of factors; however, previous research suggests it does not perform well with categorical data. In general, we do not recommend using model fit indices to select number of factors in a scale evaluation framework.


Sign in / Sign up

Export Citation Format

Share Document