scholarly journals Relative Accuracy of Two Modified Parallel Analysis Methods that Use the Proper Reference Distribution

2017 ◽  
Vol 78 (4) ◽  
pp. 589-604 ◽  
Author(s):  
Samuel Green ◽  
Yuning Xu ◽  
Marilyn S. Thompson

Parallel analysis (PA) assesses the number of factors in exploratory factor analysis. Traditionally PA compares the eigenvalues for a sample correlation matrix with the eigenvalues for correlation matrices for 100 comparison datasets generated such that the variables are independent, but this approach uses the wrong reference distribution. The proper reference distribution of eigenvalues assesses the kth factor based on comparison datasets with k−1 underlying factors. Two methods that use the proper reference distribution are revised PA (R-PA) and the comparison data method (CDM). We compare the accuracies of these methods using Monte Carlo methods by manipulating the factor structure, factor loadings, factor correlations, and number of observations. In the 17 conditions in which CDM was more accurate than R-PA, both methods evidenced high accuracies (i.e.,>94.5%). In these conditions, CDM had slightly higher accuracies (mean difference of 1.6%). In contrast, in the remaining 25 conditions, R-PA evidenced higher accuracies (mean difference of 12.1%, and considerably higher for some conditions). We consider these findings in conjunction with previous research investigating PA methods and concluded that R-PA tends to offer somewhat stronger results. Nevertheless, further research is required. Given that both CDM and R-PA involve hypothesis testing, we argue that future research should explore effect size statistics to augment these methods.

1979 ◽  
Vol 49 (1) ◽  
pp. 223-226 ◽  
Author(s):  
Charles B. Crawford ◽  
Penny Koopman

The inter-rater reliability of Cattell's scree and Linn's mean square ratio test of the number of factors was studied. Sample correlation matrices were generated from a population correlation matrix by means of standard Monte Carlo procedures such that there were 100 samples based on each of 3 sample sizes. Each matrix was factored and the scree test and the mean square ratio test were each applied by five raters. For both tests, the inter-rater reliabilities were very low. These results suggest that inexperienced factor analysts should be wary of these tests of the number of factors.


2021 ◽  
pp. 001316442098205
Author(s):  
André Beauducel ◽  
Norbert Hilger

Methods for optimal factor rotation of two-facet loading matrices have recently been proposed. However, the problem of the correct number of factors to retain for rotation of two-facet loading matrices has rarely been addressed in the context of exploratory factor analysis. Most previous studies were based on the observation that two-facet loading matrices may be rank deficient when the salient loadings of each factor have the same sign. It was shown here that full-rank two-facet loading matrices are, in principle, possible, when some factors have positive and negative salient loadings. Accordingly, the current simulation study on the number of factors to extract for two-facet models was based on rank-deficient and full-rank two-facet population models. The number of factors to extract was estimated from traditional parallel analysis based on the mean of the unreduced eigenvalues as well as from nine other rather traditional versions of parallel analysis (based on the 95th percentile of eigenvalues, based on reduced eigenvalues, based on eigenvalue differences). Parallel analysis based on the mean eigenvalues of the correlation matrix with the squared multiple correlations of each variable with the remaining variables inserted in the main diagonal had the highest detection rates for most of the two-facet factor models. Recommendations for the identification of the correct number of factors are based on the simulation results, on the results of an empirical example data set, and on the conditions for approximately rank-deficient and full-rank two-facet models.


Author(s):  
Sadamori Kojaku ◽  
Naoki Masuda

Network analysis has been applied to various correlation matrix data. Thresholding on the value of the pairwise correlation is probably the most straightforward and common method to create a network from a correlation matrix. However, there have been criticisms on this thresholding approach such as an inability to filter out spurious correlations, which have led to proposals of alternative methods to overcome some of the problems. We propose a method to create networks from correlation matrices based on optimization with regularization, where we lay an edge between each pair of nodes if and only if the edge is unexpected from a null model. The proposed algorithm is advantageous in that it can be combined with different types of null models. Moreover, the algorithm can select the most plausible null model from a set of candidate null models using a model selection criterion. For three economic datasets, we find that the configuration model for correlation matrices is often preferred to standard null models. For country-level product export data, the present method better predicts main products exported from countries than sample correlation matrices do.


2021 ◽  
pp. 001316442110220
Author(s):  
David Goretzko

Determining the number of factors in exploratory factor analysis is arguably the most crucial decision a researcher faces when conducting the analysis. While several simulation studies exist that compare various so-called factor retention criteria under different data conditions, little is known about the impact of missing data on this process. Hence, in this study, we evaluated the performance of different factor retention criteria—the Factor Forest, parallel analysis based on a principal component analysis as well as parallel analysis based on the common factor model and the comparison data approach—in combination with different missing data methods, namely an expectation-maximization algorithm called Amelia, predictive mean matching, and random forest imputation within the multiple imputations by chained equations (MICE) framework as well as pairwise deletion with regard to their accuracy in determining the number of factors when data are missing. Data were simulated for different sample sizes, numbers of factors, numbers of manifest variables (indicators), between-factor correlations, missing data mechanisms and proportions of missing values. In the majority of conditions and for all factor retention criteria except the comparison data approach, the missing data mechanism had little impact on the accuracy and pairwise deletion performed comparably well as the more sophisticated imputation methods. In some conditions, especially small-sample cases and when comparison data were used to determine the number of factors, random forest imputation was preferable to other missing data methods, though. Accordingly, depending on data characteristics and the selected factor retention criterion, choosing an appropriate missing data method is crucial to obtain a valid estimate of the number of factors to extract.


2006 ◽  
Author(s):  
Jinyan Fan ◽  
Felix James Lopez ◽  
Jennifer Nieman ◽  
Robert C. Litchfield ◽  
Robert S. Billings

Author(s):  
David Mendonça ◽  
William A. Wallace ◽  
Barbara Cutler ◽  
James Brooks

AbstractLarge-scale disasters can produce profound disruptions in the fabric of interdependent critical infrastructure systems such as water, telecommunications and electric power. The work of post-disaster infrastructure restoration typically requires information sharing and close collaboration across these sectors; yet – due to a number of factors – the means to investigate decision making phenomena associated with these activities are limited. This paper motivates and describes the design and implementation of a computer-based synthetic environment for investigating collaborative information seeking in the performance of a (simulated) infrastructure restoration task. The main contributions of this work are twofold. First, it develops a set of theoretically grounded measures of collaborative information seeking processes and embeds them within a computer-based system. Second, it suggests how these data may be organized and modeled to yield insights into information seeking processes in the performance of a complex, collaborative task. The paper concludes with a discussion of implications of this work for practice and for future research.


2021 ◽  
pp. 001316442199283
Author(s):  
Yan Xia

Despite the existence of many methods for determining the number of factors, none outperforms the others under every condition. This study compares traditional parallel analysis (TPA), revised parallel analysis (RPA), Kaiser’s rule, minimum average partial, sequential χ2, and sequential root mean square error of approximation, comparative fit index, and Tucker–Lewis index under a realistic scenario in behavioral studies, where researchers employ a closing–fitting parsimonious model with K factors to approximate a population model, leading to a trivial model-data misfit. Results show that while traditional and RPA both stand out when zero population-level misfits exist, the accuracy of RPA substantially deteriorates when a K-factor model can closely approximate the population. TPA is the least sensitive to trivial misfits and results in the highest accuracy across most simulation conditions. This study suggests the use of TPA for the investigated models. Results also imply that RPA requires further revision to accommodate a degree of model–data misfit that can be tolerated.


2019 ◽  
Vol 208 (2) ◽  
pp. 507-534 ◽  
Author(s):  
Natalia Bailey ◽  
M. Hashem Pesaran ◽  
L. Vanessa Smith

Sign in / Sign up

Export Citation Format

Share Document