Parallel analysis with categorical variables: Impact of category probability proportions on dimensionality assessment accuracy.

Cognitive diagnosis models (CDMs) allow classifying respondents into a set of discrete attribute profiles. The internal structure of the test is determined in a Q-matrix, whose correct specification is necessary to achieve an accurate attribute profile classification. Several empirical Q-matrix estimation and validation methods have been proposed with the aim of providing well-specified Q-matrices. However, these methods require the number of attributes to be set in advance. No systematic studies about CDMs dimensionality assessment have been conducted, which contrasts with the vast existing literature for the factor analysis framework. To address this gap, the present study evaluates the performance of several dimensionality assessment methods from the factor analysis literature in determining the number of attributes in the context of CDMs. The explored methods were parallel analysis, minimum average partial, very simple structure, DETECT, empirical Kaiser criterion, exploratory graph analysis, and a machine learning factor forest model. Additionally, a model comparison approach was considered, which consists in comparing the model-fit of empirically estimated Q-matrices. The performance of these methods was assessed by means of a comprehensive simulation study that included different generating number of attributes, item qualities, sample sizes, ratios of the number of items to attribute, correlations among the attributes, attributes thresholds, and generating CDM. Results showed that parallel analysis (with Pearson correlations and mean eigenvalue criterion), factor forest model, and model comparison (with AIC) are suitable alternatives to determine the number of attributes in CDM applications, with an overall percentage of correct estimates above 76% of the conditions. The accuracy increased to 97% when these three methods agreed on the number of attributes. In short, the present study supports the use of three methods in assessing the dimensionality of CDMs. This will allow to test the assumption of correct dimensionality present in the Q-matrix estimation and validation methods, as well as to gather evidence of validity to support the use of the scores obtained with these models. The findings of this study are illustrated using real data from an intelligence test to provide guidelines for assessing the dimensionality of CDM data in applied settings.

Download Full-text

A Bootstrap Generalization of Modified Parallel Analysis for IRT Dimensionality Assessment

Applied Measurement in Education ◽

10.1080/08957340801926102 ◽

2008 ◽

Vol 21 (2) ◽

pp. 119-140 ◽

Cited By ~ 9

Author(s):

Holmes Finch ◽

Patrick Monahan

Keyword(s):

Parallel Analysis ◽

Dimensionality Assessment

Download Full-text

Dimensionality assessment of ordered polytomous items with parallel analysis.

Psychological Methods ◽

10.1037/a0023353 ◽

2011 ◽

Vol 16 (2) ◽

pp. 209-220 ◽

Cited By ~ 373

Author(s):

Marieke E. Timmerman ◽

Urbano Lorenzo-Seva

Keyword(s):

Parallel Analysis ◽

Polytomous Items ◽

Dimensionality Assessment

Download Full-text

Influence of General and Crystallized Intelligence on Vocabulary Test Performance

European Journal of Psychological Assessment ◽

10.1027//1015-5759.18.1.78 ◽

2002 ◽

Vol 18 (1) ◽

pp. 78-84 ◽

Cited By ~ 10

Author(s):

Eva Ullstadius ◽

Jan-Eric Gustafsson ◽

Berit Carlstedt

Keyword(s):

Hierarchical Model ◽

Test Performance ◽

Verbal Ability ◽

Testing Procedure ◽

Intellectual Ability ◽

Analysis Of Covariance ◽

Categorical Variables ◽

Vocabulary Test ◽

Crystallized Intelligence ◽

General Ability

Summary: Vocabulary tests, part of most test batteries of general intellectual ability, measure both verbal and general ability. Newly developed techniques for confirmatory factor analysis of dichotomous variables make it possible to analyze the influence of different abilities on the performance on each item. In the testing procedure of the Computerized Swedish Enlistment test battery, eight different subtests of a new vocabulary test were given randomly to subsamples of a representative sample of 18-year-old male conscripts (N = 9001). Three central dimensions of a hierarchical model of intellectual abilities, general (G), verbal (Gc'), and spatial ability (Gv') were estimated under different assumptions of the nature of the data. In addition to an ordinary analysis of covariance matrices, assuming linearity of relations, the item variables were treated as categorical variables in the Mplus program. All eight subtests fit the hierarchical model, and the items were found to load about equally on G and Gc'. The results also indicate that if nonlinearity is not taken into account, the G loadings for the easy items are underestimated. These items, moreover, appear to be better measures of G than the difficult ones. The practical utility of the outcome for item selection and the theoretical implications for the question of the origin of verbal ability are discussed.

Download Full-text

An Alternative to Cohen's κ

European Psychologist ◽

10.1027/1016-9040.11.1.12 ◽

2006 ◽

Vol 11 (1) ◽

pp. 12-24 ◽

Cited By ~ 19

Author(s):

Alexander von Eye

Keyword(s):

Simulation Study ◽

Null Hypothesis ◽

Categorical Variables ◽

Alternative Measure ◽

Rater Agreement ◽

Verbal Processing ◽

Heavy Tailed ◽

Applicant Selection

At the level of manifest categorical variables, a large number of coefficients and models for the examination of rater agreement has been proposed and used. The most popular of these is Cohen's κ. In this article, a new coefficient, κ s , is proposed as an alternative measure of rater agreement. Both κ and κ s allow researchers to determine whether agreement in groups of two or more raters is significantly beyond chance. Stouffer's z is used to test the null hypothesis that κ s = 0. The coefficient κ s allows one, in addition to evaluating rater agreement in a fashion parallel to κ, to (1) examine subsets of cells in agreement tables, (2) examine cells that indicate disagreement, (3) consider alternative chance models, (4) take covariates into account, and (5) compare independent samples. Results from a simulation study are reported, which suggest that (a) the four measures of rater agreement, Cohen's κ, Brennan and Prediger's κ n , raw agreement, and κ s are sensitive to the same data characteristics when evaluating rater agreement and (b) both the z-statistic for Cohen's κ and Stouffer's z for κ s are unimodally and symmetrically distributed, but slightly heavy-tailed. Examples use data from verbal processing and applicant selection.

Download Full-text

Number of Factors Decision: Parallel Analysis Is Not the Panacea

PsycEXTRA Dataset ◽

10.1037/e518572013-186 ◽

2006 ◽

Author(s):

Jinyan Fan ◽

Felix James Lopez ◽

Jennifer Nieman ◽

Robert C. Litchfield ◽

Robert S. Billings

Keyword(s):

Parallel Analysis ◽

Number Of Factors

Download Full-text

Blood Management in Total Knee Arthroplasty: A Nationwide Analysis from 2011 to 2018

The Journal of Knee Surgery ◽

10.1055/s-0040-1721414 ◽

2020 ◽

Author(s):

Jared A. Warren ◽

John P. McLaughlin ◽

Robert M. Molloy ◽

Carlos A. Higuera ◽

Jonathan L. Schaffer ◽

...

Keyword(s):

Total Knee Arthroplasty ◽

Platelet Count ◽

Knee Arthroplasty ◽

Categorical Variables ◽

Bleeding Disorders ◽

Blood Management ◽

Improvement Program ◽

Preoperative Anemia ◽

High Incidence ◽

Total Knee

AbstractBoth advances in perioperative blood management, anesthesia, and surgical technique have improved transfusion rates following primary total knee arthroplasty (TKA), and have driven substantial change in preoperative blood ordering protocols. Therefore, blood management in TKA has seen substantial changes with the implementation of preoperative screening, patient optimization, and intra- and postoperative advances. Thus, the purpose of this study was to examine changes in blood management in primary TKA, a nationwide sample, to assess gaps and opportunities. The American College of Surgeons National Surgical Quality Improvement Program database was used to identify TKA (n = 337,160) cases from 2011 to 2018. The following variables examined, such as preoperative hematocrit (HCT), anemia (HCT <35.5% for females and <38.5% for males), platelet count, thrombocytopenia (platelet count < 150,000/µL), international normalized ration (INR), INR > 2.0, bleeding disorders, preoperative, and postoperative transfusions. Analysis of variances were used to examine changes in continuous variables, and Chi-squared tests were used for categorical variables. There was a substantial decrease in postoperative transfusions from high of 18.3% in 2011 to a low of 1.0% in 2018, (p < 0.001), as well as in preoperative anemia from a high of 13.3% in 2011 to a low of 9.5% in 2016 to 2017 (p < 0.001). There were statistically significant, but clinically irrelevant changes in the other variables examined. There was a HCT high of 41.2 in 2016 and a low of 40.4 in 2011 to 2012 (p < 0.001). There was platelet count high of 247,400 in 2018 and a low of 242,700 in 201 (p < 0.001). There was a high incidence of thrombocytopenia of 5.2% in 2017 and a low of low of 4.4% in 2018 (p < 0.001). There was a high INR of 1.037 in 2011 and a low of 1.021 in 2013 (p < 0.001). There was a high incidence of INR >2.0 of 1.0% in 2012 to 2015 and a low of 0.8% in 2016 to 2018 (p = 0.027). There was a high incidence of bleeding disorders of 2.9% in 2013 and a low of 1.8% in 2017 to 2018 (p < 0.001). There was a high incidence of preoperative transfusions of 0.1% in 2011 to 2014 and a low of <0.1% in 2015 to 2018 (p = 0.021). From 2011 to 2018, there has been substantial decreases in patients receiving postoperative transfusions after primary TKA. Similarly, although a decrease in patients with anemia was seen, there remains 1 out 10 patients with preoperative anemia, highlighting the opportunity to further improve and address this potentially modifiable risk factor before surgery. These findings may reflect changes during TKA patient selection, optimization, or management, and emphasizes the need to further advance multimodal approaches for perioperative blood management of TKA patients. This is a Level III study.

Download Full-text

Bridging the Gap between Quantitative and Qualitative Historical Research: an application of multiple regression analysis and homogeneity analysis with alternating least squares

History and Computing ◽

10.3366/hac.1996.8.3.133 ◽

1996 ◽

Vol 8 (3) ◽

pp. 133-144 ◽

Cited By ~ 1

Author(s):

María del Mar del Pozo Andrés ◽

Jacques F A Braster

Keyword(s):

Least Squares ◽

Multiple Regression ◽

Historical Research ◽

Alternating Least Squares ◽

Categorical Variables ◽

Two Dimensional ◽

Homogeneity Analysis ◽

Regression Approach ◽

Dimensional Picture ◽

Selection Of

In this article we propose two research techniques that can bridge the gap between quantitative and qualitative historical research. These are: (1) a multiple regression approach that gives information about general patterns between numerical variables and the selection of outliers for qualitative analysis; (2) a homogeneity analysis with alternating least squares that results in a two-dimensional picture in which the relationships between categorical variables are graphically presented.

Download Full-text

Identifying Key Fraud Indicators in the Automobile Insurance Industry Using SQL Server Analysis Services

Studia Universitatis Babe-Bolyai Oeconomica ◽

10.2478/subboec-2019-0009 ◽

2019 ◽

Vol 64 (2) ◽

pp. 53-71

Author(s):

Botond Benedek ◽

Ede László

Keyword(s):

Neural Network ◽

Decision Tree ◽

Naive Bayes ◽

Insurance Industry ◽

Naïve Bayes ◽

Sql Server ◽

Categorical Variables ◽

Automobile Insurance ◽

Price Determination ◽

Mining Tool

Abstract Customer segmentation represents a true challenge in the automobile insurance industry, as datasets are large, multidimensional, unbalanced and it also requires a unique price determination based on the risk profile of the customer. Furthermore, the price determination of an insurance policy or the validity of the compensation claim, in most cases must be an instant decision. Therefore, the purpose of this research is to identify an easily usable data mining tool that is capable to identify key automobile insurance fraud indicators, facilitating the segmentation. In addition, the methods used by the tool, should be based primarily on numerical and categorical variables, as there is no well-functioning text mining tool for Central Eastern European languages. Hence, we decided on the SQL Server Analysis Services (SSAS) tool and to compare the performance of the decision tree, neural network and Naïve Bayes methods. The results suggest that decision tree and neural network are more suitable than Naïve Bayes, however the best conclusion can be drawn if we use the decision tree and neural network together.

Download Full-text