Parallel analysis with categorical variables: Impact of category probability proportions on dimensionality assessment accuracy.

2019 ◽  
Vol 24 (3) ◽  
pp. 339-351 ◽  
Author(s):  
Dirk Lubbe
2021 ◽  
Vol 12 ◽  
Author(s):  
Pablo Nájera ◽  
Francisco José Abad ◽  
Miguel A. Sorrel

Cognitive diagnosis models (CDMs) allow classifying respondents into a set of discrete attribute profiles. The internal structure of the test is determined in a Q-matrix, whose correct specification is necessary to achieve an accurate attribute profile classification. Several empirical Q-matrix estimation and validation methods have been proposed with the aim of providing well-specified Q-matrices. However, these methods require the number of attributes to be set in advance. No systematic studies about CDMs dimensionality assessment have been conducted, which contrasts with the vast existing literature for the factor analysis framework. To address this gap, the present study evaluates the performance of several dimensionality assessment methods from the factor analysis literature in determining the number of attributes in the context of CDMs. The explored methods were parallel analysis, minimum average partial, very simple structure, DETECT, empirical Kaiser criterion, exploratory graph analysis, and a machine learning factor forest model. Additionally, a model comparison approach was considered, which consists in comparing the model-fit of empirically estimated Q-matrices. The performance of these methods was assessed by means of a comprehensive simulation study that included different generating number of attributes, item qualities, sample sizes, ratios of the number of items to attribute, correlations among the attributes, attributes thresholds, and generating CDM. Results showed that parallel analysis (with Pearson correlations and mean eigenvalue criterion), factor forest model, and model comparison (with AIC) are suitable alternatives to determine the number of attributes in CDM applications, with an overall percentage of correct estimates above 76% of the conditions. The accuracy increased to 97% when these three methods agreed on the number of attributes. In short, the present study supports the use of three methods in assessing the dimensionality of CDMs. This will allow to test the assumption of correct dimensionality present in the Q-matrix estimation and validation methods, as well as to gather evidence of validity to support the use of the scores obtained with these models. The findings of this study are illustrated using real data from an intelligence test to provide guidelines for assessing the dimensionality of CDM data in applied settings.


2011 ◽  
Vol 16 (2) ◽  
pp. 209-220 ◽  
Author(s):  
Marieke E. Timmerman ◽  
Urbano Lorenzo-Seva

2002 ◽  
Vol 18 (1) ◽  
pp. 78-84 ◽  
Author(s):  
Eva Ullstadius ◽  
Jan-Eric Gustafsson ◽  
Berit Carlstedt

Summary: Vocabulary tests, part of most test batteries of general intellectual ability, measure both verbal and general ability. Newly developed techniques for confirmatory factor analysis of dichotomous variables make it possible to analyze the influence of different abilities on the performance on each item. In the testing procedure of the Computerized Swedish Enlistment test battery, eight different subtests of a new vocabulary test were given randomly to subsamples of a representative sample of 18-year-old male conscripts (N = 9001). Three central dimensions of a hierarchical model of intellectual abilities, general (G), verbal (Gc'), and spatial ability (Gv') were estimated under different assumptions of the nature of the data. In addition to an ordinary analysis of covariance matrices, assuming linearity of relations, the item variables were treated as categorical variables in the Mplus program. All eight subtests fit the hierarchical model, and the items were found to load about equally on G and Gc'. The results also indicate that if nonlinearity is not taken into account, the G loadings for the easy items are underestimated. These items, moreover, appear to be better measures of G than the difficult ones. The practical utility of the outcome for item selection and the theoretical implications for the question of the origin of verbal ability are discussed.


2006 ◽  
Vol 11 (1) ◽  
pp. 12-24 ◽  
Author(s):  
Alexander von Eye

At the level of manifest categorical variables, a large number of coefficients and models for the examination of rater agreement has been proposed and used. The most popular of these is Cohen's κ. In this article, a new coefficient, κ s , is proposed as an alternative measure of rater agreement. Both κ and κ s allow researchers to determine whether agreement in groups of two or more raters is significantly beyond chance. Stouffer's z is used to test the null hypothesis that κ s = 0. The coefficient κ s allows one, in addition to evaluating rater agreement in a fashion parallel to κ, to (1) examine subsets of cells in agreement tables, (2) examine cells that indicate disagreement, (3) consider alternative chance models, (4) take covariates into account, and (5) compare independent samples. Results from a simulation study are reported, which suggest that (a) the four measures of rater agreement, Cohen's κ, Brennan and Prediger's κ n , raw agreement, and κ s are sensitive to the same data characteristics when evaluating rater agreement and (b) both the z-statistic for Cohen's κ and Stouffer's z for κ s are unimodally and symmetrically distributed, but slightly heavy-tailed. Examples use data from verbal processing and applicant selection.


2006 ◽  
Author(s):  
Jinyan Fan ◽  
Felix James Lopez ◽  
Jennifer Nieman ◽  
Robert C. Litchfield ◽  
Robert S. Billings

Author(s):  
Jared A. Warren ◽  
John P. McLaughlin ◽  
Robert M. Molloy ◽  
Carlos A. Higuera ◽  
Jonathan L. Schaffer ◽  
...  

AbstractBoth advances in perioperative blood management, anesthesia, and surgical technique have improved transfusion rates following primary total knee arthroplasty (TKA), and have driven substantial change in preoperative blood ordering protocols. Therefore, blood management in TKA has seen substantial changes with the implementation of preoperative screening, patient optimization, and intra- and postoperative advances. Thus, the purpose of this study was to examine changes in blood management in primary TKA, a nationwide sample, to assess gaps and opportunities. The American College of Surgeons National Surgical Quality Improvement Program database was used to identify TKA (n = 337,160) cases from 2011 to 2018. The following variables examined, such as preoperative hematocrit (HCT), anemia (HCT <35.5% for females and <38.5% for males), platelet count, thrombocytopenia (platelet count < 150,000/µL), international normalized ration (INR), INR > 2.0, bleeding disorders, preoperative, and postoperative transfusions. Analysis of variances were used to examine changes in continuous variables, and Chi-squared tests were used for categorical variables. There was a substantial decrease in postoperative transfusions from high of 18.3% in 2011 to a low of 1.0% in 2018, (p < 0.001), as well as in preoperative anemia from a high of 13.3% in 2011 to a low of 9.5% in 2016 to 2017 (p < 0.001). There were statistically significant, but clinically irrelevant changes in the other variables examined. There was a HCT high of 41.2 in 2016 and a low of 40.4 in 2011 to 2012 (p < 0.001). There was platelet count high of 247,400 in 2018 and a low of 242,700 in 201 (p < 0.001). There was a high incidence of thrombocytopenia of 5.2% in 2017 and a low of low of 4.4% in 2018 (p < 0.001). There was a high INR of 1.037 in 2011 and a low of 1.021 in 2013 (p < 0.001). There was a high incidence of INR >2.0 of 1.0% in 2012 to 2015 and a low of 0.8% in 2016 to 2018 (p = 0.027). There was a high incidence of bleeding disorders of 2.9% in 2013 and a low of 1.8% in 2017 to 2018 (p < 0.001). There was a high incidence of preoperative transfusions of 0.1% in 2011 to 2014 and a low of <0.1% in 2015 to 2018 (p = 0.021). From 2011 to 2018, there has been substantial decreases in patients receiving postoperative transfusions after primary TKA. Similarly, although a decrease in patients with anemia was seen, there remains 1 out 10 patients with preoperative anemia, highlighting the opportunity to further improve and address this potentially modifiable risk factor before surgery. These findings may reflect changes during TKA patient selection, optimization, or management, and emphasizes the need to further advance multimodal approaches for perioperative blood management of TKA patients. This is a Level III study.


1996 ◽  
Vol 8 (3) ◽  
pp. 133-144 ◽  
Author(s):  
María del Mar del Pozo Andrés ◽  
Jacques F A Braster

In this article we propose two research techniques that can bridge the gap between quantitative and qualitative historical research. These are: (1) a multiple regression approach that gives information about general patterns between numerical variables and the selection of outliers for qualitative analysis; (2) a homogeneity analysis with alternating least squares that results in a two-dimensional picture in which the relationships between categorical variables are graphically presented.


2019 ◽  
Vol 64 (2) ◽  
pp. 53-71
Author(s):  
Botond Benedek ◽  
Ede László

Abstract Customer segmentation represents a true challenge in the automobile insurance industry, as datasets are large, multidimensional, unbalanced and it also requires a unique price determination based on the risk profile of the customer. Furthermore, the price determination of an insurance policy or the validity of the compensation claim, in most cases must be an instant decision. Therefore, the purpose of this research is to identify an easily usable data mining tool that is capable to identify key automobile insurance fraud indicators, facilitating the segmentation. In addition, the methods used by the tool, should be based primarily on numerical and categorical variables, as there is no well-functioning text mining tool for Central Eastern European languages. Hence, we decided on the SQL Server Analysis Services (SSAS) tool and to compare the performance of the decision tree, neural network and Naïve Bayes methods. The results suggest that decision tree and neural network are more suitable than Naïve Bayes, however the best conclusion can be drawn if we use the decision tree and neural network together.


Sign in / Sign up

Export Citation Format

Share Document