Is It Worthy to Take Account of the “Guessing” in the Performance of the Raven Test? Calling for the Principle of Parsimony for Test Validation

The present study compares the fit of two- and three-parameter logistic (2PL and 3PL) models of item response theory in the performance of preschool children on the Raven’s Colored Progressive Matrices. The test of Raven is widely used for evaluating nonverbal intelligence of factor g. Studies comparing models with real data are scarce on the literature and this is the first to compare models of two and three parameters for the test of Raven, evaluating the informational gain of considering guessing probability. Participants were 582 Brazilian’s preschool children ( Mage = 57 months; SD = 7 months; 46% female) who responded individually to the instrument. The model fit indices suggested that the 2PL fit better to the data. The difficulty and ability parameters were similar between the models, with almost perfect correlations. Differences were observed in terms of discrimination and test information. The principle of parsimony must be called for comparing models.

Download Full-text

Evaluating Validity Properties of 25 Race-Related Scales

10.31234/osf.io/vxbtg ◽

2022 ◽

Author(s):

Neil Hester ◽

Jordan Axt ◽

Eric Hehman

Keyword(s):

Item Response Theory ◽

Item Response ◽

Racial Prejudice ◽

Model Fit ◽

Fit Indices ◽

Scale Selection ◽

Internal Reliability ◽

Measurement Issues ◽

Latent Constructs ◽

Prejudice And Discrimination

Racial attitudes, beliefs, and motivations lie at the center of many of the most influential theories of prejudice and discrimination. The extent to which such theories can meaningfully explain behavior hinges on accurate measurement of these latent constructs. We evaluated the validity properties of 25 race-related scales in a sample of 1,031,207 respondents using modern approaches such as dynamic fit indices, Item Response Theory, and nomological nets. Despite showing adequate internal reliability, many scales demonstrated poor model fit and had latent score distributions showing clear floor or ceiling effects, results that illustrate deficiencies in measures’ ability to capture their intended construct. Nomological nets further suggested that the theoretical space of “racial prejudice” is crowded with scales that may not actually capture meaningfully distinct latent constructs. We provide concrete recommendations for scale selection and renovation and outline implications for overlooking measurement issues in the study of prejudice and discrimination.

Download Full-text

A Test Information Function for Linear Combinations of Traits Without Nuisance Traits in Multidimensional Item Response Theory

The Japanese Journal of Educational Psychology ◽

10.5926/jjep1953.49.4_491 ◽

2001 ◽

Vol 49 (4) ◽

pp. 491-499

Author(s):

TAKAHIRO HOSHINO

Keyword(s):

Item Response Theory ◽

Item Response ◽

Multidimensional Item Response Theory ◽

Information Function ◽

Multidimensional Item Response ◽

Response Theory ◽

Linear Combinations ◽

Test Information ◽

Test Information Function

Download Full-text

A study on development of Perceived Quality-Image Scale in Sports Organizations

Journal of Human Sciences ◽

10.14687/jhs.v15i4.5290 ◽

2018 ◽

Vol 15 (4) ◽

pp. 2407

Author(s):

Yeşim Bayrakdaroglu ◽

Dursun Katkat

Keyword(s):

Reliability Analysis ◽

Model Fit ◽

Measurement Tool ◽

Fit Indices ◽

Confirmatory Factor Analyses ◽

Sports Organizations ◽

Confirmatory Factor ◽

Quality Image ◽

Model Fit Indices ◽

Marketing Activities

The purpose of this study is to research how marketing activities of international sports organizations are performed and to develop a scale determining the effects of image management on public. The audiences of interuniversity World Winter Olympic sheld in Erzurum in 2011 participated in the research. Explanatory and Confirmatory Factor Analysis, reliability analysis were performed over the data obtained. All model fit indices of 25-item and four-factor structure of quality-image scale perceived in sports organizations applied were found to be at good level. In line with the findings obtained from the explanatory and confirmatory factor analyses and reliability analysis, it can be uttered that the scale is a valid and reliable measurement tool that can be used in field researches.

Download Full-text

Best Practices in the Use of Bifactor Models: Conceptual Grounds, Fit Indices and Complementary Indicators

Revista Evaluar ◽

10.35670/1667-4545.v18.n3.22221 ◽

2018 ◽

Vol 18 (3) ◽

Cited By ~ 3

Author(s):

Pablo Ezequiel Flores-Kanter ◽

Sergio Dominguez-Lara ◽

Mario Alberto Trógolo ◽

Leonardo Adrián Medrano

Keyword(s):

Best Practices ◽

Empirical Studies ◽

Model Fit ◽

Fit Indices ◽

Research Article ◽

Bifactor Models ◽

Model Fit Indices ◽

Personality Psychopathology ◽

Using Data ◽

Published Research

<p>Bifactor models have gained increasing popularity in the literature concerned with personality, psychopathology and assessment. Empirical studies using bifactor analysis generally judge the estimated model using SEM model fit indices, which may lead to erroneous interpretations and conclusions. To address this problem, several researchers have proposed multiple criteria to assess bifactor models, such as a) conceptual grounds, b) overall model fit indices, and c) specific bifactor model indicators. In this article, we provide a brief summary of these criteria. An example using data gathered from a recently published research article is also provided to show how taking into account all criteria, rather than solely SEM model fit indices, may prevent researchers from drawing wrong conclusions.</p>

Download Full-text

Robustness of Projective IRT to Misspecification of the Underlying Multidimensional Model

Applied Psychological Measurement ◽

10.1177/0146621620909894 ◽

2020 ◽

Vol 44 (5) ◽

pp. 362-375

Author(s):

Tyler Strachan ◽

Edward Ip ◽

Yanyan Fu ◽

Terry Ackerman ◽

Shyh-Huei Chen ◽

...

Keyword(s):

Item Response Theory ◽

Item Response ◽

Real Data ◽

Model Parameters ◽

Simulation Studies ◽

Response Theory ◽

Computational Stability ◽

Data Set ◽

Response Data ◽

Higher Dimensional

As a method to derive a “purified” measure along a dimension of interest from response data that are potentially multidimensional in nature, the projective item response theory (PIRT) approach requires first fitting a multidimensional item response theory (MIRT) model to the data before projecting onto a dimension of interest. This study aims to explore how accurate the PIRT results are when the estimated MIRT model is misspecified. Specifically, we focus on using a (potentially misspecified) two-dimensional (2D)-MIRT for projection because of its advantages, including interpretability, identifiability, and computational stability, over higher dimensional models. Two large simulation studies (I and II) were conducted. Both studies examined whether the fitting of a 2D-MIRT is sufficient to recover the PIRT parameters when multiple nuisance dimensions exist in the test items, which were generated, respectively, under compensatory MIRT and bifactor models. Various factors were manipulated, including sample size, test length, latent factor correlation, and number of nuisance dimensions. The results from simulation studies I and II showed that the PIRT was overall robust to a misspecified 2D-MIRT. Smaller third and fourth simulation studies were done to evaluate recovery of the PIRT model parameters when the correctly specified higher dimensional MIRT or bifactor model was fitted with the response data. In addition, a real data set was used to illustrate the robustness of PIRT.

Download Full-text

Bayesian Nonparametric Monotone Regression of Dynamic Latent Traits in Item Response Theory Models

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998619887913 ◽

2019 ◽

Vol 45 (3) ◽

pp. 274-296

Author(s):

Yang Liu ◽

Xiaojing Wang

Keyword(s):

Item Response Theory ◽

Item Response ◽

Nonlinear Effects ◽

Real Data ◽

Monotone Regression ◽

Response Theory ◽

Growth Trend ◽

Nonparametric Prior ◽

Latent Traits ◽

Item Response Theory Models

Parametric methods, such as autoregressive models or latent growth modeling, are usually inflexible to model the dependence and nonlinear effects among the changes of latent traits whenever the time gap is irregular and the recorded time points are individually varying. Often in practice, the growth trend of latent traits is subject to certain monotone and smooth conditions. To incorporate such conditions and to alleviate the strong parametric assumption on regressing latent trajectories, a flexible nonparametric prior has been introduced to model the dynamic changes of latent traits for item response theory models over the study period. Suitable Bayesian computation schemes are developed for such analysis of the longitudinal and dichotomous item responses. Simulation studies and a real data example from educational testing have been used to illustrate our proposed methods.

Download Full-text

Attention-deficit/hyperactivity disorder symptoms in preschool children: Examining psychometric properties using item response theory.

Psychological Assessment ◽

10.1037/a0019581 ◽

2010 ◽

Vol 22 (3) ◽

pp. 546-558 ◽

Cited By ~ 21

Author(s):

David J. Purpura ◽

Shauna B. Wilson ◽

Christopher J. Lonigan

Keyword(s):

Attention Deficit Hyperactivity Disorder ◽

Preschool Children ◽

Item Response Theory ◽

Psychometric Properties ◽

Item Response ◽

Attention Deficit ◽

Response Theory ◽

Hyperactivity Disorder

Download Full-text

Entropy-Based Measures for Person Fit in Item Response Theory

Applied Psychological Measurement ◽

10.1177/0146621617698945 ◽

2017 ◽

Vol 41 (7) ◽

pp. 512-529 ◽

Cited By ~ 3

Author(s):

William R. Dardick ◽

Brandi A. Weiss

Keyword(s):

Item Response Theory ◽

Item Response ◽

Preliminary Evidence ◽

Model Fit ◽

Response Patterns ◽

Person Fit ◽

Response Theory ◽

Logistic Regression Models ◽

Irt Models

This article introduces three new variants of entropy to detect person misfit ( Ei, EMi, and EMRi), and provides preliminary evidence that these measures are worthy of further investigation. Previously, entropy has been used as a measure of approximate data–model fit to quantify how well individuals are classified into latent classes, and to quantify the quality of classification and separation between groups in logistic regression models. In the current study, entropy is explored through conceptual examples and Monte Carlo simulation comparing entropy with established measures of person fit in item response theory (IRT) such as lz, lz*, U, and W. Simulation results indicated that EMi and EMRi were successfully able to detect aberrant response patterns when comparing contaminated and uncontaminated subgroups of persons. In addition, EMi and EMRi performed similarly in showing separation between the contaminated and uncontaminated subgroups. However, EMRi may be advantageous over other measures when subtests include a small number of items. EMi and EMRi are recommended for use as approximate person-fit measures for IRT models. These measures of approximate person fit may be useful in making relative judgments about potential persons whose response patterns do not fit the theoretical model.

Download Full-text

Development of the Mental Imagery Scale for Preschool Children using Classical Test Theory and Item Response Theory

The International Journal of Assessment and Evaluation ◽

10.18848/2327-7920/cgp/v27i02/31-49 ◽

2020 ◽

Vol 27 (2) ◽

pp. 31-49

Author(s):

Palmira Faraci ◽

Maria Guarnera ◽

Elena Commodari ◽

Stefania Lucia Buccheri ◽

Giusy Danila Valenti

Keyword(s):

Preschool Children ◽

Item Response Theory ◽

Item Response ◽

Mental Imagery ◽

Classical Test Theory ◽

Test Theory ◽

Response Theory ◽

Classical Test

Download Full-text

Estimating the Political Center from Aggregate Data: An Item Response Theory Alternative to the Stimson Dyad Ratios Algorithm

Political Analysis ◽

10.1093/pan/mpt022 ◽

2014 ◽

Vol 22 (1) ◽

pp. 115-129 ◽

Cited By ~ 22

Author(s):

Anthony J. McGann

Keyword(s):

Item Response Theory ◽

Item Response ◽

Median Voter ◽

Theory Model ◽

Free Software ◽

Model Fit ◽

The Political ◽

Response Theory ◽

The United Kingdom ◽

Political Center

This article provides an algorithm to produce a time-series estimate of the political center (or median voter) from aggregate survey data, even when the same questions are not asked in most years. This is compared to the existing Stimson dyad ratios approach, which has been applied to various questions in political science. Unlike the dyad ratios approach, the model developed here is derived from an explicit model of individual behavior—the widely used item response theory model. I compare the results of both techniques using the data on public opinion from the United Kingdom from 1947 to 2005 from Bartle, Dellepiane-Avellaneda, and Stimson. Measures of overall model fit are provided, as well as techniques for testing model's assumptions and the fit of individual items. Full code is provided for estimation with free software WinBUGS and JAGS.

Download Full-text