Invariance of Comparisons—Separation of Person and Item Parameters

Author(s):  
David Andrich ◽  
Ida Marais
Keyword(s):  
1999 ◽  
Vol 15 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Lutz F. Hornke

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.


2019 ◽  
Vol 35 (1) ◽  
pp. 98-108 ◽  
Author(s):  
Michael J. Burtscher ◽  
Jeannette Oostlander

Abstract. Team cognition plays an important role in predicting team processes and outcomes. Thus far, research has focused on structured cognition while paying little attention to perceptual cognition. The lack of research on perceptual team cognition can be attributed to the absence of an appropriate measure. To address this gap, we introduce the construct of perceived mutual understanding (PMU) as a type of perceptual team cognition and describe the development of a respective measure – the PMU-scale. Based on three samples from different team settings ( NTotal = 566), our findings show that the scale has good psychometric properties – both at the individual as well as at the team-level. Item parameters were improved during a multistage process. Exploratory as well as confirmatory factor analyses indicate that PMU is a one-dimensional construct. The scale demonstrates sufficient internal reliability. Correlational analyses provide initial proof of construct validity. Finally, common indicators for inter-rater reliability and inter-rater agreement suggest that treating PMU as a team-level construct is justified. The PMU-scale represents a convenient and versatile measure that will potentially foster empirical research on perceptual team cognition and thereby contribute to the advancement of team cognition research in general.


Methodology ◽  
2006 ◽  
Vol 2 (4) ◽  
pp. 142-148 ◽  
Author(s):  
Pere J. Ferrando

In the IRT person-fluctuation model, the individual trait levels fluctuate within a single test administration whereas the items have fixed locations. This article studies the relations between the person and item parameters of this model and two central properties of item and test scores: temporal stability and external validity. For temporal stability, formulas are derived for predicting and interpreting item response changes in a test-retest situation on the basis of the individual fluctuations. As for validity, formulas are derived for obtaining disattenuated estimates and for predicting changes in validity in groups with different levels of fluctuation. These latter formulas are related to previous research in the person-fit domain. The results obtained and the relations discussed are illustrated with an empirical example.


Author(s):  
Brian Wesolowski

This chapter presents an introductory overview of concepts that underscore the general framework of item response theory. “Item response theory” is a broad umbrella term used to describe a family of mathematical measurement models that consider observed test scores to be a function of latent, unobservable constructs. Most musical constructs cannot be directly measured and are therefore unobservable. Musical constructs can therefore only be inferred based on secondary, observable behaviors. Item response theory uses observable behaviors as probabilistic distributions of responses as a logistic function of person and item parameters in order to define latent constructs. This chapter describes philosophical, theoretical, and applied perspectives of item response theory in the context of measuring musical behaviors.


2021 ◽  
pp. 014662162110131
Author(s):  
Leah Feuerstahler ◽  
Mark Wilson

In between-item multidimensional item response models, it is often desirable to compare individual latent trait estimates across dimensions. These comparisons are only justified if the model dimensions are scaled relative to each other. Traditionally, this scaling is done using approaches such as standardization—fixing the latent mean and standard deviation to 0 and 1 for all dimensions. However, approaches such as standardization do not guarantee that Rasch model properties hold across dimensions. Specifically, for between-item multidimensional Rasch family models, the unique ordering of items holds within dimensions, but not across dimensions. Previously, Feuerstahler and Wilson described the concept of scale alignment, which aims to enforce the unique ordering of items across dimensions by linearly transforming item parameters within dimensions. In this article, we extend the concept of scale alignment to the between-item multidimensional partial credit model and to models fit using incomplete data. We illustrate this method in the context of the Kindergarten Individual Development Survey (KIDS), a multidimensional survey of kindergarten readiness used in the state of Illinois. We also present simulation results that demonstrate the effectiveness of scale alignment in the context of polytomous item response models and missing data.


2015 ◽  
Vol 7 (1) ◽  
pp. 59 ◽  
Author(s):  
Solange Muglia Wechsler ◽  
Tatiana De Cassia Nakano ◽  
Simone Ferreira da Silva Domingues ◽  
Helena Rinaldi Rosa ◽  
Roselaine Berenice Ferreira da Silva ◽  
...  

This study aimed to determine whether performance on tests of crystallized intelligence is affected by gender and to ascertain whether differential item parameters could account for the gender disparities. The sample comprised 1.191 individuals (55% women) between the ages of 16 and 77 years old (M=22; SD=9.5). The participants were primarily college students (58.3%) living in four Brazilian states. Four verbal tests measuring crystallized intelligence (vocabulary, synonyms, antonyms and verbal analogies) were constructed and administered in a group setting. An analysis of variance revealed no significant differences in the overall performance between men and women. However, a differential item functioning analysis indicated significant differences on 8.7% of the items, which indicates the existence of gender bias. Because bias can limit women’s access to social opportunities, the results obtained indicate the importance of reducing item bias in cognitive measures to ensure the accuracy of test results.


Sign in / Sign up

Export Citation Format

Share Document