An Extension of Testlet-Based Equating to the Polytomous Testlet Response Theory Model

Educational assessments tests are often constructed using testlets because of the flexibility to test various aspects of the cognitive activities and broad content sampling. However, the violation of the local item independence assumption is inevitable when tests are built using testlet items. In this study, simulations are conducted to evaluate the performance of item response theory models and testlet response theory models for both the dichotomous and polytomous items in the context of equating tests composed of testlets. We also examine the impact of testlet effect, length of testlet items, and sample size on estimating item and person parameters. The results show that more accurate performance of testlet response theory models over item response theory models was consistently observed across the studies, which supports the benefits of using the testlet response theory models in equating for tests composed of testlets. Further, results of the study indicate that when sample size is large, item response theory models performed similarly to testlet response theory models across all studies.

Download Full-text

Examination of Different Item Response Theory Models on Tests Composed of Testlets

Journal of Education and Learning ◽

10.5539/jel.v6n4p113 ◽

2017 ◽

Vol 6 (4) ◽

pp. 113

Author(s):

Esin Yilmaz Kogar ◽

Hülya Kelecioglu

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Data Sets ◽

Meaningful Difference ◽

Response Theory ◽

Size Change ◽

Data Set ◽

Testlet Response Theory ◽

Item Response Theory Models

The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and sample size change, and then to compare the obtained results. Mathematic test in PISA 2012 was employed as the data collection tool, and 36 items were used to constitute six different data sets containing different numbers of testlets and independent items. Subsequently, from these constituted data sets, three different sample sizes of 250, 500 and 1000 persons were selected randomly. When the findings of the research were examined, it was determined that, generally the lowest mean error values were those obtained from UIRT, and TRT yielded a mean of error estimation lower than that of BIF. It was found that, under all conditions, models which take into consideration the local dependency have provided a better model-data compatibility than UIRT, generally there is no meaningful difference between BIF and TRT, and both models can be used for those data sets. It can be said that when there is a meaningful difference between those two models, generally BIF yields a better result. In addition, it has been determined that, in each sample size and data set, item and ability parameters and correlations of errors of the parameters are generally high.

Download Full-text

Multidimensional Componential Item Response Theory Models for Polytomous Items

Applied Psychological Measurement ◽

10.1177/01466216010251002 ◽

2001 ◽

Vol 25 (1) ◽

pp. 19-37 ◽

Cited By ~ 19

Author(s):

Machteld Hoskens ◽

Paul De Boeck

Keyword(s):

Item Response Theory ◽

Item Response ◽

Response Theory ◽

Polytomous Items ◽

Item Response Theory Models

Download Full-text

Towards power and sample size calculations for the comparison of two groups of patients with item response theory models

Statistics in Medicine ◽

10.1002/sim.4387 ◽

2011 ◽

Vol 31 (11-12) ◽

pp. 1277-1290 ◽

Cited By ~ 9

Author(s):

Jean-Benoit Hardouin ◽

Sarah Amri ◽

Mohand-Larbi Feddag ◽

Véronique Sébille

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Response Theory ◽

Sample Size Calculations ◽

Item Response Theory Models

Download Full-text

Evaluating the Impact of Multidimensionality on Unidimensional Item Response Theory Model Parameters

Handbook of Item Response Theory Modeling ◽

10.4324/9781315736013-11 ◽

2014 ◽

pp. 31-58

Keyword(s):

Item Response Theory ◽

Item Response ◽

Theory Model ◽

Item Response Theory Model ◽

Model Parameters ◽

Response Theory ◽

The Impact

Download Full-text

The Impact of Varied Discrimination Parameters on Mixed-Format Item Response Theory Model Selection

Educational and Psychological Measurement ◽

10.1177/0013164412472188 ◽

2013 ◽

Vol 73 (3) ◽

pp. 471-490 ◽

Cited By ~ 1

Author(s):

Tiffany A. Whittaker ◽

Wanchen Chang ◽

Barbara G. Dodd

Keyword(s):

Item Response Theory ◽

Model Selection ◽

Item Response ◽

Theory Model ◽

Item Response Theory Model ◽

Response Theory ◽

The Impact ◽

Mixed Format

Download Full-text

Target Rotations and Assessing the Impact of Model Violations on the Parameters of Unidimensional Item Response Theory Models

Educational and Psychological Measurement ◽

10.1177/0013164410378690 ◽

2011 ◽

Vol 71 (4) ◽

pp. 684-711 ◽

Cited By ~ 39

Author(s):

Steven Reise ◽

Tyler Moore ◽

Alberto Maydeu-Olivares

Keyword(s):

Item Response Theory ◽

Item Response ◽

Response Theory ◽

The Impact ◽

Item Response Theory Models

Download Full-text

The Impact of Ignoring Multilevel Data Structure on the Estimation of Dichotomous Item Response Theory Models

International Journal of Assessment Tools in Education ◽

10.21449/ijate.523586 ◽

2019 ◽

Vol 6 (1 (pre-print issue)) ◽

pp. 92-108

Author(s):

Hyung Rock Lee ◽

Sunbok Lee ◽

Jaeyun Sung

Keyword(s):

Data Structure ◽

Item Response Theory ◽

Item Response ◽

Multilevel Data ◽

Response Theory ◽

The Impact ◽

Item Response Theory Models ◽

Dichotomous Item

Download Full-text

What makes experts reliable? Expert reliability and the estimation of latent traits

Research & Politics ◽

10.1177/2053168019879561 ◽

2019 ◽

Vol 6 (4) ◽

pp. 205316801987956 ◽

Cited By ~ 1

Author(s):

Kyle L. Marquardt ◽

Daniel Pemstein ◽

Brigitte Seim ◽

Yi-ting Wang

Keyword(s):

Item Response Theory ◽

Item Response ◽

Theory Model ◽

Contextual Knowledge ◽

Response Theory ◽

Factors Affecting ◽

Reliability Estimates ◽

Level Data ◽

Latent Traits ◽

Item Response Theory Models

Experts code latent quantities for many influential political science datasets. Although scholars are aware of the importance of accounting for variation in expert reliability when aggregating such data, they have not systematically explored either the factors affecting expert reliability or the degree to which these factors influence estimates of latent concepts. Here we provide a template for examining potential correlates of expert reliability, using coder-level data for six randomly selected variables from a cross-national panel dataset. We aggregate these data with an ordinal item response theory model that parameterizes expert reliability, and regress the resulting reliability estimates on both expert demographic characteristics and measures of their coding behavior. We find little evidence of a consistent substantial relationship between most expert characteristics and reliability, and these null results extend to potentially problematic sources of bias in estimates, such as gender. The exceptions to these results are intuitive, and provide baseline guidance for expert recruitment and retention in future expert coding projects: attentive and confident experts who have contextual knowledge tend to be more reliable. Taken as a whole, these findings reinforce arguments that item response theory models are a relatively safe method for aggregating expert-coded data.

Download Full-text