true score equating Latest Research Papers

Equating is used to directly compare alternate forms of tests. We describe the equating of two alternative forms of a reading comprehension test for Brazilian children (2nd to 5th grade), Form A (n = 427) and Form B (n = 321). We employed non-equivalent random groups design with internal anchor items. Local independence was attested via standardized residual Pearson's bivariate correlation. First, from 176 items, we selected 42 in each form (33 unique and 9 in common) using 2PL model, a one-dimensional item response theory (IRT) model. Using the equateIRT package for R, the anchor items were used to link both forms. Linking coefficients were estimated under two different methods (Haebara and Stocking–Lord), resulting in scores equating by two methods: observed score equating (OSE) and true score equating (TSE). We provided reference-specific age-intervals for the sample. The final version was informative for a wide range of theta abilities. We concluded that the forms could be used interchangeably.

Download Full-text

Asymptotic Standard Errors of Generalized Partial Credit Model True Score Equating Using Characteristic Curve Methods

Applied Psychological Measurement ◽

10.1177/01466216211013101 ◽

2021 ◽

pp. 014662162110131

Author(s):

Zhonghua Zhang

Keyword(s):

Bootstrap Method ◽

Characteristic Curve ◽

Delta Method ◽

Standard Errors ◽

Partial Credit Model ◽

Generalized Partial Credit Model ◽

True Score Equating ◽

Generalized Partial Credit ◽

Multiple Imputation Method ◽

The Bootstrap Method

In this study, the delta method was applied to estimate the standard errors of the true score equating when using the characteristic curve methods with the generalized partial credit model in test equating under the context of the common-item nonequivalent groups equating design. Simulation studies were further conducted to compare the performance of the delta method with that of the bootstrap method and the multiple imputation method. The results indicated that the standard errors produced by the delta method were very close to the criterion empirical standard errors as well as those yielded by the bootstrap method and the multiple imputation method under all the manipulated conditions.

Download Full-text

Estimating standard errors of IRT true score equating coefficients using imputed item parameters

The Journal of Experimental Education ◽

10.1080/00220973.2020.1751579 ◽

2020 ◽

pp. 1-23 ◽

Cited By ~ 2

Author(s):

Zhonghua Zhang

Keyword(s):

Standard Errors ◽

True Score ◽

Item Parameters ◽

True Score Equating ◽

Equating Coefficients

Download Full-text

Evaluating Robust Scale Transformation Methods With Multiple Outlying Common Items Under IRT True Score Equating

Applied Psychological Measurement ◽

10.1177/0146621619886050 ◽

2019 ◽

Vol 44 (4) ◽

pp. 296-310

Author(s):

Yong He ◽

Zhongmin Cui

Keyword(s):

Item Parameter ◽

Scale Transformation ◽

Parameter Estimates ◽

Test Form ◽

Multiple Outliers ◽

Common Item ◽

Item Parameter Estimates ◽

True Score Equating ◽

Common Items ◽

Transformation Methods

Item parameter estimates of a common item on a new test form may change abnormally due to reasons such as item overexposure or change of curriculum. A common item, whose change does not fit the pattern implied by the normally behaved common items, is defined as an outlier. Although improving equating accuracy, detecting and eliminating of outliers may cause a content imbalance among common items. Robust scale transformation methods have recently been proposed to solve this problem when only one outlier is present in the data, although it is not uncommon to see multiple outliers in practice. In this simulation study, the authors examined the robust scale transformation methods under conditions where there were multiple outlying common items. Results indicated that the robust scale transformation methods could reduce the influences of multiple outliers on scale transformation and equating. The robust methods performed similarly to a traditional outlier detection and elimination method in terms of reducing the influence of outliers while keeping adequate content balance.

Download Full-text

Approximating Bifactor IRT True-Score Equating With a Projective Item Response Model

Applied Psychological Measurement ◽

10.1177/0146621619885903 ◽

2019 ◽

Vol 44 (3) ◽

pp. 215-218

Author(s):

Kyung Yong Kim ◽

Uk Hyun Cho

Keyword(s):

Numerical Integration ◽

Item Response ◽

Bifactor Model ◽

True Score ◽

Item Response Model ◽

Software Packages ◽

Item Parameters ◽

True Score Equating ◽

Specific Factors ◽

Item Response Function

Item response theory (IRT) true-score equating for the bifactor model is often conducted by first numerically integrating out specific factors from the item response function and then applying the unidimensional IRT true-score equating method to the marginalized bifactor model. However, an alternative procedure for obtaining the marginalized bifactor model is through projecting the nuisance dimensions of the bifactor model onto the dominant dimension. Projection, which can be viewed as an approximation to numerical integration, has an advantage over numerical integration in providing item parameters for the marginalized bifactor model; therefore, projection could be used with existing equating software packages that require item parameters. In this paper, IRT true-score equating results obtained with projection are compared to those obtained with numerical integration. Simulation results show that the two procedures provide very similar equating results.

Download Full-text

Beta observed score and true score equating methods

10.17077/etd.fya4-9gsj ◽

2019 ◽

Author(s):

Shichao Wang

Keyword(s):

True Score ◽

True Score Equating ◽

Observed Score

Download Full-text

Simple-Structure Multidimensional Item Response Theory Equating for Multidimensional Tests

Educational and Psychological Measurement ◽

10.1177/0013164419854208 ◽

2019 ◽

Vol 80 (1) ◽

pp. 91-125

Author(s):

Stella Y. Kim ◽

Won-Chan Lee ◽

Michael J. Kolen

Keyword(s):

Item Response Theory ◽

Item Response ◽

Simple Structure ◽

Multidimensional Item Response Theory ◽

Multidimensional Data ◽

True Score ◽

Multidimensional Item Response ◽

Response Theory ◽

Data Types ◽

True Score Equating

A theoretical and conceptual framework for true-score equating using a simple-structure multidimensional item response theory (SS-MIRT) model is developed. A true-score equating method, referred to as the SS-MIRT true-score equating (SMT) procedure, also is developed. SS-MIRT has several advantages over other complex multidimensional item response theory models including improved efficiency in estimation and straightforward interpretability. The performance of the SMT procedure was examined and evaluated through four studies using different data types. In these studies, results from the SMT procedure were compared with results from four other equating methods to assess the relative benefits of SMT compared with the other procedures. In general, SMT showed more accurate equating results compared with the traditional unidimensional IRT (UIRT) equating when the data were multidimensional. More accurate performance of SMT over UIRT true-score equating was consistently observed across the studies, which supports the benefits of a multidimensional approach in equating for multidimensional data. Also, SMT performed similarly to a SS-MIRT observed score method across all studies.

Download Full-text

An Approach to Scoring and Equating Tests With Binary Items

Educational and Psychological Measurement ◽

10.1177/0013164416631100 ◽

2016 ◽

Vol 76 (6) ◽

pp. 954-975 ◽

Cited By ~ 7

Author(s):

Dimiter M. Dimitrov

Keyword(s):

Item Response ◽

Large Scale ◽

Test Characteristic ◽

Test Items ◽

Item Parameters ◽

Large Scale Testing ◽

True Score Equating ◽

Large Scale Assessments ◽

Response Vector ◽

Better Than

This article describes an approach to test scoring, referred to as delta scoring ( D-scoring), for tests with dichotomously scored items. The D-scoring uses information from item response theory (IRT) calibration to facilitate computations and interpretations in the context of large-scale assessments. The D-score is computed from the examinee’s response vector, which is weighted by the expected difficulties (not “easiness”) of the test items. The expected difficulty of each item is obtained as an analytic function of its IRT parameters. The D-scores are independent of the sample of test-takers as they are based on expected item difficulties. It is shown that the D-scale performs a good bit better than the IRT logit scale by criteria of scale intervalness. To equate D-scales, it is sufficient to rescale the item parameters, thus avoiding tedious and error-prone procedures of mapping test characteristic curves under the method of IRT true score equating, which is often used in the practice of large-scale testing. The proposed D-scaling proved promising under its current piloting with large-scale assessments and the hope is that it can efficiently complement IRT procedures in the practice of large-scale testing in the field of education and psychology.

Download Full-text

Modified Robust Z method for equating and detecting item parameter drift

Research and Evaluation in Education ◽

10.21831/reid.v1i1.4901 ◽

2015 ◽

Vol 1 (1) ◽

pp. 100

Author(s):

Rahmawati Rahmawati ◽

Djemari Mardapi

Keyword(s):

Empirical Data ◽

Simulated Data ◽

Minimal Length ◽

Item Parameter ◽

True Score ◽

Item Parameter Drift ◽

Parameter Drift ◽

Classification Consistency ◽

True Score Equating ◽

Mathematics And Science

This study is aimed at: (1) revising the criterion used in Robust Z Method for detecting Item Parameter Drift (IPD), (2) identifying the strengths and weaknesses of the modified Robust Z Method, and (3) investigating the effect of IPD on examinees’ classification consistency using empirical data. This study used two types of data. The simulated data were in the form of responses of 20,000 students on 40 dichotomous items generated by simulating six variables including: (1) ability distribution, (2) differences of groups’ ability between groups, (3) type of drifting, (4) magnitude of drifting, (5) anchor test length, and (6) number of drifting items. The empirical data was 4,187,444 students’ response of UN SD/MI 2011 who administered 41 test forms of Indonesian language, mathematics, and science. Modified Robust Z method was used to detect IPD and the IRT true score equating method was used to analyze the classification consistency. The results of this study show that: (1) the criterion of 0.5 point raw score TCC difference leads to 100% consistency on passing classification, (2) the modified Robust Z is accurate to detect the b and ab- drifting when the minimal length of anchor test is 25%, (3) IPD occurring on empirical data affected the passing status of more than 2,000 students.

Download Full-text

Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items

Journal of Educational Measurement ◽

10.1111/jedm.12065 ◽

2015 ◽

Vol 52 (1) ◽

pp. 106-120 ◽

Cited By ~ 6

Author(s):

Cheow Cher Wong

Keyword(s):

Item Response Theory ◽

Item Response ◽

Standard Errors ◽

True Score ◽

Response Theory ◽

Polytomous Items ◽

Asymptotic Standard Errors ◽

True Score Equating

Download Full-text

true score equating
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Reading Comprehension Tests for Children: Test Equating and Specific Age-Interval Reports

Asymptotic Standard Errors of Generalized Partial Credit Model True Score Equating Using Characteristic Curve Methods

Estimating standard errors of IRT true score equating coefficients using imputed item parameters

Evaluating Robust Scale Transformation Methods With Multiple Outlying Common Items Under IRT True Score Equating

Approximating Bifactor IRT True-Score Equating With a Projective Item Response Model

Beta observed score and true score equating methods

Simple-Structure Multidimensional Item Response Theory Equating for Multidimensional Tests

An Approach to Scoring and Equating Tests With Binary Items

Modified Robust Z method for equating and detecting item parameter drift

Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items

Export Citation Format

true score equatingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Reading Comprehension Tests for Children: Test Equating and Specific Age-Interval Reports

Asymptotic Standard Errors of Generalized Partial Credit Model True Score Equating Using Characteristic Curve Methods

Estimating standard errors of IRT true score equating coefficients using imputed item parameters

Evaluating Robust Scale Transformation Methods With Multiple Outlying Common Items Under IRT True Score Equating

Approximating Bifactor IRT True-Score Equating With a Projective Item Response Model

Beta observed score and true score equating methods

Simple-Structure Multidimensional Item Response Theory Equating for Multidimensional Tests

An Approach to Scoring and Equating Tests With Binary Items

Modified Robust Z method for equating and detecting item parameter drift

Asymptotic Standard Errors for Item Response Theory True Score Equating of Polytomous Items

true score equating
Recently Published Documents