item parameter estimates
Recently Published Documents


TOTAL DOCUMENTS

34
(FIVE YEARS 8)

H-INDEX

10
(FIVE YEARS 0)

Psych ◽  
2021 ◽  
Vol 3 (3) ◽  
pp. 279-307
Author(s):  
Jan Steinfeld ◽  
Alexander Robitzsch

There is some debate in the psychometric literature about item parameter estimation in multistage designs. It is occasionally argued that the conditional maximum likelihood (CML) method is superior to the marginal maximum likelihood method (MML) because no assumptions have to be made about the trait distribution. However, CML estimation in its original formulation leads to biased item parameter estimates. Zwitser and Maris (2015, Psychometrika) proposed a modified conditional maximum likelihood estimation method for multistage designs that provides practically unbiased item parameter estimates. In this article, the differences between different estimation approaches for multistage designs were investigated in a simulation study. Four different estimation conditions (CML, CML estimation with the consideration of the respective MST design, MML with the assumption of a normal distribution, and MML with log-linear smoothing) were examined using a simulation study, considering different multistage designs, number of items, sample size, and trait distributions. The results showed that in the case of the substantial violation of the normal distribution, the CML method seemed to be preferable to MML estimation employing a misspecified normal trait distribution, especially if the number of items and sample size increased. However, MML estimation using log-linear smoothing lea to results that were very similar to the CML method with the consideration of the respective MST design.


2021 ◽  
pp. 001316442110204
Author(s):  
Kang Xue ◽  
Anne Corinne Huggins-Manley ◽  
Walter Leite

In data collected from virtual learning environments (VLEs), item response theory (IRT) models can be used to guide the ongoing measurement of student ability. However, such applications of IRT rely on unbiased item parameter estimates associated with test items in the VLE. Without formal piloting of the items, one can expect a large amount of nonignorable missing data in the VLE log file data, and this is expected to negatively affect IRT item parameter estimation accuracy, which then negatively affects any future ability estimates utilized in the VLE. In the psychometric literature, methods for handling missing data have been studied mostly around conditions in which the data and the amount of missing data are not as large as those that come from VLEs. In this article, we introduce a semisupervised learning method to deal with a large proportion of missingness contained in VLE data from which one needs to obtain unbiased item parameter estimates. First, we explored the factors relating to the missing data. Then we implemented a semisupervised learning method under the two-parameter logistic IRT model to estimate the latent abilities of students. Last, we applied two adjustment methods designed to reduce bias in item parameter estimates. The proposed framework showed its potential for obtaining unbiased item parameter estimates that can then be fixed in the VLE in order to obtain ongoing ability estimates for operational purposes.


2021 ◽  
Author(s):  
Jan Steinfeld ◽  
Alexander Robitzsch

This article describes the conditional maximum likelihood-based item parameter estimation in probabilistic multistage designs. In probabilistic multistage designs, the routing is not solely based on a raw score j and a cut score c as well as a rule for routing into a module such as j < c or j ≤ c but is based on a probability p(j) for each raw score j. It can be shown that the use of a conventional conditional maximum likelihood parameter estimate in multistage designs leads to severely biased item parameter estimates. Zwitser and Maris (2013) were able to show that with deterministic routing, the integration of the design into the item parameter estimation leads to unbiased estimates. This article extends this approach to probabilistic routing and, at the same time, represents a generalization. In a simulation study, it is shown that the item parameter estimation in probabilistic designs leads to unbiased item parameter estimates.


2021 ◽  
Author(s):  
Jan Steinfeld ◽  
Alexander Robitzsch

This article describes the conditional maximum likelihood-based item parameter estimation in probabilistic multistage designs. In probabilistic multistage designs, the routing is not solely based on a raw score j and a cut score c as well as a rule for routing into a module such as j < c or j ≤ c but is based on a probability p(j) for each raw score j. It can be shown that the use of a conventional conditional maximum likelihood parameter estimate in multistage designs leads to severely biased item parameter estimates. Zwitser and Maris (2013) were able to show that with deterministic routing, the integration of the design into the item parameter estimation leads to unbiased estimates. This article extends this approach to probabilistic routing and, at the same time, represents a generalization. In a simulation study, it is shown that the item parameter estimation in probabilistic designs leads to unbiased item parameter estimates.


2020 ◽  
pp. 014662162097768
Author(s):  
Wenchao Ma ◽  
Zhehan Jiang

Despite the increasing popularity, cognitive diagnosis models have been criticized for limited utility for small samples. In this study, the authors proposed to use Bayes modal (BM) estimation and monotonic constraints to stabilize item parameter estimation and facilitate person classification in small samples based on the generalized deterministic input noisy “and” gate (G-DINA) model. Both simulation study and real data analysis were used to assess the utility of the BM estimation and monotonic constraints. Results showed that in small samples, (a) the G-DINA model with BM estimation is more likely to converge successfully, (b) when prior distributions are specified reasonably, and monotonicity is not violated, the BM estimation with monotonicity tends to produce more stable item parameter estimates and more accurate person classification, and (c) the G-DINA model using the BM estimation with monotonicity is less likely to overfit the data and shows higher predictive power.


2019 ◽  
Vol 44 (4) ◽  
pp. 296-310
Author(s):  
Yong He ◽  
Zhongmin Cui

Item parameter estimates of a common item on a new test form may change abnormally due to reasons such as item overexposure or change of curriculum. A common item, whose change does not fit the pattern implied by the normally behaved common items, is defined as an outlier. Although improving equating accuracy, detecting and eliminating of outliers may cause a content imbalance among common items. Robust scale transformation methods have recently been proposed to solve this problem when only one outlier is present in the data, although it is not uncommon to see multiple outliers in practice. In this simulation study, the authors examined the robust scale transformation methods under conditions where there were multiple outlying common items. Results indicated that the robust scale transformation methods could reduce the influences of multiple outliers on scale transformation and equating. The robust methods performed similarly to a traditional outlier detection and elimination method in terms of reducing the influence of outliers while keeping adequate content balance.


2019 ◽  
Vol 45 (4) ◽  
pp. 383-402
Author(s):  
Paul A. Jewsbury ◽  
Peter W. van Rijn

In large-scale educational assessment data consistent with a simple-structure multidimensional item response theory (MIRT) model, where every item measures only one latent variable, separate unidimensional item response theory (UIRT) models for each latent variable are often calibrated for practical reasons. While this approach can be valid for data from a linear test, unacceptable item parameter estimates are obtained when data arise from a multistage test (MST). We explore this situation from a missing data perspective and show mathematically that MST data will be problematic for calibrating multiple UIRT models but not MIRT models. This occurs because some items that were used in the routing decision are excluded from the separate UIRT models, due to measuring a different latent variable. Both simulated and real data from the National Assessment of Educational Progress are used to further confirm and explore the unacceptable item parameter estimates. The theoretical and empirical results confirm that only MIRT models are valid for item calibration of multidimensional MST data.


2018 ◽  
Vol 43 (7) ◽  
pp. 512-526
Author(s):  
Kyung Yong Kim

When calibrating items using multidimensional item response theory (MIRT) models, item response theory (IRT) calibration programs typically set the probability density of latent variables to a multivariate standard normal distribution to handle three types of indeterminacies: (a) the location of the origin, (b) the unit of measurement along each coordinate axis, and (c) the orientation of the coordinate axes. However, by doing so, item parameter estimates obtained from two independent calibration runs on nonequivalent groups are on two different coordinate systems. To handle this issue and place all the item parameter estimates on a common coordinate system, a process called linking is necessary. Although various linking methods have been introduced and studied for the full MIRT model, little research has been conducted on linking methods for the bifactor model. Thus, the purpose of this study was to provide detailed descriptions of two separate calibration methods and the concurrent calibration method for the bifactor model and to compare the three linking methods through simulation. In general, the concurrent calibration method provided more accurate linking results than the two separate calibration methods, demonstrating better recovery of the item parameters, item characteristic surfaces, and expected score distribution.


2016 ◽  
Vol 77 (3) ◽  
pp. 389-414 ◽  
Author(s):  
Yin Lin ◽  
Anna Brown

A fundamental assumption in computerized adaptive testing is that item parameters are invariant with respect to context—items surrounding the administered item. This assumption, however, may not hold in forced-choice (FC) assessments, where explicit comparisons are made between items included in the same block. We empirically examined the influence of context on item parameters by comparing parameter estimates from two FC instruments. The first instrument was composed of blocks of three items, whereas in the second, the context was manipulated by adding one item to each block, resulting in blocks of four. The item parameter estimates were highly similar. However, a small number of significant deviations were observed, confirming the importance of context when designing adaptive FC assessments. Two patterns of such deviations were identified, and methods to reduce their occurrences in an FC computerized adaptive testing setting were proposed. It was shown that with a small proportion of violations of the parameter invariance assumption, score estimation remained stable.


Sign in / Sign up

Export Citation Format

Share Document