scholarly journals About the Equivalence of the Latent D-Scoring Model and the Two-Parameter Logistic Item Response Model

Mathematics ◽  
2021 ◽  
Vol 9 (13) ◽  
pp. 1465
Author(s):  
Alexander Robitzsch

This article shows that the recently proposed latent D-scoring model of Dimitrov is statistically equivalent to the two-parameter logistic item response model. An analytical derivation and a numerical illustration are employed for demonstrating this finding. Hence, estimation techniques for the two-parameter logistic model can be used for estimating the latent D-scoring model. In an empirical example using PISA data, differences of country ranks are investigated when using different metrics for the latent trait. In the example, the choice of the latent trait metric matters for the ranking of countries. Finally, it is argued that an item response model with bounded latent trait values like the latent D-scoring model might have advantages for reporting results in terms of interpretation.

Author(s):  
Alexander Robitzsch

This article shows that the recently proposed latent D-scoring model of Dimitrov is statistically equivalent to the two-parameter logistic item response model. An analytical derivation and a numerical illustration are employed for demonstrating this finding. Hence, estimation techniques for the two-parameter logistic model can be used for estimating the latent D-scoring model. In an empirical example using PISA data, differences of country ranks are investigated when using different metrics for the latent trait. In the example, the choice of the latent trait metric matters for the ranking of countries. Finally, it is argued that an item response model with bounded latent trait values like the latent D-scoring model might have advantages for reporting results in terms of interpretation.


2019 ◽  
Vol 80 (3) ◽  
pp. 604-612
Author(s):  
Tenko Raykov ◽  
George A. Marcoulides

This note raises caution that a finding of a marked pseudo-guessing parameter for an item within a three-parameter item response model could be spurious in a population with substantial unobserved heterogeneity. A numerical example is presented wherein each of two classes the two-parameter logistic model is used to generate the data on a multi-item measuring instrument, while the three-parameter logistic model is found to be associated with a considerable pseudo-guessing parameter estimate on an item. The implications of the reported results for empirical educational research are subsequently discussed.


Author(s):  
Martin Kanovský ◽  
Júlia Halamová ◽  
David C. Zuroff ◽  
Nicholas A. Troop ◽  
Paul Gilbert ◽  
...  

Abstract. The aim of this study was to test the multilevel multidimensional finite mixture item response model of the Forms of Self-Criticising/Attacking and Self-Reassuring Scale (FSCRS) to cluster respondents and countries from 13 samples ( N = 7,714) and from 12 countries. The practical goal was to learn how many discrete classes there are on the level of individuals (i.e., how many cut-offs are to be used) and countries (i.e., the magnitude of similarities and dissimilarities among them). We employed the multilevel multidimensional finite mixture approach which is based on an extended class of multidimensional latent class Item Response Theory (IRT) models. Individuals and countries are partitioned into discrete latent classes with different levels of self-criticism and self-reassurance, taking into account at the same time the multidimensional structure of the construct. This approach was applied to the analysis of the relationships between observed characteristics and latent trait at different levels (individuals and countries), and across different dimensions using the three-dimensional measure of the FSCRS. Results showed that respondents’ scores were dependent on unobserved (latent class) individual and country membership, the multidimensional structure of the instrument, and justified the use of a multilevel multidimensional finite mixture item response model in the comparative psychological assessment of individuals and countries. Latent class analysis of the FSCRS showed that individual participants and countries could be divided into discrete classes. Along with the previous findings that the FSCRS is psychometrically robust we can recommend using the FSCRS for measuring self-criticism.


2018 ◽  
Vol 43 (1) ◽  
pp. 84-88
Author(s):  
Insu Paek ◽  
Jie Xu ◽  
Zhongtian Lin

When considering the two-parameter or the three-parameter logistic model for item responses from a multiple-choice test, one may want to assess the need for the lower asymptote parameters in the item response function and make sure the use of the three-parameter item response model. This study reports the degree of sensitivity of an overall model test M2 to detecting the presence of nonzero asymptotes in the item response function under normal and nonnormal ability distribution conditions.


2004 ◽  
Vol 35 (4) ◽  
pp. 475-487 ◽  
Author(s):  
STEVEN H. AGGEN ◽  
MICHAEL C. NEALE ◽  
KENNETH S. KENDLER

Background. Expert committees of clinicians have chosen diagnostic criteria for psychiatric disorders with little guidance from measurement theory or modern psychometric methods. The DSM-III-R criteria for major depression (MD) are examined to determine the degree to which latent trait item response models can extract additional useful information.Method. The dimensionality and measurement properties of the 9 DSM-III-R criteria plus duration are evaluated using dichotomous factor analysis and the Rasch and 2 parameter logistic item response models. Quantitative liability scales are compared with a binary DSM-III-R diagnostic algorithm variable to determine the ramifications of using each approach.Results. Factor and item response model results indicated the 10 MD criteria defined a reasonably coherent unidimensional scale of liability. However, person risk measurement was not optimal. Criteria thresholds were unevenly spaced leaving scale regions poorly measured. Criteria varied in discriminating levels of risk. Compared to a binary MD diagnosis, item response model (IRM) liability scales performed far better in (i) elucidating the relationship between MD symptoms and liability, (ii) predicting the personality trait of neuroticism and future depressive episodes and (iii) more precisely estimating heritability parameters.Conclusions. Criteria for MD largely defined a single dimension of disease liability although the quality of person risk measurement was less clear. The quantitative item response scales were statistically superior in predicting relevant outcomes and estimating twin model parameters. Item response models that treat symptoms as ordered indicators of risk rather than as counts towards a diagnostic threshold more fully exploit the information available in symptom endorsement data patterns.


2012 ◽  
Vol 2012 ◽  
pp. 1-14 ◽  
Author(s):  
Yanyan Sheng ◽  
Todd C. Headrick

Current procedures for estimating compensatory multidimensional item response theory (MIRT) models using Markov chain Monte Carlo (MCMC) techniques are inadequate in that they do not directly model the interrelationship between latent traits. This limits the implementation of the model in various applications and further prevents the development of other types of IRT models that offer advantages not realized in existing models. In view of this, an MCMC algorithm is proposed for MIRT models so that the actual latent structure is directly modeled. It is demonstrated that the algorithm performs well in modeling parameters as well as intertrait correlations and that the MIRT model can be used to explore the relative importance of a latent trait in answering each test item.


Foundations ◽  
2021 ◽  
Vol 1 (1) ◽  
pp. 116-144
Author(s):  
Alexander Robitzsch

This article investigates the comparison of two groups based on the two-parameter logistic item response model. It is assumed that there is random differential item functioning in item difficulties and item discriminations. The group difference is estimated using separate calibration with subsequent linking, as well as concurrent calibration. The following linking methods are compared: mean-mean linking, log-mean-mean linking, invariance alignment, Haberman linking, asymmetric and symmetric Haebara linking, different recalibration linking methods, anchored item parameters, and concurrent calibration. It is analytically shown that log-mean-mean linking and mean-mean linking provide consistent estimates if random DIF effects have zero means. The performance of the linking methods was evaluated through a simulation study. It turned out that (log-)mean-mean and Haberman linking performed best, followed by symmetric Haebara linking and a newly proposed recalibration linking method. Interestingly, linking methods frequently found in applications (i.e., asymmetric Haebara linking, recalibration linking used in a variant in current large-scale assessment studies, anchored item parameters, concurrent calibration) perform worse in the presence of random differential item functioning. In line with the previous literature, differences between linking methods turned out be negligible in the absence of random differential item functioning. The different linking methods were also applied in an empirical example that performed a linking of PISA 2006 to PISA 2009 for Austrian students. This application showed that estimated trends in the means and standard deviations depended on the chosen linking method and the employed item response model.


SAGE Open ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 215824402110525
Author(s):  
Chanjin Zheng ◽  
Shaoyang Guo ◽  
Justin L Kern

There is a rekindled interest in the four-parameter logistic item response model (4PLM) after three decades of neglect among the psychometrics community. Recent breakthroughs in item calibration include the Gibbs sampler specially made for 4PLM and the Bayes modal estimation (BME) method as implemented in the R package mirt. Unfortunately, the MCMC is often time-consuming, while the BME method suffers from instability due to the prior settings. This paper proposes an alternative BME method, the Bayesian Expectation-Maximization-Maximization-Maximization (BE3M) method, which is developed from by combining an augmented variable formulation of the 4PLM and a mixture model conceptualization of the 3PLM. The simulation shows that the BE3M can produce estimates as accurately as the Gibbs sampling method and as fast as the EM algorithm. A real data example is also provided.


2020 ◽  
Vol 42 ◽  
pp. e58
Author(s):  
Ricardo Alberti ◽  
Fernando De Jesus Moreira Junior ◽  
Silvio José Lemos Vasconcellos ◽  
Felipe Coelho Argolo ◽  
Nathalia Ruviaro ◽  
...  

Background: The Emotion Recognition Index (ERI) is a psychometric tool develop to evaluate emotional perception. Its items were selected using Classical Tests Theory. Objectives and Methods: Evaluate ERI items performance under current psychometric theories, using Item Response Theory (IRT) thought a two parameter logistic model. Results: 12 items were removed from the Face Module to fulfill the unidimensionality assumption. 20 items were eliminated from the Voice Module because they presented low discrimination values. Both modules presented information curves peaks among low latent trait values. Discussion: Findings suggest ERI is mostly composed of low difficulty items. ERI is best used to evaluate individuals with low emotional perception. Authors suggest further studies should use IRT to select items for instruments evaluating emotional perception.


Sign in / Sign up

Export Citation Format

Share Document