Clustering students according to their proficiency: a comparison between different approaches based on item response theory models

2021 ◽  
pp. 43-48
Author(s):  
Rosa Fabbricatore ◽  
Francesco Palumbo

Evaluating learners' competencies is a crucial concern in education, and home and classroom structured tests represent an effective assessment tool. Structured tests consist of sets of items that can refer to several abilities or more than one topic. Several statistical approaches allow evaluating students considering the items in a multidimensional way, accounting for their structure. According to the evaluation's ending aim, the assessment process assigns a final grade to each student or clusters students in homogeneous groups according to their level of mastery and ability. The latter represents a helpful tool for developing tailored recommendations and remediations for each group. At this aim, latent class models represent a reference. In the item response theory (IRT) paradigm, the multidimensional latent class IRT models, releasing both the traditional constraints of unidimensionality and continuous nature of the latent trait, allow to detect sub-populations of homogeneous students according to their proficiency level also accounting for the multidimensional nature of their ability. Moreover, the semi-parametric formulation leads to several advantages in practice: It avoids normality assumptions that may not hold and reduces the computation demanding. This study compares the results of the multidimensional latent class IRT models with those obtained by a two-step procedure, which consists of firstly modeling a multidimensional IRT model to estimate students' ability and then applying a clustering algorithm to classify students accordingly. Regarding the latter, parametric and non-parametric approaches were considered. Data refer to the admission test for the degree course in psychology exploited in 2014 at the University of Naples Federico II. Students involved were N=944, and their ability dimensions were defined according to the domains assessed by the entrance exam, namely Humanities, Reading and Comprehension, Mathematics, Science, and English. In particular, a multidimensional two-parameter logistic IRT model for dichotomously-scored items was considered for students' ability estimation.

2019 ◽  
Vol 45 (3) ◽  
pp. 339-368 ◽  
Author(s):  
Chun Wang ◽  
Steven W. Nydick

Recent work on measuring growth with categorical outcome variables has combined the item response theory (IRT) measurement model with the latent growth curve model and extended the assessment of growth to multidimensional IRT models and higher order IRT models. However, there is a lack of synthetic studies that clearly evaluate the strength and limitations of different multilevel IRT models for measuring growth. This study aims to introduce the various longitudinal IRT models, including the longitudinal unidimensional IRT model, longitudinal multidimensional IRT model, and longitudinal higher order IRT model, which cover a broad range of applications in education and social science. Following a comparison of the parameterizations, identification constraints, strengths, and weaknesses of the different models, a real data example is provided to illustrate the application of different longitudinal IRT models to model students’ growth trajectories on multiple latent abilities.


2018 ◽  
Vol 29 (1) ◽  
pp. 35-44
Author(s):  
Nell Sedransk

This article is about FMCSA data and its analysis. The article responds to the two-part question: How does an Item Response Theory (IRT) model work differently . . . or better than any other model? The response to the first part is a careful, completely non-technical exposition of the fundamentals for IRT models. It differentiates IRT models from other models by providing the rationale underlying IRT modeling and by using graphs to illustrate two key properties for data items. The response to the second part of the question about superiority of an IRT model is, “it depends.” For FMCSA data, serious challenges arise from complexity of the data and from heterogeneity of the carrier industry. Questions are posed that will need to be addressed to determine the success of the actual model developed and of the scoring system.


2017 ◽  
Vol 78 (3) ◽  
pp. 384-408 ◽  
Author(s):  
Yong Luo ◽  
Hong Jiao

Stan is a new Bayesian statistical software program that implements the powerful and efficient Hamiltonian Monte Carlo (HMC) algorithm. To date there is not a source that systematically provides Stan code for various item response theory (IRT) models. This article provides Stan code for three representative IRT models, including the three-parameter logistic IRT model, the graded response model, and the nominal response model. We demonstrate how IRT model comparison can be conducted with Stan and how the provided Stan code for simple IRT models can be easily extended to their multidimensional and multilevel cases.


2021 ◽  
Vol 117 ◽  
pp. 106849
Author(s):  
Danilo Carrozzino ◽  
Kaj Sparle Christensen ◽  
Giovanni Mansueto ◽  
Fiammetta Cosci

2021 ◽  
Vol 8 (3) ◽  
pp. 672-695
Author(s):  
Thomas DeVaney

This article presents a discussion and illustration of Mokken scale analysis (MSA), a nonparametric form of item response theory (IRT), in relation to common IRT models such as Rasch and Guttman scaling. The procedure can be used for dichotomous and ordinal polytomous data commonly used with questionnaires. The assumptions of MSA are discussed as well as characteristics that differentiate a Mokken scale from a Guttman scale. MSA is illustrated using the mokken package with R Studio and a data set that included over 3,340 responses to a modified version of the Statistical Anxiety Rating Scale. Issues addressed in the illustration include monotonicity, scalability, and invariant ordering. The R script for the illustration is included.


2020 ◽  
Vol 44 (7-8) ◽  
pp. 566-567
Author(s):  
Shaoyang Guo ◽  
Chanjin Zheng ◽  
Justin L. Kern

A recently released R package IRTBEMM is presented in this article. This package puts together several new estimation algorithms (Bayesian EMM, Bayesian E3M, and their maximum likelihood versions) for the Item Response Theory (IRT) models with guessing and slipping parameters (e.g., 3PL, 4PL, 1PL-G, and 1PL-AG models). IRTBEMM should be of interest to the researchers in IRT estimation and applying IRT models with the guessing and slipping effects to real datasets.


2017 ◽  
Vol 43 (3) ◽  
pp. 259-285 ◽  
Author(s):  
Yang Liu ◽  
Ji Seung Yang

The uncertainty arising from item parameter estimation is often not negligible and must be accounted for when calculating latent variable (LV) scores in item response theory (IRT). It is particularly so when the calibration sample size is limited and/or the calibration IRT model is complex. In the current work, we treat two-stage IRT scoring as a predictive inference problem: The target of prediction is a random variable that follows the true posterior of the LV conditional on the response pattern being scored. Various Bayesian, fiducial, and frequentist prediction intervals of LV scores, which can be obtained from a simple yet generic Monte Carlo recipe, are evaluated and contrasted via simulations based on several measures of prediction quality. An empirical data example is also presented to illustrate the use of candidate methods.


Sign in / Sign up

Export Citation Format

Share Document