item calibration
Recently Published Documents


TOTAL DOCUMENTS

43
(FIVE YEARS 13)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Wim J. van der Linden

AbstractConstrained adaptive testing is reviewed as an instance of discrete maximization with the shadow-test approach delivering its solution. The approach may look counterintuitive in that it assumes sequential assembly of full test forms as its basic operation. But it always produces real-time solutions that are optimal and satisfy the set of specifications in effect for the test. Equally importantly, it can be used to run testing programs with different degrees of adaptation for the same set of specifications and/or as a tool to manage programs with simultaneous processes as adaptive item calibration, time management, and/or item-security monitoring.


2021 ◽  
pp. 107699862098390
Author(s):  
Seungwon Chung ◽  
Li Cai

In the research reported here, we propose a new method for scale alignment and test scoring in the context of supporting students with disabilities. In educational assessment, students from these special populations take modified tests because of a demonstrated disability that requires more assistance than standard testing accommodation. Updated federal education legislation and guidance require that these students be assessed and included in state education accountability systems, and their achievement reported with respect to the same rigorous content and achievement standards that the state adopted. Routine item calibration and linking methods are not feasible because the size of these special populations tends to be small. We develop a unified cross-classified random effects model that utilizes item response data from the general population as well as judge-provided data from subject matter experts in order to obtain revised item parameter estimates for use in scoring modified tests. We extend the Metropolis–Hastings Robbins–Monro algorithm to estimate the parameters of this model. The proposed method is applied to Braille test forms in a large operational multistate English language proficiency assessment program. Our work not only allows a broader range of modifications that is routinely considered in large-scale educational assessments but also directly incorporates the input from subject matter experts who work directly with the students needing support. Their structured and informed feedback deserves more attention from the psychometric community.


2020 ◽  
pp. 003329412097993
Author(s):  
Jacinto Jardim ◽  
Anabela Pereira ◽  
Paula Vagos ◽  
Inês Direito ◽  
Sónia Galinha

When attending and participating in Higher Education, students face a multitude of personal, social, and work-related challenges, which may increase the risk of developing psychopathological symptomatology. To date, there is no instrument that grasps the non-technical skills that may help prepare students to respond to these challenges. This paper presents the development and psychometric properties of the Soft Skills Inventory (SSI). The inventory was developed based on theoretical and empirical findings on the skills associated with academic and professional success, and on students’ perception. The SSI was tested with 2030 Portuguese students (of which 77.1% were female) using a two-stage approach: item calibration and model generation (n = 1033), followed by model validation (n = 997). Item calibration analyses led to retaining 49 items that were organized into six-factors: self-determination, resilience, empathy, assertiveness, social support, and teamwork. This measurement model was further validated and proved to be an invariant, and thus credible, tool to compare male and female students on those relevant skills. All measures attained good internal consistency, with alphas ranging from .76 to .88. Female students scored significantly higher than males on self-determination, empathy, social support and teamwork. On the other hand, male students scored significantly higher on resilience. No significant differences were found between men and women for assertiveness. Psychometric analysis showed that the SSI is a reliable and valid instrument to evaluate students intra and interpersonal skills. The SSI may help identify gaps in soft skills and guide targeted interventions to support a more positive student experience in Higher Education.


2020 ◽  
Vol 9 (5) ◽  
pp. 233
Author(s):  
Olimpia Gómez ◽  
Benilde García-Cabrero ◽  
Michael L. Hoover ◽  
Sandra Castañeda-Figueiras ◽  
Yolanda Guevara Benítez

The Inventory of Emotions Experienced by Adolescents when Solving Mathematical Tasks (INETAM, for its acronym in Spanish), measures four influential emotions related to Math performance: Enthusiasm, Frustration, Enjoyment, and Boredom. Content validity, construct validation and item calibration analyses were performed to obtain evidence of its validity, using a sample of 448 adolescents enrolled in ninth grade. Factor analysis showed adequate reliability coefficients and fit indices. Calibration analysis showed that the items are highly informative and discriminate between response levels. Regression analysis indicated that emotions are predictors of math achievement. INETAM is solid psychometrically, suitable for measuring academic emotions of adolescents, and can contribute to improve our understanding of their influence on academic achievement in mathematics.


Psychometrika ◽  
2020 ◽  
Vol 85 (2) ◽  
pp. 301-321 ◽  
Author(s):  
Wim J. van der Linden ◽  
Bingnan Jiang
Keyword(s):  

2020 ◽  
Vol 44 (7-8) ◽  
pp. 563-565
Author(s):  
Hwanggyu Lim ◽  
Craig S. Wells

The R package irtplay provides practical tools for unidimensional item response theory (IRT) models that conveniently enable users to conduct many analyses related to IRT. For example, the irtplay includes functions for calibrating online items, scoring test-takers’ proficiencies, evaluating IRT model-data fit, and importing item and/or proficiency parameter estimates from the output of popular IRT software. In addition, the irtplay package supports mixed-item formats consisting of dichotomous and polytomous items.


2019 ◽  
Vol 44 (4) ◽  
pp. 311-326
Author(s):  
Christoph König ◽  
Christian Spoden ◽  
Andreas Frey

Accurate item calibration in models of item response theory (IRT) requires rather large samples. For instance, [Formula: see text] respondents are typically recommended for the two-parameter logistic (2PL) model. Hence, this model is considered a large-scale application, and its use in small-sample contexts is limited. Hierarchical Bayesian approaches are frequently proposed to reduce the sample size requirements of the 2PL. This study compared the small-sample performance of an optimized Bayesian hierarchical 2PL (H2PL) model to its standard inverse Wishart specification, its nonhierarchical counterpart, and both unweighted and weighted least squares estimators (ULSMV and WLSMV) in terms of sampling efficiency and accuracy of estimation of the item parameters and their variance components. To alleviate shortcomings of hierarchical models, the optimized H2PL (a) was reparametrized to simplify the sampling process, (b) a strategy was used to separate item parameter covariances and their variance components, and (c) the variance components were given Cauchy and exponential hyperprior distributions. Results show that when combining these elements in the optimized H2PL, accurate item parameter estimates and trait scores are obtained even in sample sizes as small as [Formula: see text]. This indicates that the 2PL can also be applied to smaller sample sizes encountered in practice. The results of this study are discussed in the context of a recently proposed multiple imputation method to account for item calibration error in trait estimation.


2019 ◽  
Author(s):  
Øystein Guttersrud ◽  
Christopher Le ◽  
Kjell Sverre Pettersen ◽  
Sølvi Helseth ◽  
Hanne Søberg Finbråten

Abstract Background The self-reported European Health Literacy Survey Questionnaire (HLS-EU-Q47) is a widely used measure for population health literacy. Based on confirmatory factor analyses and Rasch modelling, the short form HLS-Q12 was developed to meet the Rasch unidimensional measurement model expectations. After its publication, there was a worldwide call to identify HLS-Q12 cutoff scores and establish clearly delineated standards regarding the skills assessed. This study therefore aims to identify the HLS-Q12 scores associated with statistically distinct levels of proficiency and to construct a proficiency scale that may indicate what individuals typically know and can do at increasingly sophisticated levels of health literacy. Methods We applied the unidimensional Rasch measurement model for polytomous items to responses from 900 randomly sampled individuals and 388 individuals with type 2 diabetes. Using Rasch based item calibration, we constructed a proficiency scale by locating the ordered item thresholds along the scale. By applying Wright’s method for the maximum number of strata, we determined the cutoff scores for significantly different levels. By directly referring to item content that people who achieved the cutoff scores viewed as ‘easy’, we suggested what these gradually more advanced levels of health literacy might mean in terms of item content. Results Analysing the population sample, we identified statistically distinct levels of health literacy at the empirically identified cutoff scores 27, 33 and 39. We confirmed them by analysing the responses from individuals with diabetes. Using item calibration, the resulting HLS-Q12 proficiency scale expresses typical knowledge and skills at these three statistically distinct levels. The scale’s cumulative nature indicates what it may mean qualitatively to move from low to high health literacy. Conclusions By identifying levels of health literacy, we may initiate the improvement of current models of health literacy. Determining how to adapt information to patients’ health literacy level is a possible clinical outcome. A substantial methodological outcome is the inevitability of Rasch modelling in measurement. We found that Wright’s method identified rating scale cutoff scores consistently across independent samples. To reveal sources of potential biases, threats to validity and imprecision of benchmarks, replication of our study in other contexts is required


Sign in / Sign up

Export Citation Format

Share Document