Applied Psychological Measurement
Latest Publications


TOTAL DOCUMENTS

2034
(FIVE YEARS 87)

H-INDEX

87
(FIVE YEARS 2)

Published By Sage Publications

0146-6216

2021 ◽  
Vol 46 (1) ◽  
pp. 53-67
Author(s):  
James Soland ◽  
Megan Kuhfeld

Researchers in the social sciences often obtain ratings of a construct of interest provided by multiple raters. While using multiple raters provides a way to help avoid the subjectivity of any given person’s responses, rater disagreement can be a problem. A variety of models exist to address rater disagreement in both structural equation modeling and item response theory frameworks. Recently, a model was developed by Bauer et al. (2013) and referred to as the “trifactor model” to provide applied researchers with a straightforward way of estimating scores that are purged of variance that is idiosyncratic by rater. Although the intent of the model is to be usable and interpretable, little is known about the circumstances under which it performs well, and those it does not. We conduct simulation studies to examine the performance of the trifactor model under a range of sample sizes and model specifications and then compare model fit, bias, and convergence rates.


2021 ◽  
pp. 014662162110428
Author(s):  
Katherine G. Jonas

New measures of test information, termed global information, quantify test information relative to the entire range of the trait being assessed. Estimating global information relative to a non-informative prior distribution results in a measure of how much information could be gained by administering the test to an unspecified examinee. Currently, such measures have been developed only for unidimensional tests. This study introduces measures of multidimensional global test information and validates them in simulated data. Then, the utility of global test information is tested in neuropsychological data collected as part of Rush University’s Memory and Aging Project. These measures allow for direct comparison of complex tests calibrated in different samples, facilitating test development and selection.


2021 ◽  
pp. 014662162110428
Author(s):  
Steffi Pohl ◽  
Daniel Schulze ◽  
Eric Stets

When measurement invariance does not hold, researchers aim for partial measurement invariance by identifying anchor items that are assumed to be measurement invariant. In this paper, we build on Bechger and Maris’s approach for identification of anchor items. Instead of identifying differential item functioning (DIF)-free items, they propose to identify different sets of items that are invariant in item parameters within the same item set. We extend their approach by an additional step in order to allow for identification of homogeneously functioning item sets. We evaluate the performance of the extended cluster approach under various conditions and compare its performance to that of previous approaches, that are the equal-mean difficulty (EMD) approach and the iterative forward approach. We show that the EMD and the iterative forward approaches perform well in conditions with balanced DIF or when DIF is small. In conditions with large and unbalanced DIF, they fail to recover the true group mean differences. With appropriate threshold settings, the cluster approach identified a cluster that resulted in unbiased mean difference estimates in all conditions. Compared to previous approaches, the cluster approach allows for a variety of different assumptions as well as for depicting the uncertainty in the results that stem from the choice of the assumption. Using a real data set, we illustrate how the assumptions of the previous approaches may be incorporated in the cluster approach and how the chosen assumption impacts the results.


2021 ◽  
pp. 014662162110517
Author(s):  
Seang-Hwane Joo ◽  
Philseok Lee ◽  
Stephen Stark

Collateral information has been used to address subpopulation heterogeneity and increase estimation accuracy in some large-scale cognitive assessments. The methodology that takes collateral information into account has not been developed and explored in published research with models designed specifically for noncognitive measurement. Because the accurate noncognitive measurement is becoming increasingly important, we sought to examine the benefits of using collateral information in latent trait estimation with an item response theory model that has proven valuable for noncognitive testing, namely, the generalized graded unfolding model (GGUM). Our presentation introduces an extension of the GGUM that incorporates collateral information, henceforth called Explanatory GGUM. We then present a simulation study that examined Explanatory GGUM latent trait estimation as a function of sample size, test length, number of background covariates, and correlation between the covariates and the latent trait. Results indicated the Explanatory GGUM approach provides scoring accuracy and precision superior to traditional expected a posteriori (EAP) and full Bayesian (FB) methods. Implications and recommendations are discussed.


2021 ◽  
pp. 014662162110468
Author(s):  
Irina Grabovsky ◽  
Jesse Pace ◽  
Christopher Runyon

We model pass/fail examinations aiming to provide a systematic tool to minimize classification errors. We use the method of cut-score operating functions to generate specific cut-scores on the basis of minimizing several important misclassification measures. The goal of this research is to examine the combined effects of a known distribution of examinee abilities and uncertainty in the standard setting on the optimal choice of the cut-score. In addition, we describe an online application that allows others to utilize the cut-score operating function for their own standard settings.


2021 ◽  
pp. 014662162110517
Author(s):  
Joseph A. Rios ◽  
Jiayi Deng

An underlying threat to the validity of reliability measures is the introduction of systematic variance in examinee scores from unintended constructs that differ from those assessed. One construct-irrelevant behavior that has gained increased attention in the literature is rapid guessing (RG), which occurs when examinees answer quickly with intentional disregard for item content. To examine the degree of distortion in coefficient alpha due to RG, this study compared alpha estimates between conditions in which simulees engaged in full solution (i.e., do not engage in RG) versus partial RG behavior. This was done by conducting a simulation study in which the percentage and ability characteristics of rapid responders as well as the percentage and pattern of RG were manipulated. After controlling for test length and difficulty, the average degree of distortion in estimates of coefficient alpha due to RG ranged from −.04 to .02 across 144 conditions. Although slight differences were noted between conditions differing in RG pattern and RG responder ability, the findings from this study suggest that estimates of coefficient alpha are largely robust to the presence of RG due to cognitive fatigue and a low perceived probability of success.


2021 ◽  
pp. 014662162110517
Author(s):  
Mengtong Li ◽  
Tianjun Sun ◽  
Bo Zhang

Recently, there has been increasing interest in adopting the forced-choice (FC) test format in non-cognitive assessments, as it demonstrates faking resistance when well-designed. However, traditional or manual pairing approaches to FC test construction are time- and effort- intensive and often involve insufficient considerations. To address these issues, we developed the new open-source autoFC R package to facilitate automated and optimized item pairing strategies. The autoFC package is intended as a practical tool for FC test constructions. Users can easily obtain automatically optimized FC tests by simply inputting the item characteristics of interest. Customizations are also available for considerations on matching rules and the behaviors of the optimization process. The autoFC package should be of interest to researchers and practitioners constructing FC scales with potentially many metrics to match on and/or many items to pair, essentially exempting users from the burden of manual item pairing and reducing the computational costs and biases induced by simple ranking methods.


2021 ◽  
pp. 014662162110492
Author(s):  
Seung W. Choi ◽  
Sangdon Lim ◽  
Luping Niu ◽  
Sooyong Lee ◽  
Christina M. Schneider ◽  
...  

Multiple Administrations Adaptive Testing (MAAT) is an extension of the shadow-test approach to CAT for the assessment framework involving multiple tests administered periodically throughout the year. The maat package utilizes multiple item pools vertically scaled across grades and multiple phases (stages) within each test administration, allowing for transitioning from an item pool to another as deemed necessary to further enhance the quality of assessment.


2021 ◽  
pp. 014662162110404
Author(s):  
James D. Weese

The R package DIFSIB provides a direct translated version of the SIBTEST, Crossing- SIBTEST, and POLYSIBTEST procedures that were last updated and released in 2005. Having these functions directly written from Fortran into R code will allow researchers and practitioners to easily access the most recent versions of these procedures when they are conducting differential item functioning analysis and continue to improve the software more easily.


Sign in / Sign up

Export Citation Format

Share Document