scholarly journals Examining Measurement Invariance and Differential Item Functioning With Discrete Latent Construct Indicators: A Note on a Multiple Testing Procedure

2016 ◽  
Vol 78 (2) ◽  
pp. 343-352 ◽  
Author(s):  
Tenko Raykov ◽  
Dimiter M. Dimitrov ◽  
George A. Marcoulides ◽  
Tatyana Li ◽  
Natalja Menold

A latent variable modeling method for studying measurement invariance when evaluating latent constructs with multiple binary or binary scored items with no guessing is outlined. The approach extends the continuous indicator procedure described by Raykov and colleagues, utilizes similarly the false discovery rate approach to multiple testing, and permits one to locate violations of measurement invariance in loading or threshold parameters. The discussed method does not require selection of a reference observed variable and is directly applicable for studying differential item functioning with one- or two-parameter item response models. The extended procedure is illustrated on an empirical data set.

2021 ◽  
pp. 014662162110428
Author(s):  
Steffi Pohl ◽  
Daniel Schulze ◽  
Eric Stets

When measurement invariance does not hold, researchers aim for partial measurement invariance by identifying anchor items that are assumed to be measurement invariant. In this paper, we build on Bechger and Maris’s approach for identification of anchor items. Instead of identifying differential item functioning (DIF)-free items, they propose to identify different sets of items that are invariant in item parameters within the same item set. We extend their approach by an additional step in order to allow for identification of homogeneously functioning item sets. We evaluate the performance of the extended cluster approach under various conditions and compare its performance to that of previous approaches, that are the equal-mean difficulty (EMD) approach and the iterative forward approach. We show that the EMD and the iterative forward approaches perform well in conditions with balanced DIF or when DIF is small. In conditions with large and unbalanced DIF, they fail to recover the true group mean differences. With appropriate threshold settings, the cluster approach identified a cluster that resulted in unbiased mean difference estimates in all conditions. Compared to previous approaches, the cluster approach allows for a variety of different assumptions as well as for depicting the uncertainty in the results that stem from the choice of the assumption. Using a real data set, we illustrate how the assumptions of the previous approaches may be incorporated in the cluster approach and how the chosen assumption impacts the results.


2017 ◽  
Vol 78 (5) ◽  
pp. 781-804 ◽  
Author(s):  
Stella Bollmann ◽  
Moritz Berger ◽  
Gerhard Tutz

Various methods to detect differential item functioning (DIF) in item response models are available. However, most of these methods assume that the responses are binary, and so for ordered response categories available methods are scarce. In the present article, DIF in the widely used partial credit model is investigated. An item-focused tree is proposed that allows the detection of DIF items, which might affect the performance of the partial credit model. The method uses tree methodology, yielding a tree for each item that is detected as DIF item. The visualization as trees makes the results easily accessible, as the obtained trees show which variables induce DIF and in which way. In the present paper, the new method is compared with alternative approaches and simulations demonstrate the performance of the method.


2010 ◽  
Vol 35 (2) ◽  
pp. 174-193 ◽  
Author(s):  
Matthias von Davier ◽  
Sandip Sinharay

This article presents an application of a stochastic approximation expectation maximization (EM) algorithm using a Metropolis-Hastings (MH) sampler to estimate the parameters of an item response latent regression model. Latent regression item response models are extensions of item response theory (IRT) to a latent variable model with covariates serving as predictors of the conditional distribution of ability. Applications to estimating latent regression models for National Assessment of Educational Progress (NAEP) data from the 2000 Grade 4 mathematics assessment and the Grade 8 reading assessment from 2002 are presented and results of the proposed method are compared to results obtained using current operational procedures.


2015 ◽  
Vol 14 (1) ◽  
pp. 1-19 ◽  
Author(s):  
Rosa J. Meijer ◽  
Thijmen J.P. Krebs ◽  
Jelle J. Goeman

AbstractWe present a multiple testing method for hypotheses that are ordered in space or time. Given such hypotheses, the elementary hypotheses as well as regions of consecutive hypotheses are of interest. These region hypotheses not only have intrinsic meaning but testing them also has the advantage that (potentially small) signals across a region are combined in one test. Because the expected number and length of potentially interesting regions are usually not available beforehand, we propose a method that tests all possible region hypotheses as well as all individual hypotheses in a single multiple testing procedure that controls the familywise error rate. We start at testing the global null-hypothesis and when this hypothesis can be rejected we continue with further specifying the exact location/locations of the effect present. The method is implemented in the


The purpose of this study was to examine the differences in sensitivity of three methods: IRT-Likelihood Ratio (IRT-LR), Mantel-Haenszel (MH) and Logistics Regression (LR), in detecting gender differential item functioning (DIF) on National Mathematics Examination (Ujian Nasional: UN) for 2014/2015 academic year in North Sumatera Province of Indonesia. DIF item shows the unfairness. It advantages the test takers of certain groups and disadvantages other group test takers, in the case they have the same ability. The presence of DIF was reviewed in grouping by gender: men as reference groups (R) and women as focus groups (F). This study used the experimental method, 3x1 design, with one factor (i.e. method) with three treatments, in the form of 3 different DIF detection methods. There are 5 types of UN Mathematics Year 2015 packages (codes: 1107, 2207, 3307, 4407 and 5507). The 2207 package code was taken as the sample data, consisting of 5000 participants (3067 women, 1933 men; for 40 UN items). Item selection was carried out based on the classical test theory (CTT) on 40 UN items, producing 32 items that fulfilled, and item response theory selection (IRT) produced 18 items that fulfilled. With program R 3.333 and IRTLRDIF 2.0, it was found 5 items were detected as DIF by the IRT-Likelihood Ratio-method (IRTLR), 4 items were detected as DIF by the Logistic Regression method (LR), and 3 items were detected as DIF by the MantelHaenszel method (MH). To test the sensitivity of the three methods, it is not enough with just one time DIF detection, but formed six groups of data analysis: (4400,40),(4400,32), (4400,18), (3000,40), (3000,32), (3000,18), and generate 40 random data sets (without repetitions) in each group, and conduct detecting DIF on the items in each data set. Although the data lacks model fit, the 3 parameter logistic model (3PL) is chosen as the most suitable model. With the Tukey's HSD post hoc test, the IRT-LR method is known to be more sensitive than the MH and LR methods in the group (4400,40) and (3000,40). The IRT-LR method is not longer more sensitive than LR in the group (4400,32) and (3000,32), but still more sensitive than MH. In the groups (4400,18) and (3000,18) the IRT-LR method is more sensitive than LR, but not significantly more sensitive than MH. The LR method is consistently tested to be more sensitive than the MH method in the entire analysis groups.


2021 ◽  
pp. 23-42
Author(s):  
Karen H. Larwin ◽  
Milton Harvey

The current investigation uses latent variable modeling to investigate Subjective Well-Being (SWB). As a follow-up to Larwin, Harvey, and Constantinou (2020), subjective wellbeing is presented through third-order factor model, which explains two-second order factors, SWB and Interpersonal Experiences (IES) while incorporating measures of relationship and resiliency self-evaluations. Additionally, the current investigation considers differential item functioning not considered in the existing SWB literature. JEL classification numbers: C1,C3,C4,C9. Keywords: Subjective Well-Being, Satisfaction with Life Scale, Subjective Happiness Scale, Brief Resiliency Scale, Relationship Assessment Scale, Multiple Indicators Multiple Causes (MIMIC), Weighted least squares mean variance adjusted estimator (WLSMV).


2011 ◽  
Vol 35 (8) ◽  
pp. 604-622 ◽  
Author(s):  
Hirotaka Fukuhara ◽  
Akihito Kamata

A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into account, thus estimating DIF magnitude appropriately when a test is composed of testlets. A fully Bayesian estimation method was adopted for parameter estimation. The recovery of parameters was evaluated for the proposed DIF model. Simulation results revealed that the proposed bifactor MIRT DIF model produced better estimates of DIF magnitude and higher DIF detection rates than the traditional IRT DIF model for all simulation conditions. A real data analysis was also conducted by applying the proposed DIF model to a statewide reading assessment data set.


2016 ◽  
Vol 6 (2) ◽  
pp. 30-41
Author(s):  
Mark Chang ◽  
Xuan Deng ◽  
John Balser

2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Zhongquan Li ◽  
Xia Zhao ◽  
Ang Sheng ◽  
Li Wang

Abstract Background Anxiety symptoms are pervasive among elderly populations around the world. The Geriatric Anxiety Inventory (the GAI) has been developed and widely used in screening those suffering from severe symptoms. Although debates about its dimensionality have been mostly resolved by Molde et al. (2019) with bifactor modeling, evidence regarding its measurement invariance across sex and somatic diseases is still missing. Methods This study attempted to provide complemental evidence to the dimensionality debates of the GAI with Mokken scale analysis and to examine its measurement invariance across sex and somatic diseases by conducting differential item functioning (DIF) analysis among a sample of older Chinese adults. The data was from responses of a large representative sample (N = 1314) in the Chinese National Survey Data Archive, focusing on the mental health of elderly adults. Results The results of Mokken scale analysis confirmed the unidimensionality of the GAI, and DIF analysis indicated measurement invariance of this inventory across individuals with different sex and somatic diseases, with just a few items exhibiting item bias but all of them negligible. Conclusions All these findings supported the use of this inventory among Chinese elders to screen anxiety symptoms and to make comparisons across sex and somatic diseases.


Sign in / Sign up

Export Citation Format

Share Document