item functioning
Recently Published Documents


TOTAL DOCUMENTS

1308
(FIVE YEARS 372)

H-INDEX

54
(FIVE YEARS 5)

2022 ◽  
pp. 001316442110684
Author(s):  
Natalie A. Koziol ◽  
J. Marc Goodrich ◽  
HyeonJin Yoon

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A simulation study was performed to compare the new framework with traditional logistic regression, with respect to Type I error and power rates of the uniform DIF test statistics and bias and root mean square error of the corresponding effect size estimators. The new framework better controlled the Type I error rate and demonstrated minimal bias but suffered from low power and lack of precision. Implications for practice are discussed.


PLoS ONE ◽  
2021 ◽  
Vol 16 (12) ◽  
pp. e0261865
Author(s):  
Linda J. Resnik ◽  
Mathew L. Borgia ◽  
Melissa A. Clark ◽  
Emily Graczyk ◽  
Jacob Segil ◽  
...  

Recent advances in upper limb prosthetics include sensory restoration techniques and osseointegration technology that introduce additional risks, higher costs, and longer periods of rehabilitation. To inform regulatory and clinical decision making, validated patient reported outcome measures are required to understand the relative benefits of these interventions. The Patient Experience Measure (PEM) was developed to quantify psychosocial outcomes for research studies on sensory-enabled upper limb prostheses. While the PEM was responsive to changes in prosthesis experience in prior studies, its psychometric properties had not been assessed. Here, the PEM was examined for structural validity and reliability across a large sample of people with upper limb loss (n = 677). The PEM was modified and tested in three phases: initial refinement and cognitive testing, pilot testing, and field testing. Exploratory factor analysis (EFA) was used to discover the underlying factor structure of the PEM items and confirmatory factor analysis (CFA) verified the structure. Rasch partial credit modeling evaluated monotonicity, fit, and magnitude of differential item functioning by age, sex, and prosthesis use for all scales. EFA resulted in a seven-factor solution that was reduced to the following six scales after CFA: social interaction, self-efficacy, embodiment, intuitiveness, wellbeing, and self-consciousness. After removal of two items during Rasch analyses, the overall model fit was acceptable (CFI = 0.973, TLI = 0.979, RMSEA = 0.038). The social interaction, self-efficacy and embodiment scales had strong person reliability (0.81, 0.80 and 0.77), Cronbach’s alpha (0.90, 0.80 and 0.71), and intraclass correlation coefficients (0.82, 0.85 and 0.74), respectively. The large sample size and use of contemporary measurement methods enabled identification of unidimensional constructs, differential item functioning by participant characteristics, and the rank ordering of the difficulty of each item in the scales. The PEM enables quantification of critical psychosocial impacts of advanced prosthetic technologies and provides a rigorous foundation for future studies of clinical and prosthetic interventions.


2021 ◽  
Author(s):  
Matthias von Davier ◽  
Ummugul Bezirhan

Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical assumptions such as the monotonicity and population independence of item functions are present even in classical test theory but are more explicitly stated when using item response theory or other latent variable models for the assessment of item fit. The work presented here provides an alternative approach that does not assume perfect model data fit, but rather uses Tukey’s concept of contaminated distributions and proposes an application of robust outlier detection in order to flag items for which adequate model data fit cannot be established.


2021 ◽  
Author(s):  
Matthias von Davier ◽  
Ummugul Bezirhan

Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical assumptions such as the monotonicity and population independence of item functions are present even in classical test theory but are more explicitly stated when using item response theory or other latent variable models for the assessment of item fit. The work presented here provides an alternative approach that does not assume perfect model data fit, but rather uses Tukey’s concept of contaminated distributions and proposes an application of robust outlier detection in order to flag items for which adequate model data fit cannot be established.


Crisis ◽  
2021 ◽  
Author(s):  
Jenny Mei Yiu Huen ◽  
Paul Siu Fai Yip ◽  
Augustine Osman ◽  
Angel Nga Man Leung

Abstract. Background: Despite the widespread use of the Suicidal Behaviors Questionnaire–Revised (SBQ-R) and advances in item response theory (IRT) modeling, item-level analysis with the SBQ-R has been minimal. Aims: This study extended IRT modeling strategies to examine the response parameters and potential differential item functioning (DIF) of the individual SBQ-R items in samples of US ( N = 320) and Chinese ( N = 298) undergraduate students. Method: Responses to the items were calibrated using the unidimensional graded response IRT model. Goodness-of-fit, item parameters, and DIF were evaluated. Results: The unidimensional graded response IRT model provided a good fit to the sample data. Results showed that the SBQ-R items had various item discrimination parameters and item severity parameters. Also, each SBQ-R item functioned similarly between the US and Chinese respondents. In particular, Item 1 (history of attempts) demonstrated high discrimination and severity of suicide-related thoughts and behaviors (STBs). Limitations: The use of cross-sectional data from convenience samples of undergraduate students could be considered a major limitation. Conclusion: The findings from the IRT analysis provided empirical support that each SBQ-R item taps into STBs and that scores for Item 1 can be used for screening purposes.


Author(s):  
Lennart Schneider ◽  
Carolin Strobl ◽  
Achim Zeileis ◽  
Rudolf Debelak

AbstractThe detection of differential item functioning (DIF) is a central topic in psychometrics and educational measurement. In the past few years, a new family of score-based tests of measurement invariance has been proposed, which allows the detection of DIF along arbitrary person covariates in a variety of item response theory (IRT) models. This paper illustrates the application of these tests within the R system for statistical computing, making them accessible to a broad range of users. This presentation also includes IRT models for which these tests have not previously been investigated, such as the generalized partial credit model. The paper has three goals: First, we review the ideas behind score-based tests of measurement invariance. Second, we describe the implementation of these tests within the R system for statistical computing, which is based on the interaction of the R packages mirt, psychotools and strucchange. Third, we illustrate the application of this software and the interpretation of its output in two empirical datasets. The complete R code for reproducing our results is reported in the paper.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Vahid Ebrahimi ◽  
Zahra Bagheri ◽  
Zahra Shayan ◽  
Peyman Jafari

Assessing differential item functioning (DIF) using the ordinal logistic regression (OLR) model highly depends on the asymptotic sampling distribution of the maximum likelihood (ML) estimators. The ML estimation method, which is often used to estimate the parameters of the OLR model for DIF detection, may be substantially biased with small samples. This study is aimed at proposing a new application of the elastic net regularized OLR model, as a special type of machine learning method, for assessing DIF between two groups with small samples. Accordingly, a simulation study was conducted to compare the powers and type I error rates of the regularized and nonregularized OLR models in detecting DIF under various conditions including moderate and severe magnitudes of DIF ( DIF = 0.4   and   0.8 ), sample size ( N ), sample size ratio ( R ), scale length ( I ), and weighting parameter ( w ). The simulation results revealed that for I = 5 and regardless of R , the elastic net regularized OLR model with w = 0.1 , as compared with the nonregularized OLR model, increased the power of detecting moderate uniform DIF ( DIF = 0.4 ) approximately 35% and 21% for N = 100   and   150 , respectively. Moreover, for I = 10 and severe uniform DIF ( DIF = 0.8 ), the average power of the elastic net regularized OLR model with 0.03 ≤ w ≤ 0.06 , as compared with the nonregularized OLR model, increased approximately 29.3% and 11.2% for N = 100   and   150 , respectively. In these cases, the type I error rates of the regularized and nonregularized OLR models were below or close to the nominal level of 0.05. In general, this simulation study showed that the elastic net regularized OLR model outperformed the nonregularized OLR model especially in extremely small sample size groups. Furthermore, the present research provided a guideline and some recommendations for researchers who conduct DIF studies with small sample sizes.


2021 ◽  
Vol Publish Ahead of Print ◽  
Author(s):  
Jonathan D. Rubright ◽  
Michael Jodoin ◽  
Stephanie Woodward ◽  
Michael A. Barone

Healthcare ◽  
2021 ◽  
Vol 9 (12) ◽  
pp. 1717
Author(s):  
Pitchapat Chinnarasri ◽  
Nahathai Wongpakaran ◽  
Tinakon Wongpakaran

Background: Being older could be stressful, especially among people with narcissistic personality disorders. Nevertheless, the tool is yet to be available among older Thai individuals. The study aimed to develop a tool to detect symptoms of narcissistic personality, and to validate its psychometric properties among older Thai adults. Methods: The Narcissistic Personality Scale (NPS) was developed based on nine domain symptoms of narcissistic personality disorder from the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), consisting of 80 items. The original scale was field-tested using Rasch analysis for item reduction, rendering a final 43 items. NPS was further investigated among 296 seniors aged 60 years old. Rasch analysis was used to assess its construct validity. Result: Of 43 items, 17 were further removed as infit or outfit mean square >1.5. The final 26-item NPS met all necessary criteria of unidimensionality and local independence without differential item functioning due to age and sex, and good targeting with subjects. Person and item reliability were 0.88 and 0.95, respectively. No disordered threshold or category was found. Conclusions: The NPS is a promising tool with a proven construct validity based on the Rasch measurement model among Thai seniors. This new questionnaire can be used as outcome measures in clinical practice.


Sign in / Sign up

Export Citation Format

Share Document