scholarly journals OPTIMISING ASSESSMENT SYSTEM IN THE ESP COURSE THROUGH THE USE of THE METHODS OF DIFFERENTIAL ITEM FUNCTIONING AND DIFFERENTIAL TEST FUNCTIONING IN FINAL TEST DESIGN

Author(s):  
K. I. Shykhnenko
2021 ◽  
pp. 001316442110015
Author(s):  
Dimiter M. Dimitrov ◽  
Dimitar V. Atanasov

This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as D-scoring method (DSM). Under the proposed approach, called P–Z method of testing for DIF, the item response functions of two groups (reference and focal) are compared by transforming their probabilities of correct item response, estimated under the DSM, into Z-scale normal deviates. Using the liner relationship between such Z-deviates, the testing for DIF is reduced to testing two basic statistical hypotheses about equal variances and equal means of the Z-deviates for the reference and focal groups. The results from a simulation study support the efficiency (low Type error and high power) of the proposed P–Z method. Furthermore, it is shown that the P–Z method is directly applicable in testing for differential test functioning. Recommendations for practical use and future research, including possible applications of the P–Z method in IRT context, are also provided.


2013 ◽  
Vol 34 (3) ◽  
pp. 170-183 ◽  
Author(s):  
Eunike Wetzel ◽  
Benedikt Hell

Large mean differences are consistently found in the vocational interests of men and women. These differences may be attributable to real differences in the underlying traits. However, they may also depend on the properties of the instrument being used. It is conceivable that, in addition to the intended dimension, items assess a second dimension that differentially influences responses by men and women. This question is addressed in the present study by analyzing a widely used German interest inventory (Allgemeiner Interessen-Struktur-Test, AIST-R) regarding differential item functioning (DIF) using a DIF estimate in the framework of item response theory. Furthermore, the impact of DIF at the scale level is investigated using differential test functioning (DTF) analyses. Several items on the AIST-R’s scales showed significant DIF, especially on the Realistic, Social, and Enterprising scales. Removal of DIF items reduced gender differences on the Realistic scale, though gender differences on the Investigative, Artistic, and Social scales remained practically unchanged. Thus, responses to some AIST-R items appear to be influenced by a secondary dimension apart from the interest domain the items were intended to measure.


2019 ◽  
Vol 38 (5) ◽  
pp. 627-641
Author(s):  
Beyza Aksu Dunya ◽  
Clark McKown ◽  
Everett Smith

Emotion recognition (ER) involves understanding what others are feeling by interpreting nonverbal behavior, including facial expressions. The purpose of this study is to evaluate the psychometric properties of a web-based social ER assessment designed for children in kindergarten through third grade. Data were collected from two separate samples of children. The first sample included 3,224 children and the second sample included 4,419 children. Data were calibrated using Rasch dichotomous model. Differential item and test functioning were also evaluated across gender and ethnicity. Across both samples, we found consistent item fit, unidimensional item structure, and adequate item targeting. Analyses of differential item functioning (DIF) found six out of 111 items displaying DIF across gender and no items demonstrating DIF across ethnicity. The analyses of person measure calibrations with and without DIF items yielded no evidence of differential test functioning (DTF) across gender and ethnicity groups in both samples.


2020 ◽  
pp. 073428292094552
Author(s):  
Maryellen Brunson McClain ◽  
Bryn Harris ◽  
Sarah E. Schwartz ◽  
Megan E. Golson

Although the racial/ethnic demographics in the United States are changing, few studies evaluate the cultural and linguistic responsiveness of commonly used autism spectrum disorder screening and diagnostic assessment measures. The purpose of this study is to evaluate item and test functioning of the Autism Spectrum Rating Scales (ASRS) in a sample of racially/ethnically diverse parents of children (nonclinical) between the ages of 6–18 ( N = 806). This study is a follow-up to a prior publication examining the factor structure of the ASRS among a similar sample. The present study furthers the examination of measurement invariance of the ASRS in racially/ethnically diverse populations by conducting differential item functioning and differential test functioning with a larger sample. Results indicate test-level invariance; however, five items are noninvariant across parent reporters from different racial/ethnic groups. Implications for practice and directions for future research are discussed.


2005 ◽  
Vol 74 ◽  
pp. 135-145
Author(s):  
Tamara van Schilt-Mol ◽  
Ton Vallen ◽  
Henny Uiterwijk

Previous research has shown that the Dutch 'Final Test of Primary Education' contains a number of unintentionally and therefore unwanted, difficult test items, leading to Differential Item Functioning (DIF) for immigrant minority students whose parents' dominant language is Turkish or Arab/Berber. Two statistical procedures were used to identify DIF-items in the Final Test of 1997. Subsequently, five experiments were conducted to detect causes of DIF, revealing a number of hypotheses concerning possible linguistic, cultural, and textual sources. These hypotheses were used to manipulate original DIF-items into intentionally DIF-free items. The article discusses three possible sources of DIF: (1) the use of fixed (misleading) answer-options and (2) of misleading illustrations (both in the disadvantage of the minority students), and (3) the fact that questions concerning past tense often lead to DIF (in their advantage).


Assessment ◽  
2017 ◽  
Vol 26 (6) ◽  
pp. 1162-1175 ◽  
Author(s):  
Christine DiStefano ◽  
Fred W. Greer ◽  
Erin Dowdy

This study examined the Behavior Assessment System for Children–Third Edition Behavioral and Emotional Screening System, Parent Form–Preschool, which is used to identify children at risk for emotional/behavioral problems. Rasch rating scale methods were used to provide information about psychometric properties of items, parent raters, and the response scale. Using the norming sample of 459 children, we investigated the structure of the screener, performance of the scale, and coverage of the maladaptive behavior construct to identify children with emerging behavioral difficulties. Most items showed good fit to the rating scale methods model with broad coverage of the latent domain. Items measuring internalizing and externalizing problems indicated a greater tendency for at-risk behavior, and items measuring adaptive skills offered a minimal contribution to measuring risk status. No differential item functioning was observed due to child gender; however, differential item functioning was observed between ratings for Caucasian and Hispanic or African American students for select items.


Sign in / Sign up

Export Citation Format

Share Document