item bias
Recently Published Documents


TOTAL DOCUMENTS

160
(FIVE YEARS 16)

H-INDEX

33
(FIVE YEARS 1)

2021 ◽  
Vol 7 ◽  
pp. e784
Author(s):  
Savas Okyay ◽  
Sercan Aygun

Recommender systems include a broad scope of applications and are associated with subjective preferences, indicating variations in recommendations. As a field of data science and machine learning, recommender systems require both statistical perspectives and sufficient performance monitoring. In this paper, we propose diversified similarity measurements by observing recommendation performance using generic metrics. Considering user-based collaborative filtering, the probability of an item being preferred by any user is measured. Having examined the best neighbor counts, we verified the test item bias phenomenon for similarity equations. Because of the statistical parameters used for computing in a global scope, there is implicit information in the literature, whether those parameters comprise the focal point user data statically. Regarding each dynamic prediction, user-wise parameters are expected to be generated at runtime by excluding the item of interest. This yields reliable results and is more compatible with real-time systems. Furthermore, we underline the effect of significance weighting by examining the similarities between a user of interest and its neighbors. Overall, this study uniquely combines significance weighting and test-item bias mitigation by inspecting the fine-tuned neighborhood. Consequently, the results reveal adequate similarity weight and performance metric combinations. The source code of our architecture is available at https://codeocean.com/capsule/1427708/tree/v1.


2021 ◽  
Vol 5 (1) ◽  
pp. 16
Author(s):  
A. Alaba Adediwura ◽  
Asowo A. Patricia

This study examined the nature of item bias on students’ performance in 2017 National Examinations Council (NECO) mathematics senior school certificate dichotomously scored items in Nigeria. The study adopted an ex-post-facto research design. A sample of 256,039 candidates was randomly selected from the population of 1,034,629 students who took the test. Instrument for data collection was 'Student Results' (SR). Data collected were analysed using the R language environment and an independent t-test. Results showed that the 2017 NECO Mathematics test was essentially unidimensional (-0.28 (<.20), ASSI = -0.31 (< 0.25) and RATIO = -0.31 (< 0.36). Results also showed that the nature of bias statistically encountered was a mean difference in scores bias, indicating that 86% (52 items), 79.1% (34 items), and 96% (56 items) were biased against male students, urban and public-school students, respectively. It was concluded that item bias is a notable factor that affected the validity of the NECO 2017 Mathematics test and conclusions drawn from the scores in Nigeria. Hence, it was recommended that before tests are administered for public use, examination bodies should make a careful review of tests through dimensionality assessment at the developmental stage to eliminate any perspectives that could cause test inequity among examinees.


2021 ◽  
Author(s):  
Katharina Groskurth ◽  
Matthias Bluemke ◽  
Clemens M. Lechner ◽  
Tenko Raykov

When scalar invariance does not hold, which is often the case in application scenarios, the amount of non-invariance bias may either be consequential for observed mean comparisons or not. So far, only a few attempts have been made to quantify the extent of bias due to measurement non-invariance. Building on Millsap and Olivera-Aguilar (2012), we derived a new effect size measure, called Measurement Invariance Violation Index (MIVI), from first principles. MIVI merely assumes partial scalar invariance for a set of items forming a scale and quantifies the intercept difference of one non-invariant item (at the item-score level) or several non-invariant items (at the scale-score level) as the share (i.e., proportion) of the total observed scale score difference between groups. One can inspect the cancelation effects of item bias at the scale-score level when using directional instead of absolute terms. We provide computational code and exemplify MIVI in simulated contexts.


2021 ◽  
Author(s):  
John Marc Goodrich ◽  
Natalie Koziol ◽  
HyeonJin Yoon

When measuring academic skills among students whose primary language is not English, standardized assessments are often provided in languages other than English (Tabaku, Carbuccia-Abbott, &amp; Saavedra, 2018). The degree to which alternate-language test items function equivalently must be evaluated, but traditional methods of investigating measurement equivalence may be confounded by group differences on characteristics other than ability level and language form. The primary purposes of this study were to investigate differential item functioning (DIF) and item bias across Spanish and English forms of an assessment of early mathematics skills. Secondary purposes were to investigate the presence of selection bias and demonstrate a novel approach for investigating DIF that uses a regression discontinuity design framework to control for selection bias. Data were drawn from 1,750 Spanish-speaking Kindergarteners participating in the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99, who were administered either the Spanish or English version of the mathematics assessment based on their performance on an English language screening measure. Results indicated a minority of items functioned differently across the Spanish and English forms, and subsequent item content scrutiny indicated no plausible evidence of item bias. Evidence of selection bias—differences between groups in SES, age, and country of birth, in addition to mathematics ability and form language—highlighted limitations of a traditional approach for investigating DIF that only controlled for ability. Fewer items exhibited DIF when controlling for selection bias (11% vs. 25%), and the type and direction of DIF differed upon controlling for selection bias.


2020 ◽  
Author(s):  
Ellyn Butler ◽  
Tyler M. Moore ◽  
Monica E. Calkins ◽  
Kosha Ruparel ◽  
Ruben C. Gur ◽  
...  

Background: Depression and anxiety are common in adolescence, but adolescents’ reports of their symptoms are often at odds with their parents’ report. The extent to which reporting difference is a function of gender and context (i.e., clinical or community) has yet to be established. Since discrepancies have been found to be predictive of poor long-term health outcomes, it is critical to determine the extent of the problem in a community sample, and that disagreement itself be measured with the upmost precision. Methods: Lifetime symptoms of depression, generalized anxiety and social anxiety were modeled with a bifactor structure, and various quality assessments were performed. Adolescent and caregiver reports of the adolescent’s symptoms came from the GOASSESS on the Philadelphia Neurodevelopmental Cohort (4,812 adolescent-caregiver pairs; adolescents aged 11-17 years; 52.2% female; 57.1% White, 31.7% Black). Results: Controlling for lifetime internalizing severity, females still endorse crying more frequently than males, which indicates item bias (Chi^2=134.13, pBon&lt;.05). The caregivers' reports for the adolescents also evidence this pattern, but to a lesser extent (Chi^2=29.49, pBon&lt;.05). Caregivers tended to underestimate the severity of adolescents’ internalizing symptoms, and when they disagreed with adolescents on a specific symptom, it was often twice as likely that the adolescent endorsed the symptom and the caregiver denied it than the reverse. This effect was markedly more pronounced for female than male adolescents. Conclusions: Researchers should build on this work by exploring potential sources of the gender differences in disagreement, and by using measures that assess recent symptomatology using Likert scales.


2020 ◽  
Vol 32 (6) ◽  
pp. 553-567
Author(s):  
Kelsie T. Forbush ◽  
Q. Chelsea Song ◽  
Louis Tay ◽  
Sara R. Gould ◽  
Danielle A. N. Chapa ◽  
...  

Author(s):  
Betül Alatlı

This study aimed to investigate cross-cultural measurement invariance of the PISA (Programme for International Student Assessment, 2015) science literacy test and items and to carry out a bias study on the items which violate measurement invariance. The study used a descriptive review model. The sample of the study consisted of 2224 students taking the S12 test booklet from Australia, France, Singapore, and Turkey. Measurement invariance analyses for the test were done using Multi-Group Confirmatory Factor Analysis (MGCFA). Differential Item Functioning (DIF), in other words, measurement invariance of the test items, was analyzed using the item response theory log-likelihood ratio (IRTLR), Hierarchical Generalized Linear Model (HGLM), and the Simultaneous Item Bias Test (SIBTEST) methods.According to the findings, the test was determined to exhibit structural invariance across cultures. The highest number of items showing DIF was observed in the comparisons of Australia-Singapore and Australia-France with 35%. The number of items showing DIF, with 24%, determined in bilateral comparisons which included Turkey, the only country taking the translated form among other countries, did not show a significant difference compared to the other comparisons. While the lowest number of items showing DIF was obtained from Singapore-France samples with 12%, the rate of items indicating DIF in the France-Turkey samples was 18%. On the other hand, 35% of the items showed cross cultural measurement invariance. An item bias study was carried out based on expert opinions on items identified and released as showing DIF in the comparisons of Turkey with Australia and Singapore.According to the findings, translation-bound differentiation of the items, familiarity of a culture group with the contents of the items, polysemy in the expressions or words used in the items, the format, or the stylistic characteristics of the items were determined to be the cause of the bias in the skills measured with the items.


Sign in / Sign up

Export Citation Format

Share Document