test fairness
Recently Published Documents


TOTAL DOCUMENTS

51
(FIVE YEARS 12)

H-INDEX

10
(FIVE YEARS 1)

2021 ◽  
Vol 1 (3) ◽  
pp. 158-166
Author(s):  
Abdul-Wahab Ibrahim ◽  
Abdullahi Iliyasu

The study examined the perceived conduct of e-assessment of undergraduate courses in Nigerian universities and compared access to e-assessment among undergraduate students in universities in the country. It also determined the relationship between e-assessment-based accountability and test fairness in the conduct of e-assessment in Nigerian universities. These were with a view to improving the assessment outcome directed towards certifying the quality in education by the Nigerian universities. The study adopted a mixed research approach combining both descriptive survey and focus group designs. The population consisted of all undergraduate students who registered for their degree courses at the government and privately-owned universities during 2019/2020 academic session. The sample consisted of intact classes of 450 Parts 2, 3, and 4 undergraduate students who registered for their degree courses. A 32-item self-developed instrument was used in the study. The qualitative data was collected via focus group. Data were analysed using independent t-test, and Pearson Product Moment Correlation statistical methods. The results showed that there existed a significant difference in students’ perception of the conduct of e-assessment in Nigerian universities. Also, there was a significant difference in access to e-assessment among undergraduate students in the universities. Further, a significant relationship existed between e-assessment-based accountability and test fairness in the conduct of e-assessment in Nigerian universities. The study concluded that improper conduct of e-assessment forms a major threat to the fairness and validity of online assessment of students. It was recommended that universities’ Management should employ equitable strategies in the design, development and administration of the e-assessment on campuses in the country.


2021 ◽  
Vol 12 ◽  
Author(s):  
Linyu Liao ◽  
Don Yao

Differential Item Functioning (DIF) analysis is always an indispensable methodology for detecting item and test bias in the arena of language testing. This study investigated grade-related DIF in the General English Proficiency Test-Kids (GEPT-Kids) listening section. Quantitative data were test scores collected from 791 test takers (Grade 5 = 398; Grade 6 = 393) from eight Chinese-speaking cities, and qualitative data were expert judgments collected from two primary school English teachers in Guangdong province. Two R packages “difR” and “difNLR” were used to perform five types of DIF analysis (two-parameter item response theory [2PL IRT] based Lord’s chi-square and Raju’s area tests, Mantel-Haenszel [MH], logistic regression [LR], and nonlinear regression [NLR] DIF methods) on the test scores, which altogether identified 16 DIF items. ShinyItemAnalysis package was employed to draw item characteristic curves (ICCs) for the 16 items in RStudio, which presented four different types of DIF effect. Besides, two experts identified reasons or sources for the DIF effect of four items. The study, therefore, may shed some light on the sustainable development of test fairness in the field of language testing: methodologically, a mixed-methods sequential explanatory design was adopted to guide further test fairness research using flexible methods to achieve research purposes; practically, the result indicates that DIF analysis does not necessarily imply bias. Instead, it only serves as an alarm that calls test developers’ attention to further examine the appropriateness of test items.


2021 ◽  
Vol 20 (1) ◽  
pp. 55-62
Author(s):  
Anthony Pius Effiom

This study used Item Response Theory approach to assess Differential Item Functioning (DIF) and detect item bias in Mathematics Achievement Test (MAT). The MAT was administered to 1,751 SS2 students in public secondary schools in Cross River State. Instrumentation research design was used to develop and validate a 50-item instrument. Data were analysed using the maximum likelihood estimation technique of BILOG-MG V3 software. The result of the study revealed that 6% of the total items exhibited differential item functioning between the male and female students. Based on the analysis, the study observed that there was sex bias on some of the test items in the MAT. DIF analysis attempt at eliminating irrelevant factors and sources of bias from any kind for a test to yield valid results is among the best methods of recent. As such, test developers and policymakers are recommended to take into serious consideration and exercise care in fair test practice by dedicating effort to more unbiased test development and decision making. Examination bodies should adopt the Item Response Theory in educational testing and test developers should therefore be mindful of the test items that can cause bias in response pattern between male and female students or any sub-group of consideration. Keywords: Assessment, Differential Item Functioning, Validity, Reliability, Test Fairness, Item Bias, Item Response Theory.


2021 ◽  
Author(s):  
Johanna Hartung ◽  
Florian Schmitz ◽  
Oliver Wilhelm

Objective: We investigated the predictive power of bachelor grades and a subject-specific admission test used to select students who enlisted to study in a German master’s program in psychology. Methods: Analyses are based on the data of 2,264 university applicants from five cohorts. Results: Bachelor grades were not significantly correlated with master grades as well as with test scores for external applicants, but the relationships were significant for internal applicants. In contrast, the correlation of test scores and master grades was comparable between groups, which can be seen as an indicator of test fairness. Regression analysis showed that the admission test was a valid predictor for master grades. Furthermore, we found that both predictors, bachelor grades and test scores, were incrementally valid. Discussion: In sum, this study illustrates the benefits of using a standardized test for master student admission. We discuss issues of using coherent admission criteria across institutions.


2021 ◽  
pp. 001316442110120
Author(s):  
Xiaowen Liu ◽  
H. Jane Rogers

Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring, multiple-group modeling, and modeling DIF as a secondary dimension. Results of this study provide indications about which approach could be applied for items showing DIF for a wide range of testing environments requiring reliable treatment.


Author(s):  
Julie Levacher ◽  
Marco Koch ◽  
Johanna Hissbach ◽  
Frank M. Spinath ◽  
Nicolas Becker

Abstract. Due to their high item difficulties and excellent psychometric properties, construction-based figural matrices tasks are of particular interest when it comes to high-stakes testing. An important prerequisite is that test preparation – which is likely to occur in this context – does not impair test fairness or item properties. The goal of this study was to provide initial evidence concerning the influence of test preparation. We administered test items to a sample of N = 882 participants divided into two groups, but only one group was given information about the rules employed in the test items. The probability of solving the items was significantly higher in the test preparation group than in the control group ( M = 0.61, SD = 0.19 vs. M = 0.41, SD = 0.25; t(54) = 3.42, p = .001; d = .92). Nevertheless, a multigroup confirmatory factor analysis, as well as a differential item functioning analysis, indicated no differences between the item properties in the two groups. The results suggest that construction-based figural matrices are suitable in the context of high-stakes testing when all participants are provided with test preparation material so that test fairness is ensured.


2020 ◽  
Vol 24 (2) ◽  
Author(s):  
Ari Arifin Danuwijaya ◽  
Adiyo Roebianto

Test fairness becomes an aspect that needs to be considered when developing a test instrument. It is highly recommended that the instrument should not be biased for the test takers, by ensuring that the items do not behave differently among male and female test-takers. This study aims to examine the extent to which the items in an English proficiency test function differently across gender. Fifty reading items were examined and analysed using a statistical method for detecting DIF. The items were individually tested for gender DIF using Rasch model analysis with the analysis tool of ConQuest. The results showed that six items were detected for DIF, three of which were basic comprehension items, and the other three were vocabulary questions. Some possible ways of dealing with DIF items were also discussed.


Sign in / Sign up

Export Citation Format

Share Document