scholarly journals The Optimal Setting of A/B Exam Papers without Item Pools: A Hybrid Approach of IRT and BGP

Mathematics ◽  
2020 ◽  
Vol 8 (8) ◽  
pp. 1290
Author(s):  
Zheng-Yun Zhuang ◽  
Chi-Kit Ho ◽  
Paul Juinn Bing Tan ◽  
Jia-Ming Ying ◽  
Jin-Hua Chen

The administration of A/B exams usually involves the use of items. Issues arise when the pre-establishment of a question bank is necessary and the inconsistency in the knowledge points to be tested (in the two exams) reduces the exams ‘fairness’. These are critical for a large multi-teacher course wherein the teachers are changed such that the course and examination content are altered every few years. However, a fair test with randomly participating students should still be a guaranteed subject with no item pool. Through data-driven decision-making, this study collected data related to a term test for a compulsory general course for empirical assessments, pre-processed the data and used item response theory to statistically estimate the difficulty, discrimination and lower asymptotic for each item in the two exam papers. Binary goal programing was finally used to analyze and balance the fairness of A/B exams without an item pool. As a result, pairs of associated questions in the two exam papers were optimized in terms of their overall balance in three dimensions (as the goals) through the paired exchanges of items. These exam papers guarantee their consistency (in the tested knowledge points) and also ensure the fairness of the term test (a key psychological factor that motivates continued studies). Such an application is novel as the teacher(s) did not have a pre-set question bank and could formulate the fairest strategy for the A/B exam papers. The model can be employed to address similar teaching practice issues.

1982 ◽  
Vol 6 (4) ◽  
pp. 473-492 ◽  
Author(s):  
David J. Weiss

Approaches to adaptive (tailored) testing based on item response theory are described and research results summarized. Through appropriate combinations of item pool design and use of different test termination criteria, adaptive tests can be designed (1) to improve both measurement quality and measurement efficiency, resulting in measurements of equal precision at all trait levels; (2) to improve measurement efficiency for test batteries using item pools designed for conventional test administration; and (3) to improve the accuracy and efficiency of testing for classification (e.g., mastery testing). Research results show that tests based on item response theory (IRT) can achieve measurements of equal precision at all trait levels, given an adequately designed item pool; these results contrast with those of conventional tests which require a tradeoff of bandwidth for fidelity/precision of measurements. Data also show reductions in bias, inaccuracy, and root mean square error of ability estimates. Improvements in test fidelity observed in simulation studies are supported by live-testing data, which showed adaptive tests requiring half the number of items as that of conventional tests to achieve equal levels of reliability, and almost one-third the number to achieve equal levels of validity. When used with item pools from conventional tests, both simulation and live-testing results show reductions in test battery length from conventional tests, with no reductions in the quality of measurements. Adaptive tests designed for dichotomous classification also represent improvements over conventional tests designed for the same purpose. Simulation studies show reductions in test length and improvements in classification accuracy for adaptive vs. conventional tests; live-testing studies in which adaptive tests were compared with "optimal" conventional tests support these findings. Thus, the research data show that IRT-based adaptive testing takes advantage of the capabilities of IRT to improve the quality and/or efficiency of measurement for each examinee.


1995 ◽  
Vol 11 (2) ◽  
pp. 133-137 ◽  
Author(s):  
Juan Fernández ◽  
Miguel A. Mateo ◽  
José Muñiz

The conditions are investigated in which Spanish university teachers carry out their teaching and research functions. 655 teachers from the University of Oviedo took part in this study by completing the Academic Setting Evaluation Questionnaire (ASEQ). Of the three dimensions assessed in the ASEQ, Satisfaction received the lowest ratings, Social Climate was rated higher, and Relations with students was rated the highest. These results are similar to those found in two studies carried out in the academic years 1986/87 and 1989/90. Their relevance for higher education is twofold because these data can be used as a complement of those obtained by means of students' opinions, and the crossing of both types of data can facilitate decision making in order to improve the quality of the work (teaching and research) of the university institutions.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 135770-135783
Author(s):  
Alka Agrawal ◽  
Abhishek Kumar Pandey ◽  
Abdullah Baz ◽  
Hosam Alhakami ◽  
Wajdi Alhakami ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document