PEMILIHAN BUTIR SOAL PADA RANCANGAN TES ADAPTIF BERDASARKAN EFFICIENCY BALANCED INFORMATION

One of the most popular item selection methods in the design of adaptive testing is Maximum Information Method. This method provides the items with the maximum information at a certain level of selecting ability. The lack of this method is rarely accurate in estimating the level of ability of the examinees at the beginning of the test and tends to select items with higher discrimination-parameter value than items with lower discrimination-parameter. It creates problem in maintaining item bank. Therefore, another method should be found. The objective of the study is to determine the performance of the application of the estimation method Balanced Efficiency Information (EBI) on the design of an adaptive test. This research was carried out through a simulation study in the setting of organizing the Open University final exams. Item bank for the purposes of simulation models based on 3 parameters Item Response Theory was contructed a total of 900 items were generated base on the ideal parameter of the item specifications. Two selection criteria items were simulated, namely Information Maximum and EBI Maximum. Those two criteria were also designed to meet the content balancing. This is to ensure that the algorithm is appropriate with the applicable UT modular learning, meaning items of each module were proportionally represented and suited the blueprint. The tests will be stopped at when the standard error of estimate ( standard error of estimation = SEE ) is 0.3.The study summarized that the algorithm of EBI was more accurate than the Maximum Information criteria in estimating performance capabilities of participants. This is indicated by the value of the bias and the standard deviation of EBI is smaller than Maximum Information criterias. Another advantage of the application of the EBI Maximum is optimally utilizing of the item bank. The items with low level of the discrimination-parameter will also be chosen at the begining of the test. The maximum information criterion is more efficient in terms of test length but less optimally of the item bank utilization. Salah satu metode pemilihan butir soal yang popular digunakan dalam rancangan tes adaptif adalah metode Informasi Maksimum. Melalui metode ini, butir soal yang memiliki informasi maksimum pada tingkat kemampuan tertentu akan dipilih dan diberikan kepada peserta tes. Namun kelemahan dari metode ini adalah kurang akurat dalam mengestimasi tingkat kemampuan peserta pada awal tes dan memiliki kecenderungan untuk memilih butir dengan nilai daya pembeda parameter butir yang tinggi dibandingkan butir dengan nilai parameter daya pembeda yang rendah, sehingga menimbulkan masalah pemeliharaan butir soal dalam bank soal. Karena itu dicari cara lain untuk mengatasi masalah tersebut. Penelitian ini bertujuan untuk mengetahui performa hasil estimasi dari penerapan metode Efficiency Balanced Information (EBI) pada rancangan tes adaptif. Penelitian ini dilakukan melalui studi simulasi dalam setting penyelenggaraan ujian akhir semester Universitas Terbuka. Bank soal untuk keperluan simulasi dibangkitkan berdasarkan model Item Response Theory 3 parameter. Sebanyak 900 butir soal dalam bank soal bangkitan dengan spesifikasi parameter butir yang ideal. Dua kriteria pemilihan butir soal yang disimulasikan yaitu Informasi Maksimum dan EBI Maksimum yang juga dirancang agar memenuhi keseimbangan isi. Hal ini agar menjamin bahwa algoritma yang dihasilkan sesuai dengan pembelajaran moduler yang diterapkan UT, artinya butir soal setiap modul secara proporsional terwakili dan sesuai kisi-kisi. Aturan pemberhentian tes menggunakan kesalahan baku estimasi (standard error of estimation=SEE) sebesar 0,3. Hasil penelitian menyimpulkan bahwa algoritma rancangan tes adaptif dengan kriteria EBI menghasilkan performa hasil estimasi kemampuan peserta yang lebih akurat dibandingkan kriteria Informasi Maksimum. Hal ini ditunjukkan oleh nilai bias dan simpangan baku pengukuran yang lebih kecil dibandingkan kriteria Informasi Maksimum. Kelebihan lain dari penerapan kriteria EBI Maksimum adalah kebermanfaatan bank soal lebih optimal karena butir-butir soal dengan tingkat daya beda rendah juga dimunculkan khususnya pada awal tes. Sedangkan kriteria Informasi Maksimum walaupun lebih efisien dari sisi panjang tes tetapi kurang optimal dalam memanfaatkan bank soal.

Download Full-text

Applying Item Response Theory in Language Test Item Bank Building

10.3726/978-3-653-01167-8 ◽

2008 ◽

Cited By ~ 3

Author(s):

Gábor Szabó

Keyword(s):

Item Response Theory ◽

Item Response ◽

Test Item ◽

Item Bank ◽

Language Test ◽

Response Theory

Download Full-text

A Bifactor Multidimensional Item Response Theory Model for Differential Item Functioning Analysis on Testlet-Based Items

Applied Psychological Measurement ◽

10.1177/0146621611428447 ◽

2011 ◽

Vol 35 (8) ◽

pp. 604-622 ◽

Cited By ~ 16

Author(s):

Hirotaka Fukuhara ◽

Akihito Kamata

Keyword(s):

Item Response Theory ◽

Differential Item Functioning ◽

Item Response ◽

Estimation Method ◽

Multidimensional Item Response Theory ◽

Multidimensional Item Response ◽

Response Theory ◽

Data Set ◽

Detection Rates ◽

Item Functioning

A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into account, thus estimating DIF magnitude appropriately when a test is composed of testlets. A fully Bayesian estimation method was adopted for parameter estimation. The recovery of parameters was evaluated for the proposed DIF model. Simulation results revealed that the proposed bifactor MIRT DIF model produced better estimates of DIF magnitude and higher DIF detection rates than the traditional IRT DIF model for all simulation conditions. A real data analysis was also conducted by applying the proposed DIF model to a statewide reading assessment data set.

Download Full-text

STANDARD ERROR OF AN EQUATING BY ITEM RESPONSE THEORY

ETS Research Report Series ◽

10.1002/j.2333-8504.1981.tb01276.x ◽

1981 ◽

Vol 1981 (2) ◽

pp. 463-471 ◽

Cited By ~ 2

Author(s):

Frederic M. Lord

Keyword(s):

Item Response Theory ◽

Item Response ◽

Standard Error ◽

Response Theory

Download Full-text

Standard Error of an Equating by Item Response Theory

10.21236/ada108875 ◽

1981 ◽

Cited By ~ 2

Author(s):

Frederic M. Lord

Keyword(s):

Item Response Theory ◽

Item Response ◽

Standard Error ◽

Response Theory

Download Full-text

Using Item Response Theory to Evaluate Self-directed Learning Readiness Scale

Journal of Educational and Developmental Psychology ◽

10.5539/jedp.v8n1p14 ◽

2017 ◽

Vol 8 (1) ◽

pp. 14

Author(s):

Rana Th. Momani

Keyword(s):

Item Response Theory ◽

Item Response ◽

Response Theory ◽

Self Directed Learning ◽

Learning Readiness ◽

Scale Levels ◽

Directed Learning ◽

Model Misfit ◽

Discrimination Parameter ◽

Item Scale

Item Response Theory becomes one of the most popular methods for instruments development and evaluation methods. This baseline study is a self-directed learning readiness (SDLR) 40 item scale with data from 648 undergraduate psychology female students attending Qassim University in Saudi Arabia through randomized selection to evaluate an SDLR scale at item and scale levels using GRM. Results provide more detailed diagnostic information to modulate the scale. GRM analysis led to the detection of two locally dependent items, one item with low discrimination parameter and 15 model misfit items. The scale often tends to measure low and moderate levels of SDLR. Advanced psychometric evaluations should be made and the SDLR scale must be reviewed based on quantitative and qualitative analysis.

Download Full-text

Reevaluation of the Amsterdam Inventory for Auditory Disability and Handicap Using Item Response Theory

Journal of Speech Language and Hearing Research ◽

10.1044/2015_jslhr-h-15-0156 ◽

2016 ◽

Vol 59 (2) ◽

pp. 373-383 ◽

Cited By ~ 14

Author(s):

J. Mirjam Boeschen Hospers ◽

Niels Smits ◽

Cas Smits ◽

Mariska Stam ◽

Caroline B. Terwee ◽

...

Keyword(s):

Item Response Theory ◽

Item Response ◽

Standard Error ◽

Item Information ◽

Response Model ◽

Graded Response Model ◽

Response Theory ◽

Hearing Disability ◽

Hearing Ability ◽

Graded Response

Purpose We reevaluated the psychometric properties of the Amsterdam Inventory for Auditory Disability and Handicap (AIADH; Kramer, Kapteyn, Festen, & Tobi, 1995) using item response theory. Item response theory describes item functioning along an ability continuum. Method Cross-sectional data from 2,352 adults with and without hearing impairment, ages 18–70 years, were analyzed. They completed the AIADH in the web-based prospective cohort study “Netherlands Longitudinal Study on Hearing.” A graded response model was fitted to the AIADH data. Category response curves, item information curves, and the standard error as a function of self-reported hearing ability were plotted. Results The graded response model showed a good fit. Item information curves were most reliable for adults who reported having hearing disability and less reliable for adults with normal hearing. The standard error plot showed that self-reported hearing ability is most reliably measured for adults reporting mild up to moderate hearing disability. Conclusions This is one of the few item response theory studies on audiological self-reports. All AIADH items could be hierarchically placed on the self-reported hearing ability continuum, meaning they measure the same construct. This provides a promising basis for developing a clinically useful computerized adaptive test, where item selection adapts to the hearing ability of individuals, resulting in efficient assessment of hearing disability.

Download Full-text

PROMIS Pediatric Pain Interference Scale: An Item Response Theory Analysis of the Pediatric Pain Item Bank

Journal of Pain ◽

10.1016/j.jpain.2010.02.005 ◽

2010 ◽

Vol 11 (11) ◽

pp. 1109-1119 ◽

Cited By ~ 151

Author(s):

James W. Varni ◽

Brian D. Stucky ◽

David Thissen ◽

Esi Morgan DeWitt ◽

Debra E. Irwin ◽

...

Keyword(s):

Item Response Theory ◽

Item Response ◽

Item Bank ◽

Pain Interference ◽

Pediatric Pain ◽

Response Theory ◽

Theory Analysis ◽

Item Response Theory Analysis

Download Full-text

A Comparison of Item Parameter Standard Error Estimation Procedures for Unidimensional and Multidimensional Item Response Theory Modeling

Educational and Psychological Measurement ◽

10.1177/0013164413500277 ◽

2013 ◽

Vol 74 (1) ◽

pp. 58-76 ◽

Cited By ~ 19

Author(s):

Insu Paek ◽

Li Cai

Keyword(s):

Item Response Theory ◽

Item Response ◽

Error Estimation ◽

Standard Error ◽

Multidimensional Item Response Theory ◽

Item Parameter ◽

Multidimensional Item Response ◽

Response Theory ◽

Estimation Procedures ◽

Standard Error Estimation

Download Full-text

PSYCHOMETRIC EVALUATION OF FLOOD DISASTER MANAGEMENT QUESTIONNAIRE: CONFIRMATORY FACTOR ANALYSIS AND ITEM RESPONSE THEORY ANALYSIS

Malaysian Journal of Public Health Medicine ◽

10.37268/mjphm/vol.21/no.2/art.859 ◽

2021 ◽

Vol 21 (2) ◽

pp. 133-140

Author(s):

Mohamad Masykurin Mafauzy ◽

Tuan Hairulnizam Tuan Kamauzaman ◽

Wan Nor Arifin ◽

Hadi Fadhil Mat Said ◽

Fatimah Ismail ◽

...

Keyword(s):

Factor Analysis ◽

Item Response Theory ◽

Confirmatory Factor Analysis ◽

Item Response ◽

Estimation Method ◽

Flood Disaster ◽

Model Fit ◽

Response Theory ◽

Attitude And Practice ◽

Confirmatory Factor

Flood disaster is the commonest natural disaster with huge impact on healthcare services in Malaysia. The FloodDMQ-BM© questionnaire was developed as a tool to assess the knowledge, attitude, and practice of healthcare providers regarding patient management during a flood disaster. We aim to further validate the FloodDMQ-BM© questionnaire by using Confirmatory Factor Analysis (CFA) and Item Response Theory (IRT).This cross-sectional study involved doctors, nurses and paramedics working in the Emergency Department of Hospital Universiti Sains Malaysia, Hospital Raja Perempuan Zainab II and Hospital Kuala Krai. Respondents were required to complete the FloodDMQ-BM© questionnaire. The responses were analysed by using CFA and IRT to establish its validity and reliability. A total of 215 respondents participated in this study. CFA analysis with Maximum Likelihood Robust as the estimation method, on the attitude and practice components resulted in good factor loadings (>0.5) in nearly all items and excellent model fit indices values (CFI = 0.96-0.98, TLI = 0.95-0.96, SRMR = 0.04-0.05, RMSEA = 0.07). Meanwhile, IRT analysis on the knowledge section showed a good two-way marginal fit based on S-X2, and a good model fit with RMSEA of 0.08. Based on the 2PL model by using the IRT assessment of the knowledge section, one item in the knowledge section (K3) was removed (chi-squared residual >4) resulting in improved model fit. The included items had well-standardized loadings (>0.3) and marginal reliability of 0. 651.Our results confirmed that the FloodDMQ-BM© questionnaire displayed valid and reliable psychometric properties.

Download Full-text

The accuracy of the cheating detection methods in large-scale tests: mathematics national examination

Jurnal Penelitian dan Evaluasi Pendidikan ◽

10.21831/pep.v22i2.14930 ◽

2018 ◽

Vol 22 (2) ◽

pp. 130-142

Author(s):

Thomas Mbenu Nulangi ◽

Djemari Mardapi

Keyword(s):

Item Response Theory ◽

Item Response ◽

Large Scale ◽

Accurate Method ◽

School Level ◽

Detection Methods ◽

Response Theory ◽

Index Method ◽

Maximum Information ◽

National Examination

This study aimed to describe (1) the characteristics of items based on the Item Response Theory, (2) the cheating level in the implementation of the national examinartion based on Angoffs B-Index method, Pair 1 method, Pair 2 method, Modified Error Similarity Analysis (MESA) method, and G2 method, (3) the most accurate method to detect the cheating in the mathematics national examination at the senior secondary school level in the academic year of 2015/2016 in East Nusa Tenggara Province. The result of the item response theory analysis showed that 17 (42.5%) items of the mathematics national examination fit with the 3-PL model, with the maximum information function of 58.0128 at 1.6, and the measurement error of 0.1313. The number of pairs detected to be cheating by Angoff’s B-Index method was 63 pairs, that by the Pair 1 method was 52 pairs, that by the Pair 2 method was 141 pairs, that by MESA method was 67 pairs, and that by the G2 method was 183 pairs. The methods which could detect most pairs doing cheating were the G2 method, the Pair 2 method, the MESA method, Angoff’s B-Index method, and the Pair 1 method successively. The methods which could accurately detect cheating based on the computation of the standard error were Angoff’s B-Index method, the G2 method, the MESA method, the Pair 1 method, and the Pair 2 method successively.

Download Full-text