Varying the Valuating Function and the Presentable Bank in Computerized Adaptive Testing

2011 ◽  
Vol 14 (1) ◽  
pp. 500-508 ◽  
Author(s):  
Juan Ramón Barrada ◽  
Francisco José Abad ◽  
Julio Olea

In computerized adaptive testing, the most commonly used valuating function is the Fisher information function. When the goal is to keep item bank security at a maximum, the valuating function that seems most convenient is the matching criterion, valuating the distance between the estimated trait level and the point where the maximum of the information function is located. Recently, it has been proposed not to keep the same valuating function constant for all the items in the test. In this study we expand the idea of combining the matching criterion with the Fisher information function. We also manipulate the number of strata into which the bank is divided. We find that the manipulation of the number of items administered with each function makes it possible to move from the pole of high accuracy and low security to the opposite pole. It is possible to greatly improve item bank security with much fewer losses in accuracy by selecting several items with the matching criterion. In general, it seems more appropriate not to stratify the bank.

Methodology ◽  
2009 ◽  
Vol 5 (1) ◽  
pp. 7-17 ◽  
Author(s):  
Juan Ramón Barrada ◽  
Julio Olea ◽  
Vicente Ponsoda ◽  
Francisco José Abad

The item selection rule (ISR) most commonly used in computerized adaptive testing (CAT) is to select the item with maximum Fisher information for the current trait estimation (PFI). Several alternative ISRs have been proposed. Among them, Fisher information considered in an interval (FI*I), Fisher information weighted with the likelihood function (FI*L), Kullback-Leibler information considered in an interval (KL*I) and Kullback-Leibler weighted with the likelihood function (KL*L) have shown a greater precision of trait estimation at the early stages of CAT. A new ISR is proposed, Fisher information by interval with geometric mean (FI*IG), which tries to rectify some detected problems in FI*I. We evaluate accuracy and item bank security for these six ISRs. FI*IG is the only ISR which simultaneously outperforms PFI in both variables. For the other ISRs, there seems to be a trade-off between accuracy and security, PFI being the one with worse accuracy and greater security, and the ISRs using the likelihood function the reverse.


2020 ◽  
Author(s):  
Menghua She ◽  
Yaling Li ◽  
Dongbo Tu ◽  
Yan Cai

Abstract Background: As more and more people suffer from sleep disorders, developing an efficient, cheap and accurate assessment tool for screening sleep disorders is becoming more urgent. This study developed a computerized adaptive testing for sleep disorders (CAT-SD). Methods: A large sample of 1,304 participants was recruited to construct the item pool of CAT-SD and to investigate the psychometric characteristics of CAT-SD. More specifically, firstly the analyses of unidimensionality, model fit, item fit, item discrimination parameter and differential item functioning (DIF) were conducted to construct a final item pool which meets the requirements of item response theory (IRT) measurement. In addition, a simulated CAT study with real response data of participants was performed to investigate the psychometric characteristics of CAT-SD, including reliability, validity and predictive utility (sensitivity and specificity). Results: The final unidimensional item bank of the CAT-SD not only had good item fit, high discrimination and no DIF; Moreover, it had acceptable reliability, validity and predictive utility. Conclusions: The CAT-SD could be used as an effective and accurate assessment tool for measuring individuals' severity of the sleep disorders and offers a bran-new perspective for screening of sleep disorders with psychological scales.


Author(s):  
Louise C. Mâsse ◽  
Teresia M. O’Connor ◽  
Yingyi Lin ◽  
Sheryl O. Hughes ◽  
Claire N. Tugault-Lafleur ◽  
...  

Abstract Purpose There has been a call to improve measurement rigour and standardization of food parenting practices measures, as well as aligning the measurement of food parenting practices with the parenting literature. Drawing from an expert-informed conceptual framework assessing three key domains of food parenting practices (autonomy promotion, control, and structure), this study combined factor analytic methods with Item Response Modeling (IRM) methodology to psychometrically validate responses to the Food Parenting Practice item bank. Methods A sample of 799 Canadian parents of 5–12-year-old children completed the Food Parenting Practice item bank (129 items measuring 17 constructs). The factorial structure of the responses to the item bank was assessed with confirmatory factor analysis (CFA), confirmatory bi-factor item analysis, and IRM. Following these analyses, differential Item Functioning (DIF) and Differential Response Functioning (DRF) analyses were then used to test invariance properties by parents’ sex, income and ethnicity. Finally, the efficiency of the item bank was examined using computerized adaptive testing simulations to identify the items to include in a short form. Results Overall, the expert-informed conceptual framework was predominantly supported by the CFA as it retained the same 17 constructs included in the conceptual framework with the exception of the access/availability and permissive constructs which were respectively renamed covert control and accommodating the child to better reflect the content of the final solution. The bi-factor item analyses and IRM analyses revealed that the solution could be simplified to 11 unidimensional constructs and the full item bank included 86-items (empirical reliability from 0.78 to 0.96, except for 1 construct) and the short form had 48 items. Conclusion Overall the food parenting practice item bank has excellent psychometric properties. The item bank includes an expanded version and short version to meet various study needs. This study provides more efficient tools for assessing how food parenting practices influence child dietary behaviours. Next steps are to use the IRM calibrated item bank and draw on computerized adaptive testing methodology to administer the item bank and provide flexibility in item selection.


2014 ◽  
Vol 17 ◽  
Author(s):  
Juan Ramón Barrada ◽  
Francisco José Abad ◽  
Julio Olea

AbstractTest security can be a major problem in computerized adaptive testing, as examinees can share information about the items they receive. Of the different item selection rules proposed to alleviate this risk, stratified methods are among those that have received most attention. In these methods, only low discriminative items can be presented at the beginning of the test and the mean information of the items increases as the test goes on. To do so, the item bank must be divided into several strata according to the information of the items. To date, there is no clear guidance about the optimal number of strata into which the item bank should be split. In this study, we will simulate conditions with different numbers of strata, from 1 (no stratification) to a number of strata equal to test length (maximum level of stratification) while manipulating the maximum exposure rate that no item should surpass (rmax) in its whole domain. In this way, we can plot the relation between test security and accuracy, making it possible to determine the number of strata that leads to better security while holding constant measurement accuracy. Our data indicates that the best option is to stratify into as many strata as possible.


2014 ◽  
Vol 4 (1) ◽  
Author(s):  
Kamaruddin Kamaruddin ◽  
Haryanto Haryanto

Penelitian dan pengembangan ini bertujuan untuk: (1) menghasilkan instrumen tes terstandar untuk mata pelajaran Menganalisis Rangkaian Listrik yang tersimpan dalam bank soal elektronik; (2) menghasilkan perangkat lunak sistem penilaian berbasis CAT (SPBCAT) yang mampu melaksanakan tes secara adaptif; dan (3) SPBCAT yang mampu melakukan proses pengujian dengan unjuk kerja yang layak. Penelitian dan pengembangan ini dilaksanakan dalam empat tahap, yaitu: (1) definisi; (2) desain; (3) pengembangan; dan (4) pengujian. Kesimpulan dari hasil penelitian ini adalah: (1) instrumen tes terstandar yang berhasil dikembangkan berbentuk 105 butir pilihan ganda, tersusun dalam empat KD, dengan reliabilitas empiris, masing-masing 0,9086, 0,9067, 0,9087, dan 0,9086, dan tersimpan dalam bank soal elektronik; (2) SPBCAT mampu melaksanakan tes secara adaptif, dan menampilkan hasil tes dengan tepat; (3) perangkat lunak SPBCAT telah mampu menampilkan unjuk kerjanya secara layak dalam menguji 28 peserta tes secara serempak, dan mendapat penilaian sangat baik oleh pengguna. DEVELOPMENT OF LEARNING OUTCOMES ASSESSMENT SYSTEMS FOR ELECTRIC CIRCUITS ANALYZING SUBJECTS BASED ON COMPUTERIZED ADAPTIVE TESTINGAbstractThis research and development aim to: (1) produce standardized test instruments for Electric Circuits Analyzing subjects which is stored in an electronic item bank; (2) produce a software of learning outcomes assessment system based on CAT (SPBCAT) which able to performs test adaptively; and (3) SPBCAT which is able to perform adaptive testing process with decent performance. This research and development were conducted in four stages, i.e.: (1) the definition; (2) the design; (3) the development; and (4) the testing. The conclusions of this research are: (1) the standardized test instrument which has been successfully developed in the forms of 105 items of multiple choice, arranged in four KD, with empirical reliability is 0.9086, 0.9067, 0.9087, and 0.9086, and stored in the electronic item bank; (2) SPBCAT has been able to perform test adaptively, displays test result precisely; (3) SPBCAT has been able to show its performance properly by present test items simultaneously to 28 students and got a very good appraisal from its user.


Author(s):  
Patrícia Nunes da Silva ◽  
Renata Cardoso Pires de Abreu ◽  
Carlos Frederico Fragoso de Barros e Vasconcellos

We present a computerized adaptive testing tool, termed DIA, for the assessment and provision of feedback to students from a formative evaluation perspective. We use Brazilian governmental guidelines for teaching mathematics (Brazil, 1998; Brazil, 1997) to construct a scale with goals increasingly ordered by the vertical development of mathematical knowledge. We construct a simulated item bank that meaningfully relates to our scale through the item response theory. We also analyze a feedback given by DIA.


2020 ◽  
pp. 014662162097768
Author(s):  
Miguel A. Sorrel ◽  
Francisco José Abad ◽  
Pablo Nájera

Decisions on how to calibrate an item bank might have major implications in the subsequent performance of the adaptive algorithms. One of these decisions is model selection, which can become problematic in the context of cognitive diagnosis computerized adaptive testing, given the wide range of models available. This article aims to determine whether model selection indices can be used to improve the performance of adaptive tests. Three factors were considered in a simulation study, that is, calibration sample size, Q-matrix complexity, and item bank length. Results based on the true item parameters, and general and single reduced model estimates were compared to those of the combination of appropriate models. The results indicate that fitting a single reduced model or a general model will not generally provide optimal results. Results based on the combination of models selected by the fit index were always closer to those obtained with the true item parameters. The implications for practical settings include an improvement in terms of classification accuracy and, consequently, testing time, and a more balanced use of the item bank. An R package was developed, named cdcatR, to facilitate adaptive applications in this context.


Sign in / Sign up

Export Citation Format

Share Document