scholarly journals Comparative Analysis of Classical Test Theory and Item Response Theory Based Item Parameter Estimates of Senior School Certificate Mathematics Examination

2016 ◽  
Vol 12 (28) ◽  
pp. 263 ◽  
Author(s):  
Awopeju, O. A. ◽  
Afolabi, E. R. I.

The study compared Classical Test Theory (CTT) and Item Response Theory (IRT)-estimated item difficulty and item discrimination indices in relation to the ability of examinees in Senior School Certificate Examination (SSCE) in Mathematics with a view to providing empirical basis for informed decisions on the appropriateness of statistical and psychometric tests. The study adopted ex-post-facto design. A sample of 6,000 students was selected from the population of 35,262 students who sat for the NECO SSCE Mathematics Paper 1 in 2008 in Osun State, Nigeria. An instrument consisting of 60-multiple-choice items, May/June 2008 NECO SSCE Mathematics Paper 1 was used. Three sampling plans: random, gender and ability sampling plans were employed to study the behaviours of the examinees scores under the CTT and IRT measurement frameworks. BILOG-MG 3 was used to estimate the indices of item parameters and SPSS 20 was used to compare CTT- and IRT-based item parameters. The results showed that CTT-based item difficulty estimates and oneparameter IRT item difficulty estimates were comparable (the correlations were generally in the -0.702 to -0.988 range in large sample and -0.622 to - 0.989 range in small sample). Results also indicated that CTT-based and two-parameter IRT-based item discrimination estimates were comparable (the correlations were in the 0.430 to 0.880 ranges in large sample and 0.531 to 0.950 range in small sample). The study concluded that CTT and IRT were comparable in estimating item characteristics of statistical and psychometric tests and thus could be used as complementary procedures in the development of national examinations

2019 ◽  
Vol 9 (2) ◽  
pp. 133-146
Author(s):  
Yance Manoppo ◽  
Djemari Mardapi

This study aimed to reveal: (1) the characteristics of items of Chemistry Test in National Examination by using the classical test theory and item response theory; (2) the amount of cheating which occured by using Angoff's B-index Method, Pair 1 Method, Pair 2 Method, Modified Error Similarity Analysis (MESA) Method, and G2 Method; (3) the methods that detect more cheating in the implementation of the Chemistry Test in National Examination for high schools in the year 2011/2012 in Maluku Province. The results of the analysis with the classical test theory approach show that 77.5% items have item difficulty functioning well, 55% items have discrimination yet qualified and 70% items have distractor that works well with the index reliability test of 0,772. The analysis using the item response theory approach shows that 14 (35%) items fit with the model, the maximum function information is 11,4069 at θ = -1,6, and the magnitude of the error of measurement is 2,296. The number of pairs who are suspected of cheating is as follows: 13 pairs according to Angoff's B-index Method, 212 pairs according to Pair 1 Method, 444 pairs according to Pair 2 Method, 7 pairs according to MESA Method, and 102 pairs according to G2 Method. The most widely detecting cheating in a row is a   Pair 2, Pair 1, G2, Angoff's B-index, and MESA.


2019 ◽  
Vol 23 (4) ◽  
pp. 275-283
Author(s):  
Ling Wang ◽  
John W. Nelson

The aim of the study is to evaluate psychometric properties of the Chinese version of Caring Factor Survey-Caring of Manager (CFS-CM), which evaluated by using with classical test theory (CTT) and item response theory (IRT). CTT analyses evaluate include internal consistence reliability, test–retest reliability and construct validity. IRT analyses were conducted to test the unidimensionality, item fit, item difficulty, the reliability, and rating scale analysis. CTT showed good psychometric properties of the CFS-CM. However, IRT revealed some problems of category level. Taking the above issue into consideration, it could be beneficial to perfect the CFS-CM in the future.


2019 ◽  
Vol 13 (1) ◽  
pp. 1-16
Author(s):  
Muh Syahrul Sarea ◽  
Rosnia Ruslan

This research aimes to describe the characteristic of UAS items theme 1 at the fourth grade of Primary School in Paramasan bawah village according to the item difficulties and discrimination. The sample of this research was 37 students who took the final examination year academic 2018/2019. The objects of this research were question items and the answer sheet of the final exam that obtained from 3 different schools in Paramasan Bawah village. The data analysis technique used in this research was empirical analysis helped by Bilog  and Iteman program application. This analysis used to know the characteristic of items based on the Item Response Theory and Classical Test Theory. The result of this research showed the characteristic of UAS items, according to item response theory, 30 items had a good discrimination and 33 items had a good item difficulty, while according to Classical Test Theory: 15 items had a good discrimination and 27 items had a good item difficulty.Keywords: Characteristics of items, item difficulties, discrimination


2021 ◽  
Vol 11 (13) ◽  
pp. 6048
Author(s):  
Jaroslav Melesko ◽  
Simona Ramanauskaite

Feedback is a crucial component of effective, personalized learning, and is usually provided through formative assessment. Introducing formative assessment into a classroom can be challenging because of test creation complexity and the need to provide time for assessment. The newly proposed formative assessment algorithm uses multivariate Elo rating and multi-armed bandit approaches to solve these challenges. In the case study involving 106 students of the Cloud Computing course, the algorithm shows double learning path recommendation precision compared to classical test theory based assessment methods. The algorithm usage approaches item response theory benchmark precision with greatly reduced quiz length without the need for item difficulty calibration.


Psychometrika ◽  
2021 ◽  
Author(s):  
Ron D. Hays ◽  
Karen L. Spritzer ◽  
Steven P. Reise

AbstractThe reliable change index has been used to evaluate the significance of individual change in health-related quality of life. We estimate reliable change for two measures (physical function and emotional distress) in the Patient-Reported Outcomes Measurement Information System (PROMIS®) 29-item health-related quality of life measure (PROMIS-29 v2.1). Using two waves of data collected 3 months apart in a longitudinal observational study of chronic low back pain and chronic neck pain patients receiving chiropractic care, and simulations, we compare estimates of reliable change from classical test theory fixed standard errors with item response theory standard errors from the graded response model. We find that unless true change in the PROMIS physical function and emotional distress scales is substantial, classical test theory estimates of significant individual change are much more optimistic than estimates of change based on item response theory.


Sign in / Sign up

Export Citation Format

Share Document