Test Length and Sample Size for Item-Difficulty Parameter Estimation in Item Response Theory

The aim of this study was to examine the precision of item parameter estimation in different sample sizes and test lengths under three parameter logistic model (3PL) item response theory (IRT) model, where the trait measured by a test was not normally distributed or had a skewed distribution.In the study, number of categories (1-0), and item response model were identified as fixed conditions, and sample size, test length variables, and the ability distributions were selected as manipulated conditions. This is a simulation study. So data simulation and data analysis were done via packages in the R programming language. Results of the study showed that item parameter estimations performed under normal distribution were much stronger and bias-free compared to non-normal distribution. Moreover, the sample size had some limited positive effect on parameter estimation. However, the test length had no effect parameter estimation. As a result the importance of normality assumptions for IRT models were highlighted and findings were discussed based on relevant literature.

Download Full-text

Sample Size and Test Length for Item Parameter Estimate and Exam Parameter Estimate

Al-Khwarizmi Jurnal Pendidikan Matematika dan Ilmu Pengetahuan Alam ◽

10.24256/jpmipa.v9i1.2384 ◽

2021 ◽

Vol 9 (1) ◽

pp. 69-78

Author(s):

Riswan Riswan

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Parameter Estimate ◽

Test Theory ◽

Item Parameter ◽

Parameter Estimates ◽

Test Length ◽

Response Theory ◽

The Stability

The Item Response Theory (IRT) model contains one or more parameters in the model. These parameters are unknown, so it is necessary to predict them. This paper aims (1) to determine the sample size (N) on the stability of the item parameter (2) to determine the length (n) test on the stability of the estimate parameter examinee (3) to determine the effect of the model on the stability of the item and the parameter to examine (4) to find out Effect of sample size and test length on item stability and examinee parameter estimates (5) Effect of sample size, test length, and model on item stability and examinee parameter estimates. This paper is a simulation study in which the latent trait (q) sample simulation is derived from a standard normal population of ~ N (0.1), with a specific Sample Size (N) and test length (n) with the 1PL, 2PL and 3PL models using Wingen. Item analysis was carried out using the classical theory test approach and modern test theory. Item Response Theory and data were analyzed through software R with the ltm package. The results showed that the larger the sample size (N), the more stable the estimated parameter. For the length test, which is the greater the test length (n), the more stable the estimated parameter (q).

Download Full-text

أثر صعوبة الفقرة و حجم العينة في دقة معادلة درجات الاختبارات باستخدام نظرية استجابة الفقرة IRT = Effect of Item Difficulty and Sample Size on the Accuracy of Equating by Using Item Response Theory

Journal of Educational and Psychological Studies [JEPS] ◽

10.12816/0026667 ◽

2016 ◽

Vol 10 (1) ◽

pp. 182-200

Author(s):

يوسف عبد العاطي المحروق

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Item Difficulty ◽

Response Theory

Download Full-text

Comparison of Unidimensional and Multidimensional Models Based on Item Response Theory in Terms of Both Variables of Test Length and Sample Size

Procedia - Social and Behavioral Sciences ◽

10.1016/j.sbspro.2012.05.082 ◽

2012 ◽

Vol 46 ◽

pp. 135-140 ◽

Cited By ~ 8

Author(s):

Ibrahim Alper Kose ◽

Nukhet C. Demirtasli

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Test Length ◽

Response Theory ◽

Multidimensional Models

Download Full-text

Effect of Item Difficulty and Sample Size on the Accuracy of Equating by Using Item Response Theory

Journal of Educational and Psychological Studies [JEPS] ◽

10.24200/jeps.vol10iss1pp182-200 ◽

2016 ◽

Vol 10 (1) ◽

pp. 182

Author(s):

Yousef A. Al Mahrouq

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Standard Error ◽

Item Difficulty ◽

Standard Errors ◽

Large Sample Size ◽

Mean Square ◽

Response Theory ◽

Simulation Data

This study explored the effect of item difficulty and sample size on the accuracy of equating by using item response theory. This study used simulation data. The equating method was evaluated using an equating criterion (SEE, RMSE). Standard error of equating between the criterion scores and equated scores, and root mean square error of equating (RMSE) were used as measures to compare the method to the criterion equating. The results indicated that the large sample size reduces the standard error of the equating and reduces residuals. The results also showed that different difficulty models tend to produce smaller standard errors and the values of RMSE. The similar difficulty models tend to produce decreasing standard errors and the values of RMSE.

Download Full-text

The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

Educational Sciences Theory & Practice ◽

10.12738/estp.2017.1.0270 ◽

2017 ◽

Cited By ~ 1

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Test Length ◽

Response Theory ◽

Item Parameters

Download Full-text

Analyzing Item Difficulty and Discrimination in a Dichotomously Scored Writing Test: Focus on Classical Testing Theorem and Item Response Theory

Studies in English Education ◽

10.22275/see.21.3.09 ◽

2016 ◽

Vol 21 (3) ◽

pp. 235-235

Author(s):

Ho Lee

Keyword(s):

Item Response Theory ◽

Item Response ◽

Item Difficulty ◽

Response Theory ◽

Writing Test

Download Full-text

Validation study of the Korean version of decent work scale

European Journal of Training and Development ◽

10.1108/ejtd-03-2021-0040 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Yunsoo Lee ◽

Ji Hoon Song ◽

Soo Jung Kim

Keyword(s):

Item Response Theory ◽

Item Response ◽

Work Engagement ◽

Item Difficulty ◽

Test Theory ◽

Decent Work ◽

Validity And Reliability ◽

Response Theory ◽

Content Type ◽

Korean Version

Purpose This paper aims to validate the Korean version of the decent work scale and examine the relationship between decent work and work engagement. Design/methodology/approach After completing translation and back translation, the authors surveyed 266 Korean employees from various organizations via network sampling. They assessed Rasch’s model based on item response theory. In addition, they used classical test theory to evaluate the decent work scale’s validity and reliability. Findings The authors found that the current version of the decent work scale has good validity, reliability and item difficulty, and decent work has a positive relationship with work engagement. However, based on item response theory, the assessment showed that three of the items are extremely similar to another item within the same dimension, implying that the items are unable to discriminate among individual traits. Originality/value This study validated the decent work scale in a Korean work environment using Rasch’s (1960) model from the perspective of item response theory.

Download Full-text

Examination of Different Item Response Theory Models on Tests Composed of Testlets

Journal of Education and Learning ◽

10.5539/jel.v6n4p113 ◽

2017 ◽

Vol 6 (4) ◽

pp. 113

Author(s):

Esin Yilmaz Kogar ◽

Hülya Kelecioglu

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Data Sets ◽

Meaningful Difference ◽

Response Theory ◽

Size Change ◽

Data Set ◽

Testlet Response Theory ◽

Item Response Theory Models

The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and sample size change, and then to compare the obtained results. Mathematic test in PISA 2012 was employed as the data collection tool, and 36 items were used to constitute six different data sets containing different numbers of testlets and independent items. Subsequently, from these constituted data sets, three different sample sizes of 250, 500 and 1000 persons were selected randomly. When the findings of the research were examined, it was determined that, generally the lowest mean error values were those obtained from UIRT, and TRT yielded a mean of error estimation lower than that of BIF. It was found that, under all conditions, models which take into consideration the local dependency have provided a better model-data compatibility than UIRT, generally there is no meaningful difference between BIF and TRT, and both models can be used for those data sets. It can be said that when there is a meaningful difference between those two models, generally BIF yields a better result. In addition, it has been determined that, in each sample size and data set, item and ability parameters and correlations of errors of the parameters are generally high.

Download Full-text

سلوك دالة المعلومات بتغير نسب فقد الاستجابة وحجم العينة في ضوء نظرية استجابة الفقرة = The Behavior of the Information Function Related to the Percentages of Missing Response and Sample Size Depending on Item Response Theory

مجلة الزرقاء للبحوث و الدراسات الإنسانية ◽

10.12816/0054762 ◽

2018 ◽

Vol 18 (3) ◽

pp. 429-440

Author(s):

بني عواد ، علي محمد العرسان

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Information Function ◽

Response Theory ◽

Missing Response

Download Full-text