scholarly journals PENGARUH SAMPLE SIZE (N) DAN TEST LENGTH (n) TERHADAP ITEM PARAMETER ESTIMATE DAN EXAMINEE PARAMETER ESTIMATE, SUATU STUDI SIMULAS

2013 ◽  
Vol 12 (1) ◽  
pp. 25
Author(s):  
R. BUDIARTI

Studi yang mempelajari masalah pengukuran secara umum di bidang pendidikan dan mempelajari metode untuk menyelesaikannya, telah berkembang menjadi disiplin ilmu khusus yang dikenal dengan test theory. Test theory menyediakan kerangka kerja umum untuk melihat proses pembentukan instrumen tes (item test). Analisis item dapat dilakukan dengan pendekatan tes teori klasik (Classical Test Theory atau CTT) dan teori tes modern yang dikenal dengan  Item Respons Theory (IRT). Ada beberapa model respon item (item response model), yang berbeda banyaknya parameter dalam model.  Semua model IRT mengandung satu atau lebih parameter item dan satu atau lebih parameter examinee. Pada tulisan ini difokuskan pada model respon item dengan satu parameter examinee dengan dua parameter item. Parameter-parameter ini tidak diketahui, untuk itu perlu diduga. Agar hasil dugaan relatif stabil dan akurat, maka diperlukan sample size yang cukup. Tujuan dari paper ini adalah (1) menginvestigasi pengaruh sample size (N) terhadap kestabilan item  parameter estimate, (2) menginvestigasi pengaruh test length (n) terhadap kestabilan examinee parameter estimate. Kestabilan dugaan parameter item (a dan b) dipengaruhi oleh sample size, dan kestabilan parameter examinee (<em>Ө</em>) dipengaruhi oleh ukuran test length. Semakin besar sample size, maka pendugaan parameter item makin stabil, sedangkan semakin besar ukuran test length maka makin stabil dugaan parameter item.

Author(s):  
Riswan Riswan

The Item Response Theory (IRT) model contains one or more parameters in the model. These parameters are unknown, so it is necessary to predict them. This paper aims (1) to determine the sample size (N) on the stability of the item parameter (2) to determine the length (n) test on the stability of the estimate parameter examinee (3) to determine the effect of the model on the stability of the item and the parameter to examine (4) to find out Effect of sample size and test length on item stability and examinee parameter estimates (5) Effect of sample size, test length, and model on item stability and examinee parameter estimates. This paper is a simulation study in which the latent trait (q) sample simulation is derived from a standard normal population of ~ N (0.1), with a specific Sample Size (N) and test length (n) with the 1PL, 2PL and 3PL models using Wingen. Item analysis was carried out using the classical theory test approach and modern test theory. Item Response Theory and data were analyzed through software R with the ltm package. The results showed that the larger the sample size (N), the more stable the estimated parameter. For the length test, which is the greater the test length (n), the more stable the estimated parameter (q).


Author(s):  
Alper Köse ◽  
C. Deha Doğan

The aim of this study was to examine the precision of item parameter estimation in different sample sizes and test lengths under three parameter logistic model (3PL) item response theory (IRT) model, where the trait measured by a test was not normally distributed or had a skewed distribution.In the study, number of categories (1-0), and item response model were identified as fixed conditions, and sample size, test length variables, and the ability distributions were selected as manipulated conditions. This is a simulation study. So data simulation and data analysis were done via packages in the R programming language. Results of the study showed that item parameter estimations performed under normal distribution were much stronger and bias-free compared to non-normal distribution. Moreover, the sample size had some limited positive effect on parameter estimation. However, the test length had no effect parameter estimation. As a result the importance of normality assumptions for IRT models were highlighted and findings were discussed based on relevant literature.


Author(s):  
DongGun Park ◽  
MyungOk Choi ◽  
WonSun Lee ◽  
HyeMin Lee ◽  
JunHee Lee

The present study investigated the utilities of two types of item response process models(dominance model and ideal point model) for personality item parameter estimation and scoring. The authors developed scales for four personality traits(achievement, fairness, cooperation and honesty) using classical test theory, dominance item response theory(IRT) method, and ideal point IRT method and compared the methods in terms of model-data fit, information and criterion validity. Results show that the fit of ideal point IRT model was better than that of dominance IRT model, but the difference between the fit of two models was very slight. The test information functions of ideal point IRT model and dominance IRT model for honesty and cooperation scales were very similar. The criterion-related validity based on individual ability estimates and grades was not significant for the three methods but the validity for the ideal point method is not better than dominant IRT model. Implications and limitations of the findings are discussed.


2021 ◽  
Vol 11 (13) ◽  
pp. 6048
Author(s):  
Jaroslav Melesko ◽  
Simona Ramanauskaite

Feedback is a crucial component of effective, personalized learning, and is usually provided through formative assessment. Introducing formative assessment into a classroom can be challenging because of test creation complexity and the need to provide time for assessment. The newly proposed formative assessment algorithm uses multivariate Elo rating and multi-armed bandit approaches to solve these challenges. In the case study involving 106 students of the Cloud Computing course, the algorithm shows double learning path recommendation precision compared to classical test theory based assessment methods. The algorithm usage approaches item response theory benchmark precision with greatly reduced quiz length without the need for item difficulty calibration.


Psychometrika ◽  
2021 ◽  
Author(s):  
Ron D. Hays ◽  
Karen L. Spritzer ◽  
Steven P. Reise

AbstractThe reliable change index has been used to evaluate the significance of individual change in health-related quality of life. We estimate reliable change for two measures (physical function and emotional distress) in the Patient-Reported Outcomes Measurement Information System (PROMIS®) 29-item health-related quality of life measure (PROMIS-29 v2.1). Using two waves of data collected 3 months apart in a longitudinal observational study of chronic low back pain and chronic neck pain patients receiving chiropractic care, and simulations, we compare estimates of reliable change from classical test theory fixed standard errors with item response theory standard errors from the graded response model. We find that unless true change in the PROMIS physical function and emotional distress scales is substantial, classical test theory estimates of significant individual change are much more optimistic than estimates of change based on item response theory.


Sign in / Sign up

Export Citation Format

Share Document