Analyzing Item Difficulty and Discrimination in a Dichotomously Scored Writing Test: Focus on Classical Testing Theorem and Item Response Theory

Purpose This paper aims to validate the Korean version of the decent work scale and examine the relationship between decent work and work engagement. Design/methodology/approach After completing translation and back translation, the authors surveyed 266 Korean employees from various organizations via network sampling. They assessed Rasch’s model based on item response theory. In addition, they used classical test theory to evaluate the decent work scale’s validity and reliability. Findings The authors found that the current version of the decent work scale has good validity, reliability and item difficulty, and decent work has a positive relationship with work engagement. However, based on item response theory, the assessment showed that three of the items are extremely similar to another item within the same dimension, implying that the items are unable to discriminate among individual traits. Originality/value This study validated the decent work scale in a Korean work environment using Rasch’s (1960) model from the perspective of item response theory.

Download Full-text

Item Response Theory in Affective Instrument Development: An Illustration

Journal of Nursing Measurement ◽

10.1891/1061-3749.9.1.5 ◽

2001 ◽

Vol 9 (1) ◽

pp. 5-22 ◽

Cited By ~ 8

Author(s):

Cheryl T. Beck ◽

Robert K. Gable

Keyword(s):

Item Response Theory ◽

Item Response ◽

Instrument Development ◽

Development Process ◽

Item Difficulty ◽

Empirical Support ◽

Measurement Model ◽

Response Theory ◽

Critical Function ◽

Screening Scale

The benefits of item response theory (IRT) analysis in obtaining empirical support for construct validity make it an essential step in the instrument development process. IRT analysis can result in finer construct interpretations that lead to more thorough descriptions of low- and high-scoring respondents. A critical function of IRT is its ability to determine the adequacy with which the attitude continuum underlying each dimension is assessed by the respective items in an instrument. Many nurse researchers, however, are not reaping the benefits of IRT in the development of affective instruments. The purpose of this article is to familiarize nurse researchers with this valuable approach through a description of the Facets computer program. Facets uses a one parameter (i.e., item difficulty) Rasch measurement model. Data from a survey of 525 new mothers that assessed the psychometric properties of the Postpartum Depresssion Screening Scale are used to illustrate the Facets program. It is hoped that IRT will gain increased prominence in affective instrument development as more nurse researchers become aware of computer programs such as Facets to assist in analysis.

Download Full-text

Item Response Theory Modeling of the Philadelphia Naming Test

Journal of Speech Language and Hearing Research ◽

10.1044/2015_jslhr-l-14-0249 ◽

2015 ◽

Vol 58 (3) ◽

pp. 865-877 ◽

Cited By ~ 13

Author(s):

Gerasimos Fergadiotis ◽

Stacey Kellough ◽

William D. Hula

Keyword(s):

Item Response Theory ◽

Item Response ◽

Word Length ◽

Logistic Model ◽

Item Difficulty ◽

Age Of Acquisition ◽

Response Theory ◽

Contextual Diversity ◽

Severity Scores ◽

Naming Test

Purpose In this study, we investigated the fit of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) to an item-response-theory measurement model, estimated the precision of the resulting scores and item parameters, and provided a theoretical rationale for the interpretation of PNT overall scores by relating explanatory variables to item difficulty. This article describes the statistical model underlying the computer adaptive PNT presented in a companion article (Hula, Kellough, & Fergadiotis, 2015). Method Using archival data, we evaluated the fit of the PNT to 1- and 2-parameter logistic models and examined the precision of the resulting parameter estimates. We regressed the item difficulty estimates on three predictor variables: word length, age of acquisition, and contextual diversity. Results The 2-parameter logistic model demonstrated marginally better fit, but the fit of the 1-parameter logistic model was adequate. Precision was excellent for both person ability and item difficulty estimates. Word length, age of acquisition, and contextual diversity all independently contributed to variance in item difficulty. Conclusions Item-response-theory methods can be productively used to analyze and quantify anomia severity in aphasia. Regression of item difficulty on lexical variables supported the validity of the PNT and interpretation of anomia severity scores in the context of current word-finding models.

Download Full-text

أثر صعوبة الفقرة و حجم العينة في دقة معادلة درجات الاختبارات باستخدام نظرية استجابة الفقرة IRT = Effect of Item Difficulty and Sample Size on the Accuracy of Equating by Using Item Response Theory

Journal of Educational and Psychological Studies [JEPS] ◽

10.12816/0026667 ◽

2016 ◽

Vol 10 (1) ◽

pp. 182-200

Author(s):

يوسف عبد العاطي المحروق

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Item Difficulty ◽

Response Theory

Download Full-text

ANALISIS METODE CHEATING PADA TES BERSKALA BESAR

Molluca Journal of Chemistry Education (MJoCE) ◽

10.30598/mjocevol9iss2pp133-146 ◽

2019 ◽

Vol 9 (2) ◽

pp. 133-146

Author(s):

Yance Manoppo ◽

Djemari Mardapi

Keyword(s):

Item Response Theory ◽

Item Response ◽

Item Difficulty ◽

Classical Test Theory ◽

Test Theory ◽

Theory Approach ◽

Response Theory ◽

Index Method ◽

National Examination ◽

Classical Test

This study aimed to reveal: (1) the characteristics of items of Chemistry Test in National Examination by using the classical test theory and item response theory; (2) the amount of cheating which occured by using Angoff's B-index Method, Pair 1 Method, Pair 2 Method, Modified Error Similarity Analysis (MESA) Method, and G2 Method; (3) the methods that detect more cheating in the implementation of the Chemistry Test in National Examination for high schools in the year 2011/2012 in Maluku Province. The results of the analysis with the classical test theory approach show that 77.5% items have item difficulty functioning well, 55% items have discrimination yet qualified and 70% items have distractor that works well with the index reliability test of 0,772. The analysis using the item response theory approach shows that 14 (35%) items fit with the model, the maximum function information is 11,4069 at θ = -1,6, and the magnitude of the error of measurement is 2,296. The number of pairs who are suspected of cheating is as follows: 13 pairs according to Angoff's B-index Method, 212 pairs according to Pair 1 Method, 444 pairs according to Pair 2 Method, 7 pairs according to MESA Method, and 102 pairs according to G2 Method. The most widely detecting cheating in a row is a Pair 2, Pair 1, G2, Angoff's B-index, and MESA.

Download Full-text

Evaluation Properties of the Chinese Version of the Caring Factor Survey-Caring of Manager According to Classical Test Theory and Item Response Theory

International Journal for Human Caring ◽

10.20467/1091-5710.23.4.275 ◽

2019 ◽

Vol 23 (4) ◽

pp. 275-283

Author(s):

Ling Wang ◽

John W. Nelson

Keyword(s):

Item Response Theory ◽

Psychometric Properties ◽

Item Response ◽

Rating Scale ◽

Item Difficulty ◽

Classical Test Theory ◽

Chinese Version ◽

Test Theory ◽

Response Theory ◽

Classical Test

The aim of the study is to evaluate psychometric properties of the Chinese version of Caring Factor Survey-Caring of Manager (CFS-CM), which evaluated by using with classical test theory (CTT) and item response theory (IRT). CTT analyses evaluate include internal consistence reliability, test–retest reliability and construct validity. IRT analyses were conducted to test the unidimensionality, item fit, item difficulty, the reliability, and rating scale analysis. CTT showed good psychometric properties of the CFS-CM. However, IRT revealed some problems of category level. Taking the above issue into consideration, it could be beneficial to perfect the CFS-CM in the future.

Download Full-text

Taking Inventory of the Creative Behavior Inventory: An Item Response Theory Analysis of the CBI

10.31234/osf.io/b7cfd ◽

2021 ◽

Author(s):

Paul Silvia ◽

Rebekah Rodriguez ◽

James C. Kaufman ◽

Roni Reiter-Palmon ◽

Jeb S. Puryear

Keyword(s):

Item Response Theory ◽

Item Response ◽

Item Difficulty ◽

Self Report ◽

Response Theory ◽

Response Scale ◽

Creative Behavior ◽

Item Response Theory Analysis ◽

Gender Based ◽

Psychometric Investigation

The original 90-item Creative Behavior Inventory (CBI) was a landmark self-report scale in creativity research, and the 28-item brief form developed nearly 20 years ago is a popular measure of everyday creativity. Relatively little is known, however, about the psychometric properties of this widely used scale. In the current research, we conduct a detailed psychometric investigation into the 28-item CBI by applying methods from item response theory using a sample of 2,082 adults. Our investigation revealed several strengths of the current scale: excellent reliability, suitable dimensionality, appropriate item difficulty, and reasonably good item discrimination. Several areas for improvement were highlighted as well: (1) the four-point response scale should probably have fewer options; (2) a handful of items showed gender-based differential item functioning, indicating some item bias; and (3) local dependence statistics revealed clusters of items that are probably redundant. These analyses support the continued use of the CBI for assessing engagement in everyday creative behaviors and suggest that the CBI could benefit from thoughtful revision.

Download Full-text

MPP's Item Difficulty and Motor Performance Abilities of the Students with Intellectual Disability and Autism: Item Response Theory Application

Journal of adapted physical activity and exercise ◽

10.17006/kjapa.2014.22.4.21 ◽

2014 ◽

Vol 22 (4) ◽

pp. 21-32

Author(s):

김태형

Keyword(s):

Intellectual Disability ◽

Item Response Theory ◽

Item Response ◽

Motor Performance ◽

Item Difficulty ◽

Response Theory ◽

Performance Abilities ◽

Theory Application

Download Full-text

Effect of Item Difficulty and Sample Size on the Accuracy of Equating by Using Item Response Theory

Journal of Educational and Psychological Studies [JEPS] ◽

10.24200/jeps.vol10iss1pp182-200 ◽

2016 ◽

Vol 10 (1) ◽

pp. 182

Author(s):

Yousef A. Al Mahrouq

Keyword(s):

Item Response Theory ◽

Sample Size ◽

Item Response ◽

Standard Error ◽

Item Difficulty ◽

Standard Errors ◽

Large Sample Size ◽

Mean Square ◽

Response Theory ◽

Simulation Data

This study explored the effect of item difficulty and sample size on the accuracy of equating by using item response theory. This study used simulation data. The equating method was evaluated using an equating criterion (SEE, RMSE). Standard error of equating between the criterion scores and equated scores, and root mean square error of equating (RMSE) were used as measures to compare the method to the criterion equating. The results indicated that the large sample size reduces the standard error of the equating and reduces residuals. The results also showed that different difficulty models tend to produce smaller standard errors and the values of RMSE. The similar difficulty models tend to produce decreasing standard errors and the values of RMSE.

Download Full-text

Item Response Theory Analysis: PROMIS® Anxiety Form and Generalized Anxiety Disorder Scale

Western Journal of Nursing Research ◽

10.1177/01939459211015985 ◽

2021 ◽

pp. 019394592110159

Author(s):

Wen Liu ◽

Lilian Dindo ◽

Katherine Hadlandsmyth ◽

George Jay Unick ◽

M. Bridget Zimmerman ◽

...

Keyword(s):

Anxiety Disorder ◽

Item Response Theory ◽

Item Response ◽

Generalized Anxiety Disorder ◽

Item Difficulty ◽

Short Form ◽

Generalized Anxiety ◽

Response Theory ◽

Patient Reported ◽

Item Scale

Little research has compared item functioning of the Patient-Reported Outcomes Measurement Information System (PROMIS®) anxiety short form 6a and the generalized anxiety disorder 7-item scale using item response theory models. This was a secondary analysis of self-reported assessments from 67 at-risk U.S. military veterans. The two measures performed comparably well with data fitting adequately to models, acceptable item discriminations, and item and test information curves being unimodal and symmetric. The PROMIS® anxiety short form 6a performed better in that item difficulty estimates had a wider range and distributed more evenly and all response categories had less floor effect, while the third category in most items of the generalized anxiety disorder 7-item scale were rarely used. While both measures may be appropriate, findings provided preliminary information supporting use of the PROMIS® anxiety short form 6a as potentially preferable, especially for veterans with low-to-moderate anxiety. Further testing is needed in larger, more diverse samples.

Download Full-text