Estimating Interval Scale Values for Survey Item Response Categories

A recently developed framework of measurement, referred to as Delta-scoring (or D-scoring) method (DSM; e.g., Dimitrov 2016, 2018, 2020) is gaining attention in the field of educational measurement and widely used in large-scale assessments at the National Center for Assessment in Saudi Arabia. The D-scores obtained under the DSM range from 0 to 1 to indicate how much (what proportion) of the ability measured by a test of binary items is demonstrated by the examinee. This study examines whether the D-scale is an interval scale and how D-scores compare to IRT ability scores (thetas) in terms of intervalness via testing the axioms of additive conjoint measurement (ACM). The approach to testing is the ConjointChecks (Domingue, 2014), which implements a Bayesian method to evaluating whether the axioms are violated in a given empirical item response data set. The results indicate that the D-scores, computed under the DSM, produce fewer violations of the ordering axioms of ACM than do the IRT “theta” scores. The conclusion is that the DSM produces a dependable D-scale in terms of the essential property of intervalness.

Download Full-text

Survey Item-Response Behavior as an Imperfect Proxy for Unobserved Ability: Theory and Application

SSRN Electronic Journal ◽

10.2139/ssrn.3170238 ◽

2018 ◽

Author(s):

Sonja C. de New ◽

Stefanie Schurer

Keyword(s):

Item Response ◽

Response Behavior ◽

Survey Item ◽

Unobserved Ability

Download Full-text

Survey Item Response Rates by Survey Modality, Language, and Sociodemographic Factors in a Large U.S. Cohort

Cancer Epidemiology Biomarkers & Prevention ◽

10.1158/1055-9965.epi-19-0757 ◽

2020 ◽

Vol 29 (4) ◽

pp. 724-730 ◽

Cited By ~ 2

Author(s):

Melissa Rittase ◽

Elizabeth Kirkland ◽

Daniela M. Dudas ◽

Alpa V. Patel

Keyword(s):

Item Response ◽

Response Rates ◽

Sociodemographic Factors ◽

Survey Item

Download Full-text

Using IRT Trait Estimates Versus Summated Scores in Predicting Outcomes

Educational and Psychological Measurement ◽

10.1177/0013164411419846 ◽

2011 ◽

Vol 72 (3) ◽

pp. 453-468 ◽

Cited By ~ 13

Author(s):

Ting Xu ◽

Clement A. Stone

Keyword(s):

Item Response ◽

Test Scores ◽

Behavioral Assessment ◽

Graded Response Model ◽

Interval Scale ◽

Graded Response ◽

Dependent Variables ◽

External Variables ◽

Predicting Outcome ◽

Model Versus

It has been argued that item response theory trait estimates should be used in analyses rather than number right (NR) or summated scale (SS) scores. Thissen and Orlando postulated that IRT scaling tends to produce trait estimates that are linearly related to the underlying trait being measured. Therefore, IRT trait estimates can be more useful than summated scores when examining relationships between test scores and external variables. Also, when the model holds, IRT trait estimates possess an interval scale that is a property assumed for dependent variables by most statistical procedures used in educational research. The objective of this study was to use Monte Carlo methods to compare the performance of IRT trait estimates and SS scores in predicting outcome variables in the context of health and behavioral assessment. The use of scores based on the graded-response model versus summated scores was compared. Results indicated that IRT-based scores and summated scores are comparable when evaluating the relationships between test scores and outcome measures. Thus, applied researchers could use summated scores in predictive studies and circumvent evaluating the assumptions underlying use of IRT-based scores.

Download Full-text

Test Scaling and Value-Added Measurement

Education Finance and Policy ◽

10.1162/edfp.2009.4.4.351 ◽

2009 ◽

Vol 4 (4) ◽

pp. 351-383 ◽

Cited By ~ 42

Author(s):

Dale Ballou

Keyword(s):

Data Analysis ◽

Item Response Theory ◽

Item Response ◽

Ordinal Data ◽

Value Added ◽

Interval Scale ◽

Response Theory ◽

Metric Properties ◽

Ordinal Data Analysis ◽

Ordinal Methods

Conventional value-added assessment requires that achievement be reported on an interval scale. While many metrics do not have this property, application of item response theory (IRT) is said to produce interval scales. However, it is difficult to confirm that the requisite conditions are met. Even when they are, the properties of the data that make a test IRT scalable may not be the properties we seek to represent in an achievement scale, as shown by the lack of surface plausibility of many scales resulting from the application of IRT. An alternative, ordinal data analysis, is presented. It is shown that value-added estimates are sensitive to the choice of ordinal methods over conventional techniques. Value-added practitioners should ask themselves whether they are so confident of the metric properties of these scales that they are willing to attribute differences to the superiority of the latter.

Download Full-text

Psychometric Properties of the Standardized Cosmesis and Health Nasal Outcomes Survey: Item Response Theory Analysis

JAMA Facial Plastic Surgery ◽

10.1001/jamafacial.2018.0626 ◽

2018 ◽

Vol 20 (6) ◽

pp. 519-521 ◽

Cited By ~ 15

Author(s):

Mikhail Saltychev ◽

Cherian K. Kandathil ◽

Mohamed Abdelwahab ◽

Emily A. Spataro ◽

Sami P. Moubayed ◽

...

Keyword(s):

Item Response Theory ◽

Psychometric Properties ◽

Item Response ◽

Response Theory ◽

Theory Analysis ◽

Survey Item ◽

Item Response Theory Analysis

Download Full-text

Item Response Theory Without Restriction of Equal Interval Scale for Rater’s Score

Lecture Notes in Computer Science - Artificial Intelligence in Education ◽

10.1007/978-3-319-93846-2_68 ◽

2018 ◽

pp. 363-368 ◽

Cited By ~ 4

Author(s):

Masaki Uto ◽

Maomi Ueno

Keyword(s):

Item Response Theory ◽

Item Response ◽

Interval Scale ◽

Response Theory ◽

Equal Interval

Download Full-text

Applying Item Response Theory Modeling to Identify Social (Pragmatic) Communication Disorder

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-19-00284 ◽

2020 ◽

Vol 63 (6) ◽

pp. 1916-1932 ◽

Cited By ~ 1

Author(s):

Haiying Yuan ◽

Christine Dollaghan

Keyword(s):

Item Response Theory ◽

Item Response ◽

Sensitivity And Specificity ◽

Social Communication ◽

National Database ◽

Communication Disorder ◽

Social Communication Questionnaire ◽

Autism Research ◽

The Social ◽

Pragmatic Communication

Purpose No diagnostic tools exist for identifying social (pragmatic) communication disorder (SPCD), a new Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition category for individuals with social communication deficits but not the repetitive, restricted behaviors and interests (RRBIs) that would qualify them for a diagnosis of autism spectrum disorder (ASD). We explored the value of items from a widely used screening measure of ASD for distinguishing SPCD from typical controls (TC; Aim 1) and from ASD (Aim 2). Method We applied item response theory (IRT) modeling to Social Communication Questionnaire–Lifetime ( Rutter, Bailey, & Lord, 2003 ) records available in the National Database for Autism Research. We defined records from putative SPCD ( n = 54), ASD ( n = 278), and TC ( n = 274) groups retrospectively, based on National Database for Autism Research classifications and Autism Diagnostic Interview–Revised responses. After assessing model assumptions, estimating model parameters, and measuring model fit, we identified items in the social communication and RRBI domains that were maximally informative in differentiating the groups. Results IRT modeling identified a set of seven social communication items that distinguished SPCD from TC with sensitivity and specificity > 80%. A set of five RRBI items was less successful in distinguishing SPCD from ASD (sensitivity and specificity < 70%). Conclusion The IRT modeling approach and the Social Communication Questionnaire–Lifetime item sets it identified may be useful in efforts to construct screening and diagnostic measures for SPCD.

Download Full-text