Development and Validation of a Cognitive Diagnostic Assessment with Ordered Multiple-Choice Items for Addition of Time

International Journal of Science and Mathematics Education ◽

10.1007/s10763-021-10170-5 ◽

2021 ◽

Author(s):

Huan Chin ◽

Cheng Meng Chew ◽

Hooi Lian Lim ◽

Lei Mee Thien

Keyword(s):

Multiple Choice ◽

Diagnostic Assessment ◽

Cognitive Diagnostic Assessment ◽

Multiple Choice Items ◽

Development And Validation

Download Full-text

Diagnostic Assessment With Ordered Multiple-Choice Items

Educational Assessment ◽

10.1207/s15326977ea1101_2 ◽

2006 ◽

Vol 11 (1) ◽

pp. 33-63 ◽

Cited By ~ 132

Author(s):

Derek Briggs ◽

Alicia Alonzo ◽

Cheryl Schwab ◽

Mark Wilson

Keyword(s):

Multiple Choice ◽

Diagnostic Assessment ◽

Multiple Choice Items

Download Full-text

Development and Validation of a Learning Progression for Astronomical Systems Using Ordered Multiple-Choice Items

Journal of The Korean Association For Science Education ◽

10.14697/jkase.2014.34.8.0703 ◽

2014 ◽

Vol 34 (8) ◽

pp. 703-718 ◽

Cited By ~ 13

Author(s):

Seungho Maeng ◽

Kiyoung Lee ◽

Young-Shin Park ◽

Jeong-A Lee ◽

Hyunseok Oh

Keyword(s):

Multiple Choice ◽

Learning Progression ◽

Multiple Choice Items ◽

Development And Validation

Download Full-text

An analysis on the optimal number of options in multiple-choice items of the National Assessment of Educational Achievement

Foreign Languages Education ◽

10.15334/fle.2014.21.2.107 ◽

2014 ◽

Vol 21 (2) ◽

pp. 107-128

Author(s):

Young-Ju Lee ◽

Keyword(s):

Educational Achievement ◽

Multiple Choice ◽

Optimal Number ◽

National Assessment ◽

Multiple Choice Items

Download Full-text

EFFECTS OF STRUCTURAL CHARACTERISTICS OF STEM FORMAT OF MULTIPLE-CHOICE ITEMS ON ITEM DIFFICULTY AND DISCRIMINATION

Psychological Reports ◽

10.2466/pr0.1987.60.3c.1259 ◽

1987 ◽

Vol 60 (3c) ◽

pp. 1259-1262 ◽

Cited By ~ 3

Author(s):

CLAUDIO VIOLATO ◽

PETER H. HARASYM

Keyword(s):

Item Difficulty ◽

Structural Characteristics ◽

Multiple Choice ◽

Multiple Choice Items

Download Full-text

Developing individualized feedback for listening assessment: Combining standard setting and cognitive diagnostic assessment approaches

Language Testing ◽

10.1177/0265532221995475 ◽

2021 ◽

pp. 026553222199547

Author(s):

Shangchao Min ◽

Lianzhen He

Keyword(s):

Large Scale ◽

Standard Setting ◽

Diagnostic Assessment ◽

Efl Learners ◽

Cognitive Diagnostic Assessment ◽

Item Level ◽

Language Assessments ◽

Individualized Feedback ◽

Proficiency Classifications ◽

Acceptable Reliability

In this study, we present the development of individualized feedback for a large-scale listening assessment by combining standard setting and cognitive diagnostic assessment (CDA) approaches. We used the performance data from 3358 students’ item-level responses to a field test of a national EFL test primarily intended for tertiary-level EFL learners. The results showed that proficiency classifications and subskill mastery classifications were generally of acceptable reliability, and the two kinds of classifications were in alignment with each other at individual and group levels. The outcome of the study is a set of descriptors that describe each test taker’s ability to understand certain level of oral texts and his or her cognitive performance. The current study, by illustrating the feasibility of combining standard setting and CDA approaches to produce individualized feedback, contributes to the enhancement of score reporting and addresses the long-standing criticism that large-scale language assessments fail to provide individualized feedback to link assessment with instruction.

Download Full-text

Certainty-Based Marking on Multiple-Choice Items: Psychometrics Meets Decision Theory

Psychometrika ◽

10.1007/s11336-021-09759-0 ◽

2021 ◽

Author(s):

Qian Wu ◽

Monique Vanerum ◽

Anouk Agten ◽

Andrés Christiansen ◽

Frank Vandenabeele ◽

...

Keyword(s):

Decision Theory ◽

Multiple Choice ◽

Multiple Choice Items

Download Full-text

Statistical classification for cognitive diagnostic assessment: an artificial neural network approach

Educational Psychology ◽

10.1080/01443410.2015.1062078 ◽

2015 ◽

Vol 36 (6) ◽

pp. 1065-1082 ◽

Cited By ~ 4

Author(s):

Ying Cui ◽

Mark Gierl ◽

Qi Guo

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Diagnostic Assessment ◽

Network Approach ◽

Statistical Classification ◽

Neural Network Approach ◽

Cognitive Diagnostic Assessment ◽

Artificial Neural ◽

Artificial Neural Network Approach

Download Full-text

What's Wrong with Three-Option Multiple Choice Items?

Educational and Psychological Measurement ◽

10.1177/0013164487472027 ◽

1987 ◽

Vol 47 (2) ◽

pp. 513-522 ◽

Cited By ~ 23

Author(s):

Steven V. Owen ◽

Robin D. Froman

Keyword(s):

Multiple Choice ◽

Multiple Choice Items

Download Full-text

FREE-RESPONSE AND MULTIPLE-CHOICE ITEMS: MEASURES OF THE SAME ABILITY?

ETS Research Report Series ◽

10.1002/j.2333-8504.1990.tb01349.x ◽

1990 ◽

Vol 1990 (1) ◽

pp. i-29 ◽

Cited By ~ 1

Author(s):

Randy Elliot Bennett ◽

Donald A. Rock ◽

Minhwei Wang

Keyword(s):

Multiple Choice ◽

Free Response ◽

Multiple Choice Items

Download Full-text

Detecting Rater Biases in Sparse Rater-Mediated Assessment Networks

Educational and Psychological Measurement ◽

10.1177/0013164420988108 ◽

2021 ◽

pp. 001316442098810

Author(s):

Stefanie A. Wind ◽

Yuan Ge

Keyword(s):

Multiple Choice ◽

Complete Data ◽

Performance Assessments ◽

Model Estimation ◽

Different Types ◽

Multiple Choice Items ◽

Practical Constraints

Practical constraints in rater-mediated assessments limit the availability of complete data. Instead, most scoring procedures include one or two ratings for each performance, with overlapping performances across raters or linking sets of multiple-choice items to facilitate model estimation. These incomplete scoring designs present challenges for detecting rater biases, or differential rater functioning (DRF). The purpose of this study is to illustrate and explore the sensitivity of DRF indices in realistic sparse rating designs that have been documented in the literature that include different types and levels of connectivity among raters and students. The results indicated that it is possible to detect DRF in sparse rating designs, but the sensitivity of DRF indices varies across designs. We consider the implications of our findings for practice related to monitoring raters in performance assessments.

Download Full-text