Interjudge Reliability and Ratings Variability in Music Performance Assessment

2019 ◽

pp. 854-876

Author(s):

Daniel Massoth

Keyword(s):

Research And Development ◽

Performance Assessment ◽

Music Performance ◽

External Factors ◽

Assessment Process ◽

Future Directions ◽

Music Assessment ◽

History Of ◽

Technical Issues ◽

Internal And External Factors

When technology is used for assessment in music, certain considerations can affect the validity, reliability, and depth of analysis. This chapter explores factors that are present in the three phases of the assessment process: recognition, analysis, and display of assessment of a musical performance. Each phase has inherent challenges embedded within internal and external factors. The goal here is not to provide an exhaustive analysis of any or all aspects of assessment but, rather, to present the rationale for and history of using technology in music assessment and to examine the philosophical and practical considerations. A discussion of possible future directions of product research and development concludes the chapter.

Download Full-text

Measurement of Instrumental and Vocal Undergraduate Performance Juries Using a Multidimensional Assessment Rubric

Journal of Research in Music Education ◽

10.1177/0022429409333405 ◽

2009 ◽

Vol 57 (1) ◽

pp. 5-15 ◽

Cited By ~ 18

Author(s):

Charles R. Ciorba ◽

Neal Y. Smith

Keyword(s):

Music Performance ◽

Assessment Tools ◽

Applied Music ◽

Performance Achievement ◽

Multidimensional Assessment ◽

Solo Music ◽

Interjudge Reliability ◽

Specific Assessment ◽

High Level ◽

Assessment Rubric

Recent policy initiatives instituted by major accrediting bodies require the implementation of specific assessment tools to provide evidence of student achievement in a number of areas, including applied music study. The purpose of this research was to investigate the effectiveness of a multidimensional assessment rubric, which was administered to all students performing instrumental and vocal juries at a private Midwestern university during one semester ( N = 359). Interjudge reliability coefficients indicated a moderate to high level of agreement among judges. Results also revealed that performance achievement was positively related to participants' year in school (freshman, sophomore, junior, and senior), which indicates that a multidimensional assessment rubric can effectively measure students' achievement in the area of solo music performance.

Download Full-text

Predicting Operational Rater‐Type Classifications Using Rasch Measurement Theory and Random Forests: A Music Performance Assessment Perspective

Journal of Educational Measurement ◽

10.1111/jedm.12227 ◽

2019 ◽

Vol 56 (3) ◽

pp. 610-625

Author(s):

Brian C. Wesolowski

Keyword(s):

Performance Assessment ◽

Random Forests ◽

Measurement Theory ◽

Music Performance ◽

Rasch Measurement ◽

Rasch Measurement Theory

Download Full-text

Faculty Interjudge Reliability of Music Performance Evaluation

Journal of Research in Music Education ◽

10.2307/3345847 ◽

2003 ◽

Vol 51 (2) ◽

pp. 137-150 ◽

Cited By ~ 43

Author(s):

Martin J. Bergee

Keyword(s):

Performance Evaluation ◽

Rating Scale ◽

Faculty Evaluation ◽

Music Performance ◽

Applied Music ◽

Letter Grade ◽

Score Reliability ◽

Scale Total Score ◽

Interjudge Reliability ◽

Grade Assessment

Assessment of music performance in authentic contexts remains an underinvestigated area of research. This study is an examination of one such context, the inter-judge reliability of faculty evaluation of end-of-semester applied music performances. Brass (n = 4), percussion (n = 2), woodwind (n = 5), voice (n = 5), piano (n = 3), and string (n = 5) instructors evaluating a recent semester's applied music juries at a large university participated in the study. Each evaluator completed a criterion-specific rating scale for each performer and assigned each performance a global letter grade not shared with other evaluators or with the performer. Interjudge reliability was determined for each group's rating scale total scores, subscale scores, and the letter-grade assessment. All possible permutations of two, three, and four were examined for interjudge reliability, and averaged correlations, standard deviations, and ranges were determined. Full-panel interjudge reliability was consistently good regardless of panel size. All total score reliability coefficients were statistically significant, as were all coefficients for the global letter-grade assessment. All subscale reliabilities for all groups except Percussion (which, with an n of 2, had a stringent significance criterion) were statistically significant, with the exception of the Suitability subscale in Voice. For larger panels (ns of 4 and 5), rating scale total score reliability was consistently but not greatly higher than reliability for the letter-grade assessment. There was no decrease of average reliability as group size incrementally decreased. Permutations of two and three evaluators, however, tended on average to exhibit more variability, greater range, and less uniformity than did groups of four and five. No differences in reliability were noted among levels of experience or between teaching assistants and faculty members. Use of a minimum of five adjudicators for performance evaluation in this context was recommended.

Download Full-text

An Analysis of Ratings Assigned to String Musicians’ Performances at Solo and Ensemble Music Performance Assessment Events

String Research Journal ◽

10.1177/194849921500600003 ◽

2015 ◽

Vol 6 (1) ◽

pp. 37-54 ◽

Cited By ~ 1

Author(s):

David A. Pope ◽

James P. Mick

Keyword(s):

Performance Assessment ◽

Music Performance

Download Full-text

The Effect of Evaluation Strategy and Music Performance Presentation Format on Score Variability of Music Students’ Performance Assessment

Procedia - Social and Behavioral Sciences ◽

10.1016/j.sbspro.2014.03.224 ◽

2014 ◽

Vol 127 ◽

pp. 119-123

Author(s):

Dorina Iusca

Keyword(s):

Performance Assessment ◽

Music Performance ◽

Presentation Format ◽

Music Students ◽

Evaluation Strategy

Download Full-text

Music Performance Assessment: A Study of the Effects of Training and Experience on the Criteria Used by Music Examiners

International Journal of Music Education ◽

10.1177/025576149302200106 ◽

1993 ◽

Vol os-22 (1) ◽

pp. 34-39 ◽

Cited By ~ 17

Author(s):

Neal Winter

Keyword(s):

Performance Assessment ◽

Music Performance

Download Full-text

Performer, Rater, Occasion, and Sequence as Sources of Variability in Music Performance Assessment

Journal of Research in Music Education ◽

10.1177/0022429408317515 ◽

2007 ◽

Vol 55 (4) ◽

pp. 344-358 ◽

Cited By ~ 14

Author(s):

Martin J. Bergee

Keyword(s):

High School ◽

Measurement Error ◽

Performance Assessment ◽

Generalizability Theory ◽

Music Performance ◽

Wind Instrumentalists ◽

Generalizability Coefficient ◽

Strong Source ◽

The One ◽

Source Of Error

This study examined performer, rater, occasion, and sequence as sources of variability in music performance assessment. Generalizability theory served as the study's basis. Performers were 8 high school wind instrumentalists who had recently performed a solo. The author audio-recorded performers playing excerpts from their solo three times, establishing an occasion variable. To establish a rater variable, 10 certified adjudicators were asked to rate the performances from 0 (poor) to 100 (excellent). Raters were randomly assigned to one of five performance sequences, thus nesting raters within a sequence variable. Two G (generalizability) studies established that occasion and sequence produced virtually no measurement error. Raters were a strong source of error. D (decision) studies established the one-rater, one-occasion scenario as unreliable. In scenarios using the generalizability coefficient as a criterion, 5 hypothetical raters were necessary to reach the .80 benchmark. Using the dependability index, 17 hypothetical raters were necessary to reach .80.

Download Full-text

Music Performance Assessment

Music Educators Journal ◽

10.1177/0027432114540336 ◽

2014 ◽

Vol 101 (1) ◽

pp. 70-76 ◽

Cited By ~ 2

Author(s):

Christopher DeLuca ◽

Benjamin Bolden

Keyword(s):

Performance Assessment ◽

Music Performance

Download Full-text

The Relationship Between District Concert Band Music Performance Assessment Participation and Student Achievement in Miami-Dade County Public Middle Schools

10.25148/etd.fidc006579 ◽

2018 ◽

Author(s):

Arthur J. N. Scavella

Keyword(s):

Student Achievement ◽

Middle Schools ◽

Performance Assessment ◽

Music Performance ◽

Concert Band ◽

Dade County ◽

Band Music ◽

Concert Band Music ◽

The Relationship

Download Full-text

Interjudge Reliability and Ratings Variability in Music Performance Assessment

Technical Issues Related to Computerized Music Performance Assessment

Measurement of Instrumental and Vocal Undergraduate Performance Juries Using a Multidimensional Assessment Rubric

Predicting Operational Rater‐Type Classifications Using Rasch Measurement Theory and Random Forests: A Music Performance Assessment Perspective

Faculty Interjudge Reliability of Music Performance Evaluation

An Analysis of Ratings Assigned to String Musicians’ Performances at Solo and Ensemble Music Performance Assessment Events

The Effect of Evaluation Strategy and Music Performance Presentation Format on Score Variability of Music Students’ Performance Assessment

Music Performance Assessment: A Study of the Effects of Training and Experience on the Criteria Used by Music Examiners

Performer, Rater, Occasion, and Sequence as Sources of Variability in Music Performance Assessment

Music Performance Assessment

The Relationship Between District Concert Band Music Performance Assessment Participation and Student Achievement in Miami-Dade County Public Middle Schools

Export Citation Format