Exploring decision consistency and decision accuracy across rating designs in rater-mediated music performance assessments

Music performance assessments frequently include panels of raters who evaluate the quality of musical performances using rating scales. As a result of practical considerations, it is often not possible to obtain ratings from every rater on every performance (i.e., complete rating designs). When there are differences in rater severity, and not all raters rate all performances, ratings of musical performances and their resulting classification (e.g., pass or fail) depend on the “luck of the rater draw.” In this study, we explored the implications of different types of incomplete rating designs for the classification of musical performances in rater-mediated musical performance assessments. We present a procedure that researchers and practitioners can use to adjust student scores for differences in rater severity when incomplete rating designs are used, and we consider the effects of the adjustment procedure across different types of rating designs. Our results suggested that differences in rater severity have large practical consequences for ratings of musical performances that impact individual students and group of students differently. Furthermore, our findings suggest that it is possible to adjust musical performance ratings for differences in rater severity as long as there are common raters across scoring panels. We consider the implications of our findings as they relate to music assessment research and practice.

Download Full-text

Gendering Emotions: Ṭarab, Women and Musical Performance in Three Biographical Narratives from ‘The Book of Songs’

Cultural History ◽

10.3366/cult.2019.0198 ◽

2019 ◽

Vol 8 (2) ◽

pp. 164-183

Author(s):

Karen Moukheiber

Keyword(s):

Formative Period ◽

Musical Performance ◽

Ruling Elite ◽

Islamic History ◽

Musical Performances ◽

Slave Women ◽

Social Conduct ◽

Literary Biographies ◽

The One ◽

The Book Of Songs

Musical performance was a distinctive feature of urban culture in the formative period of Islamic history. At the court of the Abbasid caliphs, and in the residences of the ruling elite, men and women singers performed to predominantly male audiences. The success of a performer was linked to his or her ability to elicit ṭarab, namely a spectrum of emotions and affects, in their audiences. Ṭarab was criticized by religious scholars due, in part, to the controversial performances at court of slave women singers depicted as using music to induce passion in men, diverting them from normative ethical social conduct. This critique, in turn, shaped the ethical boundaries of musical performances and affective responses to them. Abū l-Faraj al-Iṣfahānī’s tenth-century Kitāb al-Aghānī (‘The Book of Songs’) compiles literary biographies of prominent male and female singers from the formative period of Islamic history. It offers rich descriptions of musical performances as well as ensuing manifestations of ṭarab in audiences, revealing at times the polemics with which they were associated. Investigating three biographical narratives from Kitāb al-Aghānī, this paper seeks to answer the following question: How did emotions, gender and status shape on the one hand the musical performances of women singers and on the other their audiences’ emotional responses, holistically referred to as ṭarab. Through this question, this paper seeks to nuance and complicate our understanding of the constraints and opportunities that shaped slave and free women's musical performances, as well as men's performances, at the Abbasid court.

Download Full-text

Technical Issues Related to Computerized Music Performance Assessment

The Oxford Handbook of Assessment Policy and Practice in Music Education, Volume 2 ◽

10.1093/oxfordhb/9780190248130.013.73 ◽

2019 ◽

pp. 854-876

Author(s):

Daniel Massoth

Keyword(s):

Research And Development ◽

Performance Assessment ◽

Music Performance ◽

External Factors ◽

Assessment Process ◽

Future Directions ◽

Music Assessment ◽

History Of ◽

Technical Issues ◽

Internal And External Factors

When technology is used for assessment in music, certain considerations can affect the validity, reliability, and depth of analysis. This chapter explores factors that are present in the three phases of the assessment process: recognition, analysis, and display of assessment of a musical performance. Each phase has inherent challenges embedded within internal and external factors. The goal here is not to provide an exhaustive analysis of any or all aspects of assessment but, rather, to present the rationale for and history of using technology in music assessment and to examine the philosophical and practical considerations. A discussion of possible future directions of product research and development concludes the chapter.

Download Full-text

Detecting Rater Biases in Sparse Rater-Mediated Assessment Networks

Educational and Psychological Measurement ◽

10.1177/0013164420988108 ◽

2021 ◽

pp. 001316442098810

Author(s):

Stefanie A. Wind ◽

Yuan Ge

Keyword(s):

Multiple Choice ◽

Complete Data ◽

Performance Assessments ◽

Model Estimation ◽

Different Types ◽

Multiple Choice Items ◽

Practical Constraints

Practical constraints in rater-mediated assessments limit the availability of complete data. Instead, most scoring procedures include one or two ratings for each performance, with overlapping performances across raters or linking sets of multiple-choice items to facilitate model estimation. These incomplete scoring designs present challenges for detecting rater biases, or differential rater functioning (DRF). The purpose of this study is to illustrate and explore the sensitivity of DRF indices in realistic sparse rating designs that have been documented in the literature that include different types and levels of connectivity among raters and students. The results indicated that it is possible to detect DRF in sparse rating designs, but the sensitivity of DRF indices varies across designs. We consider the implications of our findings for practice related to monitoring raters in performance assessments.

Download Full-text

Empathy as a Tool for Embodiment Processes in Vocal Performance

Empirical Musicology Review ◽

10.18061/emr.v10i1-2.4601 ◽

2015 ◽

Vol 10 (1-2) ◽

pp. 104 ◽

Cited By ~ 1

Author(s):

Erin Heisel

Keyword(s):

Art Song ◽

Music Performance ◽

Musical Performance ◽

Vocal Performance ◽

Preparation Process ◽

Healthy Habits ◽

Role Preparation ◽

Moral Convictions

One way of understanding empathy in music performance is as a process by which singers closely identify with the characters they encounter and portray in opera or art song. As singers embody these characters, they literally give them voice. Musical performance thus humanizes characters as well as performers and audiences as deeper, empathetic engagement may also reflect or elicit new pathways of growth, knowledge, and understanding. What is the process a singer goes through in empathizing with a character? How can young singers learn to empathize with the characters they are tasked with portraying, even when they may find the characters or their behavior to fall outside of their own moral convictions?  This paper posits that empathy is a necessary part of the role preparation process for singers and introduces the “role journal” as a way for young singers to track embodiment processes and develop healthy habits of empathy and boundaries in their work.

Download Full-text

Neuropsychological assessment in older people: old principles and new directions

Advances in Psychiatric Treatment ◽

10.1192/apt.6.5.362 ◽

2000 ◽

Vol 6 (5) ◽

pp. 362-370 ◽

Cited By ~ 11

Author(s):

Robin G. Morris ◽

Claire Worsley ◽

David Matthews

Keyword(s):

Older Adults ◽

Clinical Practice ◽

Older People ◽

Neuropsychological Assessment ◽

Rating Scales ◽

Clinical Psychologist ◽

Narrow Sense ◽

New Directions ◽

Different Types ◽

Common Clinical Practice

Neuropsychological assessment, in the broader sense, is common clinical practice with older adults because of the widespread use of mental status examinations and dementia rating scales. In the more narrow sense, a neuropsychological assessment conducted by a clinical psychologist or clinical neuropsychologist is used less frequently and for more specific purposes. This paper outlines these uses and provides a brief overview of the different types of test that might be used, with a clinical example to illustrate the type of information gained. This review is designed not to be comprehensive, but to provide a pointer towards the latest trends in test development.

Download Full-text

Responding to performers

10.1093/acprof:oso/9780199346677.003.0023 ◽

2017 ◽

Author(s):

Sinéad O’Neill ◽

John Sloboda

Keyword(s):

Music Performance ◽

Musical Performance ◽

Multiple Relationships ◽

Social Phenomenon ◽

Structural Elements ◽

Creative Performance ◽

Peak Experiences ◽

Creative Practice ◽

Performance Space ◽

Different Levels

Musical performance is an irreducibly social phenomenon, manifested through the multiple relationships between performers and audience. In live contexts, the nature and meaning of performance encompass the two-way interplay between performers and audience. This chapter surveys a range of research, from the philosophical to the empirical, into the parameters of this interplay, both during and after performances, focusing most specifically on those aspects that have implications for the creative practice of the musician. These aspects go beyond sound parameters to features of the performance often seen as ‘extra-musical’, such as the visual and gestural aspects of performance, the architecture of the performance space and perceived norms of behaviour within the concert context. Consideration is given to how these elements contribute to different levels of experience, from the ‘basic’ appreciation of structural elements through to the ‘peak’ experiences which music performance sometimes engenders. Also considered is audience feedback, both formal and informal, and how it may have an impact on creative performance.

Download Full-text

Complicating factors in studio teaching dynamics: Dyad dissolution in post-secondary music studios

Research Studies in Music Education ◽

10.1177/1321103x19871079 ◽

2020 ◽

pp. 1321103X1987107 ◽

Cited By ~ 1

Author(s):

Gina Ryan

Keyword(s):

Teaching And Learning ◽

Rating Scales ◽

Music Performance ◽

Essential Element ◽

Music Schools ◽

Post Secondary ◽

Personal Connection ◽

Professional Goals ◽

Secondary Music ◽

Four Levels

Relationship dynamics between students and teachers are an essential element of one-on-one teaching and learning in music schools. The purpose of this study was to investigate factors leading to student–teacher dyad dissolution in post-secondary music performance studios. A total of 30 students and 30 teachers were interviewed. Interview questionnaires contained closed-ended rating scales and open-ended questions. Unstructured responses were transcribed, coded by units that each represented a contributing factor to dyad dissolution, and then subjected to a frequency count to determine decisive factors leading to dyad dissolution. All factors were subjected to the Framework of Social Levels, which is based on four levels – Interpersonal, Self, Other, and Outside. The majority of students’ dissolution factors were attributed at the Interpersonal level, whereas the majority of teachers attributed dissolution to factors at the Student ( Other) level. Participants cited several factors leading to dyad dissolution including different expectations, different professional goals, poor communication, incompatible personalities, student commitment, teacher teaching abilities, lesson satisfaction, and lack of personal connection.

Download Full-text

Accuracy of performance-test linking based on a many-facet Rasch model

Behavior Research Methods ◽

10.3758/s13428-020-01498-x ◽

2020 ◽

Author(s):

Masaki Uto

Keyword(s):

Performance Test ◽

Performance Assessments ◽

Task Characteristics ◽

Model Parameters ◽

Test Results ◽

Irt Models ◽

Test Linking ◽

Rater Severity ◽

Task Parameters ◽

And Task

Abstract Performance assessments, in which human raters assess examinee performance in practical tasks, have attracted much attention in various assessment contexts involving measurement of higher-order abilities. However, difficulty persists in that ability measurement accuracy strongly depends on rater and task characteristics such as rater severity and task difficulty. To resolve this problem, various item response theory (IRT) models incorporating rater and task parameters, including many-facet Rasch models (MFRMs), have been proposed. When applying such IRT models to datasets comprising results of multiple performance tests administered to different examinees, test linking is needed to unify the scale for model parameters estimated from individual test results. In test linking, test administrators generally need to design multiple tests such that raters and tasks partially overlap. The accuracy of linking under this design is highly reliant on the numbers of common raters and tasks. However, the numbers of common raters and tasks required to ensure high accuracy in test linking remain unclear, making it difficult to determine appropriate test designs. We therefore empirically evaluate the accuracy of IRT-based performance-test linking under common rater and task designs. Concretely, we conduct evaluations through simulation experiments that examine linking accuracy based on a MFRM while changing numbers of common raters and tasks with various factors that possibly affect linking accuracy.

Download Full-text

The Role of Timing Patterns in Recognition of Emotional Expression from Musical Performance

Music Perception An Interdisciplinary Journal ◽

10.2307/40285891 ◽

1999 ◽

Vol 17 (2) ◽

pp. 197-221 ◽

Cited By ~ 40

Author(s):

Patrik N. Juslin ◽

Guy Madison

Keyword(s):

Emotional Expression ◽

Musical Performance ◽

Acoustic Cues ◽

Musical Performances ◽

Decoding Accuracy ◽

Better Than

The purpose of this study was to explore whether listeners can use timing patterns to decode the intended emotional expression of musical performances. We gradually removed different acoustic cues (tempo, dynamics, timing, articulation) from piano performances rendered with various intended expressions (anger, sadness, happiness, fear) to see how such manipulations would affect a listener's ability to decode the emotional expression. The results show that (a) removing the timing patterns yielded a significant decrease in listeners' decoding accuracy, (b) timing patterns were by themselves capable of communicating some emotions with accuracy better than chance, and (c) timing patterns were less effective in communicating emotions than were tempo and dynamics. Implications for research on timing in performance are discussed.

Download Full-text

Development of anthropomorphic musical performance robots: From understanding the nature of music performance to its application to entertainment robotics

2009 IEEE/RSJ International Conference on Intelligent Robots and Systems ◽

10.1109/iros.2009.5354547 ◽

2009 ◽

Cited By ~ 12

Author(s):

Jorge Solis ◽

Klaus Petersen ◽

Takeshi Ninomiya ◽

Masaki Takeuchi ◽

Atsuo Takanishi

Keyword(s):

Music Performance ◽

Musical Performance ◽

Entertainment Robotics

Download Full-text