Perceptual evaluation of pathological voice quality: A comparative analysis between the RASATI and GRBASI scales

Purpose The purpose of this study is to develop a program to concatenate acoustic vowel segments that were selected with the moving window technique, a previously developed technique used to segment and select the least perturbed segment from a sustained vowel segment. The concatenated acoustic segments were compared with the nonconcatenated, short, individual acoustic segments for their ability to differentiate normal and pathological voices. The concatenation process sometimes created a clicking noise or beat, which was also analyzed to determine any confounding effects. Method A program was developed to concatenate the moving window segments. Listeners with no previous rating experience were trained and, then, rated 20 normal and 20 pathological voice segments, both concatenated (2 s) and short (0.2 s) for a total of 80 segments. Listeners evaluated these segments on both the Grade, Roughness, Breathiness, Asthenia, and Strain scale (GRBAS; 8 listeners) and the Consensus Auditory-Perceptual Evaluation of Voice (Kempster, Gerratt, Abbott, Barkmeier-Kraemer, & Hillman, 2009) scale (7 listeners). The sensitivity and specificity of these ratings were analyzed using a receiver-operating characteristic curve. To evaluate if there were increases in particular criteria due to the beat, differences between beat and nonbeat ratings were compared using a 2-tailed analysis of variance. Results Concatenated segments had a higher sensitivity and specificity for distinguishing pathological and normal voices than short segments. Compared with nonbeat segments, the beat had statistically similar increases for all criteria across Consensus Auditory-Perceptual Evaluation of Voice and GRBAS scales, except pitch and loudness. Conclusions The concatenated moving window method showed improved sensitivity and specificity for detecting voice disorders using auditory-perceptual analysis, compared with the short moving window segment. It is a helpful tool for perceptual analytic protocols, allowing for voice evaluation using standardized and automated voice-segmenting procedures. Supplemental Material https://doi.org/10.23641/asha.7100951

Download Full-text

Cultural and Linguistic Adaptation of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) Into Hindi

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-20-00348 ◽

2020 ◽

Vol 63 (12) ◽

pp. 3974-3981

Author(s):

Ashwini Joshi ◽

Isha Baheti ◽

Vrushali Angadi

Keyword(s):

Strong Correlation ◽

Concurrent Validity ◽

Interrater Reliability ◽

Voice Quality ◽

Weak Correlation ◽

Voice Assessment ◽

Perceptual Evaluation ◽

Severity Grade ◽

Normal Voice ◽

Group A

Aim The purpose of this study was to develop and assess the reliability of a Hindi version of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). Reliability was assessed by comparing Hindi CAPE-V ratings with English CAPE-V ratings and by the Grade, Roughness, Breathiness, Asthenia and Strain (GRBAS) scale. Method Hindi sentences were created to match the phonemic load of the corresponding English CAPE-V sentences. The Hindi sentences were adapted for linguistic content. The original English and adapted Hindi CAPE-V and GRBAS were completed for 33 bilingual individuals with normal voice quality. Additionally, the Hindi CAPE-V and GRBAS were completed for 13 Hindi speakers with disordered voice quality. The agreement of CAPE-V ratings was assessed between language versions, GRBAS ratings, and two rater pairs (three raters in total). Pearson product–moment correlation was completed for all comparisons. Results A strong correlation ( r > .8, p < .01) was found between the Hindi CAPE-V scores and the English CAPE-V scores for most variables in normal voice participants. A weak correlation was found for the variable of strain ( r < .2, p = .400) in the normative group. A strong correlation ( r > .6, p < .01) was found between the overall severity/grade, roughness, and breathiness scores in the GRBAS scale and the CAPE-V scale in normal and disordered voice samples. Significant interrater reliability ( r > .75) was present in overall severity and breathiness. Conclusions The Hindi version of the CAPE-V demonstrates good interrater reliability and concurrent validity with the English CAPE-V and the GRBAS. The Hindi CAPE-V can be used for the auditory-perceptual voice assessment of Hindi speakers.

Download Full-text

Automatic Estimation of Pathological Voice Quality Based on Recurrent Neural Network Using Amplitude and Phase Spectrogram

10.21437/interspeech.2020-3228 ◽

2020 ◽

Author(s):

Shunsuke Hidaka ◽

Yogaku Lee ◽

Kohei Wakamiya ◽

Takashi Nakagawa ◽

Tokihiko Kaburagi

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Voice Quality ◽

Automatic Estimation ◽

Pathological Voice

Download Full-text

Measuring Vocal Fatigue in Sports Coaches

Journal of Clinical Speech and Language Studies ◽

10.3233/acs-2017-23104 ◽

2017 ◽

Vol 23 (1) ◽

pp. 1-20

Author(s):

Kathy Connaughton ◽

Irena Yanushevskaya

Keyword(s):

Acoustic Analysis ◽

Voice Quality ◽

Professional Sports ◽

Muscle Adaptation ◽

Voice Change ◽

Acoustic Measures ◽

Perceptual Evaluation ◽

Coaching Session ◽

Sports Coaches ◽

Voice Use

Objective: This study explores the immediate impact of prolonged voice use by professional sports coaches. Method: Speech samples including sustained phonation of vowel /a/ and a short read passage were collected from two professional sports coaches. The audio recordings were made within an hour before and after a coaching session, over three sessions. Perceptual evaluation of voice quality was done using the GRBAS scale. The speech samples were subsequently analyzed using Praat. The acoustic measures included fundamental frequency (f0), jitter, shimmer, Harmonics-to-Noise ratio and Cepstral Peak Prominence. Main results: The results of perceptual and acoustic analysis suggest a slight shift towards a tenser phonation post-coaching session, which is a likely consequence of laryngeal muscle adaptation to prolonged voice use. This tendency was similar in sustained vowels and connected speech. Conclusion: Acoustic measures used in this study can be useful to capture the voice change post-coaching session. It is desirable, however, that more sophisticated and robust and at the same time intuitive and easy-to-use tools for voice assessment and monitoring be made available to clinicians and professional voice users.

Download Full-text

A Randomized Controlled Trial of Two Semi-Occluded Vocal Tract Voice Therapy Protocols

Journal of Speech Language and Hearing Research ◽

10.1044/2015_jslhr-s-13-0231 ◽

2015 ◽

Vol 58 (3) ◽

pp. 535-549 ◽

Cited By ~ 55

Author(s):

Mara R. Kapsner-Smith ◽

Eric J. Hunter ◽

Kimberly Kirkham ◽

Karin Cox ◽

Ingo R. Titze

Keyword(s):

Quality Of Life ◽

Vocal Tract ◽

Controlled Trial ◽

Voice Quality ◽

Voice Handicap Index ◽

Voice Therapy ◽

Control Group ◽

Perceptual Evaluation ◽

Therapy Program

PurposeAlthough there is a long history of use of semi-occluded vocal tract gestures in voice therapy, including phonation through thin tubes or straws, the efficacy of phonation through tubes has not been established. This study compares results from a therapy program on the basis of phonation through a flow-resistant tube (FRT) with Vocal Function Exercises (VFE), an established set of exercises that utilize oral semi-occlusions.MethodTwenty subjects (16 women, 4 men) with dysphonia and/or vocal fatigue were randomly assigned to 1 of 4 treatment conditions: (a) immediate FRT therapy, (b) immediate VFE therapy, (c) delayed FRT therapy, or (d) delayed VFE therapy. Subjects receiving delayed therapy served as a no-treatment control group.ResultsVoice Handicap Index (Jacobson et al., 1997) scores showed significant improvement for both treatment groups relative to the no-treatment group. Comparison of the effect sizes suggests FRT therapy is noninferior to VFE in terms of reduction in Voice Handicap Index scores. Significant reductions in Roughness on the Consensus Auditory-Perceptual Evaluation of Voice (Kempster, Gerratt, Verdolini Abbott, Barkmeier-Kraemer, & Hillman, 2009) were found for the FRT subjects, with no other significant voice quality findings.ConclusionsVFE and FRT therapy may improve voice quality of life in some individuals with dysphonia. FRT therapy was noninferior to VFE in improving voice quality of life in this study.

Download Full-text

Perceptual Evaluation of Voice Disorder in Children Who Have Had Laryngotracheal Reconstruction Surgery and the Relationship Between Clinician Perceptual Ratingof Voice Quality and Parent Proxy/Child Self-Report of Voice-Related Quality of Life

Journal of Voice ◽

10.1016/j.jvoice.2018.07.009 ◽

2019 ◽

Vol 33 (6) ◽

pp. 945.e27-945.e35 ◽

Cited By ~ 1

Author(s):

Wendy Cohen ◽

Susan Lloyd ◽

David M. Wynne ◽

Richard B Townsley

Keyword(s):

Voice Quality ◽

Self Report ◽

Voice Disorder ◽

Reconstruction Surgery ◽

Perceptual Evaluation ◽

Laryngotracheal Reconstruction ◽

Related Quality ◽

The Relationship ◽

Parent Proxy

Download Full-text

Vocal fold injury following endotracheal intubation

The Journal of Laryngology & Otology ◽

10.1258/002221505774481192 ◽

2005 ◽

Vol 119 (10) ◽

pp. 825-827 ◽

Cited By ~ 15

Author(s):

Satoshi Kitahara ◽

Yukihiro Masuda ◽

Yoko Kitagawa

Keyword(s):

Endotracheal Intubation ◽

Fibrin Glue ◽

Vocal Fold ◽

Case Reports ◽

Voice Quality ◽

Fibrous Tissue ◽

Normal Result ◽

Perceptual Evaluation ◽

Vibratory Pattern

Vocal fold scarring results in the formation of fibrous tissue which disturbs the vibratory pattern of the fold during phonation. However, vocal fold scarring in humans is poorly understood because of the lack of clear case reports focusing on voice quality. The authors present a case of vocal fold scarring with changes in voice quality. At the time of injury the pedicle mucosa was cemented with fibrin glue. Phonation was inhibited for two weeks and tranilast (300 mg/day) was given for 3 months. Sixty-nine days later, perceptual evaluation showed a normal result and the phonation time became better, but the mucosal vibration was still lacking. Ninety-seven days later, mucosal vibration was finally restored. We suggest that characterization of vocal fold scarring in humans may be different from that in animals, and recommend that surgical management should be avoided for at least three months after injury.

Download Full-text

Age Differences in Voice Evaluation: From Auditory-Perceptual Evaluation to Social Interactions

Journal of Speech Language and Hearing Research ◽

10.1044/2017_jslhr-s-16-0202 ◽

2018 ◽

Vol 61 (2) ◽

pp. 227-245 ◽

Cited By ~ 3

Author(s):

Catherine L. Lortie ◽

Isabelle Deschamps ◽

Matthieu J. Guitton ◽

Pascale Tremblay

Keyword(s):

Social Interactions ◽

Age Differences ◽

Smoking Status ◽

Voice Quality ◽

Cross Sectional Study ◽

Cross Sectional ◽

Younger Adults ◽

Perceptual Evaluation ◽

Younger Age ◽

The Voice

Purpose The factors that influence the evaluation of voice in adulthood, as well as the consequences of such evaluation on social interactions, are not well understood. Here, we examined the effect of listeners' age and the effect of talker age, sex, and smoking status on the auditory-perceptual evaluation of voice, voice-related psychosocial attributions, and perceived speech tempo. We also examined the voice dimensions affecting the propensity to engage in social interactions. Method Twenty-five younger (age 19–37 years) and 25 older (age 51–74 years) healthy adults participated in this cross-sectional study. Their task was to evaluate the voice of 80 talkers. Results Statistical analyses revealed limited effects of the age of the listener on voice evaluation. Specifically, older listeners provided relatively more favorable voice ratings than younger listeners, mainly in terms of roughness. In contrast, the age of the talker had a broader impact on voice evaluation, affecting auditory-perceptual evaluations, psychosocial attributions, and perceived speech tempo. Some of these talker differences were dependent upon the sex of the talker and his or her smoking status. Finally, the results also show that voice-related psychosocial attribution was more strongly associated with the propensity of the listener to engage in social interactions with a person than auditory-perceptual dimensions and perceived speech tempo, especially for the younger adults. Conclusions These results suggest that age has a broad influence on voice evaluation, with a stronger impact for talker age compared with listener age. While voice-related psychosocial attributions may be an important determinant of social interactions, perceived voice quality and speech tempo appear to be less influential. Supplemental Materials https://doi.org/10.23641/asha.5844102

Download Full-text

Intercultural differences in evaluation of pathological voice quality: perceptual and acoustical comparisons between RASATI and GRBASI scales

10.21437/interspeech.2009-741 ◽

2009 ◽

Author(s):

Emi Juliana Yamauchi ◽

Satoshi Imaizumi ◽

Hagino Maruyama ◽

Tomoyuki Haji

Keyword(s):

Voice Quality ◽

Intercultural Differences ◽

Pathological Voice

Download Full-text

The Effects of Endoscopic Sinus Surgery on Voice Characteristics in Chronic Rhinosinusitis Patients

Annals of Otology Rhinology & Laryngology ◽

10.1177/0003489419861124 ◽

2019 ◽

Vol 128 (12) ◽

pp. 1129-1133

Author(s):

Danny B. Jandali ◽

Ashwin Ganti ◽

Inna A. Husain ◽

Pete S. Batra ◽

Bobby A. Tajudeen

Keyword(s):

Quality Of Life ◽

Chronic Rhinosinusitis ◽

Endoscopic Sinus Surgery ◽

Tertiary Care ◽

Voice Quality ◽

Upper Airway ◽

Sinus Surgery ◽

Perceptual Evaluation ◽

Snot 22

Objectives: Functional endoscopic sinus surgery (FESS) is a standard treatment modality for patients with chronic rhinosinusitis (CRS) who have failed appropriate medical therapy. However, FESS entails modification of the upper airway tract that may alter phonatory resonance and produce voice changes. The effects of FESS on postoperative voice characteristics in patients with CRS have yet to be quantitatively assessed. Methods: Patients with severe CRS who underwent FESS at a tertiary care referral center between May and October 2017 were prospectively enrolled. The Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) and the Voice Handicap Index (VHI) were used to quantitatively evaluate voice characteristics and quality of life, respectively. Preoperative and postoperative CAPE-V and VHI scores were compared with postoperative scores for each patient. Sino-Nasal Outcome Test (SNOT-22) scores were also obtained to assess changes in patient symptoms. Results: 18 CRS patients undergoing FESS were enrolled. The average preoperative Lund-Mackay score was 14, indicating baseline severe CRS. Postoperative assessments demonstrated a statistically significant decrease in CAPE-V (45-27, p = .005) and VHI (10-4.7, p < .001) scores. These correlated with a statistically significant decrease in SNOT-22 scores (42-13, p < .001). Conclusions: Patients with CRS experience a significant improvement in voice characteristics and vocal quality of life following FESS. Furthermore, this appears to correlate with a significant decrease in self-reported disease severity. These findings may augment the discussion of potential benefits of FESS to a new potential domain for voice quality.

Download Full-text