The effects of audio compression on voice quality measurements

Perceptual voice quality measurement can be defined as an objective quantification of an overall impression of the perceived stimulus. An alternative to laborious subjective testing is objective predictive modelling, which employs a perceptual model of the human auditory and cognitive system to predict the human response to a voice signal in terms of its quality. This chapter describes subjective and automated objective testing methods, and provides a test case scenario for measuring voice quality.

Download Full-text

Adverse Effects of Environmental Noise on Acoustic Voice Quality Measurements

Journal of Voice ◽

10.1016/j.jvoice.2004.07.003 ◽

2005 ◽

Vol 19 (1) ◽

pp. 15-28 ◽

Cited By ~ 76

Author(s):

Dimitar D. Deliyski ◽

Heather S. Shaw ◽

Maegan K. Evans

Keyword(s):

Adverse Effects ◽

Voice Quality ◽

Environmental Noise ◽

Quality Measurements

Download Full-text

Age Norms for Auditory-Perceptual Neurophonetic Parameters: A Prerequisite for the Assessment of Childhood Dysarthria

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-19-00114 ◽

2020 ◽

Vol 63 (4) ◽

pp. 1071-1082

Author(s):

Theresa Schölderle ◽

Elisabet Haas ◽

Wolfram Ziegler

Keyword(s):

Assessment Tool ◽

Developmental Trajectories ◽

Voice Quality ◽

Typically Developing ◽

Typically Developing Children ◽

Age Norms ◽

Elementary School Age ◽

Speech Characteristics ◽

Substantial Progress ◽

Computer Based

Purpose The aim of this study was to collect auditory-perceptual data on established symptom categories of dysarthria from typically developing children between 3 and 9 years of age, for the purpose of creating age norms for dysarthria assessment. Method One hundred forty-four typically developing children (3;0–9;11 [years;months], 72 girls and 72 boys) participated. We used a computer-based game specifically designed for this study to elicit sentence repetitions and spontaneous speech samples. Speech recordings were analyzed using the auditory-perceptual criteria of the Bogenhausen Dysarthria Scales, a standardized German assessment tool for dysarthria in adults. The Bogenhausen Dysarthria Scales (scales and features) cover clinically relevant dimensions of speech and allow for an evaluation of well-established symptom categories of dysarthria. Results The typically developing children exhibited a number of speech characteristics overlapping with established symptom categories of dysarthria (e.g., breathy voice, frequent inspirations, reduced articulatory precision, decreased articulation rate). Substantial progress was observed between 3 and 9 years of age, but with different developmental trajectories across different dimensions. In several areas (e.g., respiration, voice quality), 9-year-olds still presented with salient developmental speech characteristics, while in other dimensions (e.g., prosodic modulation), features typically associated with dysarthria occurred only exceptionally, even in the 3-year-olds. Conclusions The acquisition of speech motor functions is a prolonged process not yet completed with 9 years. Various developmental influences (e.g., anatomic–physiological changes) shape children's speech specifically. Our findings are a first step toward establishing auditory-perceptual norms for dysarthria in children of kindergarten and elementary school age. Supplemental Material https://doi.org/10.23641/asha.12133380

Download Full-text

Evaluation of Acoustic Analyses of Voice in Nonoptimized Conditions

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-20-00212 ◽

2020 ◽

Vol 63 (12) ◽

pp. 3991-3999

Author(s):

Benjamin van der Woerd ◽

Min Wu ◽

Vijay Parsa ◽

Philip C. Doyle ◽

Kevin Fung

Keyword(s):

Repeated Measures ◽

Voice Quality ◽

Data Sets ◽

Acoustic Measurements ◽

Sample Collection ◽

Experimental Conditions ◽

Environment Analysis ◽

Acoustic Measures ◽

Recording Conditions ◽

Cepstral Peak Prominence

Objectives This study aimed to evaluate the fidelity and accuracy of a smartphone microphone and recording environment on acoustic measurements of voice. Method A prospective cohort proof-of-concept study. Two sets of prerecorded samples (a) sustained vowels (/a/) and (b) Rainbow Passage sentence were played for recording via the internal iPhone microphone and the Blue Yeti USB microphone in two recording environments: a sound-treated booth and quiet office setting. Recordings were presented using a calibrated mannequin speaker with a fixed signal intensity (69 dBA), at a fixed distance (15 in.). Each set of recordings (iPhone—audio booth, Blue Yeti—audio booth, iPhone—office, and Blue Yeti—office), was time-windowed to ensure the same signal was evaluated for each condition. Acoustic measures of voice including fundamental frequency ( f o ), jitter, shimmer, harmonic-to-noise ratio (HNR), and cepstral peak prominence (CPP), were generated using a widely used analysis program (Praat Version 6.0.50). The data gathered were compared using a repeated measures analysis of variance. Two separate data sets were used. The set of vowel samples included both pathologic ( n = 10) and normal ( n = 10), male ( n = 5) and female ( n = 15) speakers. The set of sentence stimuli ranged in perceived voice quality from normal to severely disordered with an equal number of male ( n = 12) and female ( n = 12) speakers evaluated. Results The vowel analyses indicated that the jitter, shimmer, HNR, and CPP were significantly different based on microphone choice and shimmer, HNR, and CPP were significantly different based on the recording environment. Analysis of sentences revealed a statistically significant impact of recording environment and microphone type on HNR and CPP. While statistically significant, the differences across the experimental conditions for a subset of the acoustic measures (viz., jitter and CPP) have shown differences that fell within their respective normative ranges. Conclusions Both microphone and recording setting resulted in significant differences across several acoustic measurements. However, a subset of the acoustic measures that were statistically significant across the recording conditions showed small overall differences that are unlikely to have clinical significance in interpretation. For these acoustic measures, the present data suggest that, although a sound-treated setting is ideal for voice sample collection, a smartphone microphone can capture acceptable recordings for acoustic signal analysis.

Download Full-text

Cultural and Linguistic Adaptation of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) Into Hindi

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-20-00348 ◽

2020 ◽

Vol 63 (12) ◽

pp. 3974-3981

Author(s):

Ashwini Joshi ◽

Isha Baheti ◽

Vrushali Angadi

Keyword(s):

Strong Correlation ◽

Concurrent Validity ◽

Interrater Reliability ◽

Voice Quality ◽

Weak Correlation ◽

Voice Assessment ◽

Perceptual Evaluation ◽

Severity Grade ◽

Normal Voice ◽

Group A

Aim The purpose of this study was to develop and assess the reliability of a Hindi version of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). Reliability was assessed by comparing Hindi CAPE-V ratings with English CAPE-V ratings and by the Grade, Roughness, Breathiness, Asthenia and Strain (GRBAS) scale. Method Hindi sentences were created to match the phonemic load of the corresponding English CAPE-V sentences. The Hindi sentences were adapted for linguistic content. The original English and adapted Hindi CAPE-V and GRBAS were completed for 33 bilingual individuals with normal voice quality. Additionally, the Hindi CAPE-V and GRBAS were completed for 13 Hindi speakers with disordered voice quality. The agreement of CAPE-V ratings was assessed between language versions, GRBAS ratings, and two rater pairs (three raters in total). Pearson product–moment correlation was completed for all comparisons. Results A strong correlation ( r > .8, p < .01) was found between the Hindi CAPE-V scores and the English CAPE-V scores for most variables in normal voice participants. A weak correlation was found for the variable of strain ( r < .2, p = .400) in the normative group. A strong correlation ( r > .6, p < .01) was found between the overall severity/grade, roughness, and breathiness scores in the GRBAS scale and the CAPE-V scale in normal and disordered voice samples. Significant interrater reliability ( r > .75) was present in overall severity and breathiness. Conclusions The Hindi version of the CAPE-V demonstrates good interrater reliability and concurrent validity with the English CAPE-V and the GRBAS. The Hindi CAPE-V can be used for the auditory-perceptual voice assessment of Hindi speakers.

Download Full-text

Adolescent Voice Quality Aberrations: Personality and Social Status

Journal of Speech and Hearing Research ◽

10.1044/jshr.1103.576 ◽

1968 ◽

Vol 11 (3) ◽

pp. 576-582 ◽

Cited By ~ 4

Author(s):

John R. Muma ◽

Ronald L. Laeder ◽

Clarence E. Webb

Keyword(s):

Social Status ◽

Voice Quality ◽

Personality Characteristics ◽

Control Group ◽

Peer Evaluations ◽

Adolescent Voice

Seventy-eight subjects, identified as possessing voice quality aberrations for six months, constituted four experimental groups: breathiness, harshness, hoarseness, and nasality. A control group included 38 subjects. The four experimental groups were compared with the control group according to personality characteristics and peer evaluations. The results of these comparisons indicated that there was no relationship between voice quality aberration and either personality characteristics or peer evaluations.

Download Full-text