scholarly journals Acoustic Signal Typing for Evaluation of Voice Quality in Tracheoesophageal Speech

2006 ◽  
Vol 20 (3) ◽  
pp. 355-368 ◽  
Author(s):  
Corina J. van As-Brooks ◽  
Florien J. Koopmans-van Beinum ◽  
Louis C.W. Pols ◽  
Frans J.M. Hilgers
1997 ◽  
Vol 106 (4) ◽  
pp. 279-285 ◽  
Author(s):  
David G. Hanson ◽  
Judy Chen ◽  
Jack J. Jiang ◽  
Barbara Roa Pauloski

Sixteen patients who had symptoms and signs of chronic posterior laryngitis were evaluated before, during, and after treatment with omeprazole and nocturnal antireflux precautions. Data were analyzed for patients who complained of some hoarseness, who had no smoking history, and who completed all of the voice recording protocol. The patients' voices were recorded before, during, and following treatment with omeprazole and nocturnal antireflux precautions. Voice quality was analyzed by perceptual analysis, and acoustic signal data were measured for jitter, shimmer, and signal-to-noise ratio. Measures of jitter, shimmer, and signal-to-noise ratio changed significantly with treatment of posterior laryngitis (p < .01 for change in each of the measures). Acoustic measures showed some trend of deterioration with cessation of treatment, although the overall improvement in acoustic measures of voice quality was still statistically significant after treatment with omeprazole was discontinued. Although perceived abnormality of voice increased and decreased with the magnitude of measured perturbation of the acoustic signal for some patients, the perceptual assessments were not highly correlated with acoustic measures for individual patients, and the perceptual analysis group data did not show a significant change with time during treatment, in contrast to the significance of change in acoustic measures. The data demonstrate that acoustic measures of jitter, shimmer, and signal-to-noise ratio improve significantly with antisecretory and antireflux treatment of chronic posterior laryngitis, and that for individual patients, these are changes that are detected by trained listeners, but not at statistically high levels of confidence.


2016 ◽  
Vol 37 ◽  
pp. 1-10 ◽  
Author(s):  
Renee P. Clapham ◽  
Jean-Pierre Martens ◽  
Rob J.J.H. van Son ◽  
Frans J.M. Hilgers ◽  
Michiel M.W. van den Brekel ◽  
...  

2003 ◽  
Vol 46 (4) ◽  
pp. 947-959 ◽  
Author(s):  
Corina J. van As ◽  
Florien J. Koopmans-van Beinum ◽  
Louis C. W. Pols ◽  
Frans J. M. Hilgers

The present study was conducted to investigate voice quality in tracheoesophageal speech by means of perceptual evaluations and to develop a clinically useful subset of perceptual scales sufficient for these perceptual evaluations. The perceptual ratings were obtained from both naive and trained raters (speechlanguage pathologists [SLPs]) after listening to a read-aloud text. The perceptual evaluations were performed by means of 19 semantic bipolar 7-point scales for the naive raters and 20 semantic bipolar 7-point scales for the trained raters. The trained raters were also asked to judge the overall voice quality as good, reasonable, or poor. Both naive listeners and trained SLPs were able to perform reliable perceptual judgments. Naive raters judged the tracheoesophageal voice as more deviant than the trained raters did. Naive raters made judgments based on 2 underlying perceptual dimensions (voice quality and pitch), whereas the trained raters made judgments based on 4 underlying perceptual dimensions (voice quality, tonicity, pitch, and tempo). These perceptual dimensions were further subdivided into a subset of 4 perceptual scales for the naive raters and a subset of 8 perceptual scales for the trained raters. This appeared to provide a sufficient coverage of the underlying perceptual dimensions used by the listeners.


2005 ◽  
Vol 19 (3) ◽  
pp. 360-372 ◽  
Author(s):  
Corina J. van As-Brooks ◽  
Frans J.M. Hilgers ◽  
Florien J. Koopmans-van Beinum ◽  
Louis C.W. Pols

2015 ◽  
Vol 29 (4) ◽  
pp. 517.e23-517.e29 ◽  
Author(s):  
Renee P. Clapham ◽  
Corina J. van As-Brooks ◽  
Rob J.J.H. van Son ◽  
Frans J.M. Hilgers ◽  
Michiel W.M. van den Brekel

1994 ◽  
Vol 103 (12) ◽  
pp. 929-936 ◽  
Author(s):  
Daniel G. Deschler ◽  
E. Thomas Doherty ◽  
James P. Anthony ◽  
Charles G. Reed ◽  
Mark I. Singer

Tracheoesophageal voice restoration after laryngectomy is possible with a variety of neopharyngeal reconstructions. We have used the tubed radial forearm free flap for neopharyngeal reconstruction since 1991. Six patients have undergone voice restoration with the Blom-Singer prosthesis and were available for quantitative and qualitative speech analysis. These patients were compared to five laryngectomy patients with standard pharyngeal closures and similar voice restorations. The free flap patients produced similar loudness levels compared to the standards with soft speech (52.06 dB and 47.19 dB, respectively) and loud speech (62.66 dB and 60.91 dB, respectively). The free flap patients demonstrated adequate intelligibility, with fundamental frequencies comparable to standards (124.82 Hz and 135.66 Hz, respectively), although with increased jitter (5.00% versus 1.96%). No differences were statistically significant, but evaluation by trained and naive listeners demonstrated significant differences in voice quality. This quantitative and qualitative analysis of tracheoesophageal speech after radial forearm free flap reconstruction of the neopharynx demonstrates that acceptable voice can be achieved, but with limitations.


2020 ◽  
Vol 63 (4) ◽  
pp. 1071-1082
Author(s):  
Theresa Schölderle ◽  
Elisabet Haas ◽  
Wolfram Ziegler

Purpose The aim of this study was to collect auditory-perceptual data on established symptom categories of dysarthria from typically developing children between 3 and 9 years of age, for the purpose of creating age norms for dysarthria assessment. Method One hundred forty-four typically developing children (3;0–9;11 [years;months], 72 girls and 72 boys) participated. We used a computer-based game specifically designed for this study to elicit sentence repetitions and spontaneous speech samples. Speech recordings were analyzed using the auditory-perceptual criteria of the Bogenhausen Dysarthria Scales, a standardized German assessment tool for dysarthria in adults. The Bogenhausen Dysarthria Scales (scales and features) cover clinically relevant dimensions of speech and allow for an evaluation of well-established symptom categories of dysarthria. Results The typically developing children exhibited a number of speech characteristics overlapping with established symptom categories of dysarthria (e.g., breathy voice, frequent inspirations, reduced articulatory precision, decreased articulation rate). Substantial progress was observed between 3 and 9 years of age, but with different developmental trajectories across different dimensions. In several areas (e.g., respiration, voice quality), 9-year-olds still presented with salient developmental speech characteristics, while in other dimensions (e.g., prosodic modulation), features typically associated with dysarthria occurred only exceptionally, even in the 3-year-olds. Conclusions The acquisition of speech motor functions is a prolonged process not yet completed with 9 years. Various developmental influences (e.g., anatomic–physiological changes) shape children's speech specifically. Our findings are a first step toward establishing auditory-perceptual norms for dysarthria in children of kindergarten and elementary school age. Supplemental Material https://doi.org/10.23641/asha.12133380


2020 ◽  
Vol 63 (12) ◽  
pp. 3991-3999
Author(s):  
Benjamin van der Woerd ◽  
Min Wu ◽  
Vijay Parsa ◽  
Philip C. Doyle ◽  
Kevin Fung

Objectives This study aimed to evaluate the fidelity and accuracy of a smartphone microphone and recording environment on acoustic measurements of voice. Method A prospective cohort proof-of-concept study. Two sets of prerecorded samples (a) sustained vowels (/a/) and (b) Rainbow Passage sentence were played for recording via the internal iPhone microphone and the Blue Yeti USB microphone in two recording environments: a sound-treated booth and quiet office setting. Recordings were presented using a calibrated mannequin speaker with a fixed signal intensity (69 dBA), at a fixed distance (15 in.). Each set of recordings (iPhone—audio booth, Blue Yeti—audio booth, iPhone—office, and Blue Yeti—office), was time-windowed to ensure the same signal was evaluated for each condition. Acoustic measures of voice including fundamental frequency ( f o ), jitter, shimmer, harmonic-to-noise ratio (HNR), and cepstral peak prominence (CPP), were generated using a widely used analysis program (Praat Version 6.0.50). The data gathered were compared using a repeated measures analysis of variance. Two separate data sets were used. The set of vowel samples included both pathologic ( n = 10) and normal ( n = 10), male ( n = 5) and female ( n = 15) speakers. The set of sentence stimuli ranged in perceived voice quality from normal to severely disordered with an equal number of male ( n = 12) and female ( n = 12) speakers evaluated. Results The vowel analyses indicated that the jitter, shimmer, HNR, and CPP were significantly different based on microphone choice and shimmer, HNR, and CPP were significantly different based on the recording environment. Analysis of sentences revealed a statistically significant impact of recording environment and microphone type on HNR and CPP. While statistically significant, the differences across the experimental conditions for a subset of the acoustic measures (viz., jitter and CPP) have shown differences that fell within their respective normative ranges. Conclusions Both microphone and recording setting resulted in significant differences across several acoustic measurements. However, a subset of the acoustic measures that were statistically significant across the recording conditions showed small overall differences that are unlikely to have clinical significance in interpretation. For these acoustic measures, the present data suggest that, although a sound-treated setting is ideal for voice sample collection, a smartphone microphone can capture acceptable recordings for acoustic signal analysis.


2020 ◽  
Vol 63 (12) ◽  
pp. 3974-3981
Author(s):  
Ashwini Joshi ◽  
Isha Baheti ◽  
Vrushali Angadi

Aim The purpose of this study was to develop and assess the reliability of a Hindi version of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). Reliability was assessed by comparing Hindi CAPE-V ratings with English CAPE-V ratings and by the Grade, Roughness, Breathiness, Asthenia and Strain (GRBAS) scale. Method Hindi sentences were created to match the phonemic load of the corresponding English CAPE-V sentences. The Hindi sentences were adapted for linguistic content. The original English and adapted Hindi CAPE-V and GRBAS were completed for 33 bilingual individuals with normal voice quality. Additionally, the Hindi CAPE-V and GRBAS were completed for 13 Hindi speakers with disordered voice quality. The agreement of CAPE-V ratings was assessed between language versions, GRBAS ratings, and two rater pairs (three raters in total). Pearson product–moment correlation was completed for all comparisons. Results A strong correlation ( r > .8, p < .01) was found between the Hindi CAPE-V scores and the English CAPE-V scores for most variables in normal voice participants. A weak correlation was found for the variable of strain ( r < .2, p = .400) in the normative group. A strong correlation ( r > .6, p < .01) was found between the overall severity/grade, roughness, and breathiness scores in the GRBAS scale and the CAPE-V scale in normal and disordered voice samples. Significant interrater reliability ( r > .75) was present in overall severity and breathiness. Conclusions The Hindi version of the CAPE-V demonstrates good interrater reliability and concurrent validity with the English CAPE-V and the GRBAS. The Hindi CAPE-V can be used for the auditory-perceptual voice assessment of Hindi speakers.


1968 ◽  
Vol 11 (3) ◽  
pp. 576-582 ◽  
Author(s):  
John R. Muma ◽  
Ronald L. Laeder ◽  
Clarence E. Webb

Seventy-eight subjects, identified as possessing voice quality aberrations for six months, constituted four experimental groups: breathiness, harshness, hoarseness, and nasality. A control group included 38 subjects. The four experimental groups were compared with the control group according to personality characteristics and peer evaluations. The results of these comparisons indicated that there was no relationship between voice quality aberration and either personality characteristics or peer evaluations.


Sign in / Sign up

Export Citation Format

Share Document