scholarly journals Computing scores of voice quality and speech intelligibility in tracheoesophageal speech for speech stimuli of varying lengths

2016 ◽  
Vol 37 ◽  
pp. 1-10 ◽  
Author(s):  
Renee P. Clapham ◽  
Jean-Pierre Martens ◽  
Rob J.J.H. van Son ◽  
Frans J.M. Hilgers ◽  
Michiel M.W. van den Brekel ◽  
...  
2016 ◽  
Vol 25 (4) ◽  
pp. 561-575 ◽  
Author(s):  
Paul M. Evitts ◽  
Heather Starmer ◽  
Kristine Teets ◽  
Christen Montgomery ◽  
Lauren Calhoun ◽  
...  

Purpose There is currently minimal information on the impact of dysphonia secondary to phonotrauma on listeners. Considering the high incidence of voice disorders with professional voice users, it is important to understand the impact of a dysphonic voice on their audiences. Methods Ninety-one healthy listeners (39 men, 52 women; mean age = 23.62 years) were presented with speech stimuli from 5 healthy speakers and 5 speakers diagnosed with dysphonia secondary to phonotrauma. Dependent variables included processing speed (reaction time [RT] ratio), speech intelligibility, and listener comprehension. Voice quality ratings were also obtained for all speakers by 3 expert listeners. Results Statistical results showed significant differences between RT ratio and number of speech intelligibility errors between healthy and dysphonic voices. There was not a significant difference in listener comprehension errors. Multiple regression analyses showed that voice quality ratings from the Consensus Assessment Perceptual Evaluation of Voice (Kempster, Gerratt, Verdolini Abbott, Barkmeier-Kraemer, & Hillman, 2009) were able to predict RT ratio and speech intelligibility but not listener comprehension. Conclusions Results of the study suggest that although listeners require more time to process and have more intelligibility errors when presented with speech stimuli from speakers with dysphonia secondary to phonotrauma, listener comprehension may not be affected.


2013 ◽  
Vol 278-280 ◽  
pp. 1124-1128
Author(s):  
Yi Long You ◽  
Fei Zhang ◽  
Bu Lei Zuo ◽  
Feng Xiang You

Although traditional algorithms can led to suppressed voice in the noise, but the distortion of the voice is inevitable. An introduction is made as to the speech signal enhancement with an improved threshold method. Compared MATLAB experimental simulation on simulated platform with traditional enhanced algorithm, this paper aims to verify this method can effectively remove the noise in the signal, enhanced voice quality, improve speech intelligibility, and achieve the effect of the enhanced speech signal.


2015 ◽  
Vol 27 (3) ◽  
pp. 533-545 ◽  
Author(s):  
Rebecca E. Millman ◽  
Sam R. Johnson ◽  
Garreth Prendergast

The temporal envelope of speech is important for speech intelligibility. Entrainment of cortical oscillations to the speech temporal envelope is a putative mechanism underlying speech intelligibility. Here we used magnetoencephalography (MEG) to test the hypothesis that phase-locking to the speech temporal envelope is enhanced for intelligible compared with unintelligible speech sentences. Perceptual “pop-out” was used to change the percept of physically identical tone-vocoded speech sentences from unintelligible to intelligible. The use of pop-out dissociates changes in phase-locking to the speech temporal envelope arising from acoustical differences between un/intelligible speech from changes in speech intelligibility itself. Novel and bespoke whole-head beamforming analyses, based on significant cross-correlation between the temporal envelopes of the speech stimuli and phase-locked neural activity, were used to localize neural sources that track the speech temporal envelope of both intelligible and unintelligible speech. Location-of-interest analyses were carried out in a priori defined locations to measure the representation of the speech temporal envelope for both un/intelligible speech in both the time domain (cross-correlation) and frequency domain (coherence). Whole-brain beamforming analyses identified neural sources phase-locked to the temporal envelopes of both unintelligible and intelligible speech sentences. Crucially there was no difference in phase-locking to the temporal envelope of speech in the pop-out condition in either the whole-brain or location-of-interest analyses, demonstrating that phase-locking to the speech temporal envelope is not enhanced by linguistic information.


2006 ◽  
Vol 20 (3) ◽  
pp. 355-368 ◽  
Author(s):  
Corina J. van As-Brooks ◽  
Florien J. Koopmans-van Beinum ◽  
Louis C.W. Pols ◽  
Frans J.M. Hilgers

1996 ◽  
Vol 83 (2) ◽  
pp. 658-658
Author(s):  
Sakina S. Drummond ◽  
Kathy Krueger ◽  
Jess Dancer ◽  
Gretchen Spring

With 8 men, two methods of alaryngeal speech production, tracheoesophageal and electrolaryngeal, were compared on perceptual and acoustic measures of speech intelligibility. Measures consistently identified the tracheoesophageal speech as superior to electrolaryngeal speech.


2020 ◽  
Vol 10 (1) ◽  
pp. 26
Author(s):  
John J. Sidtis ◽  
Diana Van Lancker Sidtis ◽  
Ritesh Ramdhani ◽  
Michele Tagliati

Deep brain stimulation (DBS) of the subthalamic nucleus (STN) has become an effective and widely used tool in the treatment of Parkinson’s disease (PD). STN-DBS has varied effects on speech. Clinical speech ratings suggest worsening following STN-DBS, but quantitative intelligibility, perceptual, and acoustic studies have produced mixed and inconsistent results. Improvements in phonation and declines in articulation have frequently been reported during different speech tasks under different stimulation conditions. Questions remain about preferred STN-DBS stimulation settings. Seven right-handed, native speakers of English with PD treated with bilateral STN-DBS were studied off medication at three stimulation conditions: stimulators off, 60 Hz (low frequency stimulation—LFS), and the typical clinical setting of 185 Hz (High frequency—HFS). Spontaneous speech was recorded in each condition and excerpts were prepared for transcription (intelligibility) and difficulty judgements. Separate excerpts were prepared for listeners to rate abnormalities in voice, articulation, fluency, and rate. Intelligibility for spontaneous speech was reduced at both HFS and LFS when compared to STN-DBS off. On the average, speech produced at HFS was more intelligible than that produced at LFS, but HFS made the intelligibility task (transcription) subjectively more difficult. Both voice quality and articulation were judged to be more abnormal with DBS on. STN-DBS reduced the intelligibility of spontaneous speech at both LFS and HFS but lowering the frequency did not improve intelligibility. Voice quality ratings with STN-DBS were correlated with the ratings made without stimulation. This was not true for articulation ratings. STN-DBS exacerbated existing voice problems and may have introduced new articulatory abnormalities. The results from individual DBS subjects showed both improved and reduced intelligibility varied as a function of DBS, with perceived changes in voice appearing to be more reflective of intelligibility than perceived changes in articulation.


2003 ◽  
Vol 46 (4) ◽  
pp. 947-959 ◽  
Author(s):  
Corina J. van As ◽  
Florien J. Koopmans-van Beinum ◽  
Louis C. W. Pols ◽  
Frans J. M. Hilgers

The present study was conducted to investigate voice quality in tracheoesophageal speech by means of perceptual evaluations and to develop a clinically useful subset of perceptual scales sufficient for these perceptual evaluations. The perceptual ratings were obtained from both naive and trained raters (speechlanguage pathologists [SLPs]) after listening to a read-aloud text. The perceptual evaluations were performed by means of 19 semantic bipolar 7-point scales for the naive raters and 20 semantic bipolar 7-point scales for the trained raters. The trained raters were also asked to judge the overall voice quality as good, reasonable, or poor. Both naive listeners and trained SLPs were able to perform reliable perceptual judgments. Naive raters judged the tracheoesophageal voice as more deviant than the trained raters did. Naive raters made judgments based on 2 underlying perceptual dimensions (voice quality and pitch), whereas the trained raters made judgments based on 4 underlying perceptual dimensions (voice quality, tonicity, pitch, and tempo). These perceptual dimensions were further subdivided into a subset of 4 perceptual scales for the naive raters and a subset of 8 perceptual scales for the trained raters. This appeared to provide a sufficient coverage of the underlying perceptual dimensions used by the listeners.


2018 ◽  
Author(s):  
Jonas Vanthornhout ◽  
Lien Decruy ◽  
Jan Wouters ◽  
Jonathan Z. Simon ◽  
Tom Francart

AbstractSpeech intelligibility is currently measured by scoring how well a person can identify a speech signal. The results of such behavioral measures reflect neural processing of the speech signal, but are also influenced by language processing, motivation and memory. Very often electrophysiological measures of hearing give insight in the neural processing of sound. However, in most methods non-speech stimuli are used, making it hard to relate the results to behavioral measures of speech intelligibility. The use of natural running speech as a stimulus in electrophysiological measures of hearing is a paradigm shift which allows to bridge the gap between behavioral and electrophysiological measures. Here, by decoding the speech envelope from the electroencephalogram, and correlating it with the stimulus envelope, we demonstrate an electrophysiological measure of neural processing of running speech. We show that behaviorally measured speech intelligibility is strongly correlated with our electrophysiological measure. Our results pave the way towards an objective and automatic way of assessing neural processing of speech presented through auditory prostheses, reducing confounds such as attention and cognitive capabilities. We anticipate that our electrophysiological measure will allow better differential diagnosis of the auditory system, and will allow the development of closed-loop auditory prostheses that automatically adapt to individual users.


Sign in / Sign up

Export Citation Format

Share Document