Dichotic presentation of speech signal with critical band filtering for improving speech perception

Tests of tactual speech perception were conducted using a special frequency-analyzing vocoder. The vocoder presented a running frequency analysis of speech mapped into a spatial array of tactual vibrations which were applied to the fingers of the receiving subject. Ten vibrators were used, one for each finger. The position of a vibrator represented a given frequency region of speech energy; the total range covered was 210 to 7 700 cps; all the vibrations had a frequency of 300 cps; the vibration amplitudes represented the energy distribution over the various frequencies. Discrimination and identification tests were performed with various sets of test vowels; consonant discrimination tests were performed with certain consonants including those that might be difficult to lipread. Performance with vowels appeared to be related to formant structure and duration as measured on the test vowels, and to tactual masking effects. Consonant discrimination was good between stops and continuants; consonant features of nasality, voicing, and affrication were also discriminated to some extent. It is concluded that the skin offers certain capacities for transmitting speech information which may be used to complement speech communication where only an impoverished speech signal is normally received. This research was conducted at the Speech Transmission Laboratory, Royal Institute of Technology, Stockholm, Sweden.

Download Full-text

Integrating cues in speech perception

Behavioral and Brain Sciences ◽

10.1017/s0140525x98391177 ◽

1998 ◽

Vol 21 (2) ◽

pp. 275-275 ◽

Cited By ~ 1

Author(s):

Dominic W. Massaro

Keyword(s):

Speech Perception ◽

Speech Signal ◽

Perceptual Categorization ◽

Important Goal ◽

Multiple Cues ◽

Ecological Property ◽

Second Formant ◽

Speech Research

Sussman et al. describe an ecological property of the speech signal that is putatively functional in perception. An important issue, however, is whether their putative cue is an emerging feature or whether the second formant (F2) onset and the F2 vowel actually provide independent cues to perceptual categorization. Regardless of the outcome of this issue, an important goal of speech research is to understand how multiple cues are evaluated and integrated to achieve categorization.

Download Full-text

Dichotic presentation of interleaving critical-band envelopes: an application to multi-descriptive coding

2000 IEEE Workshop on Speech Coding Proceedings Meeting the Challenges of the New Millennium (Cat No 00EX421) SCFT-00 ◽

10.1109/scft.2000.878400 ◽

2002 ◽

Cited By ~ 1

Author(s):

O. Ghitza ◽

P. Kroon

Keyword(s):

Critical Band ◽

Dichotic Presentation

Download Full-text

Neuronal oscillations and speech perception: critical-band temporal envelopes are the essence

Frontiers in Human Neuroscience ◽

10.3389/fnhum.2012.00340 ◽

2013 ◽

Vol 6 ◽

Cited By ~ 28

Author(s):

Oded Ghitza ◽

Anne-Lise Giraud ◽

David Poeppel

Keyword(s):

Speech Perception ◽

Critical Band ◽

Neuronal Oscillations

Download Full-text

Kõnetaju kategoriaalsus ehk hüpotees sellest, kuidas me keelelisi üksusi tajume

Eesti ja soome-ugri keeleteaduse ajakiri Journal of Estonian and Finno-Ugric Linguistics ◽

10.12697/jeful.2013.4.1.08 ◽

2013 ◽

Vol 4 (1) ◽

pp. 127-143

Author(s):

Nele Salveste

Keyword(s):

Theoretical Model ◽

Speech Perception ◽

Experimental Method ◽

Speech Signal ◽

Categorical Perception ◽

Speech Communication ◽

Perceptual Ability ◽

Estonian Language ◽

Linguistic Units ◽

Normal Speech

Erinevate häälikute laad meie igapäevases kõnes varieerub tugevalt, kuid häälduse varieeruvus ei ole enamasti kõneeristusele takistuseks. See annab alust oletada, et kõnetaju on välja arendanud süsteemi, millega tuvastada foneeme väga suure varieeruvusega kõnesignaalist. See süsteem tegeleb kõne varieeruvusega nii tõhusalt ja kiiresti, et me ei ole sellest enamasti teadlikud. Seda süsteemi võiks nimetada kategoriaalseks tajuks (ingl Categorical Perception), kuid kuna taju on uurimisele üksnes kaudselt kättesaadav, siis tähistab see termin pigem eksperimentaalset mudelit või meetodit, millega uuritakse taju võimet foneeme kõnesignaalist eristada. (Schouten jt 2003) Käesolevas artiklis arutatakse kategoriaalse taju kui mudeli ja katsemeetodi üle, mille teoreetilised lähtekohad on olnud nii muudes keeltes kui eesti keeles läbi viidud tajukatsete ülesehituse ja järelduste eeldusteks.Categorical perception or the hypothesis of how we perceive linguistic units. The acoustic signal of everyday speech is very variable, but it seldom distracts the normal speech communication. This motivates the hypothesis that the speech perception must have developed a special mechanism for extracting phonemes from highly variable speech signal. This mechanism extracts phonemes so efficiently and quickly that we are often unaware of it. We would like to call this mechanism “categorical perception of speech”, but since the perceptual processes are only indirectly accessible for investigation, the term refers rather to a theoretical model or an experimental method for investigating our perceptual ability to distinguish phonemes from the speech signal so efficiently (Schouten et al. 2003). In this paper the Categorical Perception as an experimental method and its theoretical statements will be discussed in connection to perception experiments and findings in other languages as well as in Estonian language.

Download Full-text

Comb filters for binaural dichotic presentation to improve speech perception by persons with bilateral sensorineural hearing impairment

The Journal of the Acoustical Society of America ◽

10.1121/1.4777332 ◽

2001 ◽

Vol 110 (5) ◽

pp. 2705-2705 ◽

Cited By ~ 1

Author(s):

Alice N. Cheeran ◽

Prem C. Pandey ◽

Dakshayani S. Jangamashetti

Keyword(s):

Speech Perception ◽

Hearing Impairment ◽

Sensorineural Hearing ◽

Comb Filters ◽

Sensorineural Hearing Impairment ◽

Dichotic Presentation

Download Full-text

Critical band splitting of speech signal for reducing the effect of spectral masking in bilateral sensorineural hearing impairment

ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359) ◽

10.1109/isspa.1999.818127 ◽

2003 ◽

Cited By ~ 4

Author(s):

D.S. Chaudhari ◽

P.C. Pandey

Keyword(s):

Hearing Impairment ◽

Speech Signal ◽

Critical Band ◽

Sensorineural Hearing ◽

Band Splitting ◽

Sensorineural Hearing Impairment ◽

Spectral Masking

Download Full-text

Spectral Splitting of Speech by Wavelet Packets to Shrink Simultaneous Masking

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.403-408.970 ◽

2011 ◽

Vol 403-408 ◽

pp. 970-975

Author(s):

P.A. Dhulekar ◽

S.L. Nalbalwar ◽

J.J. Chopade

Keyword(s):

Spectral Analysis ◽

Speech Processing ◽

Speech Signal ◽

Auditory Perception ◽

Filter Bank ◽

Wavelet Packets ◽

Decomposition Level ◽

Analysis Strategies ◽

Spectral Splitting ◽

Dichotic Presentation

Simultaneous masking is when a sound is made inaudible by a masker, a noise or unwanted sound of the same duration as the original sound. An innovative approach is investigated for speech processing in cochlea. Differently from the traditional filter-bank spectral analysis strategies, the proposed method analyses the speech signal by means of the wavelet packets. Splitting the speech signal by filtering and down sampling at each decomposition level, using wavelet packets with different wavelet functions, helps to shrink the effect of simultaneous masking. The performance of the proposed method is experimentally evaluated with vowel-consonant-vowel -syllables for fifteen English consonants. The dichotic presentation of processed speech signals effectively reduces the simultaneous masking through which it improves auditory perception.

Download Full-text

Left frontal motor delta oscillations reflect the temporal integration of multimodal speech

10.1101/2020.11.26.399709 ◽

2020 ◽

Author(s):

Emmanuel Biau ◽

Benjamin G. Schultz ◽

Thomas C. Gunter ◽

Sonja A. Kotz

Keyword(s):

Speech Perception ◽

Speech Signal ◽

Temporal Structure ◽

Temporal Integration ◽

Oscillatory Activity ◽

Prosodic Features ◽

Temporal Alignment ◽

Delta Activity ◽

Verbal Content ◽

Delta Oscillations

ABSTRACTDuring multimodal speech perception, slow delta oscillations (~1 - 3 Hz) in the listener’s brain synchronize with speech signal, likely reflecting signal decomposition at the service of comprehension. In particular, fluctuations imposed onto the speech amplitude envelope by a speaker’s prosody seem to temporally align with articulatory and body gestures, thus providing two complementary sensations to the speech signal’s temporal structure. Further, endogenous delta oscillations in the left motor cortex align with speech and music beat, suggesting a role in the temporal integration of (quasi)-rhythmic stimulations. We propose that delta activity facilitates the temporal alignment of a listener’s oscillatory activity with the prosodic fluctuations in a speaker’s speech during multimodal speech perception. We recorded EEG responses in an audiovisual synchrony detection task while participants watched videos of a speaker. To test the temporal alignment of visual and auditory prosodic features, we filtered the speech signal to remove verbal content. Results confirm (i) that participants accurately detected audiovisual synchrony, and (ii) greater delta power in left frontal motor regions in response to audiovisual asynchrony. The latter effect correlated with behavioural performance, and (iii) decreased delta-beta coupling in the left frontal motor regions when listeners could not accurately integrate visual and auditory prosodies. Together, these findings suggest that endogenous delta oscillations align fluctuating prosodic information conveyed by distinct sensory modalities onto a common temporal organisation in multimodal speech perception.

Download Full-text

Alaryngeal Speech Intelligibility and the Older Listener

Journal of Speech and Hearing Disorders ◽

10.1044/jshd.5001.60 ◽

1985 ◽

Vol 50 (1) ◽

pp. 60-65 ◽

Cited By ~ 12

Author(s):

John Greer Clark

Keyword(s):

Speech Perception ◽

Older Adult ◽

Speech Signal ◽

Speech Intelligibility ◽

Speech Signals ◽

Identification Performance ◽

Alaryngeal Speech ◽

Artificial Larynx ◽

Tracheoesophageal Speech ◽

Speech Identification

Past investigations of alaryngeal speech intelligibility have focused on comparative intelligibility as perceived by young normally hearing adults. However, the spouses and social companions of laryngectomees may have significantly different auditory capabilities compared to young listeners. This report presents a comparison of alaryngeal and laryngeal speech identification performance for a group of young normally hearing listeners and a group of older adult listeners representative of the age of the laryngectomee's social companions. The speech signals investigated included normal laryngeal speech, artificial larynx speech, traditional esophageal speech, and tracheoesophageal speech. The results obtained reveal not only differences in speech signals but also a difference in the proficiency of speech perception for the two groups, favoring the younger listeners. The results of the speech identification measures in the presence of auditory competition revealed greatest intelligibility for the artificial larynx speech signal and poorest for the tracheoesophageal speech signal.

Download Full-text