Speech Perception through Face Masks by Children and Adults

Face masks can cause speech processing difficulties. However, it is unclear to what extent these difficulties are caused by the visual obstruction of the speaker’s mouth or by changes of the acoustic signal, and whether the effects can be found regardless of semantic context. In the present study, children and adults performed a cued shadowing task online, repeating the last word of English sentences. Target words were embedded in sentence-final position and manipulated visually, acoustically, and by semantic context (cloze probability). First results from 16 children and 16 adults suggest that processing language through face masks leads to slower responses in both groups, but visual, acoustic, and semantic cues all significantly reduce the mask effect. Although children were less proficient in predictive speech processing overall, they were still able to use semantic cues to compensate for face mask effects in a similar fashion to adults.

Download Full-text

Face masks impair reconstruction of acoustic speech features and higher-level segmentational features in the presence of a distractor speaker

10.1101/2021.09.28.461909 ◽

2021 ◽

Author(s):

Chandra Leon Haider ◽

Nina Suess ◽

Anne Hauswald ◽

Hyojin Park ◽

Nathan Weisz

Keyword(s):

Speech Perception ◽

Speech Processing ◽

Visual Information ◽

Brain Activity ◽

Face Mask ◽

Speech Features ◽

Adverse Influence ◽

Cortical Regions ◽

Visual Cortical ◽

Target Speaker

AbstractFace masks have become a prevalent measure during the Covid-19 pandemic to counteract the transmission of SARS-CoV 2. An unintended “side-effect” of face masks is their adverse influence on speech perception especially in challenging listening situations. So far, behavioural studies have not pinpointed exactly which feature(s) of speech processing face masks affect in such listening situations. We conducted an audiovisual (AV) multi-speaker experiment using naturalistic speech (i.e. an audiobook). In half of the trials, the target speaker wore a (surgical) face mask, while we measured the brain activity of normal hearing participants via magnetoencephalography (MEG). A decoding model on the clear AV speech (i.e. no additional speaker and target speaker not wearing a face mask) was trained and used to reconstruct crucial speech features in each condition. We found significant main effects of face masks on the reconstruction of acoustic features, such as the speech envelope and spectral speech features (i.e. pitch and formant frequencies), while reconstruction of higher level features of speech segmentation (phoneme and word onsets) were especially impaired through masks in difficult listening situations, i.e. when a distracting speaker was also presented. Our findings demonstrate the detrimental impact face masks have on listening and speech perception, thus extending previous behavioural results. Supporting the idea of visual facilitation of speech is the fact that we used surgical face masks in our study, which only show mild effects on speech acoustics. This idea is in line with recent research, also by our group, showing that visual cortical regions track spectral modulations. Since hearing impairment usually affects higher frequencies, the detrimental effect of face masks might pose a particular challenge for individuals who likely need the visual information about higher frequencies (e.g. formants) to compensate.

Download Full-text

Abnormal speech perception in schizophrenia with auditory hallucinations

Acta Neuropsychiatrica ◽

10.1111/j.0924-2708.2004.00071.x ◽

2004 ◽

Vol 16 (3) ◽

pp. 154-159 ◽

Cited By ~ 13

Author(s):

Seung-Hwan Lee ◽

Young-Cho Chung ◽

Jong-Chul Yang ◽

Yong-Ku Kim ◽

Kwang-Yoon Suh

Keyword(s):

Speech Perception ◽

Speech Processing ◽

Auditory Hallucination ◽

Verbal Working Memory ◽

Continuous Performance Task ◽

Perception System ◽

Significant Difference ◽

Significant Impairment ◽

Schizophrenic Patients ◽

Cpt I

Background:The neurobiological mechanism of auditory hallucination (AH) in schizophrenia remains elusive, but AH can be caused by the abnormality in the speech perception system based on the speech perception neural network model.Objectives:The purpose of this study was to investigate whether schizophrenic patients with AH have the speech processing impairment as compared with schizophrenic patients without AH, and whether the speech perception ability could be improved after AH had subsided.Methods:Twenty-four schizophrenic patients with AH were compared with 25 schizophrenic patients without AH. Narrative speech perception was assessed using a masked speech tracking (MST) task with three levels of superimposed phonetic noise. Sentence repetition task (SRT) and auditory continuous performance task (CPT) were used to assess grammar-dependent verbal working memory and non-language attention, respectively. These tests were measured before and after treatment in both groups.Results:Before treatment, schizophrenic patients with AH showed significant impairments in MST compared with those without AH. There were no significant differences in SRT and CPT correct (CPT-C) rates between both groups, but CPT incorrect (CPT-I) rate showed a significant difference. The low-score CPI-I group showed a significant difference in MST performance between the two groups, while the high-score CPI-I group did not. After treatment (after AH subsided), the hallucinating schizophrenic patients still had significant impairment in MST performance compared with non-hallucinating schizophrenic patients.Conclusions:Our results support the claim that schizophrenic patients with AH are likely to have a disturbance of the speech perception system. Moreover, our data suggest that non-language attention might be a key factor influencing speech perception ability and that speech perception dysfunction might be a trait marker in schizophrenia with AH.

Download Full-text

Neuromorphic Speech Processing

Machine Audition ◽

10.4018/978-1-61520-919-4.ch019 ◽

2010 ◽

pp. 447-473

Author(s):

Pedro Gómez-Vilda ◽

José Manuel Ferrández-Vicente ◽

Victoria Rodellar-Biarge ◽

Rafael Martínez-Olalla ◽

Víctor Nieto-Lluis ◽

...

Keyword(s):

Speech Perception ◽

Speech Processing ◽

Parallel Implementation ◽

Typical Case ◽

Semantic Gap ◽

Top Down ◽

Auditory Pathways ◽

Time Frequency ◽

Current Trends ◽

Processing Architecture

Current trends in the search for improvements in well-established technologies imitating human abilities, as speech perception, try to find inspiration in the explanation of certain capabilities hidden in the natural system which are not yet well understood. A typical case is that of speech recognition, where the semantic gap going from spectral time-frequency representations to the symbolic translation into phonemes and words, and the construction of morpho-syntactic and semantic structures find many hidden phenomena not well understood yet. The present chapter is intended to explore some of these facts at a simplifying level under two points of view: that of top-down analysis provided from speech perception, and the symmetric from bottom-up synthesis provided by the biological architecture of auditory pathways. An application-driven design of a Neuromorphic Speech Processing Architecture is presented and its performance analyzed. Simulation details provided by a parallel implementation of the architecture in a supercomputer will be also shown and discussed.

Download Full-text

L3 Sentence Processing: Language-Specific or Phenomenon-Sensitive?

Languages ◽

10.3390/languages4030054 ◽

2019 ◽

Vol 4 (3) ◽

pp. 54

Author(s):

Sokolova ◽

Slabakova

Keyword(s):

Sentence Processing ◽

Native Speakers ◽

Semantic Cues ◽

Lexical Semantic ◽

L2 Processing ◽

Processing Strategies ◽

Reading Experiment ◽

Linguistic Decision Making ◽

Matrix Verb ◽

Processing Language

The article investigates non-native sentence processing and examines the existing scholarly approaches to L2 processing with a population of L3 learners of English, whose native language is Russian. In a self-paced reading experiment, native speakers of Russian and English, as well as (low) intermediate L3 learners of English, read ambiguous relative clauses (RC) and decided on their attachment interpretation: high attachment (HA) or low attachment (LA). In the two-by-two design, linguistic decision-making was prompted by lexical semantic cues vs. a structural change caused by a certain type of matrix verb. The results show that whenever a matrix verb caused a change of syntactic modification, which entailed HA, both native and non-native speakers abandoned the default English-like LA and chose HA. Lexical semantic cues did not have any significant effect in RC attachment resolution. The study provides experimental evidence in favor of the similarity of native and non-native processing strategies. Both native speakers and L3 learners of English apply structural processing strategies and show similar sensitivity to a linguistic prompt that shapes RC resolution. Native and non-native processing is found to be prediction-based; structure building is performed in a top-down manner.

Download Full-text

Cortico – (thalamo) – cortical interactions, gamma resonance, and auditory hallucinations in schizophrenia

Behavioral and Brain Sciences ◽

10.1017/s0140525x04320182 ◽

2004 ◽

Vol 27 (6) ◽

pp. 797-798 ◽

Cited By ~ 1

Author(s):

Ralph E. Hoffman ◽

Daniel H. Mathalon ◽

Judith M. Ford ◽

John H. Krystal

Keyword(s):

Speech Perception ◽

Transcranial Magnetic Stimulation ◽

Speech Processing ◽

Magnetic Stimulation ◽

Corollary Discharge ◽

Auditory Hallucinations ◽

Behavioral Studies ◽

Verbal Thought

Transcranial magnetic stimulation, EEG, and behavioral studies by our group implicate spurious activation of speech perception neurocircuitry in the genesis of auditory hallucinations in schizophrenia. The neurobiological basis of these abnormalities remains uncertain, however. We review our ongoing studies, which suggest that altered cortical coupling underlies speech processing in schizophrenia and is expressed via disrupted gamma resonances and impaired corollary discharge function of self-generated verbal thought.

Download Full-text

Perception of Rhythmic Speech Is Modulated by Focal Bilateral Transcranial Alternating Current Stimulation

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01490 ◽

2020 ◽

Vol 32 (2) ◽

pp. 226-240 ◽

Cited By ~ 5

Author(s):

Benedikt Zoefel ◽

Isobella Allard ◽

Megha Anil ◽

Matthew H. Davis

Keyword(s):

Speech Perception ◽

Speech Processing ◽

Alternating Current ◽

Stream Segregation ◽

Causal Role ◽

Oscillatory Activity ◽

Transcranial Alternating Current Stimulation ◽

Neural Entrainment ◽

Actual Speech

Several recent studies have used transcranial alternating current stimulation (tACS) to demonstrate a causal role of neural oscillatory activity in speech processing. In particular, it has been shown that the ability to understand speech in a multi-speaker scenario or background noise depends on the timing of speech presentation relative to simultaneously applied tACS. However, it is possible that tACS did not change actual speech perception but rather auditory stream segregation. In this study, we tested whether the phase relation between tACS and the rhythm of degraded words, presented in silence, modulates word report accuracy. We found strong evidence for a tACS-induced modulation of speech perception, but only if the stimulation was applied bilaterally using ring electrodes (not for unilateral left hemisphere stimulation with square electrodes). These results were only obtained when data were analyzed using a statistical approach that was identified as optimal in a previous simulation study. The effect was driven by a phasic disruption of word report scores. Our results suggest a causal role of neural entrainment for speech perception and emphasize the importance of optimizing stimulation protocols and statistical approaches for brain stimulation research.

Download Full-text

Speech Perception as a Talker-Contingent Process

Psychological Science ◽

10.1111/j.1467-9280.1994.tb00612.x ◽

1994 ◽

Vol 5 (1) ◽

pp. 42-46 ◽

Cited By ~ 262

Author(s):

Lynne C Nygaard ◽

Mitchell S Sommers ◽

David B Pisoni

Keyword(s):

Speech Perception ◽

Perceptual Learning ◽

Acoustic Signal ◽

Signal To Noise ◽

Control Subjects ◽

Spoken Words ◽

Vocal Source ◽

Novel Words

To determine how familiarity with a talker's voice affects perception of spoken words, we trained two groups of subjects to recognize a set of voices over a 9-day period One group then identified novel words produced by the same set of talkers at four signal-to-noise ratios Control subjects identified the same words produced by a different set of talkers The results showed that the ability to identify a talker's voice improved intelligibility of novel words produced by that talker The results suggest that speech perception may involve talker-contingent processes whereby perceptual learning of aspects of the vocal source facilitates the subsequent phonetic analysis of the acoustic signal

Download Full-text

An Attempt to Improve Bilateral Cochlear Implants by Increasing the Distance between Electrodes and Providing Complementary Information to the Two Ears

Journal of the American Academy of Audiology ◽

10.3766/jaaa.21.1.7 ◽

2010 ◽

Vol 21 (01) ◽

pp. 052-065 ◽

Cited By ~ 7

Author(s):

Richard S. Tyler ◽

Shelley A. Witt ◽

Camille C. Dunn ◽

Ann Perreau ◽

Aaron J. Parkinson ◽

...

Keyword(s):

Speech Perception ◽

Cochlear Implant ◽

Cochlear Implants ◽

Speech Processing ◽

Processing Strategy ◽

Bilateral Cochlear Implants ◽

Channel Interaction ◽

Consonant Recognition ◽

Bilateral Cochlear Implant ◽

Input Spectrum

Objectives: The purpose of this investigation was to determine if adult bilateral cochlear implant recipients could benefit from using a speech processing strategy in which the input spectrum was interleaved among electrodes across the two implants. Design: Two separate experiments were conducted. In both experiments, subjects were tested using a control speech processing strategy and a strategy in which the full input spectrum was filtered so that only the output of half of the filters was audible to one implant, while the output of the alternative filters was audible to the other implant. The filters were interleaved in a way that created alternate frequency “holes” between the two cochlear implants. Results: In experiment one, four subjects were tested on consonant recognition. Results indicated that one of the four subjects performed better with the interleaved strategy, one subject received a binaural advantage with the interleaved strategy that they did not receive with the control strategy, and two subjects showed no decrement in performance when using the interleaved strategy. In the second experiment, 11 subjects were tested on word recognition, sentences in noise, and localization (it should be noted that not all subjects participated in all tests). Results showed that for speech perception testing one subject achieved significantly better scores with the interleaved strategy on all tests, and seven subjects showed a significant improvement with the interleaved strategy on at least one test. Only one subject showed a decrement in performance on all speech perception tests with the interleaved strategy. Out of nine subjects, one subject preferred the sound quality of the interleaved strategy. No one performed better on localization with the interleaved strategy. Conclusion: Data from this study indicate that some adult bilateral cochlear implant recipients can benefit from using a speech processing strategy in which the input spectrum is interleaved among electrodes across the two implants. It is possible that the subjects in this study who showed a significant improvement with the interleaved strategy did so because of less channel interaction; however, this hypothesis was not directly tested.

Download Full-text

Effects of Meditation on Temporal Processing and Speech Perceptual Skills in Younger and Older Adults

Asian Journal of Neuroscience ◽

10.1155/2013/304057 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

Uppunda Ajith Kumar ◽

A. V. Sangamanatha ◽

Jai Vikas

Keyword(s):

Older Adults ◽

Young Adults ◽

Speech Perception ◽

Temporal Processing ◽

Speech Processing ◽

Detection Task ◽

Gap Detection ◽

Backward Masking ◽

Modulation Detection ◽

Age Range

The purpose of this study was to assess the temporal processing and speech perception abilities in older adults who were practicing meditation for more than five years. Participants were comprised of three groups, 30 young adults (“YA”) in the age range of 20–30 years, 30 older adults in the age range of 50–65 years who practiced meditation for a period of five years or more (effective meditators “EM”), and 51 age matched older adults who did not have any experience of meditation (non-meditators “NM”). Temporal processing was evaluated using gap detection in noise, duration discrimination, modulation detection, and backward masking and duration pattern tests. Speech perception was measured in presence of a four-talker babble at −5 dB signal to noise ratio and with the vocoded stimuli. Results revealed that EM group performed significantly better than NM group in all psychophysical and speech perception tasks except in gap detection task. In the gap detection task, two groups did not differ significantly. Furthermore, EM group showed significantly better modulation detection thresholds compared to YA. Results of the study demonstrate that the practice of meditation not only offsets the decline in temporal and speech processing abilities due to aging process but also improves the ability to perceive the modulations compared to young adults.

Download Full-text

THE USE OF PROSODIC CUES IN LEARNING NEW WORDS IN AN UNFAMILIAR LANGUAGE

Studies in Second Language Acquisition ◽

10.1017/s0272263112000137 ◽

2012 ◽

Vol 34 (3) ◽

pp. 415-444 ◽

Cited By ~ 24

Author(s):

Sahyang Kim ◽

Mirjam Broersma ◽

Taehong Cho

Keyword(s):

Speech Perception ◽

Language Learning ◽

Artificial Language ◽

Final Position ◽

Prosodic Features ◽

Initial Exposure ◽

New Words ◽

Artificial Language Learning ◽

Prosodic Cues ◽

Nonnative Language

The artificial language learning paradigm was used to investigate to what extent the use of prosodic features is universally applicable or specifically language driven in learning an unfamiliar language, and how nonnative prosodic patterns can be learned. Listeners of unrelated languages—Dutch (n= 100) and Korean (n= 100)—participated. The words to be learned varied with prosodic cues: no prosody, fundamental frequency (F0) rise in initial and final position, final lengthening, and final lengthening plus F0 rise. Both listener groups performed well above chance level with the final lengthening cue, confirming its crosslinguistic use. As for final F0 rise, however, Dutch listeners did not use it until the second exposure session, whereas Korean listeners used it at initial exposure. Neither group used initial F0 rise. On the basis of these results, F0 and durational cues appear to be universal in the sense that they are used across languages for their universally applicable auditory-perceptual saliency, but how they are used is language specific and constrains the use of available prosodic cues in processing a nonnative language. A discussion on how these findings bear on theories of second language (L2) speech perception and learning is provided.

Download Full-text