scholarly journals Do You See What I Am Saying? Exploring Visual Enhancement of Speech Comprehension in Noisy Environments

2006 ◽  
Vol 17 (5) ◽  
pp. 1147-1153 ◽  
Author(s):  
L. A. Ross ◽  
D. Saint-Amour ◽  
V. M. Leavitt ◽  
D. C. Javitt ◽  
J. J. Foxe
2007 ◽  
Vol 97 (1-3) ◽  
pp. 173-183 ◽  
Author(s):  
Lars A. Ross ◽  
Dave Saint-Amour ◽  
Victoria M. Leavitt ◽  
Sophie Molholm ◽  
Daniel C. Javitt ◽  
...  

2018 ◽  
Author(s):  
Bohan Dai ◽  
James M. McQueen ◽  
René Terporten ◽  
Peter Hagoort ◽  
Anne Kösem

AbstractListening to speech is difficult in noisy environments, and is even harder when the interfering noise consists of intelligible speech as compared to non-intelligible sounds. This suggests that the ignored speech is not fully ignored, and that competing linguistic information interferes with the neural processing of target speech. We tested this hypothesis using magnetoencephalography (MEG) while participants listened to target clear speech in the presence of distracting noise-vocoded signals. Crucially, the noise vocoded distractors were initially unintelligible but were perceived as intelligible speech after a small training session. We compared participants’ performance in the speech-in-noise task before and after training, and neural entrainment to both target and distracting speech. The comprehension of the target clear speech was reduced in the presence of intelligible distractors as compared to when they were unintelligible. The neural entrainment to target speech in the delta range (1–4 Hz) reduced in strength in the presence of an intelligible distractor. In contrast, neural entrainment to distracting signals was not significantly modulated by intelligibility. These results support and extend previous findings, showing, first, that the masking effects of distracting speech originate from the degradation of the linguistic representation of target speech, and second, that delta entrainment reflects linguistic processing of speech.Significance StatementComprehension of speech in noisy environments is impaired due to interference from background sounds. The magnitude of interference depends on the intelligibility of the distracting speech signals. In a magnetoencephalography experiment with a highly-controlled training paradigm, we show that the linguistic information of distracting speech imposes higher-order interference on the processing of the target speech, as indexed by a decline of comprehension of target speech and a reduction of delta entrainment to target speech. This work demonstrates the importance of neural oscillations for speech processing. It shows that delta oscillations reflect linguistic analysis during speech comprehension, which can critically be affected by the presence of other speech.


2020 ◽  
Author(s):  
Luuk P.H. van de Rijt ◽  
A. John van Opstal ◽  
Marc M. van Wanrooij

AbstractThe cochlear implant (CI) allows profoundly deaf individuals to partially recover hearing. Still, due to the coarse acoustic information provided by the implant, CI users have considerable difficulties in recognizing speech, especially in noisy environments, even years after implantation. CI users therefore rely heavily on visual cues to augment speech comprehension, more so than normal-hearing individuals. However, it is unknown how attention to one (focused) or both (divided) modalities plays a role in multisensory speech recognition. Here we show that unisensory speech listening and speech reading were negatively impacted in divided-attention tasks for CI users - but not for normal-hearing individuals. Our psychophysical experiments revealed that, as expected, listening thresholds were consistently better for the normal-hearing, while lipreading thresholds were largely similar for the two groups. Moreover, audiovisual speech recognition for normal-hearing individuals could be described well by probabilistic summation of auditory and visual speech recognition, while CI users were better integrators than expected from statistical facilitation alone. Our results suggest that this benefit in integration, however, comes at a cost. Unisensory speech recognition is degraded for CI users when attention needs to be divided across modalities, i.e. in situations with uncertainty about the upcoming stimulus modality. We conjecture that CI users exhibit an integration-attention trade-off. They focus solely on a single modality during focused-attention tasks, but need to divide their limited attentional resources to more modalities during divided-attention tasks. We argue that in order to determine the benefit of a CI for speech comprehension, situational factors need to be discounted by presenting speech in realistic or complex audiovisual environments.Significance statementDeaf individuals using a cochlear implant require significant amounts of effort to listen in noisy environments due to their impoverished hearing. Lipreading can benefit them and reduce the burden of listening by providing an additional source of information. Here we show that the improved speech recognition for audiovisual stimulation comes at a cost, however, as the cochlear-implant users now need to listen and speech-read simultaneously, paying attention to both modalities. The data suggests that cochlear-implant users run into the limits of their attentional resources, and we argue that they, unlike normal-hearing individuals, always need to consider whether a multisensory benefit outweighs the unisensory cost in everyday environments.


2017 ◽  
Vol 26 (5) ◽  
pp. 451-457 ◽  
Author(s):  
Lucy C. Erickson ◽  
Rochelle S. Newman

The goal of this review is to provide a high-level, selected overview of the consequences of background noise on health, perception, cognition, and learning during early development, with a specific focus on how noise may impair speech comprehension and language learning (e.g., via masking). Although much of the existing literature has focused on adults, research shows that infants and young children are relatively disadvantaged at listening in noise. Consequently, a major goal is to consider how background noise may affect young children, who must learn and develop language in noisy environments despite being simultaneously less equipped to do so.


2020 ◽  
Vol 33 (3) ◽  
pp. 277-294 ◽  
Author(s):  
Niti Jaha ◽  
Stanley Shen ◽  
Jess R. Kerlin ◽  
Antoine J. Shahin

Abstract Lip-reading improves intelligibility in noisy acoustical environments. We hypothesized that watching mouth movements benefits speech comprehension in a ‘cocktail party’ by strengthening the encoding of the neural representations of the visually paired speech stream. In an audiovisual (AV) task, EEG was recorded as participants watched and listened to videos of a speaker uttering a sentence while also hearing a concurrent sentence by a speaker of the opposite gender. A key manipulation was that each audio sentence had a 200-ms segment replaced by white noise. To assess comprehension, subjects were tasked with transcribing the AV-attended sentence on randomly selected trials. In the auditory-only trials, subjects listened to the same sentences and completed the same task while watching a static picture of a speaker of either gender. Subjects directed their listening to the voice of the gender of the speaker in the video. We found that the N1 auditory-evoked potential (AEP) time-locked to white noise onsets was significantly more inhibited for the AV-attended sentences than for those of the auditorily-attended (A-attended) and AV-unattended sentences. N1 inhibition to noise onsets has been shown to index restoration of phonemic representations of degraded speech. These results underscore that attention and congruency in the AV setting help streamline the complex auditory scene, partly by reinforcing the neural representations of the visually attended stream, heightening the perception of continuity and comprehension.


2020 ◽  
Vol 32 (10) ◽  
pp. 1975-1983
Author(s):  
Esti Blanco-Elorrieta ◽  
Nai Ding ◽  
Liina Pylkkänen ◽  
David Poeppel

Understanding speech in noise is a fundamental challenge for speech comprehension. This perceptual demand is amplified in a second language: It is a common experience in bars, train stations, and other noisy environments that degraded signal quality severely compromises second language comprehension. Through a novel design, paired with a carefully selected participant profile, we independently assessed signal-driven and knowledge-driven contributions to the brain bases of first versus second language processing. We were able to dissociate the neural processes driven by the speech signal from the processes that come from speakers' knowledge of their first versus second languages. The neurophysiological data show that, in combination with impaired access to top–down linguistic information in the second language, the locus of bilinguals' difficulty in understanding second language speech in noisy conditions arises from a failure to successfully perform a basic, low-level process: cortical entrainment to speech signals above the syllabic level.


2019 ◽  
Author(s):  
Esti Blanco-Elorrieta ◽  
Nai Ding ◽  
Liina Pylkkänen ◽  
David Poeppel

AbstractUnderstanding speech in noise is a fundamental challenge for speech comprehension. This perceptual demand is amplified in a second language: it is a common experience in bars, train stations, and other noisy environments that degraded signal quality severely compromises second language comprehension. Through a novel design, paired with a carefully selected participant profile, we independently assessed signal-driven and knowledge-driven contributions to the brain bases of first versus second language processing. We were able to dissociate the neural processes driven by the speech signal from the processes that come from speakers’ knowledge of their first versus second languages. The neurophysiological data show that in combination with impaired access to top-down linguistic information in the second language, the locus of bilinguals’ difficulty in understanding second language speech in noisy conditions arises from a failure to successfully perform a basic, low-level process: cortical entrainment to speech signals above the syllabic level.Significance statementOver half of the world’s population is multilingual. Although proficiency in a second language improves over time, even to the point of reaching native-like proficiency, one persistent difficulty is understanding a second language in noisy environments (e.g. in crowds, train stations, etc.). We examined this processing impairment using a novel frequency-tagging paradigm with magnetoencephalography (MEG). We found that the reason underlying bilinguals’ difficulty in understanding second language speech in noisy conditions is induced by a failure to neurally track the operations reflecting structure (i.e., phrase) building.


Author(s):  
Avanti Dey

A significant problem in the area of speech perception is that noisy listening environments often make it difficult to understand what is being said. Furthermore, speech overwhelmingly contains ambiguous words that carry multiple meanings, which can make speech comprehension even more difficult. Previous research has found that spoken sentences containing ambiguous words (e.g. “the woman was told that the mint was used for making coins”) are harder to understand in noise than matched sentences without such words; we call this phenomenon the “ambiguity effect”. The current study examined individuals’ understanding of speech in noisy environments when this speech contains ambiguous words, and how context can influence this understanding. By manipulating the context in which sentences are presented, I examined whether listeners’ interpretation of the sentence can be biased towards a particular meaning, thereby affecting intelligibility. Participants listened to noisy sentences, each of which was preceded by a priming word intended to provide a particular context to the sentence. Two main predictions follow from this study. First, I predict that listeners will be able to understand less from sentences that contain ambiguous words, compared to those that do not. Furthermore, I predict that the priming words will be of greater benefit (particularly related primes) to listeners in understanding sentences with ambiguous words, rather than sentences without ambiguous words. Preliminary findings will be discussed in the presentation. This work will contribute to the current literature concerning how we use semantic information to understand speech in challenging listening environments.


2009 ◽  
Vol 21 (9) ◽  
pp. 1790-1804 ◽  
Author(s):  
Christopher W. Bishop ◽  
Lee M. Miller

In noisy environments, listeners tend to hear a speaker's voice yet struggle to understand what is said. The most effective way to improve intelligibility in such conditions is to watch the speaker's mouth movements. Here we identify the neural networks that distinguish understanding from merely hearing speech, and determine how the brain applies visual information to improve intelligibility. Using functional magnetic resonance imaging, we show that understanding speech-in-noise is supported by a network of brain areas including the left superior parietal lobule, the motor/premotor cortex, and the left anterior superior temporal sulcus (STS), a likely apex of the acoustic processing hierarchy. Multisensory integration likely improves comprehension through improved communication between the left temporal–occipital boundary, the left medial-temporal lobe, and the left STS. This demonstrates how the brain uses information from multiple modalities to improve speech comprehension in naturalistic, acoustically adverse conditions.


Sign in / Sign up

Export Citation Format

Share Document