audiovisual speech recognition Latest Research Papers

AbstractObjectivesPeople with Schizophrenia (SZ) show deficits in auditory and audiovisual speech recognition. It is possible that these deficits are related to aberrant early sensory processing, combined with an impaired ability to utilize visual cues to improve speech recognition. In this electroencephalography study we tested this by having SZ and healthy controls (HC) identify different unisensory auditory and bisensory audiovisual syllables at different auditory noise levels.MethodsSZ (N = 24) and HC (N = 21) identified one of three different syllables (/da/, /ga/, /ta/) at three different noise levels (no, low, high). Half the trials were unisensory auditory and the other half provided additional visual input of moving lips. Task-evoked mediofrontal N1 and P2 brain potentials triggered to the onset of the auditory syllables were derived and related to behavioral performance.ResultsIn comparison to HC, SZ showed speech recognition deficits for unisensory and bisensory stimuli. These deficits were primarily found in the no noise condition. Paralleling these observations, reduced N1 amplitudes to unisensory and bisensory stimuli in SZ were found in the no noise condition. In HC the N1 amplitudes were positively related to the speech recognition performance, whereas no such relationships were found in SZ. Moreover, no group differences in multisensory speech recognition benefits and N1 suppression effects for bisensory stimuli were observed.ConclusionOur study shows that reduced N1 amplitudes relate to auditory and audiovisual speech processing deficits in SZ. The findings that the amplitude effects were confined to salient speech stimuli and the attenuated relationship with behavioral performance, compared to HC, indicates a diminished decoding of the auditory speech signals in SZs. Our study also revealed intact multisensory benefits in SZs, which indicates that the observed auditory and audiovisual speech recognition deficits were primarily related to aberrant auditory speech processing.HighlightsSpeech processing deficits in schizophrenia related to reduced N1 amplitudes Audiovisual suppression effect in N1 preserved in schizophrenia Schizophrenia showed weakened P2 components in specifically audiovisual processing

Download Full-text

Predicting children's audiovisual speech recognition thresholds based on unimodal performance

The Journal of the Acoustical Society of America ◽

10.1121/10.0007962 ◽

2021 ◽

Vol 150 (4) ◽

pp. A153-A153

Author(s):

Kaylah Lalonde

Keyword(s):

Speech Recognition ◽

Audiovisual Speech ◽

Audiovisual Speech Recognition

Download Full-text

Multisensory Integration-Attention Trade-Off in Cochlear-Implanted Deaf Individuals

Frontiers in Neuroscience ◽

10.3389/fnins.2021.683804 ◽

2021 ◽

Vol 15 ◽

Author(s):

Luuk P. H. van de Rijt ◽

A. John van Opstal ◽

Marc M. van Wanrooij

Keyword(s):

Speech Recognition ◽

Visual Cues ◽

Normal Hearing ◽

Situational Factors ◽

Visual Speech ◽

Noisy Environments ◽

Trade Off ◽

Visual Speech Recognition ◽

Attention Tasks ◽

Audiovisual Speech Recognition

The cochlear implant (CI) allows profoundly deaf individuals to partially recover hearing. Still, due to the coarse acoustic information provided by the implant, CI users have considerable difficulties in recognizing speech, especially in noisy environments. CI users therefore rely heavily on visual cues to augment speech recognition, more so than normal-hearing individuals. However, it is unknown how attention to one (focused) or both (divided) modalities plays a role in multisensory speech recognition. Here we show that unisensory speech listening and reading were negatively impacted in divided-attention tasks for CI users—but not for normal-hearing individuals. Our psychophysical experiments revealed that, as expected, listening thresholds were consistently better for the normal-hearing, while lipreading thresholds were largely similar for the two groups. Moreover, audiovisual speech recognition for normal-hearing individuals could be described well by probabilistic summation of auditory and visual speech recognition, while CI users were better integrators than expected from statistical facilitation alone. Our results suggest that this benefit in integration comes at a cost. Unisensory speech recognition is degraded for CI users when attention needs to be divided across modalities. We conjecture that CI users exhibit an integration-attention trade-off. They focus solely on a single modality during focused-attention tasks, but need to divide their limited attentional resources in situations with uncertainty about the upcoming stimulus modality. We argue that in order to determine the benefit of a CI for speech recognition, situational factors need to be discounted by presenting speech in realistic or complex audiovisual environments.

Download Full-text