Time Course of Early Audiovisual Interactions during Speech and Nonspeech Central Auditory Processing: A Magnetoencephalography Study

2009 ◽  
Vol 21 (2) ◽  
pp. 259-274 ◽  
Author(s):  
Ingo Hertrich ◽  
Klaus Mathiak ◽  
Werner Lutzenberger ◽  
Hermann Ackermann

Cross-modal fusion phenomena suggest specific interactions of auditory and visual sensory information both within the speech and nonspeech domains. Using whole-head magnetoencephalography, this study recorded M50 and M100 fields evoked by ambiguous acoustic stimuli that were visually disambiguated to perceived /ta/ or /pa/ syllables. As in natural speech, visual motion onset preceded the acoustic signal by 150 msec. Control conditions included visual and acoustic nonspeech signals as well as visual-only and acoustic-only stimuli. (a) Both speech and nonspeech motion yielded a consistent attenuation of the auditory M50 field, suggesting a visually induced “preparatory baseline shift” at the level of the auditory cortex. (b) Within the temporal domain of the auditory M100 field, visual speech and nonspeech motion gave rise to different response patterns (nonspeech: M100 attenuation; visual /pa/: left-hemisphere M100 enhancement; /ta/: no effect). (c) These interactions could be further decomposed using a six-dipole model. One of these three pairs of dipoles (V270) was fitted to motion-induced activity at a latency of 270 msec after motion onset, that is, the time domain of the auditory M100 field, and could be attributed to the posterior insula. This dipole source responded to nonspeech motion and visual /pa/, but was found suppressed in the case of visual /ta/. Such a nonlinear interaction might reflect the operation of a binary distinction between the marked phonological feature “labial” versus its underspecified competitor “coronal.” Thus, visual processing seems to be shaped by linguistic data structures even prior to its fusion with auditory information channel.

2021 ◽  
Author(s):  
Min Zhang ◽  
Rachel N Denison ◽  
Denis G Pelli ◽  
Thuy Tien C Le ◽  
Antje Ihlefeld

AbstractIn noisy or cluttered environments, sensory cortical mechanisms help combine auditory or visual features into perceived objects. Knowing that individuals vary greatly in their ability to suppress unwanted sensory information, and knowing that the sizes of auditory and visual cortical regions are correlated, we wondered whether there might be a corresponding relation between an individual’s ability to suppress auditory vs. visual interference. In auditory masking, background sound makes spoken words unrecognizable. When masking arises due to interference at central auditory processing stages, beyond the cochlea, it is called informational masking (IM). A strikingly similar phenomenon in vision, called visual crowding, occurs when nearby clutter makes a target object unrecognizable, despite being resolved at the retina. We here compare susceptibilities to auditory IM and visual crowding in the same participants. Surprisingly, across participants, we find a negative correlation (R = −0.7) between IM susceptibility and crowding susceptibility: Participants who have low susceptibility to IM tend to have high susceptibility to crowding, and vice versa. This reveals a mid-level trade-off between auditory and visual processing.


2021 ◽  
pp. 174702182199003
Author(s):  
Andy J Kim ◽  
David S Lee ◽  
Brian A Anderson

Previously reward-associated stimuli have consistently been shown to involuntarily capture attention in the visual domain. Although previously reward-associated but currently task-irrelevant sounds have also been shown to interfere with visual processing, it remains unclear whether such stimuli can interfere with the processing of task-relevant auditory information. To address this question, we modified a dichotic listening task to measure interference from task-irrelevant but previously reward-associated sounds. In a training phase, participants were simultaneously presented with a spoken letter and number in different auditory streams and learned to associate the correct identification of each of three letters with high, low, and no monetary reward, respectively. In a subsequent test phase, participants were again presented with the same auditory stimuli but were instead instructed to report the number while ignoring spoken letters. In both the training and test phases, response time measures demonstrated that attention was biased in favour of the auditory stimulus associated with high value. Our findings demonstrate that attention can be biased towards learned reward cues in the auditory domain, interfering with goal-directed auditory processing.


Author(s):  
Wessam Mostafa Essawy

<p class="abstract"><strong>Background:</strong> Amblyaudia is a weakness in the listener’s binaural processing of auditory information. Subjects with amblyaudia also demonstrate binaural integration deficits and may display similar patterns in their evoked responses in terms of latency and amplitude of these responses. The purpose of this study was to identify the presence of amblyaudia in a population of young children subjects and to measure mismatch negativity (MMN), P300 and cortical auditory evoked potentials (CAEPs) for those individuals.</p><p class="abstract"><strong>Methods:</strong> Subjects included in this study were divided into 2 groups control group that consisted of 20 normal hearing subjects with normal developmental milestones and normal speech development. The study group (GII) consisted of 50 subjects with central auditory processing disorders (CAPDs) diagnosed by central auditory screening tests. </p><p class="abstract"><strong>Results:</strong> With using dichotic tests including dichotic digits test (DDT) and competing sentence test (CST), we could classify these cases into normal, dichotic dysaudia, amblyaudia, and amblyaudia plus with percentages (40%, 14%, 38%, 8% respectively). Using event related potentials, we found that P300 and MMN are more specific in detecting neurocognitive dysfunction related to allocation of attentional resources and immediate memory in these cases.</p><p class="abstract"><strong>Conclusions:</strong> The presence of amblyaudia in cases of central auditory processing disorders (CAPDs) and event related potentials is an objective tool for diagnosis, prognosis and follow up after rehabilitation.</p>


2019 ◽  
Author(s):  
Samson Chota ◽  
Rufin VanRullen

AbstractIt has long been debated whether visual processing is, at least partially, a discrete process. Although vision appears to be a continuous stream of sensory information, sophisticated experiments reveal periodic modulations of perception and behavior. Previous work has demonstrated that the phase of endogenous neural oscillations in the 10 Hz range predicts the “lag” of the flash lag effect, a temporal visual illusion in which a static object is perceived to be lagging in time behind a moving object. Consequently, it has been proposed that the flash lag illusion could be a manifestation of a periodic, discrete sampling mechanism in the visual system. In this experiment we set out to causally test this hypothesis by entraining the visual system to a periodic 10 Hz stimulus and probing the flash lag effect (FLE) at different time points during entrainment. We hypothesized that the perceived FLE would be modulated over time, at the same frequency as the entrainer (10 Hz). A frequency analysis of the average FLE time-course indeed reveals a significant peak at 10 Hz as well as a strong phase consistency between subjects (N=26). Our findings provide evidence for a causal relationship between alpha oscillations and fluctuations in temporal perception.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Min Zhang ◽  
Rachel N Denison ◽  
Denis G Pelli ◽  
Thuy Tien C Le ◽  
Antje Ihlefeld

AbstractSensory cortical mechanisms combine auditory or visual features into perceived objects. This is difficult in noisy or cluttered environments. Knowing that individuals vary greatly in their susceptibility to clutter, we wondered whether there might be a relation between an individual’s auditory and visual susceptibilities to clutter. In auditory masking, background sound makes spoken words unrecognizable. When masking arises due to interference at central auditory processing stages, beyond the cochlea, it is called informational masking. A strikingly similar phenomenon in vision, called visual crowding, occurs when nearby clutter makes a target object unrecognizable, despite being resolved at the retina. We here compare susceptibilities to auditory informational masking and visual crowding in the same participants. Surprisingly, across participants, we find a negative correlation (R = –0.7) between susceptibility to informational masking and crowding: Participants who have low susceptibility to auditory clutter tend to have high susceptibility to visual clutter, and vice versa. This reveals a tradeoff in the brain between auditory and visual processing.


Author(s):  
Jamileh Chupani ◽  
Mohanna Javanbakht ◽  
Yones Lotfi

Background and Aim: The majority of the world’s population is bilingual. Bilingualism is a form of sensory enrichment that translates to gains in cognitive abilities; these cognitive gains in attention and memory are known to modulate subcortical processing of auditory stimuli. Sec­ond language acquisition has a broad impact on various psychological, cognitive, memory, and linguistic processes. Central auditory processing (CAP) is the perceptual processing of auditory information. Due to its importance in bilingu­alism, this study aimed to review the CAP of bilinguals. Recent Findings: The CAP was studied in three areas: dichotic listening, temporal processing, and speech in noise perception. Regarding dicho­tic listening, studies have shown that bilinguals have better performance in staggered spondaic word (SSW) test, consonant-vowel dichotic test, dichotic digits test (DDT), and disyllable dichotic test than monolinguals, although similar results have also been reported in SSW and DDT. Reg­arding temporal processing, the results of bilin­guals do not differ from those of monolinguals, although in some cases, it is better in bilinguals. Regarding speech in noise perception, the results between bilinguals and monolinguals are varied depending on the amount of linguistic infor­mation available in the stimuli. Conclusion: Bilingualism has a positive effect on dichotic processing, no effect on temporal processing, and varied effect on speech in noise perception. Bilinguals have poor performance using meaningful speech and better performance using meaningless speech. Keywords: Central auditory processing; bilingual; dichotic listening; temporal processing; speech in noise perception


2021 ◽  
Vol 15 ◽  
Author(s):  
Ruxandra I. Tivadar ◽  
Robert T. Knight ◽  
Athina Tzovara

The human brain has the astonishing capacity of integrating streams of sensory information from the environment and forming predictions about future events in an automatic way. Despite being initially developed for visual processing, the bulk of predictive coding research has subsequently focused on auditory processing, with the famous mismatch negativity signal as possibly the most studied signature of a surprise or prediction error (PE) signal. Auditory PEs are present during various consciousness states. Intriguingly, their presence and characteristics have been linked with residual levels of consciousness and return of awareness. In this review we first give an overview of the neural substrates of predictive processes in the auditory modality and their relation to consciousness. Then, we focus on different states of consciousness - wakefulness, sleep, anesthesia, coma, meditation, and hypnosis - and on what mysteries predictive processing has been able to disclose about brain functioning in such states. We review studies investigating how the neural signatures of auditory predictions are modulated by states of reduced or lacking consciousness. As a future outlook, we propose the combination of electrophysiological and computational techniques that will allow investigation of which facets of sensory predictive processes are maintained when consciousness fades away.


2019 ◽  
Author(s):  
Patrick J. Karas ◽  
John F. Magnotti ◽  
Brian A. Metzger ◽  
Lin L. Zhu ◽  
Kristen B. Smith ◽  
...  

AbstractVision provides a perceptual head start for speech perception because most speech is “mouth-leading”: visual information from the talker’s mouth is available before auditory information from the voice. However, some speech is “voice-leading” (auditory before visual). Consistent with a model in which vision modulates subsequent auditory processing, there was a larger perceptual benefit of visual speech for mouth-leading vs. voice-leading words (28% vs. 4%). The neural substrates of this difference were examined by recording broadband high-frequency activity from electrodes implanted over auditory association cortex in the posterior superior temporal gyrus (pSTG) of epileptic patients. Responses were smaller for audiovisual vs. auditory-only mouth-leading words (34% difference) while there was little difference (5%) for voice-leading words. Evidence for cross-modal suppression of auditory cortex complements our previous work showing enhancement of visual cortex (Ozker et al., 2018b) and confirms that multisensory interactions are a powerful modulator of activity throughout the speech perception network.Impact StatementHuman perception and brain responses differ between words in which mouth movements are visible before the voice is heard and words for which the reverse is true.


2012 ◽  
Vol 25 (0) ◽  
pp. 194
Author(s):  
Carolina Sánchez-García ◽  
Sonia Kandel ◽  
Christophe Savariaux ◽  
Nara Ikumi ◽  
Salvador Soto-Faraco

When both present, visual and auditory information are combined in order to decode the speech signal. Past research has addressed to what extent visual information contributes to distinguish confusable speech sounds, but usually ignoring the continuous nature of speech perception. Here we tap at the temporal course of the contribution of visual and auditory information during the process of speech perception. To this end, we designed an audio–visual gating task with videos recorded with high speed camera. Participants were asked to identify gradually longer fragments of pseudowords varying in the central consonant. Different Spanish consonant phonemes with different degree of visual and acoustic saliency were included, and tested on visual-only, auditory-only and audio–visual trials. The data showed different patterns of contribution of unimodal and bimodal information during identification, depending on the visual saliency of the presented phonemes. In particular, for phonemes which are clearly more salient in one modality than the other, audio–visual performance equals that of the best unimodal. In phonemes with more balanced saliency, audio–visual performance was better than both unimodal conditions. These results shed new light on the temporal course of audio–visual speech integration.


Sign in / Sign up

Export Citation Format

Share Document