Time Course of Early Audiovisual Interactions during Speech and Nonspeech Central Auditory Processing: A Magnetoencephalography Study

Cross-modal fusion phenomena suggest specific interactions of auditory and visual sensory information both within the speech and nonspeech domains. Using whole-head magnetoencephalography, this study recorded M50 and M100 fields evoked by ambiguous acoustic stimuli that were visually disambiguated to perceived /ta/ or /pa/ syllables. As in natural speech, visual motion onset preceded the acoustic signal by 150 msec. Control conditions included visual and acoustic nonspeech signals as well as visual-only and acoustic-only stimuli. (a) Both speech and nonspeech motion yielded a consistent attenuation of the auditory M50 field, suggesting a visually induced “preparatory baseline shift” at the level of the auditory cortex. (b) Within the temporal domain of the auditory M100 field, visual speech and nonspeech motion gave rise to different response patterns (nonspeech: M100 attenuation; visual /pa/: left-hemisphere M100 enhancement; /ta/: no effect). (c) These interactions could be further decomposed using a six-dipole model. One of these three pairs of dipoles (V270) was fitted to motion-induced activity at a latency of 270 msec after motion onset, that is, the time domain of the auditory M100 field, and could be attributed to the posterior insula. This dipole source responded to nonspeech motion and visual /pa/, but was found suppressed in the case of visual /ta/. Such a nonlinear interaction might reflect the operation of a binary distinction between the marked phonological feature “labial” versus its underspecified competitor “coronal.” Thus, visual processing seems to be shaped by linguistic data structures even prior to its fusion with auditory information channel.

Download Full-text

Informational masking vs. crowding — A mid-level trade-off between auditory and visual processing

10.1101/2021.04.21.440826 ◽

2021 ◽

Author(s):

Min Zhang ◽

Rachel N Denison ◽

Denis G Pelli ◽

Thuy Tien C Le ◽

Antje Ihlefeld

Keyword(s):

Visual Processing ◽

Auditory Processing ◽

Sensory Information ◽

Target Object ◽

Central Auditory Processing ◽

Auditory Masking ◽

Trade Off ◽

Knowing That ◽

Visual Interference ◽

Visual Crowding

AbstractIn noisy or cluttered environments, sensory cortical mechanisms help combine auditory or visual features into perceived objects. Knowing that individuals vary greatly in their ability to suppress unwanted sensory information, and knowing that the sizes of auditory and visual cortical regions are correlated, we wondered whether there might be a corresponding relation between an individual’s ability to suppress auditory vs. visual interference. In auditory masking, background sound makes spoken words unrecognizable. When masking arises due to interference at central auditory processing stages, beyond the cochlea, it is called informational masking (IM). A strikingly similar phenomenon in vision, called visual crowding, occurs when nearby clutter makes a target object unrecognizable, despite being resolved at the retina. We here compare susceptibilities to auditory IM and visual crowding in the same participants. Surprisingly, across participants, we find a negative correlation (R = −0.7) between IM susceptibility and crowding susceptibility: Participants who have low susceptibility to IM tend to have high susceptibility to crowding, and vice versa. This reveals a mid-level trade-off between auditory and visual processing.

Download Full-text

Previously reward-associated sounds interfere with goal-directed auditory processing

Quarterly Journal of Experimental Psychology ◽

10.1177/1747021821990033 ◽

2021 ◽

pp. 174702182199003

Author(s):

Andy J Kim ◽

David S Lee ◽

Brian A Anderson

Keyword(s):

Visual Processing ◽

Auditory Processing ◽

Dichotic Listening ◽

Auditory Information ◽

Correct Identification ◽

Subsequent Test ◽

Dichotic Listening Task ◽

Listening Task ◽

Task Irrelevant ◽

Visual Domain

Previously reward-associated stimuli have consistently been shown to involuntarily capture attention in the visual domain. Although previously reward-associated but currently task-irrelevant sounds have also been shown to interfere with visual processing, it remains unclear whether such stimuli can interfere with the processing of task-relevant auditory information. To address this question, we modified a dichotic listening task to measure interference from task-irrelevant but previously reward-associated sounds. In a training phase, participants were simultaneously presented with a spoken letter and number in different auditory streams and learned to associate the correct identification of each of three letters with high, low, and no monetary reward, respectively. In a subsequent test phase, participants were again presented with the same auditory stimuli but were instead instructed to report the number while ignoring spoken letters. In both the training and test phases, response time measures demonstrated that attention was biased in favour of the auditory stimulus associated with high value. Our findings demonstrate that attention can be biased towards learned reward cues in the auditory domain, interfering with goal-directed auditory processing.

Download Full-text

Event related potentials in cases of amblyaudia

International Journal of Otorhinolaryngology and Head and Neck Surgery ◽

10.18203/issn.2454-5929.ijohns20201293 ◽

2020 ◽

Vol 6 (4) ◽

pp. 747

Author(s):

Wessam Mostafa Essawy

Keyword(s):

Auditory Processing ◽

Event Related Potentials ◽

Speech Development ◽

Screening Tests ◽

Control Group ◽

Auditory Information ◽

Central Auditory Processing ◽

Auditory Processing Disorders ◽

Central Auditory Processing Disorders ◽

Related Potentials

Background: Amblyaudia is a weakness in the listener’s binaural processing of auditory information. Subjects with amblyaudia also demonstrate binaural integration deficits and may display similar patterns in their evoked responses in terms of latency and amplitude of these responses. The purpose of this study was to identify the presence of amblyaudia in a population of young children subjects and to measure mismatch negativity (MMN), P300 and cortical auditory evoked potentials (CAEPs) for those individuals.Methods: Subjects included in this study were divided into 2 groups control group that consisted of 20 normal hearing subjects with normal developmental milestones and normal speech development. The study group (GII) consisted of 50 subjects with central auditory processing disorders (CAPDs) diagnosed by central auditory screening tests. Results: With using dichotic tests including dichotic digits test (DDT) and competing sentence test (CST), we could classify these cases into normal, dichotic dysaudia, amblyaudia, and amblyaudia plus with percentages (40%, 14%, 38%, 8% respectively). Using event related potentials, we found that P300 and MMN are more specific in detecting neurocognitive dysfunction related to allocation of attentional resources and immediate memory in these cases.Conclusions: The presence of amblyaudia in cases of central auditory processing disorders (CAPDs) and event related potentials is an objective tool for diagnosis, prognosis and follow up after rehabilitation.

Download Full-text

Visual Entrainment at 10 Hz causes periodic modulation of the Flash Lag Illusion

10.1101/515114 ◽

2019 ◽

Author(s):

Samson Chota ◽

Rufin VanRullen

Keyword(s):

Visual System ◽

Visual Processing ◽

Time Course ◽

Visual Illusion ◽

Sensory Information ◽

Temporal Perception ◽

Static Object ◽

Lag Effect ◽

Continuous Stream ◽

And Behavior

AbstractIt has long been debated whether visual processing is, at least partially, a discrete process. Although vision appears to be a continuous stream of sensory information, sophisticated experiments reveal periodic modulations of perception and behavior. Previous work has demonstrated that the phase of endogenous neural oscillations in the 10 Hz range predicts the “lag” of the flash lag effect, a temporal visual illusion in which a static object is perceived to be lagging in time behind a moving object. Consequently, it has been proposed that the flash lag illusion could be a manifestation of a periodic, discrete sampling mechanism in the visual system. In this experiment we set out to causally test this hypothesis by entraining the visual system to a periodic 10 Hz stimulus and probing the flash lag effect (FLE) at different time points during entrainment. We hypothesized that the perceived FLE would be modulated over time, at the same frequency as the entrainer (10 Hz). A frequency analysis of the average FLE time-course indeed reveals a significant peak at 10 Hz as well as a strong phase consistency between subjects (N=26). Our findings provide evidence for a causal relationship between alpha oscillations and fluctuations in temporal perception.

Download Full-text

Critical role for cochlear hair cell BK channels for coding the temporal structure and dynamic range of auditory information for central auditory processing

The FASEB Journal ◽

10.1096/fj.11-200535 ◽

2012 ◽

Vol 26 (9) ◽

pp. 3834-3843 ◽

Cited By ~ 15

Author(s):

Simone Kurt ◽

Matthias Sausbier ◽

Lukas Rüttiger ◽

Niels Brandt ◽

Christoph K. Moeller ◽

...

Keyword(s):

Hair Cell ◽

Auditory Processing ◽

Dynamic Range ◽

Temporal Structure ◽

Critical Role ◽

Bk Channels ◽

Auditory Information ◽

Central Auditory Processing ◽

Cochlear Hair Cell

Download Full-text

An auditory-visual tradeoff in susceptibility to clutter

Scientific Reports ◽

10.1038/s41598-021-00328-0 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Min Zhang ◽

Rachel N Denison ◽

Denis G Pelli ◽

Thuy Tien C Le ◽

Antje Ihlefeld

Keyword(s):

Visual Processing ◽

Auditory Processing ◽

Target Object ◽

Central Auditory Processing ◽

Auditory Masking ◽

Informational Masking ◽

Cluttered Environments ◽

Knowing That ◽

Visual Crowding ◽

The Brain

AbstractSensory cortical mechanisms combine auditory or visual features into perceived objects. This is difficult in noisy or cluttered environments. Knowing that individuals vary greatly in their susceptibility to clutter, we wondered whether there might be a relation between an individual’s auditory and visual susceptibilities to clutter. In auditory masking, background sound makes spoken words unrecognizable. When masking arises due to interference at central auditory processing stages, beyond the cochlea, it is called informational masking. A strikingly similar phenomenon in vision, called visual crowding, occurs when nearby clutter makes a target object unrecognizable, despite being resolved at the retina. We here compare susceptibilities to auditory informational masking and visual crowding in the same participants. Surprisingly, across participants, we find a negative correlation (R = –0.7) between susceptibility to informational masking and crowding: Participants who have low susceptibility to auditory clutter tend to have high susceptibility to visual clutter, and vice versa. This reveals a tradeoff in the brain between auditory and visual processing.

Download Full-text

Central auditory processing in bilinguals

Auditory and Vestibular Research ◽

10.18502/avr.v30i3.6529 ◽

2021 ◽

Author(s):

Jamileh Chupani ◽

Mohanna Javanbakht ◽

Yones Lotfi

Keyword(s):

Temporal Processing ◽

Auditory Processing ◽

Cognitive Abilities ◽

Dichotic Listening ◽

Poor Performance ◽

Attention And Memory ◽

Auditory Information ◽

Central Auditory Processing ◽

Speech In Noise ◽

Listening Studies

Background and Aim: The majority of the world’s population is bilingual. Bilingualism is a form of sensory enrichment that translates to gains in cognitive abilities; these cognitive gains in attention and memory are known to modulate subcortical processing of auditory stimuli. Second language acquisition has a broad impact on various psychological, cognitive, memory, and linguistic processes. Central auditory processing (CAP) is the perceptual processing of auditory information. Due to its importance in bilingualism, this study aimed to review the CAP of bilinguals. Recent Findings: The CAP was studied in three areas: dichotic listening, temporal processing, and speech in noise perception. Regarding dichotic listening, studies have shown that bilinguals have better performance in staggered spondaic word (SSW) test, consonant-vowel dichotic test, dichotic digits test (DDT), and disyllable dichotic test than monolinguals, although similar results have also been reported in SSW and DDT. Regarding temporal processing, the results of bilinguals do not differ from those of monolinguals, although in some cases, it is better in bilinguals. Regarding speech in noise perception, the results between bilinguals and monolinguals are varied depending on the amount of linguistic information available in the stimuli. Conclusion: Bilingualism has a positive effect on dichotic processing, no effect on temporal processing, and varied effect on speech in noise perception. Bilinguals have poor performance using meaningful speech and better performance using meaningless speech. Keywords: Central auditory processing; bilingual; dichotic listening; temporal processing; speech in noise perception

Download Full-text

Automatic Sensory Predictions: A Review of Predictive Mechanisms in the Brain and Their Link to Conscious Processing

Frontiers in Human Neuroscience ◽

10.3389/fnhum.2021.702520 ◽

2021 ◽

Vol 15 ◽

Author(s):

Ruxandra I. Tivadar ◽

Robert T. Knight ◽

Athina Tzovara

Keyword(s):

Visual Processing ◽

Auditory Processing ◽

Predictive Coding ◽

Sensory Information ◽

Neural Substrates ◽

Predictive Processing ◽

Computational Techniques ◽

Auditory Modality ◽

Future Events ◽

Predictive Processes

The human brain has the astonishing capacity of integrating streams of sensory information from the environment and forming predictions about future events in an automatic way. Despite being initially developed for visual processing, the bulk of predictive coding research has subsequently focused on auditory processing, with the famous mismatch negativity signal as possibly the most studied signature of a surprise or prediction error (PE) signal. Auditory PEs are present during various consciousness states. Intriguingly, their presence and characteristics have been linked with residual levels of consciousness and return of awareness. In this review we first give an overview of the neural substrates of predictive processes in the auditory modality and their relation to consciousness. Then, we focus on different states of consciousness - wakefulness, sleep, anesthesia, coma, meditation, and hypnosis - and on what mysteries predictive processing has been able to disclose about brain functioning in such states. We review studies investigating how the neural signatures of auditory predictions are modulated by states of reduced or lacking consciousness. As a future outlook, we propose the combination of electrophysiological and computational techniques that will allow investigation of which facets of sensory predictive processes are maintained when consciousness fades away.

Download Full-text

Cross-modal Suppression of Auditory Association Cortex by Visual Speech as a Mechanism for Audiovisual Speech Perception

10.1101/626259 ◽

2019 ◽

Author(s):

Patrick J. Karas ◽

John F. Magnotti ◽

Brian A. Metzger ◽

Lin L. Zhu ◽

Kristen B. Smith ◽

...

Keyword(s):

Speech Perception ◽

Auditory Processing ◽

Visual Information ◽

Visual Speech ◽

Association Cortex ◽

Auditory Information ◽

Voice Leading ◽

Audiovisual Speech Perception ◽

Auditory Association Cortex ◽

The Voice

AbstractVision provides a perceptual head start for speech perception because most speech is “mouth-leading”: visual information from the talker’s mouth is available before auditory information from the voice. However, some speech is “voice-leading” (auditory before visual). Consistent with a model in which vision modulates subsequent auditory processing, there was a larger perceptual benefit of visual speech for mouth-leading vs. voice-leading words (28% vs. 4%). The neural substrates of this difference were examined by recording broadband high-frequency activity from electrodes implanted over auditory association cortex in the posterior superior temporal gyrus (pSTG) of epileptic patients. Responses were smaller for audiovisual vs. auditory-only mouth-leading words (34% difference) while there was little difference (5%) for voice-leading words. Evidence for cross-modal suppression of auditory cortex complements our previous work showing enhancement of visual cortex (Ozker et al., 2018b) and confirms that multisensory interactions are a powerful modulator of activity throughout the speech perception network.Impact StatementHuman perception and brain responses differ between words in which mouth movements are visible before the voice is heard and words for which the reverse is true.

Download Full-text

Time course of audio–visual phoneme identification: A cross-modal gating study

Seeing and Perceiving ◽

10.1163/187847612x648233 ◽

2012 ◽

Vol 25 (0) ◽

pp. 194

Author(s):

Carolina Sánchez-García ◽

Sonia Kandel ◽

Christophe Savariaux ◽

Nara Ikumi ◽

Salvador Soto-Faraco

Keyword(s):

Speech Perception ◽

Visual Information ◽

High Speed ◽

Time Course ◽

Visual Saliency ◽

Past Research ◽

Visual Speech ◽

Visual Performance ◽

Auditory Information ◽

Temporal Course

When both present, visual and auditory information are combined in order to decode the speech signal. Past research has addressed to what extent visual information contributes to distinguish confusable speech sounds, but usually ignoring the continuous nature of speech perception. Here we tap at the temporal course of the contribution of visual and auditory information during the process of speech perception. To this end, we designed an audio–visual gating task with videos recorded with high speed camera. Participants were asked to identify gradually longer fragments of pseudowords varying in the central consonant. Different Spanish consonant phonemes with different degree of visual and acoustic saliency were included, and tested on visual-only, auditory-only and audio–visual trials. The data showed different patterns of contribution of unimodal and bimodal information during identification, depending on the visual saliency of the presented phonemes. In particular, for phonemes which are clearly more salient in one modality than the other, audio–visual performance equals that of the best unimodal. In phonemes with more balanced saliency, audio–visual performance was better than both unimodal conditions. These results shed new light on the temporal course of audio–visual speech integration.

Download Full-text