Visual Anticipatory Information Modulates Multisensory Interactions of Artificial Audiovisual Stimuli

2010 ◽  
Vol 22 (7) ◽  
pp. 1583-1596 ◽  
Author(s):  
Jean Vroomen ◽  
Jeroen J. Stekelenburg

The neural activity of speech sound processing (the N1 component of the auditory ERP) can be suppressed if a speech sound is accompanied by concordant lip movements. Here we demonstrate that this audiovisual interaction is neither speech specific nor linked to humanlike actions but can be observed with artificial stimuli if their timing is made predictable. In Experiment 1, a pure tone synchronized with a deformation of a rectangle induced a smaller auditory N1 than auditory-only presentations if the temporal occurrence of this audiovisual event was made predictable by two moving disks that touched the rectangle. Local autoregressive average source estimation indicated that this audiovisual interaction may be related to integrative processing in auditory areas. When the moving disks did not precede the audiovisual stimulus—making the onset unpredictable—there was no N1 reduction. In Experiment 2, the predictability of the leading visual signal was manipulated by introducing a temporal asynchrony between the audiovisual event and the collision of moving disks. Audiovisual events occurred either at the moment, before (too “early”), or after (too “late”) the disks collided on the rectangle. When asynchronies varied from trial to trial—rendering the moving disks unreliable temporal predictors of the audiovisual event—the N1 reduction was abolished. These results demonstrate that the N1 suppression is induced by visual information that both precedes and reliably predicts audiovisual onset, without a necessary link to human action-related neural mechanisms.

2007 ◽  
Vol 555 ◽  
pp. 177-182 ◽  
Author(s):  
Snezana Pašalić ◽  
P.B. Jovanić ◽  
B. Bugarski

There are many developed strategies for evaluating emulsion stability, aimed at determining the life circle of emulsions. Most of them are based on rheological properties of emulsions. There are, however, very few based on direct emulsion observations. In this paper we present a developed method for the emulsion stability evaluation by direct observation of optical emulsion properties. We propose the fractal dimension approach as a stability quantification measure. The method is based on the measure of emulsion transmittance properties, which are directly dependent on the emulsion stability at the moment of measurement. The oil in water emulsion was used as a test emulsion. The system is classified as stable emulsion and our intention was to find the moment when it starts to break. Emulsion transmittance properties were measure applying a system for acquisition of visual information, which is based on a CCD camera and a fast PC configuration equipped with the capturing software. The acquired sets of visual information were analyzed by the OZARIA software package. The fractal dimensions were determined by the box counting method. For these experiments, 100 boxes of different sizes were used. Experimental emulsions were measured after 7, 14, 21 and 28 days from the moment of creation. A slight increase in fractal dimensions was observed, which indicates that the emulsions are still in the stable region, or from the fractal point of view emulsion are still regular and no significant irregularities were observed. From the first experiments the applied methodology proved to be sensitive enough to be used for emulsions stability evaluation.


2008 ◽  
Vol 35 (4) ◽  
pp. 809-822 ◽  
Author(s):  
SABINE VAN LINDEN ◽  
JEAN VROOMEN

ABSTRACTIn order to examine whether children adjust their phonetic speech categories, children of two age groups, five-year-olds and eight-year-olds, were exposed to a video of a face saying /aba/ or /ada/ accompanied by an auditory ambiguous speech sound halfway between /b/ and /d/. The effect of exposure to these audiovisual stimuli was measured on subsequently delivered auditory-only speech identification trials. Results were compared to a control condition in which the audiovisual exposure stimuli contained non-ambiguous and congruent sounds /aba/ or /ada/. The older children learned to categorize the initially ambiguous speech sound in accord with the previously seen lip-read information (i.e. recalibration), but this was not the case for the younger age group. Moreover, all children displayed a tendency to report the stimulus that they were exposed to during the exposure phase. Methodological improvements for adjusting such a response bias are discussed.


Author(s):  
Welber Marinovic ◽  
Annaliese M. Plooy ◽  
James R. Tresilian

When intercepting a moving target, accurate timing depends, in part, upon starting to move at the right moment. It is generally believed that this is achieved by triggering motor command generation when a visually perceived quantity such as the target’s time-to-arrival reaches a specific criterion value. An experimental method that could be used to determine the moment when this visual event happens was introduced by Whiting and coworkers in the 1970s, and it involves occluding the vision of the target at different times prior to the time of movement onset (MO). This method is limited because the experimenter has no control over MO time. We suggest a method which provides the needed control by having people make interceptive movements of a specific duration. We tested the efficacy of this method in two experiments in which the accuracy of interception was examined under different occlusion conditions. In the first experiment, we examined the effect of changing the timing of an occlusion period (OP) of fixed duration (200 ms). In the second experiment, we varied the duration of the OP (180–430 ms) as well as its timing. The results demonstrated the utility of the proposed method and showed that performance deteriorated only when the participants had their vision occluded from 200 ms prior to MO. The results of Experiment 2 were able to narrow down the critical interval to trigger the interceptive action to within the period from 200 to 150 ms prior to MO, probably closer to 150 ms. In addition, the results showed that the execution of brief interceptive movements (180 ms) was not affected by the range of OPs used in the experiments. This indicates that the whole movement was prepared in advance and triggered by a visual stimulus event that occurred at about 150 ms before onset.


2012 ◽  
Vol 367 (1591) ◽  
pp. 965-976 ◽  
Author(s):  
Anahita Basirat ◽  
Jean-Luc Schwartz ◽  
Marc Sato

The verbal transformation effect (VTE) refers to perceptual switches while listening to a speech sound repeated rapidly and continuously. It is a specific case of perceptual multistability providing a rich paradigm for studying the processes underlying the perceptual organization of speech. While the VTE has been mainly considered as a purely auditory effect, this paper presents a review of recent behavioural and neuroimaging studies investigating the role of perceptuo-motor interactions in the effect. Behavioural data show that articulatory constraints and visual information from the speaker's articulatory gestures can influence verbal transformations. In line with these data, functional magnetic resonance imaging and intracranial electroencephalography studies demonstrate that articulatory-based representations play a key role in the emergence and the stabilization of speech percepts during a verbal transformation task. Overall, these results suggest that perceptuo (multisensory)-motor processes are involved in the perceptual organization of speech and the formation of speech perceptual objects.


2020 ◽  
Author(s):  
José Moya-Díaz ◽  
Ben James ◽  
Leon Lagnado

SummaryMultivesicular release (MVR) allows retinal bipolar cells to transmit visual signals as changes in both the rate and amplitude of synaptic events. How do neuromodulators reguate this vesicle code? By imaging larval zebrafish, we find that the variability of calcium influx is a major source of synaptic noise. Dopamine increases synaptic gain up to 15-fold while Substance P reduces it 7-fold, both by acting on the presynaptic calcium transient to alter the distribution of amplitudes of multivesicular events. An increase in gain is accompanied by a decrease in the temporal precision of transmission and a reduction in the efficiency with which vesicles transfer visual information. The decrease in gain caused by Substance P was also associated with a shift in temporal filtering from band-pass to low-pass. This study demonstrates how neuromodulators act on the synaptic transformation of the visual signal to alter the way information is coded with vesicles.


2021 ◽  
pp. 1-21
Author(s):  
Xinyue Wang ◽  
Clemens Wöllner ◽  
Zhuanghua Shi

Abstract Compared to vision, audition has been considered to be the dominant sensory modality for temporal processing. Nevertheless, recent research suggests the opposite, such that the apparent inferiority of visual information in tempo judgements might be due to the lack of ecological validity of experimental stimuli, and reliable visual movements may have the potential to alter the temporal location of perceived auditory inputs. To explore the role of audition and vision in overall time perception, audiovisual stimuli with various degrees of temporal congruence were developed in the current study. We investigated which sensory modality weighs more in holistic tempo judgements with conflicting audiovisual information, and whether biological motion (point-light displays of dancers) rather than auditory cues (rhythmic beats) dominate judgements of tempo. A bisection experiment found that participants relied more on visual tempo compared to auditory tempo in overall tempo judgements. For fast tempi (150 to 180 BPM), participants judged ‘fast’ significantly more often with visual cues regardless of the auditory tempo, whereas for slow tempi (60 to 90 BPM), they did so significantly less often. Our results support the notion that visual stimuli with higher ecological validity have the potential to drive up or down the holistic perception of tempo.


2021 ◽  
Author(s):  
Andrei Amatuni ◽  
Sara Schroer ◽  
Yayun Zhang ◽  
Ryan Ernest Peters ◽  
Alimoor Reza ◽  
...  

Infants learn the meaning of words from accumulated experiences of real-time interactions with their caregivers. To study the effects of visual sensory input on word learning, we recorded infant's view of the world using head-mounted eye trackers during free-flowing play with a caregiver. While playing, infants were exposed to novel label-object mappings and later learning outcomes for these items were tested after the play session. In this study we use a classification based approach to link properties of infants' visual scenes during naturalistic labeling moments to their word learning outcomes. We find that a model which integrates both highly informative and ambiguous sensory evidence is a better fit to infants' individual learning outcomes than models where either type of evidence is taken alone, and that raw labeling frequency is unable to account for the word learning differences we observe. Here we demonstrate how a computational model, using only raw pixels taken from the egocentric scene image, can derive insights on human language learning.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yanna Ren ◽  
Yawei Hou ◽  
Jiayu Huang ◽  
Fanghong Li ◽  
Tao Wang ◽  
...  

The modulation of attentional load on the perception of auditory and visual information has been widely reported; however, whether attentional load alters audiovisual integration (AVI) has seldom been investigated. Here, to explore the effect of sustained auditory attentional load on AVI and the effects of aging, nineteen older and 20 younger adults performed an AV discrimination task with a rapid serial auditory presentation task competing for attentional resources. The results showed that responses to audiovisual stimuli were significantly faster than those to auditory and visual stimuli ( AV > V ≥ A , all p < 0.001 ), and the younger adults were significantly faster than the older adults under all attentional load conditions (all p < 0.001 ). The analysis of the race model showed that AVI was decreased and delayed with the addition of auditory sustained attention ( no _ load > load _ 1 > load _ 2 > load _ 3 > load _ 4 ) for both older and younger adults. In addition, AVI was lower and more delayed in older adults than in younger adults in all attentional load conditions. These results suggested that auditory sustained attentional load decreased AVI and that AVI was reduced in older adults.


2016 ◽  
Vol 28 (1) ◽  
pp. 1-7 ◽  
Author(s):  
Claudia S. Lüttke ◽  
Matthias Ekman ◽  
Marcel A. J. van Gerven ◽  
Floris P. de Lange

Auditory speech perception can be altered by concurrent visual information. The superior temporal cortex is an important combining site for this integration process. This area was previously found to be sensitive to audiovisual congruency. However, the direction of this congruency effect (i.e., stronger or weaker activity for congruent compared to incongruent stimulation) has been more equivocal. Here, we used fMRI to look at the neural responses of human participants during the McGurk illusion—in which auditory /aba/ and visual /aga/ inputs are fused to perceived /ada/—in a large homogenous sample of participants who consistently experienced this illusion. This enabled us to compare the neuronal responses during congruent audiovisual stimulation with incongruent audiovisual stimulation leading to the McGurk illusion while avoiding the possible confounding factor of sensory surprise that can occur when McGurk stimuli are only occasionally perceived. We found larger activity for congruent audiovisual stimuli than for incongruent (McGurk) stimuli in bilateral superior temporal cortex, extending into the primary auditory cortex. This finding suggests that superior temporal cortex prefers when auditory and visual input support the same representation.


Sign in / Sign up

Export Citation Format

Share Document