scholarly journals Speech and non-speech measures of audiovisual integration are not correlated

2021 ◽  
Author(s):  
Jonathan Wilbiks ◽  
Julia Feld Strand ◽  
Violet Aurora Brown

Many natural events generate both visual and auditory signals, and humans are remarkably adept at integrating information from those sources. However, individuals appear to differ markedly in their ability or propensity to combine what they hear with what they see. Individual differences in audiovisual integration have been established using a range of materials including speech stimuli (seeing and hearing a talker) and simpler audiovisual stimuli (seeing flashes of light combined with tones). Although there are multiple tasks in the literature that are referred to as “measures of audiovisual integration,” the tasks themselves differ widely with respect to both the type of stimuli used (speech versus non-speech) and the nature of the tasks themselves (e.g., some tasks use conflicting auditory and visual stimuli whereas others use congruent stimuli). It is not clear whether these varied tasks are actually measuring the same underlying construct: audiovisual integration. This study tested the convergent validity of four commonly-used measures of audiovisual integration, two of which use speech stimuli (susceptibility to the McGurk effect and a measure of audiovisual benefit), and two of which use non-speech stimuli (the sound-induced flash illusion and audiovisual integration capacity). We replicated previous work showing large individual differences in each measure, but found no significant correlations between any of the measures. These results suggest that tasks that are commonly referred to as measures of audiovisual integration may not be tapping into the same underlying construct.

2020 ◽  
Vol 82 (7) ◽  
pp. 3490-3506
Author(s):  
Jonathan Tong ◽  
Lux Li ◽  
Patrick Bruns ◽  
Brigitte Röder

Abstract According to the Bayesian framework of multisensory integration, audiovisual stimuli associated with a stronger prior belief that they share a common cause (i.e., causal prior) are predicted to result in a greater degree of perceptual binding and therefore greater audiovisual integration. In the present psychophysical study, we systematically manipulated the causal prior while keeping sensory evidence constant. We paired auditory and visual stimuli during an association phase to be spatiotemporally either congruent or incongruent, with the goal of driving the causal prior in opposite directions for different audiovisual pairs. Following this association phase, every pairwise combination of the auditory and visual stimuli was tested in a typical ventriloquism-effect (VE) paradigm. The size of the VE (i.e., the shift of auditory localization towards the spatially discrepant visual stimulus) indicated the degree of multisensory integration. Results showed that exposure to an audiovisual pairing as spatiotemporally congruent compared to incongruent resulted in a larger subsequent VE (Experiment 1). This effect was further confirmed in a second VE paradigm, where the congruent and the incongruent visual stimuli flanked the auditory stimulus, and a VE in the direction of the congruent visual stimulus was shown (Experiment 2). Since the unisensory reliabilities for the auditory or visual components did not change after the association phase, the observed effects are likely due to changes in multisensory binding by association learning. As suggested by Bayesian theories of multisensory processing, our findings support the existence of crossmodal causal priors that are flexibly shaped by experience in a changing world.


2021 ◽  
Author(s):  
Iliza M Butera ◽  
Ryan A Stevenson ◽  
René H Gifford ◽  
Mark T Wallace

The reduction in spectral resolution by cochlear implants oftentimes requires complementary visual speech cues to aid in understanding. Despite substantial clinical characterization of auditory-only speech outcome measures, relatively little is known about the audiovisual integrative abilities that most cochlear implant (CI) users rely on for daily speech comprehension. In this study, we tested audiovisual integration in 63 CI users and 69 normal-hearing (NH) controls using the McGurk and sound-induced flash illusions. This study is the largest to-date measuring the McGurk effect in this population and the first to test the sound-induced flash illusion. When presented with conflicting audiovisual speech stimuli (i.e., the phoneme "ba" dubbed onto the viseme "ga"), we found that 55 CI users (87%) reported a fused percept of "da" or "tha" on at least one trial. However, overall, we found that CI users experienced the McGurk effect less often than controls--a result that was concordant with results with the sound-induced flash illusion where the pairing of a single circle flashing on the screen with multiple beeps resulted in fewer illusory flashes for CI users. While illusion perception in these two tasks appears to be uncorrelated among CI users, we identified a negative correlation in the NH group. Because neither illusion appears to provide further explanation of variability in CI outcome measures, further research is needed to determine how these findings relate to CI users' speech understanding, particularly in ecological listening conditions that are naturally multisensory.


2014 ◽  
Vol 57 (6) ◽  
pp. 2322-2331 ◽  
Author(s):  
Julia Strand ◽  
Allison Cooperman ◽  
Jonathon Rowe ◽  
Andrea Simenstad

Purpose Prior studies (e.g., Nath & Beauchamp, 2012) report large individual variability in the extent to which participants are susceptible to the McGurk effect, a prominent audiovisual (AV) speech illusion. The current study evaluated whether susceptibility to the McGurk effect (MGS) is related to lipreading skill and whether multiple measures of MGS that have been used previously are correlated. In addition, it evaluated the test–retest reliability of individual differences in MGS. Method Seventy-three college-age participants completed 2 tasks measuring MGS and 3 measures of lipreading skill. Fifty-eight participants returned for a 2nd session (approximately 2 months later) in which MGS was tested again. Results The current study demonstrated that MGS shows high test–retest reliability and is correlated with some measures of lipreading skill. In addition, susceptibility measures derived from identification tasks were moderately related to the ability to detect instances of AV incongruity. Conclusions Although MGS is often cited as a demonstration of AV integration, the results suggest that perceiving the illusion depends in part on individual differences in lipreading skill and detecting AV incongruity. Therefore, individual differences in susceptibility to the illusion are not solely attributable to individual differences in AV integration ability.


2019 ◽  
Author(s):  
Paola Perone ◽  
David Vaughn Becker ◽  
Joshua M. Tybur

Multiple studies report that disgust-eliciting stimuli are perceived as salient and subsequently capture selective attention. In the current study, we aimed to better understand the nature of temporal attentional biases toward disgust-eliciting stimuli and to investigate the extent to which these biases are sensitive to contextual and trait-level pathogen avoidance motives. Participants (N=116) performed in an Emotional Attentional Blink (EAB) task in which task-irrelevant disgust-eliciting, fear-eliciting, or neutral images preceded a target by 200, 500, or 800 milliseconds (i.e., lag two, five and eight respectively). They did so twice - once while not exposed to an odor, and once while exposed to either an odor that elicited disgust or an odor that did not - and completed a measure of disgust sensitivity. Results indicate that disgust-eliciting visual stimuli produced a greater attentional blink than neutral visual stimuli at lag two and a greater attentional blink than fear-eliciting visual stimuli at both lag two and at lag five. Neither the odor manipulations nor individual differences measures moderated this effect. We propose that visual attention is engaged for a longer period of time following disgust-eliciting stimuli because covert processes automatically initiate the evaluation of pathogen threats. The fact that state and trait pathogen avoidance do not influence this temporal attentional bias suggests that early attentional processing of pathogen cues is initiated independent from the context in which such cues are perceived.


Perception ◽  
10.1068/p5304 ◽  
2005 ◽  
Vol 34 (11) ◽  
pp. 1315-1324 ◽  
Author(s):  
J Farley Norman ◽  
Charles E Crabtree ◽  
Anna Marie Clayton ◽  
Hideko F Norman

The ability of observers to perceive distances and spatial relationships in outdoor environments was investigated in two experiments. In experiment 1, the observers adjusted triangular configurations to appear equilateral, while in experiment 2, they adjusted the depth of triangles to match their base width. The results of both experiments revealed that there are large individual differences in how observers perceive distances in outdoor settings. The observers' judgments were greatly affected by the particular task they were asked to perform. The observers who had shown no evidence of perceptual distortions in experiment 1 (with binocular vision) demonstrated large perceptual distortions in experiment 2 when the task was changed to match distances in depth to frontal distances perpendicular to the observers' line of sight. Considered as a whole, the results indicate that there is no single relationship between physical and perceived space that is consistent with observers' judgments of distances in ordinary outdoor contexts.


2005 ◽  
Vol 101 (2) ◽  
pp. 487-497
Author(s):  
Yoshinori Nagasawa ◽  
Shinichi Demura

Present purposes were to examine the characteristics of controlled force exertion in 28 developmentally delayed young people (14 men, 14 women), and sex differences compared to 28 normal young students (14 men, 14 women). The subjects matched their submaximal grip strength to changing demand values displayed in a bar chart on the display of a personal computer. The total sum of the differences between the demand value and grip exertion value for 25 sec. was used as an evaluation parameter for the test. The controlled force exertion was significantly poorer for the developmentally delayed group than for controls, and there were large individual differences. The developmentally delayed men scored poorer than women in coordination. Like the controls, the means between trials did not decrease significantly. For these developmentally delayed subjects, performance did not improve after only a few trials. The controlled force-exertion test is useful as a voluntary movement-function test for developmentally delayed subjects.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Bruno Laeng ◽  
Sarjo Kuyateh ◽  
Tejaswinee Kelkar

AbstractCross-modal integration is ubiquitous within perception and, in humans, the McGurk effect demonstrates that seeing a person articulating speech can change what we hear into a new auditory percept. It remains unclear whether cross-modal integration of sight and sound generalizes to other visible vocal articulations like those made by singers. We surmise that perceptual integrative effects should involve music deeply, since there is ample indeterminacy and variability in its auditory signals. We show that switching videos of sung musical intervals changes systematically the estimated distance between two notes of a musical interval so that pairing the video of a smaller sung interval to a relatively larger auditory led to compression effects on rated intervals, whereas the reverse led to a stretching effect. In addition, after seeing a visually switched video of an equally-tempered sung interval and then hearing the same interval played on the piano, the two intervals were judged often different though they differed only in instrument. These findings reveal spontaneous, cross-modal, integration of vocal sounds and clearly indicate that strong integration of sound and sight can occur beyond the articulations of natural speech.


1983 ◽  
Vol 35 (2) ◽  
pp. 411-421 ◽  
Author(s):  
J. I. Laszlo ◽  
P. J. Bairstow

This paper reviews studies which demonstrate the importance of kinaesthesis in the acquisition and performance of motor skills. A method of measuring kinaesthetic sensitivity in children and adults (recently developed) is briefly described. Developmental trends in kinaesthetic perception are discussed and large individual differences found within age groups. It was shown that kinaesthetically undeveloped children can be trained to perceive and memorize kinaesthetic information with greatly improved accuracy. Furthermore perceptual training facilitates the performance of a drawing skill. On the basis of these results an argument is made for the importance of kinaesthesis in skilled motor behaviour.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Raphaël Thézé ◽  
Mehdi Ali Gadiri ◽  
Louis Albert ◽  
Antoine Provost ◽  
Anne-Lise Giraud ◽  
...  

Abstract Natural speech is processed in the brain as a mixture of auditory and visual features. An example of the importance of visual speech is the McGurk effect and related perceptual illusions that result from mismatching auditory and visual syllables. Although the McGurk effect has widely been applied to the exploration of audio-visual speech processing, it relies on isolated syllables, which severely limits the conclusions that can be drawn from the paradigm. In addition, the extreme variability and the quality of the stimuli usually employed prevents comparability across studies. To overcome these limitations, we present an innovative methodology using 3D virtual characters with realistic lip movements synchronized on computer-synthesized speech. We used commercially accessible and affordable tools to facilitate reproducibility and comparability, and the set-up was validated on 24 participants performing a perception task. Within complete and meaningful French sentences, we paired a labiodental fricative viseme (i.e. /v/) with a bilabial occlusive phoneme (i.e. /b/). This audiovisual mismatch is known to induce the illusion of hearing /v/ in a proportion of trials. We tested the rate of the illusion while varying the magnitude of background noise and audiovisual lag. Overall, the effect was observed in 40% of trials. The proportion rose to about 50% with added background noise and up to 66% when controlling for phonetic features. Our results conclusively demonstrate that computer-generated speech stimuli are judicious, and that they can supplement natural speech with higher control over stimulus timing and content.


Sign in / Sign up

Export Citation Format

Share Document