Auditory signal dominates visual in the perception of emotional social interactions

2012 ◽  
Vol 25 (0) ◽  
pp. 112 ◽  
Author(s):  
Lukasz Piwek ◽  
Karin Petrini ◽  
Frank E. Pollick

Multimodal perception of emotions has been typically examined using displays of a solitary character (e.g., the face–voice and/or body–sound of one actor). We extend investigation to more complex, dyadic point-light displays combined with speech. A motion and voice capture system was used to record twenty actors interacting in couples with happy, angry and neutral emotional expressions. The obtained stimuli were validated in a pilot study and used in the present study to investigate multimodal perception of emotional social interactions. Participants were required to categorize happy and angry expressions displayed visually, auditorily, or using emotionally congruent and incongruent bimodal displays. In a series of cross-validation experiments we found that sound dominated the visual signal in the perception of emotional social interaction. Although participants’ judgments were faster in the bimodal condition, the accuracy of judgments was similar for both bimodal and auditory-only conditions. When participants watched emotionally mismatched bimodal displays, they predominantly oriented their judgments towards the auditory rather than the visual signal. This auditory dominance persisted even when the reliability of auditory signal was decreased with noise, although visual information had some effect on judgments of emotions when it was combined with a noisy auditory signal. Our results suggest that when judging emotions from observed social interaction, we rely primarily on vocal cues from the conversation, rather then visual cues from their body movement.

2007 ◽  
Vol 44 (5) ◽  
pp. 518-522 ◽  
Author(s):  
Shelley Von Berg ◽  
Douglas McColl ◽  
Tami Brancamp

Objective: This study investigated observers’ intelligibility for the spoken output of an individual with Moebius syndrome (MoS) with and without visual cues. Design: An audiovisual recording of the speaker's output was obtained for 50 Speech Intelligibility in Noise sentences consisting of 25 high predictability and 25 low predictability sentences. Stimuli were presented to observers under two conditions: audiovisual and audio only. Data were analyzed using a multivariate repeated measures model. Observers: Twenty students and faculty affiliated with the Department of Speech Pathology and Audiology at the University of Nevada, Reno. Results: ANOVA mixed design revealed that intelligibility for the audio condition only was significantly greater than intelligibility for the audiovisual condition; and accuracy for high predictability sentences was significantly greater than accuracy for low predictability sentences. Conclusions: The compensatory substitutional placements for phonemes produced by MoS speakers may detract from the intelligibility of speech. This is similar to the McGurk-MacDonald effect, whereby an illusory auditory signal is perceived when visual information from lip movements does not match the auditory information from speech. It also suggests that observers use contextual clues, more than the acoustic signal alone, to arrive at the accurate recognition of the message of the speakers with MoS. Therefore, speakers with MoS should be counseled in the top-down approach of auditory closure. When the speech signal is degraded, predictable messages are more easily understood than unpredictable ones. It is also important to confirm the speaking partner's understanding of the topic before proceeding.


2021 ◽  
pp. 1-21
Author(s):  
Xinyue Wang ◽  
Clemens Wöllner ◽  
Zhuanghua Shi

Abstract Compared to vision, audition has been considered to be the dominant sensory modality for temporal processing. Nevertheless, recent research suggests the opposite, such that the apparent inferiority of visual information in tempo judgements might be due to the lack of ecological validity of experimental stimuli, and reliable visual movements may have the potential to alter the temporal location of perceived auditory inputs. To explore the role of audition and vision in overall time perception, audiovisual stimuli with various degrees of temporal congruence were developed in the current study. We investigated which sensory modality weighs more in holistic tempo judgements with conflicting audiovisual information, and whether biological motion (point-light displays of dancers) rather than auditory cues (rhythmic beats) dominate judgements of tempo. A bisection experiment found that participants relied more on visual tempo compared to auditory tempo in overall tempo judgements. For fast tempi (150 to 180 BPM), participants judged ‘fast’ significantly more often with visual cues regardless of the auditory tempo, whereas for slow tempi (60 to 90 BPM), they did so significantly less often. Our results support the notion that visual stimuli with higher ecological validity have the potential to drive up or down the holistic perception of tempo.


2018 ◽  
Author(s):  
Katarzyna Jaworska ◽  
Fei Yi ◽  
Robin A.A. Ince ◽  
Nicola J. van Rijsbergen ◽  
Philippe G. Schyns ◽  
...  

AbstractFast and accurate face processing is critical for everyday social interactions, but it declines and becomes delayed with age, as measured by both neural and behavioural responses. Here, we addressed the critical challenge of understanding how ageing changes neural information processing mechanisms to delay behaviour. Young (20-36 years) and older (60-86 years) adults performed the basic social interaction task detecting a face vs. noise while we recorded their electroencephalogram (EEG). In each participant, using a new information theoretic framework we reconstructed the features supporting face detection behaviour, and also where, when and how EEG activity represents them. We found that occipital-temporal pathway activity dynamically represents the eyes of the face images for behaviour ∼170 ms post-stimulus, with a 40 ms delay in older adults that underlies their 200 ms behavioural deficit of slower reaction times. Our results therefore demonstrate how ageing can change neural information processing mechanisms that underlie behavioural slow down.Author summaryOlder adults are consistently slower than young adults in a variety of behavioural perceptual tasks. So far, it has been unclear if the underlying cause of the behavioural delay relates to attentional or perceptual differences in encoding visual information, or slower neural processing speed, or other neural factors. Our study addresses these questions by showing that in a basic social interaction task (discriminating faces from noise), young and older adults encoded the same visual information (eyes of the face) to perform the task. Moreover, early brain activity (within 200 ms following stimulus onset) encoded the same visual information (again, eyes of the face) in both groups, but was delayed and weaker in older adults. These early delays in information encoding were directly related to the observed behavioural slowing in older adults, showing that differences in early perceptual brain processes can contribute to the motor response.


2020 ◽  
Vol 82 (7) ◽  
pp. 3544-3557 ◽  
Author(s):  
Jemaine E. Stacey ◽  
Christina J. Howard ◽  
Suvobrata Mitra ◽  
Paula C. Stacey

AbstractSeeing a talker’s face can aid audiovisual (AV) integration when speech is presented in noise. However, few studies have simultaneously manipulated auditory and visual degradation. We aimed to establish how degrading the auditory and visual signal affected AV integration. Where people look on the face in this context is also of interest; Buchan, Paré and Munhall (Brain Research, 1242, 162–171, 2008) found fixations on the mouth increased in the presence of auditory noise whilst Wilson, Alsius, Paré and Munhall (Journal of Speech, Language, and Hearing Research, 59(4), 601–615, 2016) found mouth fixations decreased with decreasing visual resolution. In Condition 1, participants listened to clear speech, and in Condition 2, participants listened to vocoded speech designed to simulate the information provided by a cochlear implant. Speech was presented in three levels of auditory noise and three levels of visual blurring. Adding noise to the auditory signal increased McGurk responses, while blurring the visual signal decreased McGurk responses. Participants fixated the mouth more on trials when the McGurk effect was perceived. Adding auditory noise led to people fixating the mouth more, while visual degradation led to people fixating the mouth less. Combined, the results suggest that modality preference and where people look during AV integration of incongruent syllables varies according to the quality of information available.


2018 ◽  
Author(s):  
Keri V. Langridge ◽  
Claudia Wilke ◽  
Olena Riabinina ◽  
Misha Vorobyev ◽  
Natalie Hempel de Ibarra

SummaryGaze direction is closely coupled with body movement in insects and other animals. If movement patterns interfere with the acquisition of visual information, insects can actively adjust them to seek relevant cues. Alternatively, where multiple visual cues are available, an insect’s movements may influence how it perceives a scene. We show that the way a foraging bumblebee approaches a floral pattern could determine what it learns about the pattern. When trained to vertical bicoloured patterns, bumblebees consistently approached from below centre in order to land in the centre of the target where the reward was located. In subsequent tests, the bees preferred the colour of the lower half of the pattern that they predominantly faced during the approach and landing sequence. A predicted change of learning outcomes occurred when the contrast line was moved up or down off-centre: learned preferences again reflected relative frontal exposure to each colour during the approach, independent of the overall ratio of colours. This mechanism may underpin learning strategies in both simple and complex visual discriminations, highlighting that morphology and action patterns determines how animals solve sensory learning tasks. The deterministic effect of movement on visual learning may have substantially influenced the evolution of floral signals, particularly where plants depend on fine-scaled movements of pollinators on flowers.


2019 ◽  
Vol 62 (10) ◽  
pp. 3860-3875 ◽  
Author(s):  
Kaylah Lalonde ◽  
Lynne A. Werner

Purpose This study assessed the extent to which 6- to 8.5-month-old infants and 18- to 30-year-old adults detect and discriminate auditory syllables in noise better in the presence of visual speech than in auditory-only conditions. In addition, we examined whether visual cues to the onset and offset of the auditory signal account for this benefit. Method Sixty infants and 24 adults were randomly assigned to speech detection or discrimination tasks and were tested using a modified observer-based psychoacoustic procedure. Each participant completed 1–3 conditions: auditory-only, with visual speech, and with a visual signal that only cued the onset and offset of the auditory syllable. Results Mixed linear modeling indicated that infants and adults benefited from visual speech on both tasks. Adults relied on the onset–offset cue for detection, but the same cue did not improve their discrimination. The onset–offset cue benefited infants for both detection and discrimination. Whereas the onset–offset cue improved detection similarly for infants and adults, the full visual speech signal benefited infants to a lesser extent than adults on the discrimination task. Conclusions These results suggest that infants' use of visual onset–offset cues is mature, but their ability to use more complex visual speech cues is still developing. Additional research is needed to explore differences in audiovisual enhancement (a) of speech discrimination across speech targets and (b) with increasingly complex tasks and stimuli.


2021 ◽  
Vol 12 ◽  
Author(s):  
Keri V. Langridge ◽  
Claudia Wilke ◽  
Olena Riabinina ◽  
Misha Vorobyev ◽  
Natalie Hempel de Ibarra

Gaze direction is closely coupled with body movement in insects and other animals. If movement patterns interfere with the acquisition of visual information, insects can actively adjust them to seek relevant cues. Alternatively, where multiple visual cues are available, an insect’s movements may influence how it perceives a scene. We show that the way a foraging bumblebee approaches a floral pattern could determine what it learns about the pattern. When trained to vertical bicoloured patterns, bumblebees consistently approached from below centre in order to land in the centre of the target where the reward was located. In subsequent tests, the bees preferred the colour of the lower half of the pattern that they predominantly faced during the approach and landing sequence. A predicted change of learning outcomes occurred when the contrast line was moved up or down off-centre: learned preferences again reflected relative frontal exposure to each colour during the approach, independent of the overall ratio of colours. This mechanism may underpin learning strategies in both simple and complex visual discriminations, highlighting that morphology and action patterns determines how animals solve sensory learning tasks. The deterministic effect of movement on visual learning may have substantially influenced the evolution of floral signals, particularly where plants depend on fine-scaled movements of pollinators on flowers.


2021 ◽  
Vol 15 ◽  
Author(s):  
Thorben Hülsdünker ◽  
David Riedel ◽  
Hannes Käsbauer ◽  
Diemo Ruhnow ◽  
Andreas Mierau

Although vision is the dominating sensory system in sports, many situations require multisensory integration. Faster processing of auditory information in the brain may facilitate time-critical abilities such as reaction speed however previous research was limited by generic auditory and visual stimuli that did not consider audio-visual characteristics in ecologically valid environments. This study investigated the reaction speed in response to sport-specific monosensory (visual and auditory) and multisensory (audio-visual) stimulation. Neurophysiological analyses identified the neural processes contributing to differences in reaction speed. Nineteen elite badminton players participated in this study. In a first recording phase, the sound profile and shuttle speed of smash and drop strokes were identified on a badminton court using high-speed video cameras and binaural recordings. The speed and sound characteristics were transferred into auditory and visual stimuli and presented in a lab-based experiment, where participants reacted in response to sport-specific monosensory or multisensory stimulation. Auditory signal presentation was delayed by 26 ms to account for realistic audio-visual signal interaction on the court. N1 and N2 event-related potentials as indicators of auditory and visual information perception/processing, respectively were identified using a 64-channel EEG. Despite the 26 ms delay, auditory reactions were significantly faster than visual reactions (236.6 ms vs. 287.7 ms, p < 0.001) but still slower when compared to multisensory stimulation (224.4 ms, p = 0.002). Across conditions response times to smashes were faster when compared to drops (233.2 ms, 265.9 ms, p < 0.001). Faster reactions were paralleled by a lower latency and higher amplitude of the auditory N1 and visual N2 potentials. The results emphasize the potential of auditory information to accelerate the reaction time in sport-specific multisensory situations. This highlights auditory processes as a promising target for training interventions in racquet sports.


Author(s):  
Alisha Lola Jones

Flaming?: The Peculiar Theopolitics of Fire and Desire in Black Male Gospel Performance examines the rituals and social interactions of African American men who use gospel music-making as a means of worshiping God and performing gendered identities. Prompted by the popular term “flaming” that is used to identify over-the-top or peculiar performance of identity, Flaming? argues that these men wield and interweave a variety of multivalent aural-visual cues, including vocal style, gesture, attire, and homiletics, to position themselves along a spectrum of gender identities. These multisensory enactments empower artists (i.e., “peculiar people”) to demonstrate modes of “competence” that affirm their fitness to minister through speech and song. Through a progression of transcongregational case studies, Flaming? observes the ways in which African American men traverse tightly knit social networks to negotiate their identities through and beyond the worship experience. Coded and “read” as either hypermasculine, queer, or sexually ambiguous, peculiar gospel performances are often a locus of nuanced protest, facilitating a critique of heteronormative theology while affording African American men opportunities for greater visibility and access to leadership. Same-sex relationships among men constitute an open secret that is carefully guarded by those who elect to remain silent in the face of traditional theology, but musically performed by those compelled to worship “in Spirit and in truth.” This book thus examines the performative mechanisms through which black men acquire an aura of sexual ambiguity, exhibit an ostensible absence of sexual preference, and thereby gain social and ritual prestige in gospel music circles.


2021 ◽  
Vol 15 ◽  
Author(s):  
Yi Yuan ◽  
Yasneli Lleo ◽  
Rebecca Daniel ◽  
Alexandra White ◽  
Yonghee Oh

Speech perception often takes place in noisy environments, where multiple auditory signals compete with one another. The addition of visual cues such as talkers’ faces or lip movements to an auditory signal can help improve the intelligibility of speech in those suboptimal listening environments. This is referred to as audiovisual benefits. The current study aimed to delineate the signal-to-noise ratio (SNR) conditions under which visual presentations of the acoustic amplitude envelopes have their most significant impact on speech perception. Seventeen adults with normal hearing were recruited. Participants were presented with spoken sentences in babble noise either in auditory-only or auditory-visual conditions with various SNRs at −7, −5, −3, −1, and 1 dB. The visual stimulus applied in this study was a sphere that varied in size syncing with the amplitude envelope of the target speech signals. Participants were asked to transcribe the sentences they heard. Results showed that a significant improvement in accuracy in the auditory-visual condition versus the audio-only condition was obtained at the SNRs of −3 and −1 dB, but no improvement was observed in other SNRs. These results showed that dynamic temporal visual information can benefit speech perception in noise, and the optimal facilitative effects of visual amplitude envelope can be observed under an intermediate SNR range.


Sign in / Sign up

Export Citation Format

Share Document