scholarly journals Eye Gaze And Perceptual Adaptation To Audiovisual Degraded Speech

2020 ◽  
Author(s):  
Briony Banks ◽  
Emma Gowen ◽  
Kevin Munro ◽  
patti adank

Visual cues from a speaker’s face may improve perceptual adaptation to degraded speech over time, but current evidence is limited. We aimed to replicate results from previous studies and extend them to more demanding speech stimuli (sentences), to better represent real-life, challenging speech comprehension. In addition, we investigated whether particular eye gaze patterns towards the speaker’s mouth were related to adaptation, hypothesising that listeners who looked more at the speaker’s mouth would show greater adaptation. A group of listeners were presented with noise-vocoded sentences in audiovisual format while a control group were presented with the audio signal only, presented congruently with a still image of the speaker’s face. Results of previous adaptation studies were partially replicated: the audiovisual group had better recognition throughout and adapted slightly more rapidly, but both groups showed an equal amount of improvement overall (after exposure to 90 sentences). Longer fixations on the speaker’s mouth in the audiovisual group were related to better overall accuracy, although evidence for this relationship was relatively weak. An exploratory analysis further showed that the duration of fixations to the speaker’s mouth decreased over time. The results suggest that the benefits from visual cues to adaptation to unfamiliar speech vary more than previously thought. Longer fixations on a speaker’s mouth may play a role in successfully decoding these cues, but more evidence is needed to fully establish how patterns of eye gaze are related to audiovisual speech recognition.

Author(s):  
Briony Banks ◽  
Emma Gowen ◽  
Kevin J. Munro ◽  
Patti Adank

Purpose Visual cues from a speaker's face may benefit perceptual adaptation to degraded speech, but current evidence is limited. We aimed to replicate results from previous studies to establish the extent to which visual speech cues can lead to greater adaptation over time, extending existing results to a real-time adaptation paradigm (i.e., without a separate training period). A second aim was to investigate whether eye gaze patterns toward the speaker's mouth were related to better perception, hypothesizing that listeners who looked more at the speaker's mouth would show greater adaptation. Method A group of listeners ( n = 30) was presented with 90 noise-vocoded sentences in audiovisual format, whereas a control group ( n = 29) was presented with the audio signal only. Recognition accuracy was measured throughout and eye tracking was used to measure fixations toward the speaker's eyes and mouth in the audiovisual group. Results Previous studies were partially replicated: The audiovisual group had better recognition throughout and adapted slightly more rapidly, but both groups showed an equal amount of improvement overall. Longer fixations on the speaker's mouth in the audiovisual group were related to better overall accuracy. An exploratory analysis further demonstrated that the duration of fixations to the speaker's mouth decreased over time. Conclusions The results suggest that visual cues may not benefit adaptation to degraded speech as much as previously thought. Longer fixations on a speaker's mouth may play a role in successfully decoding visual speech cues; however, this will need to be confirmed in future research to fully understand how patterns of eye gaze are related to audiovisual speech recognition. All materials, data, and code are available at https://osf.io/2wqkf/ .


2021 ◽  
Author(s):  
Rachel Newey ◽  
Kami Koldewyn ◽  
Richard Ramsey

A variety of subtle social cues, including gaze behaviour, are used to form impressions of others. For example, if another’s eye-gaze reliably helps or hinders us while we complete a task, we incidentally form a positive or negative impression about them. In real life, people are rarely so consistent in their behaviour, and they are often encountered in dynamic group contexts. To date, however, it is not yet known how incidental impressions are affected by either changes in target individual’s behaviour over time group, or by the group’s behaviour. To better understand how impressions are formed when subtle social behaviours change valence over time, we manipulated helping behaviour both at the level of the individual (Experiments 1-3) and the wider group (Experiments 4 & 5). Contrary to the idea that first impressions are hard to change, we found no evidence that impressions were driven by initial behaviour (primacy effects). Rather, people tended to form impressions based on the most recent behaviour, with some influence from the overall, average behaviour. In addition, we found that individuals’ behaviours appear to be viewed more or less favourably, depending on the behaviour of the wider group. Overall, we demonstrate that impression formation based on subtle social cues is not dominated by a single process, but instead reflects a complex product of cognitive mechanisms that integrate average valence over time, the direction of behaviour changes, the recency of observed behaviour, and the group context in which the behaviour is observed.


2013 ◽  
Vol 25 (8) ◽  
pp. 1383-1395 ◽  
Author(s):  
Antje Strauß ◽  
Sonja A. Kotz ◽  
Jonas Obleser

Under adverse listening conditions, speech comprehension profits from the expectancies that listeners derive from the semantic context. However, the neurocognitive mechanisms of this semantic benefit are unclear: How are expectancies formed from context and adjusted as a sentence unfolds over time under various degrees of acoustic degradation? In an EEG study, we modified auditory signal degradation by applying noise-vocoding (severely degraded: four-band, moderately degraded: eight-band, and clear speech). Orthogonal to that, we manipulated the extent of expectancy: strong or weak semantic context (±con) and context-based typicality of the sentence-last word (high or low: ±typ). This allowed calculation of two distinct effects of expectancy on the N400 component of the evoked potential. The sentence-final N400 effect was taken as an index of the neural effort of automatic word-into-context integration; it varied in peak amplitude and latency with signal degradation and was not reliably observed in response to severely degraded speech. Under clear speech conditions in a strong context, typical and untypical sentence completions seemed to fulfill the neural prediction, as indicated by N400 reductions. In response to moderately degraded signal quality, however, the formed expectancies appeared more specific: Only typical (+con +typ), but not the less typical (+con −typ) context–word combinations led to a decrease in the N400 amplitude. The results show that adverse listening “narrows,” rather than broadens, the expectancies about the perceived speech signal: limiting the perceptual evidence forces the neural system to rely on signal-driven expectancies, rather than more abstract expectancies, while a sentence unfolds over time.


2019 ◽  
Author(s):  
Keren Shavit Cohen ◽  
Elana Zion Golumbic

AbstractFocusing attention on one speaker on the background of other irrelevant speech can be a challenging feat. A longstanding question in attention research is whether and how frequently individuals shift their attention towards task-irrelevant speech, arguably leading to occasional detection of words in a so-called unattended message. However, this has been difficult to gauge empirically, particularly when participants attend to continuous natural speech, due to the lack of appropriate metrics for detecting shifts in internal attention. Here we introduce a new experimental platform for studying the dynamic deployment of attention among concurrent speakers, utilizing a unique combination of Virtual Reality and Eye-Tracking technology. We created a Virtual Café in which participants sit across from and attend to the narrative of a target speaker. We manipulate the number and location of distractor speakers, manifest as additional patrons throughout the Virtual Café. By monitoring participant’ eye-gaze dynamics, we studied the patterns of overt shifts of attention among the concurrent speakers as well as the consequences of these shifts on speech comprehension.Our results reveal important individual differences in the gaze-pattern displayed during selective attention to speech. While some participants stayed fixated on a target speaker throughout the entire experiment, approximately 30% of participants frequently shifted their gaze toward distractor speakers or other locations in the environment, regardless of the severity of audiovisual distraction. Critically, the tendency for frequent gaze-shifts negatively impacted comprehension of the target speaker. We also found that gaze-shifts occurred primarily during gaps in the acoustic input, suggesting they are prompted by momentary unmasking of the competing audio, in line with ‘glimpsing’ theories of processing speech in noise.These results open a new window into understanding the dynamics of attention as they wax and wane over time, and the different listening patterns employed for dealing with the influx of sensory input in multisensory environments. Moreover, the novel approach developed here for tracking the locus of momentary attention in a naturalistic virtual-reality environment holds high promise for extending the study of human behavior and cognition and bridging the gap between the laboratory and real-life.


2014 ◽  
Vol 23 (3) ◽  
pp. 132-139 ◽  
Author(s):  
Lauren Zubow ◽  
Richard Hurtig

Children with Rett Syndrome (RS) are reported to use multiple modalities to communicate although their intentionality is often questioned (Bartolotta, Zipp, Simpkins, & Glazewski, 2011; Hetzroni & Rubin, 2006; Sigafoos et al., 2000; Sigafoos, Woodyatt, Tuckeer, Roberts-Pennell, & Pittendreigh, 2000). This paper will present results of a study analyzing the unconventional vocalizations of a child with RS. The primary research question addresses the ability of familiar and unfamiliar listeners to interpret unconventional vocalizations as “yes” or “no” responses. This paper will also address the acoustic analysis and perceptual judgments of these vocalizations. Pre-recorded isolated vocalizations of “yes” and “no” were presented to 5 listeners (mother, father, 1 unfamiliar, and 2 familiar clinicians) and the listeners were asked to rate the vocalizations as either “yes” or “no.” The ratings were compared to the original identification made by the child's mother during the face-to-face interaction from which the samples were drawn. Findings of this study suggest, in this case, the child's vocalizations were intentional and could be interpreted by familiar and unfamiliar listeners as either “yes” or “no” without contextual or visual cues. The results suggest that communication partners should be trained to attend to eye-gaze and vocalizations to ensure the child's intended choice is accurately understood.


1999 ◽  
Vol 4 (4) ◽  
pp. 205-218 ◽  
Author(s):  
David Magnusson

A description of two cases from my time as a school psychologist in the middle of the 1950s forms the background to the following question: Has anything important happened since then in psychological research to help us to a better understanding of how and why individuals think, feel, act, and react as they do in real life and how they develop over time? The studies serve as a background for some general propositions about the nature of the phenomena that concerns us in developmental research, for a summary description of the developments in psychological research over the last 40 years as I see them, and for some suggestions about future directions.


1998 ◽  
Vol 3 (4) ◽  
pp. 271-280 ◽  
Author(s):  
Hannah Steinberg ◽  
Briony R. Nicholls ◽  
Elizabeth A. Sykes ◽  
N. LeBoutillier ◽  
Nerina Ramlakhan ◽  
...  

Mood improvement immediately after a single bout of exercise is well documented, but less is known about successive and longer term effects. In a “real-life” field investigation, four kinds of exercise class (Beginners, Advanced, Body Funk and Callanetics) met once a week for up to 7 weeks. Before and after each class the members assessed how they felt by completing a questionnaire listing equal numbers of “positive” and “negative” mood words. Subjects who had attended at least five times were included in the analysis, which led to groups consisting of 18, 20, 16, and 16 subjects, respectively. All four kinds of exercise significantly increased positive and decreased negative feelings, and this result was surprisingly consistent in successive weeks. However, exercise seemed to have a much greater effect on positive than on negative moods. The favorable moods induced by each class seemed to have worn off by the following week, to be reinstated by the class itself. In the Callanetics class, positive mood also improved significantly over time. The Callanetics class involved “slower,” more demanding exercises, not always done to music. The Callanetics and Advanced classes also showed significantly greater preexercise negative moods in the first three sessions. However, these differences disappeared following exercise. Possibly, these two groups had become more “tolerant” to the mood-enhancing effects of physical exercise; this may be in part have been due to “exercise addiction.”


2016 ◽  
Vol 24 (2) ◽  
pp. 159-169
Author(s):  
M L Mojapelo

Storytelling consists of an interaction between a narrator and a listener, both of whom assign meaning to the story as a whole and its component parts. The meaning assigned to the narrative changes over time under the influence of the recipient‟s changing precepts and perceptions which seem to be simplistic in infancy and more nuanced with age. It becomes more philosophical in that themes touching on the more profound questions of human existence tend to become more prominently discernible as the subject moves into the more reflective or summative phases of his or her existence. The aim of this article is to demonstrate the metaphorical character of a story, as reflected in changing patterns of meaning assigned to the narrative in the course of the subjective receiver‟s passage through the various stages of life. This was done by analysing meaning, from a particular storytelling session, at different stages of a listener‟s personal development. Meaning starts as literal and evolves through re-interpretation to abstract and deeper levels towards application in real life.


2013 ◽  
Vol 8 (1) ◽  
pp. 76
Author(s):  
Mathew Stone

A Review of: Gardois, P., Calabrese, R., Colombi, N., Lingua, C., Longo, F., Villanacci, M., Miniero, R., & Piga, A. (2011). Effectiveness of bibliographic searches performed by paediatric residents and interns assisted by librarian. A randomised controlled trial. Health Information and Libraries Journal, 28(4), 273-284. doi: 10.1111/j.1471-1842.2011.00957.x Objective – To establish whether the assistance of an experienced biomedical librarian delivers an improvement in the searching of bibliographic databases as performed by medical residents and interns. Design – Randomized controlled trial. Setting – The pediatrics department of a large Italian teaching hospital. Subjects – 18 pediatric residents and interns. Methods – 23 residents and interns from the pediatrics department of a large Italian teaching hospital were invited to participate in this study, of which 18 agreed. Subjects were then randomized into two groups and asked to spend between 30 and 90 minutes searching bibliographic databases for evidence to answer a real-life clinical question which was randomly allocated to them. Each member of the intervention group was provided with an experienced biomedical librarian to provide assistance throughout the search session. The control group received no assistance. The outcome of the search was then measured using an assessment tool adapted for the purpose of this study from the Fresno test of competence in evidence based medicine. This adapted assessment tool rated the “global success” of the search and included criteria such as appropriate question formulation, number of PICO terms translated into search terms, use of Boolean logic, use of subject headings, use of filters, use of limits, and the percentage of citations retrieved that matched a gold standard set of citations found in a prior search by two librarians (who were not involved in assisting the subjects) together with an expert clinician. Main Results – The intervention group scored a median average of 73.6 points out of a possible 100, compared with the control group which scored 50.4. The difference of 23.2 points in favour of the librarian assisted group was a statistically significant result (p value = 0.013) with a 95% confidence interval of between 4.8 and 33.2. Conclusion – This study presents credible evidence that assistance provided by an experienced biomedical librarian improves the quality of the bibliographic database searches performed by residents and interns using real-life clinical scenarios.


Sign in / Sign up

Export Citation Format

Share Document