Cognitive and Physiological Measures of Listening Effort During Degraded Speech Perception: Relating Dual-Task and Pupillometry Paradigms

Purpose Listening effort is quickly becoming an important metric for assessing speech perception in less-than-ideal situations. However, the relationship between the construct of listening effort and the measures used to assess it remains unclear. We compared two measures of listening effort: a cognitive dual task and a physiological pupillometry task. We sought to investigate the relationship between these measures of effort and whether engaging effort impacts speech accuracy. Method In Experiment 1, 30 participants completed a dual task and a pupillometry task that were carefully matched in stimuli and design. The dual task consisted of a spoken word recognition task and a visual match-to-sample task. In the pupillometry task, pupil size was monitored while participants completed a spoken word recognition task. Both tasks presented words at three levels of listening difficulty (unmodified, eight-channel vocoding, and four-channel vocoding) and provided response feedback on every trial. We refined the pupillometry task in Experiment 2 ( n = 31); crucially, participants no longer received response feedback. Finally, we ran a new group of subjects on both tasks in Experiment 3 ( n = 30). Results In Experiment 1, accuracy in the visual task decreased with increased signal degradation in the dual task, but pupil size was sensitive to accuracy and not vocoding condition. After removing feedback in Experiment 2, changes in pupil size were predicted by listening condition, suggesting the task was now sensitive to engaged effort. Both tasks were sensitive to listening difficulty in Experiment 3, but there was no relationship between the tasks and neither task predicted speech accuracy. Conclusions Consistent with previous work, we found little evidence for a relationship between different measures of listening effort. We also found no evidence that effort predicts speech accuracy, suggesting that engaging more effort does not lead to improved speech recognition. Cognitive and physiological measures of listening effort are likely sensitive to different aspects of the construct of listening effort. Supplemental Material https://doi.org/10.23641/asha.16455900

Download Full-text

Cognitive and physiological measures of listening effort during degraded speech perception: A comparison of dual-task and pupillometry paradigms

10.31234/osf.io/jzg8u ◽

2020 ◽

Author(s):

Sarah Elizabeth Margaret Colby ◽

Bob McMurray

Keyword(s):

Speech Perception ◽

Dual Task ◽

Pupil Size ◽

Spoken Word Recognition ◽

Recognition Task ◽

Spoken Word ◽

Physiological Measures ◽

Listening Effort ◽

Response Feedback ◽

The Relationship

Purpose: Listening effort is quickly becoming an important metric for assessing speech perception in less-than-ideal situations. However, the relationship between the construct of listening effort and the measures used to assess it remain unclear. We compared two measures of listening effort: a cognitive dual task and a physiological pupillometry task. We sought to investigate the relationship between these measures of effort and whether engaging effort impacts speech accuracy.Method: In Experiment 1, 30 participants completed a dual task and pupillometry task that were carefully matched in stimuli and design. The dual task consisted of a spoken word recognition task and a visual match-to-sample task. In the pupillometry task, pupil size was monitored while participants completed a spoken word recognition task. Both tasks presented words at three levels of listening difficulty (unmodified, 8-channel vocoding, and 4-channel vocoding) and provided response feedback on every trial. We refined the pupillometry task in Experiment 2 (n=31); crucially, participants no longer received response feedback. Finally, we ran a new group of subjects on both tasks in Experiment 3 (n=30).Results: In Experiment 1, accuracy in the visual task decreased with increased listening difficulty in the dual task, but pupil size was sensitive to accuracy and not listening difficulty. After removing feedback in Experiment 2, changes in pupil size were predicted by listening difficulty, suggesting the task was now sensitive to engaged effort. Both tasks were sensitive to listening difficulty in Experiment 3, but there was no relationship between the tasks and neither task predicted speech accuracy.Conclusions: Consistent with previous work, we found little evidence for a relationship between different measures of listening effort. We also found no evidence that effort predicts speech accuracy, suggesting that engaging more effort does not lead to improved speech recognition. Cognitive and physiological measures of listening effort are likely sensitive to different aspects of the construct of listening effort.

Download Full-text

About face: Seeing the talker improves spoken word recognition but increases listening effort

10.31234/osf.io/m7b8q ◽

2019 ◽

Author(s):

Violet Aurora Brown ◽

Julia Feld Strand

Keyword(s):

Dual Task ◽

Speech Processing ◽

Background Noise ◽

Spoken Word Recognition ◽

Spoken Word ◽

Audiovisual Speech ◽

Visual Modality ◽

Auditory Modality ◽

Listening Effort ◽

Dual Task Paradigm

It is widely accepted that seeing a talker improves a listener’s ability to understand what a talker is saying in background noise (e.g., Erber, 1969; Sumby & Pollack, 1954). The literature is mixed, however, regarding the influence of the visual modality on the listening effort required to recognize speech (e.g., Fraser, Gagné, Alepins, & Dubois, 2010; Sommers & Phelps, 2016). Here, we present data showing that even when the visual modality robustly benefits recognition, processing audiovisual speech can still result in greater cognitive load than processing speech in the auditory modality alone. We show using a dual-task paradigm that the costs associated with audiovisual speech processing are more pronounced in easy listening conditions, in which speech can be recognized at high rates in the auditory modality alone—indeed, effort did not differ between audiovisual and audio-only conditions when the background noise was presented at a more difficult level. Further, we show that though these effects replicate with different stimuli and participants, they do not emerge when effort is assessed with a recall paradigm rather than a dual-task paradigm. Together, these results suggest that the widely cited audiovisual recognition benefit may come at a cost under more favorable listening conditions, and add to the growing body of research suggesting that various measures of effort may not be tapping into the same underlying construct (Strand et al., 2018).

Download Full-text

Phonological priming in spoken word recognition: Task effects

Memory & Cognition ◽

10.3758/bf03197074 ◽

1989 ◽

Vol 17 (5) ◽

pp. 525-535 ◽

Cited By ~ 52

Author(s):

Monique Radeau ◽

Josè Morais ◽

Agnlès Dewier

Keyword(s):

Word Recognition ◽

Spoken Word Recognition ◽

Recognition Task ◽

Spoken Word ◽

Phonological Priming ◽

Task Effects

Download Full-text

The gender congruency effect during bilingual spoken-word recognition

Bilingualism Language and Cognition ◽

10.1017/s1366728915000176 ◽

2015 ◽

Vol 19 (2) ◽

pp. 294-310 ◽

Cited By ~ 18

Author(s):

LUIS MORALES ◽

DANIELA PAOLIERI ◽

PAOLA E. DUSSIAS ◽

JORGE R. VALDÉS KROFF ◽

CHIP GERFEN ◽

...

Keyword(s):

Word Recognition ◽

Congruency Effect ◽

Temporal Dynamics ◽

Spoken Word Recognition ◽

Recognition Task ◽

Spoken Word ◽

Definite Article ◽

Language Activation ◽

Cross Language ◽

Gender Congruency Effect

We investigate the ‘gender-congruency’ effect during a spoken-word recognition task using the visual world paradigm. Eye movements of Italian–Spanish bilinguals and Spanish monolinguals were monitored while they viewed a pair of objects on a computer screen. Participants listened to instructions in Spanish (encuentra la bufanda / ‘find the scarf’) and clicked on the object named in the instruction. Grammatical gender of the objects’ name was manipulated so that pairs of objects had the same (congruent) or different (incongruent) gender in Italian, but gender in Spanish was always congruent. Results showed that bilinguals, but not monolinguals, looked at target objects less when they were incongruent in gender, suggesting a between-language gender competition effect. In addition, bilinguals looked at target objects more when the definite article in the spoken instructions provided a valid cue to anticipate its selection (different-gender condition). The temporal dynamics of gender processing and cross-language activation in bilinguals are discussed.

Download Full-text

Effects of age, word frequency, and noise on the time course of spoken word recognition

10.31234/osf.io/xd3qg ◽

2020 ◽

Author(s):

Kristin J. Van Engen ◽

Avanti Dey ◽

Nichole Runge ◽

Brent Spehar ◽

Mitchell S. Sommers ◽

...

Keyword(s):

Older Adults ◽

Young Adults ◽

Word Recognition ◽

Target Word ◽

Time Course ◽

Spoken Word Recognition ◽

Low Frequency ◽

Recognition Task ◽

Spoken Word ◽

Lexical Frequency

This study assessed the effects of age, lexical frequency, and noise on the time course of lexical activation during spoken word recognition. Participants (41 young adults and 39 older adults) performed a visual world word recognition task while we monitored their gaze position. On each trial, four phonologically-unrelated pictures appeared on the screen. A target word was presented following a carrier phrase (“Click on the ________”), at which point participants were instructed to use the mouse to click on the picture that corresponded to the target word. High- and low-frequency words were presented in quiet and in noise at a signal-to-noise ratio (SNR) of +3 dB. Results show that, even in the absence of phonological competitors in the visual array, high-frequency words were fixated more quickly than low-frequency words by both listener groups. Young adults were generally faster to fixate on targets compared to older adults, but the pattern of interactions among noise, lexical frequency, and listener age show that the behavior of young adults in a small amount of noise largely matches older adult behavior.

Download Full-text

Relations Among Linguistic and Cognitive Skills and Spoken Word Recognition in Adults With Cochlear Implants

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2004/039) ◽

2004 ◽

Vol 47 (3) ◽

pp. 496-508 ◽

Cited By ~ 13

Author(s):

Elizabeth A. Collison ◽

Benjamin Munson ◽

Arlene Earley Carney

Keyword(s):

Speech Perception ◽

Word Recognition ◽

Cochlear Implants ◽

Cognitive Abilities ◽

Cognitive Skills ◽

Spoken Word Recognition ◽

Recognition Task ◽

Spoken Word ◽

Neighborhood Density ◽

Hearing Sensitivity

This study examined spoken word recognition in adults with cochlear implants (CIs) to determine the extent to which linguistic and cognitive abilities predict variability in speech-perception performance. Both a traditional consonant-vowel-consonant (CVC)-repetition measure and a gated-word recognition measure (F. Grosjean, 1996) were used. Stimuli in the gated-word-recognition task varied in neighborhood density. Adults with CIs repeated CVC words less accurately than did age-matched adults with normal hearing sensitivity (NH). In addition, adults with CIs required more acoustic information to recognize gated words than did adults with NH. Neighborhood density had a smaller influence on gated-word recognition by adults with CIs than on recognition by adults with NH. With the exception of 1 outlying participant, standardized, norm-referenced measures of cognitive and linguistic abilities were not correlated with word-recognition measures. Taken together, these results do not support the hypothesis that cognitive and linguistic abilities predict variability in speech-perception performance in a heterogeneous group of adults with CIs. Findings are discussed in light of the potential role of auditory perception in mediating relations among cognitive and linguistic skill and spoken word recognition.

Download Full-text

Inhibitory control is associated with the activation of output-driven competitors in a spoken word recognition task

The Journal of General Psychology ◽

10.1080/00221309.2020.1771675 ◽

2020 ◽

pp. 1-28

Author(s):

Libo Zhao ◽

Shanshan Yuan ◽

Ying Guo ◽

Shan Wang ◽

Chuansheng Chen ◽

...

Keyword(s):

Word Recognition ◽

Inhibitory Control ◽

Spoken Word Recognition ◽

Recognition Task ◽

Spoken Word

Download Full-text

Phonetic priming in spoken word recognition: Task comparisons

The Journal of the Acoustical Society of America ◽

10.1121/1.2027818 ◽

1990 ◽

Vol 87 (S1) ◽

pp. S108-S108

Author(s):

Deborah A. Gagnon

Keyword(s):

Word Recognition ◽

Spoken Word Recognition ◽

Recognition Task ◽

Spoken Word

Download Full-text

From sounds to words: The relation between phonological and lexical processing of tone in L2 Mandarin

Second language Research ◽

10.1177/0267658320941546 ◽

2020 ◽

pp. 026765832094154

Author(s):

Wenyi Ling ◽

Theres Grüter

Keyword(s):

Word Recognition ◽

Real Time ◽

Lexical Processing ◽

Group Performance ◽

Categorical Perception ◽

Spoken Word Recognition ◽

Recognition Task ◽

Spoken Word ◽

L2 Acquisition ◽

Identification Task

Successful listening in a second language (L2) involves learning to identify the relevant acoustic–phonetic dimensions that differentiate between words in the L2, and then use these cues to access lexical representations during real-time comprehension. This is a particularly challenging goal to achieve when the relevant acoustic–phonetic dimensions in the L2 differ from those in the L1, as is the case for the L2 acquisition of Mandarin, a tonal language, by speakers of non-tonal languages like English. Previous work shows tone in L2 is perceived less categorically (Shen and Froud, 2019) and weighted less in word recognition (Pelzl et al., 2019) than in L1. However, little is known about the link between categorical perception of tone and use of tone in real time L2 word recognition at the level of the individual learner. This study presents evidence from 30 native and 29 L1-English speakers of Mandarin who completed a real-time spoken word recognition and a tone identification task. Results show that L2 learners differed from native speakers in both the extent to which they perceived tone categorically as well as in their ability to use tonal cues to distinguish between words in real-time comprehension. Critically, learners who reliably distinguished between words differing by tone alone in the word recognition task also showed more categorical perception of tone on the identification task. Moreover, within this group, performance on the two tasks was strongly correlated. This provides the first direct evidence showing that the ability to perceive tone categorically is related to the weighting of tonal cues during spoken word recognition, thus contributing to a better understanding of the link between phonemic and lexical processing, which has been argued to be a key component in the L2 acquisition of tone (Wong and Perrachione, 2007).

Download Full-text

In the Infant’s Mind’s Ear

Psychological Science ◽

10.1177/0956797610373371 ◽

2010 ◽

Vol 21 (7) ◽

pp. 908-913 ◽

Cited By ~ 72

Author(s):

Nivedita Mani ◽

Kim Plunkett

Keyword(s):

Word Recognition ◽

Target Recognition ◽

Spoken Word Recognition ◽

Recognition Task ◽

Target Object ◽

Spoken Word ◽

Visual Object ◽

Phonological Priming ◽

Priming Task

Do infants implicitly name visually fixated objects whose names are known, and does this information influence their preference for looking at other objects? We presented 18-month-old infants with a picture-based phonological priming task and examined their recognition of named targets in primed (e.g., dog-door) and unrelated (e.g., dog-boat) trials. Infants showed better recognition of the target object in primed than in unrelated trials across three measures. As the prime image was never explicitly named during the experiment, the only explanation for the systematic influence of the prime image on target recognition is that infants, like adults, can implicitly name visually fixated images and that these implicitly generated names can prime infants’ subsequent responses in a paired visual-object spoken-word-recognition task.

Download Full-text