scholarly journals Bottom-up and top-down neural signatures of disordered multi-talker speech perception in adults with normal hearing

eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Aravindakshan Parthasarathy ◽  
Kenneth E Hancock ◽  
Kara Bennett ◽  
Victor DeGruttola ◽  
Daniel B Polley

In social settings, speech waveforms from nearby speakers mix together in our ear canals. Normally, the brain unmixes the attended speech stream from the chorus of background speakers using a combination of fast temporal processing and cognitive active listening mechanisms. Of >100,000 patient records,~10% of adults visited our clinic because of reduced hearing, only to learn that their hearing was clinically normal and should not cause communication difficulties. We found that multi-talker speech intelligibility thresholds varied widely in normal hearing adults, but could be predicted from neural phase-locking to frequency modulation (FM) cues measured with ear canal EEG recordings. Combining neural temporal fine structure processing, pupil-indexed listening effort, and behavioral FM thresholds accounted for 78% of the variability in multi-talker speech intelligibility. The disordered bottom-up and top-down markers of poor multi-talker speech perception identified here could inform the design of next-generation clinical tests for hidden hearing disorders.

2019 ◽  
Author(s):  
Aravindakshan Parthasarathy ◽  
Kenneth E Hancock ◽  
Kara Bennett ◽  
Victor DeGruttola ◽  
Daniel B Polley

AbstractIn social settings, speech waveforms from nearby speakers mix together in our ear canals. The brain unmixes the attended speech stream from the chorus of background speakers using a combination of fast temporal processing and cognitive active listening mechanisms. Multi-talker speech perception is vulnerable to aging or auditory abuse. We found that ∼10% of adult visitors to our clinic have no measurable hearing loss, yet offer a primary complaint of poor hearing. Multi-talker speech intelligibility in these adults was strongly correlated with neural phase locking to frequency modulation (FM) cues, as determined from ear canal EEG recordings. Combining neural temporal fine structure (TFS) processing with pupil-indexed measures of cognitive listening effort could predict most of the individual variance in speech intelligibility thresholds. These findings identify a confluence of disordered bottom-up and top-down processes that predict poor multi-talker speech perception and could be useful in next-generation tests of hidden hearing disorders.


2020 ◽  
Author(s):  
Matthew Winn ◽  
Katherine H. Teece

Speech perception and listening effort are complicated and interrelated concepts. One might assume that intelligibility performance (percent correct) is a proxy for listening effort, but there are some reasons to challenge whether that is actually true. Correct responses in speech perception tests could reflect effortful mental processing, and a completely wrong answer could evoke very little effort, especially if the misperception itself is linguistically well-formed and sensible. This paper presents evidence that listening effort is not a function of the proportion of words correct, but is rather driven by the types of errors, position of errors within a sentence, and the need to resolve ambiguity, reflecting how easily the listener can make sense of a perception. We offer a taxonomy of error types that is both intuitive and also consistent with data from two experiments measuring listening effort with careful controls to either elicit specific kinds of mistakes or to track specific mistakes retrospectively. Participants included individuals with normal hearing or with cochlear implants. In two experiments of sentence repetition, listening effort – indexed by changes in pupil size – was found to scale with the amount of perceptual restoration needed (phoneme versus whole word), and also scale with the sensibility of responses, but not with the number of intelligibility errors. Although mental corrective action and number of mistakes can scale together in many experiments, it is possible to dissociate them in order to advance toward a more explanatory (rather than correlational) account of listening effort.


2019 ◽  
Author(s):  
Aravindakshan Parthasarathy ◽  
Kenneth E Hancock ◽  
Kara Bennett ◽  
Victor DeGruttola ◽  
Daniel B Polley

2010 ◽  
Vol 10 ◽  
pp. 329-339 ◽  
Author(s):  
Torsten Rahne ◽  
Michael Ziese ◽  
Dorothea Rostalski ◽  
Roland Mühler

This paper describes a logatome discrimination test for the assessment of speech perception in cochlear implant users (CI users), based on a multilingual speech database, the Oldenburg Logatome Corpus, which was originally recorded for the comparison of human and automated speech recognition. The logatome discrimination task is based on the presentation of 100 logatome pairs (i.e., nonsense syllables) with balanced representations of alternating “vowel-replacement” and “consonant-replacement” paradigms in order to assess phoneme confusions. Thirteen adult normal hearing listeners and eight adult CI users, including both good and poor performers, were included in the study and completed the test after their speech intelligibility abilities were evaluated with an established sentence test in noise. Furthermore, the discrimination abilities were measured electrophysiologically by recording the mismatch negativity (MMN) as a component of auditory event-related potentials. The results show a clear MMN response only for normal hearing listeners and CI users with good performance, correlating with their logatome discrimination abilities. Higher discrimination scores for vowel-replacement paradigms than for the consonant-replacement paradigms were found. We conclude that the logatome discrimination test is well suited to monitor the speech perception skills of CI users. Due to the large number of available spoken logatome items, the Oldenburg Logatome Corpus appears to provide a useful and powerful basis for further development of speech perception tests for CI users.


2021 ◽  
Vol 69 (1) ◽  
pp. 77-85
Author(s):  
Cheol-Ho Jeong ◽  
Wan-Ho Cho ◽  
Ji-Ho Chang ◽  
Sung-Hyun Lee ◽  
Chang-Wook Kang ◽  
...  

Hearing-impaired people need more stringent acoustic and noise requirements than normal-hearing people in terms of speech intelligibility and listening effort. Multiple guidelines recommend a maximum reverberation time of 0.4 s in classrooms, signal-to-noise ratios (SNRs) greater than 15 dB, and ambient noise levels lower than 35 dBA. We measured noise levels and room acoustic parameters of 12 classrooms in two schools for hearing-impaired pupils, a dormitory apartment for the hearing-impaired, and a church mainly for the hearing-impaired in the Republic of Korea. Additionally, subjective speech clarity and quality of verbal communication were evaluated through questionnaires and interviews with hearing-impaired students in one school. Large differences in subjective speech perception were found between younger primary school pupils and older pupils. Subjective data from the questionnaire and interview were inconsistent; major challenges in obtaining reliable subjective speech perception and limitations of the results are discussed.


2015 ◽  
Vol 26 (06) ◽  
pp. 572-581 ◽  
Author(s):  
Stanley Sheft ◽  
Min-Yu Cheng ◽  
Valeriy Shafiro

Background: Past work has shown that low-rate frequency modulation (FM) may help preserve signal coherence, aid segmentation at word and syllable boundaries, and benefit speech intelligibility in the presence of a masker. Purpose: This study evaluated whether difficulties in speech perception by cochlear implant (CI) users relate to a deficit in the ability to discriminate among stochastic low-rate patterns of FM. Research Design: This is a correlational study assessing the association between the ability to discriminate stochastic patterns of low-rate FM and the intelligibility of speech in noise. Study Sample: Thirteen postlingually deafened adult CI users participated in this study. Data Collection and Analysis: Using modulators derived from 5-Hz lowpass noise applied to a 1-kHz carrier, thresholds were measured in terms of frequency excursion both in quiet and with a speech-babble masker present, stimulus duration, and signal-to-noise ratio in the presence of a speech-babble masker. Speech perception ability was assessed in the presence of the same speech-babble masker. Relationships were evaluated with Pearson product–moment correlation analysis with correction for family-wise error, and commonality analysis to determine the unique and common contributions across psychoacoustic variables to the association with speech ability. Results: Significant correlations were obtained between masked speech intelligibility and three metrics of FM discrimination involving either signal-to-noise ratio or stimulus duration, with shared variance among the three measures accounting for much of the effect. Compared to past results from young normal-hearing adults and older adults with either normal hearing or a mild-to-moderate hearing loss, mean FM discrimination thresholds obtained from CI users were higher in all conditions. Conclusions: The ability to process the pattern of frequency excursions of stochastic FM may, in part, have a common basis with speech perception in noise. Discrimination of differences in the temporally distributed place coding of the stimulus could serve as this common basis for CI users.


NeuroImage ◽  
2014 ◽  
Vol 102 ◽  
pp. 637-645 ◽  
Author(s):  
Juan Wang ◽  
Danqi Gao ◽  
Duan Li ◽  
Amy S. Desroches ◽  
Li Liu ◽  
...  
Keyword(s):  
Top Down ◽  

2021 ◽  
Vol 15 ◽  
Author(s):  
Stephen Grossberg

All perceptual and cognitive circuits in the human cerebral cortex are organized into layers. Specializations of a canonical laminar network of bottom-up, horizontal, and top-down pathways carry out multiple kinds of biological intelligence across different neocortical areas. This article describes what this canonical network is and notes that it can support processes as different as 3D vision and figure-ground perception; attentive category learning and decision-making; speech perception; and cognitive working memory (WM), planning, and prediction. These processes take place within and between multiple parallel cortical streams that obey computationally complementary laws. The interstream interactions that are needed to overcome these complementary deficiencies mix cell properties so thoroughly that some authors have noted the difficulty of determining what exactly constitutes a cortical stream and the differences between streams. The models summarized herein explain how these complementary properties arise, and how their interstream interactions overcome their computational deficiencies to support effective goal-oriented behaviors.


Sign in / Sign up

Export Citation Format

Share Document