The cafeteria study: Effects of facial masks, hearing protection, and real-world noise on speech recognition

2021 ◽  
Vol 150 (6) ◽  
pp. 4244-4255
Author(s):  
Mary E. Barrett ◽  
Sandra Gordon-Salant ◽  
Douglas S. Brungart
2020 ◽  
pp. 1237-1247
Author(s):  
Xiangdong Wang ◽  
Yang Yang ◽  
Hong Liu ◽  
Yueliang Qian ◽  
Duan Jia

In real world applications of speech recognition, recognition errors are inevitable, and manual correction is necessary. This paper presents an approach for the refinement of Mandarin speech recognition result by exploiting user feedback. An interface incorporating character-based candidate lists and feedback-driven updating of the candidate lists is introduced. For dynamic updating of candidate lists, a novel method based on lattice modification and rescoring is proposed. By adding words with similar pronunciations to the candidates next to the corrected character into the lattice and then performing rescoring on the modified lattice, the proposed method can improve the accuracy of the candidate lists even if the correct characters are not in the original lattice, with much lower computational cost than that of the speech re-recognition methods. Experimental results show that the proposed method can reduce 24.03% of user inputs and improve average candidate rank by 25.31%.


1997 ◽  
Vol 33 (1) ◽  
pp. 12 ◽  
Author(s):  
Doh-Suk Kim ◽  
Soo-Young Lee ◽  
Rhee M. Kil ◽  
Xuelong Zhu

2004 ◽  
Vol 115 (5) ◽  
pp. 2378-2379 ◽  
Author(s):  
William A. Ahroon ◽  
Martin B. Robinette

2013 ◽  
Vol 24 (07) ◽  
pp. 616-634 ◽  
Author(s):  
Terrin N. Tamati ◽  
Jaimie L. Gilbert ◽  
David B. Pisoni

Background: Previous studies investigating speech recognition in adverse listening conditions have found extensive variability among individual listeners. However, little is currently known about the core underlying factors that influence speech recognition abilities. Purpose: To investigate sensory, perceptual, and neurocognitive differences between good and poor listeners on the Perceptually Robust English Sentence Test Open-set (PRESTO), a new high-variability sentence recognition test under adverse listening conditions. Research Design: Participants who fell in the upper quartile (HiPRESTO listeners) or lower quartile (LoPRESTO listeners) on key word recognition on sentences from PRESTO in multitalker babble completed a battery of behavioral tasks and self-report questionnaires designed to investigate real-world hearing difficulties, indexical processing skills, and neurocognitive abilities. Study Sample: Young, normal-hearing adults (N = 40) from the Indiana University community participated in the current study. Data Collection and Analysis: Participants' assessment of their own real-world hearing difficulties was measured with a self-report questionnaire on situational hearing and hearing health history. Indexical processing skills were assessed using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Neurocognitive abilities were measured with the Auditory Digit Span Forward (verbal short-term memory) and Digit Span Backward (verbal working memory) tests, the Stroop Color and Word Test (attention/inhibition), the WordFam word familiarity test (vocabulary size), the Behavioral Rating Inventory of Executive Function–Adult Version (BRIEF-A) self-report questionnaire on executive function, and two performance subtests of the Wechsler Abbreviated Scale of Intelligence (WASI) Performance Intelligence Quotient (IQ; nonverbal intelligence). Scores on self-report questionnaires and behavioral tasks were tallied and analyzed by listener group (HiPRESTO and LoPRESTO). Results: The extreme groups did not differ overall on self-reported hearing difficulties in real-world listening environments. However, an item-by-item analysis of questions revealed that LoPRESTO listeners reported significantly greater difficulty understanding speakers in a public place. HiPRESTO listeners were significantly more accurate than LoPRESTO listeners at gender discrimination and regional dialect categorization, but they did not differ on talker discrimination accuracy or response time, or gender discrimination response time. HiPRESTO listeners also had longer forward and backward digit spans, higher word familiarity ratings on the WordFam test, and lower (better) scores for three individual items on the BRIEF-A questionnaire related to cognitive load. The two groups did not differ on the Stroop Color and Word Test or either of the WASI performance IQ subtests. Conclusions: HiPRESTO listeners and LoPRESTO listeners differed in indexical processing abilities, short-term and working memory capacity, vocabulary size, and some domains of executive functioning. These findings suggest that individual differences in the ability to encode and maintain highly detailed episodic information in speech may underlie the variability observed in speech recognition performance in adverse listening conditions using high-variability PRESTO sentences in multitalker babble.


Sign in / Sign up

Export Citation Format

Share Document