speech stimuli
Recently Published Documents


TOTAL DOCUMENTS

401
(FIVE YEARS 94)

H-INDEX

41
(FIVE YEARS 3)

2021 ◽  
pp. 1-15
Author(s):  
Pierre Reynard ◽  
Josée Lagacé ◽  
Charles-Alexandre Joly ◽  
Léon Dodelé ◽  
Evelyne Veuillet ◽  
...  

<b><i>Background:</i></b> Difficulty understanding speech in background noise is the reason of consultation for most people who seek help for their hearing. With the increased use of speech-in-noise (SpIN) testing, audiologists and otologists are expected to evidence disabilities in a greater number of patients with sensorineural hearing loss. The purpose of this study is to list validated available SpIN tests for the French-speaking population. <b><i>Summary:</i></b> A review was conducted according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. PubMed and Scopus databases were searched. Search strategies used a combination of 4 keywords: speech, audiometry, noise, and French. There were 10 validated SpIN tests dedicated to the Francophone adult population at the time of the review. Some tests use digits triplets as speech stimuli and were originally designed for hearing screening. The others were given a broader range of indications covering diagnostic or research purposes, determination of functional capacities and fitness for duty, as well as assessment of hearing amplification benefit. <b><i>Key Messages:</i></b> As there is a SpIN test for almost any type of clinical or rehabilitation needs, both the accuracy and duration should be considered for choosing one or the other. In an effort to meet the needs of a rapidly aging population, fast adaptive procedures can be favored to screen large groups in order to limit the risk of ignoring the early signs of forthcoming presbycusis and to provide appropriate audiological counseling.


2021 ◽  
Vol 15 ◽  
Author(s):  
Florine L. Bachmann ◽  
Ewen N. MacDonald ◽  
Jens Hjortkjær

Linearized encoding models are increasingly employed to model cortical responses to running speech. Recent extensions to subcortical responses suggest clinical perspectives, potentially complementing auditory brainstem responses (ABRs) or frequency-following responses (FFRs) that are current clinical standards. However, while it is well-known that the auditory brainstem responds both to transient amplitude variations and the stimulus periodicity that gives rise to pitch, these features co-vary in running speech. Here, we discuss challenges in disentangling the features that drive the subcortical response to running speech. Cortical and subcortical electroencephalographic (EEG) responses to running speech from 19 normal-hearing listeners (12 female) were analyzed. Using forward regression models, we confirm that responses to the rectified broadband speech signal yield temporal response functions consistent with wave V of the ABR, as shown in previous work. Peak latency and amplitude of the speech-evoked brainstem response were correlated with standard click-evoked ABRs recorded at the vertex electrode (Cz). Similar responses could be obtained using the fundamental frequency (F0) of the speech signal as model predictor. However, simulations indicated that dissociating responses to temporal fine structure at the F0 from broadband amplitude variations is not possible given the high co-variance of the features and the poor signal-to-noise ratio (SNR) of subcortical EEG responses. In cortex, both simulations and data replicated previous findings indicating that envelope tracking on frontal electrodes can be dissociated from responses to slow variations in F0 (relative pitch). Yet, no association between subcortical F0-tracking and cortical responses to relative pitch could be detected. These results indicate that while subcortical speech responses are comparable to click-evoked ABRs, dissociating pitch-related processing in the auditory brainstem may be challenging with natural speech stimuli.


2021 ◽  
Vol 15 ◽  
Author(s):  
Moïra-Phoebé Huet ◽  
Christophe Micheyl ◽  
Etienne Parizet ◽  
Etienne Gaudrain

During the past decade, several studies have identified electroencephalographic (EEG) correlates of selective auditory attention to speech. In these studies, typically, listeners are instructed to focus on one of two concurrent speech streams (the “target”), while ignoring the other (the “masker”). EEG signals are recorded while participants are performing this task, and subsequently analyzed to recover the attended stream. An assumption often made in these studies is that the participant’s attention can remain focused on the target throughout the test. To check this assumption, and assess when a participant’s attention in a concurrent speech listening task was directed toward the target, the masker, or neither, we designed a behavioral listen-then-recall task (the Long-SWoRD test). After listening to two simultaneous short stories, participants had to identify keywords from the target story, randomly interspersed among words from the masker story and words from neither story, on a computer screen. To modulate task difficulty, and hence, the likelihood of attentional switches, masker stories were originally uttered by the same talker as the target stories. The masker voice parameters were then manipulated to parametrically control the similarity of the two streams, from clearly dissimilar to almost identical. While participants listened to the stories, EEG signals were measured and subsequently, analyzed using a temporal response function (TRF) model to reconstruct the speech stimuli. Responses in the behavioral recall task were used to infer, retrospectively, when attention was directed toward the target, the masker, or neither. During the model-training phase, the results of these behavioral-data-driven inferences were used as inputs to the model in addition to the EEG signals, to determine if this additional information would improve stimulus reconstruction accuracy, relative to performance of models trained under the assumption that the listener’s attention was unwaveringly focused on the target. Results from 21 participants show that information regarding the actual – as opposed to, assumed – attentional focus can be used advantageously during model training, to enhance subsequent (test phase) accuracy of auditory stimulus-reconstruction based on EEG signals. This is the case, especially, in challenging listening situations, where the participants’ attention is less likely to remain focused entirely on the target talker. In situations where the two competing voices are clearly distinct and easily separated perceptually, the assumption that listeners are able to stay focused on the target is reasonable. The behavioral recall protocol introduced here provides experimenters with a means to behaviorally track fluctuations in auditory selective attention, including, in combined behavioral/neurophysiological studies.


PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0260090
Author(s):  
Emanuele Perugia ◽  
Ghada BinKhamis ◽  
Josef Schlittenlacher ◽  
Karolina Kluk

Current clinical strategies to assess benefits from hearing aids (HAs) are based on self-reported questionnaires and speech-in-noise (SIN) tests; which require behavioural cooperation. Instead, objective measures based on Auditory Brainstem Responses (ABRs) to speech stimuli would not require the individuals’ cooperation. Here, we re-analysed an existing dataset to predict behavioural measures with speech-ABRs using regression trees. Ninety-two HA users completed a self-reported questionnaire (SSQ-Speech) and performed two aided SIN tests: sentences in noise (BKB-SIN) and vowel-consonant-vowels (VCV) in noise. Speech-ABRs were evoked by a 40 ms [da] and recorded in 2x2 conditions: aided vs. unaided and quiet vs. background noise. For each recording condition, two sets of features were extracted: 1) amplitudes and latencies of speech-ABR peaks, 2) amplitudes and latencies of speech-ABR F0 encoding. Two regression trees were fitted for each of the three behavioural measures with either feature set and age, digit-span forward and backward, and pure tone average (PTA) as possible predictors. The PTA was the only predictor in the SSQ-Speech trees. In the BKB-SIN trees, performance was predicted by the aided latency of peak F in quiet for participants with PTAs between 43 and 61 dB HL. In the VCV trees, performance was predicted by the aided F0 encoding latency and the aided amplitude of peak VA in quiet for participants with PTAs ≤ 47 dB HL. These findings indicate that PTA was more informative than any speech-ABR measure, as these were relevant only for a subset of the participants. Therefore, speech-ABRs evoked by a 40 ms [da] are not a clinical predictor of behavioural measures in HA users.


2021 ◽  
Author(s):  
Laurianne Cabrera ◽  
Bonnie K. Lau

The processing of auditory temporal information is important for the extraction of voice pitch, linguistic information, as well as the overall temporal structure of speech. However, many aspects regarding its early development remains not well understood. This paper reviews the development of different aspects of auditory temporal processing during the first year of life when infants are acquiring their native language. First, potential mechanisms of neural immaturity are discussed in the context of neurophysiological studies. Next, what is known about infant auditory capabilities is considered with a focus on psychophysical studies involving non-speech stimuli to investigate the perception of temporal fine structure and envelope cues. This is followed by a review of studies involving speech stimuli, including those that present vocoded signals as a method of degrading the spectro-temporal information available to infant listeners. Finally, we highlight key findings from the cochlear implant literature that illustrate the importance of temporal cues in speech perception.


Author(s):  
Luodi Yu ◽  
Jiajing Zeng ◽  
Suiping Wang ◽  
Yang Zhang

Purpose This study aimed to examine whether abstract knowledge of word-level linguistic prosody is independent of or integrated with phonetic knowledge. Method Event-related potential (ERP) responses were measured from 18 adult listeners while they listened to native and nonnative word-level prosody in speech and in nonspeech. The prosodic phonology (speech) conditions included disyllabic pseudowords spoken in Chinese and in English matched for syllabic structure, duration, and intensity. The prosodic acoustic (nonspeech) conditions were hummed versions of the speech stimuli, which eliminated the phonetic content while preserving the acoustic prosodic features. Results We observed language-specific effects on the ERP that native stimuli elicited larger late negative response (LNR) amplitude than nonnative stimuli in the prosodic phonology conditions. However, no such effect was observed in the phoneme-free prosodic acoustic control conditions. Conclusions The results support the integration view that word-level linguistic prosody likely relies on the phonetic content where the acoustic cues embedded in. It remains to be examined whether the LNR may serve as a neural signature for language-specific processing of prosodic phonology beyond auditory processing of the critical acoustic cues at the suprasyllabic level.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7044
Author(s):  
Leah Fostick ◽  
Nir Fink

 The purpose of the current study was to test sound localization of a spoken word, rarely studied in the context of localization, compared to pink noise and a gunshot, while taking into account the source position and the effect of different hearing protection devices (HPDs) used by the listener. Ninety participants were divided into three groups using different HPDs. Participants were tested twice, under with- and no-HPD conditions, and were requested to localize the different stimuli that were delivered from one of eight speakers evenly distributed around them (starting from 22.5°). Localization of the word stimulus was more difficult than that of the other stimuli. HPD usage resulted in a larger mean root-mean-square error (RMSE) and increased mirror image reversal errors for all stimuli. In addition, HPD usage increased the mean RMSE and mirror image reversal errors for stimuli delivered from the front and back, more than for stimuli delivered from the left and right. HPDs affect localization, both due to attenuation and to limitation of pinnae cues when using earmuffs. Difficulty localizing the spoken word should be considered when assessing auditory functionality and should be further investigated to include HPDs with different attenuation spectra and levels, and to further types of speech stimuli. 


2021 ◽  
Vol 12 ◽  
Author(s):  
Chloe Jones ◽  
Elizabeth Collin ◽  
Olga Kepinska ◽  
Roeland Hancock ◽  
Jocelyn Caballero ◽  
...  

Perception of low-level auditory cues such as frequency modulation (FM) and rise time (RT) is crucial for development of phonemic representations, segmentation of word boundaries, and attunement to prosodic patterns in language. While learning an additional language, children may develop an increased sensitivity to these cues to extract relevant information from multiple types of linguistic input. Performance on these auditory processing tasks such as FM and RT by children learning another language is, however, unknown. Here we examine 92 English-speaking 7–8-year-olds in the U.S. and their performance in FM and RT perceptual tasks at the end of their second year in Cantonese or Spanish dual-language immersion compared to children in general English education programs. Results demonstrate that children in immersion programs have greater sensitivity to FM, but not RT, controlling for various factors. The immersion program students were also observed to have better phonological awareness performance. However, individual differences in FM sensitivity were not associated with phonological awareness, a pattern typically observed in monolinguals. These preliminary findings suggest a possible impact of formal language immersion on low-level auditory processing. Additional research is warranted to understand causal relationships and ultimate impact on language skills in multilinguals.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Iris Berent ◽  
Irene de la Cruz-Pavía ◽  
Diane Brentari ◽  
Judit Gervain

AbstractInfants readily extract linguistic rules from speech. Here, we ask whether this advantage extends to linguistic stimuli that do not rely on the spoken modality. To address this question, we first examine whether infants can differentially learn rules from linguistic signs. We show that, despite having no previous experience with a sign language, six-month-old infants can extract the reduplicative rule (AA) from dynamic linguistic signs, and the neural response to reduplicative linguistic signs differs from reduplicative visual controls, matched for the dynamic spatiotemporal properties of signs. We next demonstrate that the brain response for reduplicative signs is similar to the response to reduplicative speech stimuli. Rule learning, then, apparently depends on the linguistic status of the stimulus, not its sensory modality. These results suggest that infants are language-ready. They possess a powerful rule system that is differentially engaged by all linguistic stimuli, speech or sign.


2021 ◽  
Vol 64 (10) ◽  
pp. 4014-4029
Author(s):  
Kathy R. Vander Werff ◽  
Christopher E. Niemczak ◽  
Kenneth Morse

Purpose Background noise has been categorized as energetic masking due to spectrotemporal overlap of the target and masker on the auditory periphery or informational masking due to cognitive-level interference from relevant content such as speech. The effects of masking on cortical and sensory auditory processing can be objectively studied with the cortical auditory evoked potential (CAEP). However, whether effects on neural response morphology are due to energetic spectrotemporal differences or informational content is not fully understood. The current multi-experiment series was designed to assess the effects of speech versus nonspeech maskers on the neural encoding of speech information in the central auditory system, specifically in terms of the effects of speech babble noise maskers varying by talker number. Method CAEPs were recorded from normal-hearing young adults in response to speech syllables in the presence of energetic maskers (white or speech-shaped noise) and varying amounts of informational maskers (speech babble maskers). The primary manipulation of informational masking was the number of talkers in speech babble, and results on CAEPs were compared to those of nonspeech maskers with different temporal and spectral characteristics. Results Even when nonspeech noise maskers were spectrally shaped and temporally modulated to speech babble maskers, notable changes in the typical morphology of the CAEP in response to speech stimuli were identified in the presence of primarily energetic maskers and speech babble maskers with varying numbers of talkers. Conclusions While differences in CAEP outcomes did not reach significance by number of talkers, neural components were significantly affected by speech babble maskers compared to nonspeech maskers. These results suggest an informational masking influence on neural encoding of speech information at the sensory cortical level of auditory processing, even without active participation on the part of the listener.


Sign in / Sign up

Export Citation Format

Share Document