speech stream
Recently Published Documents


TOTAL DOCUMENTS

76
(FIVE YEARS 17)

H-INDEX

18
(FIVE YEARS 2)

2021 ◽  
pp. 26-61
Author(s):  
Sylvia Sierra

While scholars have explored the importance of quoting media in accomplishing relationship and identity work in conversation, there is little work on how people actually phonetically and paralinguistically signal media references in the speech stream. This chapter demonstrates how speakers make 148 media references recognizable across five audio-recorded everyday conversations among Millennial friends in their late twenties. Five ways that media references are signaled in talk are identified: word stress and intonation, pitch shifts, smiling and laughter, performing stylized accents, and singing. This systematic analysis of the contextualization cues used to signal media references in everyday talk contributes to understanding how speakers participate in intertextual processes. This chapter also introduces how signaling playful media references often (but not always) serves to negotiate epistemic, or knowledge, imbalances as well as interactional dilemmas, or awkward and unpleasant moments in interaction; this will be explored in more detail in chapters 4 and 5. Also weaved in are analyses of the identity work being constructed with the media references, as well as of the media stereotypes that are repeated in some of them.


2021 ◽  
Vol 10 ◽  
Author(s):  
Iris Broedelet ◽  
Paul Boersma ◽  
Judith Rispens

Since Saffran, Aslin and Newport (1996) showed that infants were sensitive to transitional probabilities between syllables after being exposed to a few minutes of fluent speech, there has been ample research on statistical learning. Word segmentation studies usually test learning by making use of “offline methods” such as forced-choice tasks. However, cognitive factors besides statistical learning possibly influence performance on those tasks. The goal of the present study was to improve a method for measuring word segmentation online. Click sounds were added to the speech stream, both between words and within words. Stronger expectations for the next syllable within words as opposed to between words were expected to result in slower detection of clicks within words, revealing sensitivity to word boundaries. Unexpectedly, we did not find evidence for learning in multiple groups of adults and child participants. We discuss possible methodological factors that could have influenced our results.


PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0253039
Author(s):  
Viridiana L. Benitez ◽  
Jenny R. Saffran

To acquire the words of their language, learners face the challenge of tracking regularities at multiple levels of abstraction from continuous speech. In the current study, we examined adults’ ability to track two types of regularities from a continuous artificial speech stream: the individual words in the speech stream (token level information), and a phonotactic pattern shared by a subset of those words (type level information). We additionally manipulated exposure time to the language to examine the relationship between the acquisition of these two regularities. Using a ratings test procedure, we found that adults can extract both the words in the language and their phonotactic patterns from continuous speech in as little as 3.5 minutes of listening time. Results from a 2AFC testing method provide converging evidence that adults rapidly learn both words and their phonotactic patterns. Together, the findings suggest that adults are capable of concurrently tracking regularities at multiple levels of abstraction from brief exposures to a continuous stream of speech.


2021 ◽  
Vol 6 (1) ◽  
pp. 966
Author(s):  
Abram Clear ◽  
Anya Hogoboom

Formant transitions from a high front vowel to a non-high, non-front vowel mimic the formant signature of a canonical [j], resulting in the perception of an acoustic glide (Hogoboom 2020). We ask if listeners may still perceive a glide when canonical formant transitions are absent. We investigated the mapping of an Appalachian English (AE) monophthongal [aɪ] in hiatus sequences, monophthongal [aɪ.a]. If participants map this monophthongal [aɪ] to a high front position, they might perceive a glide that is not supported by the acoustic signal, which we call a phantom glide. Ninety-six participants (45 of which were native AE speakers) heard 30 different English words ending in [i], [ə], or monophthongal [aɪ] (i.e. tree, coma, pie) that had been suffixed with either [-a] or [-ja]. They were asked to identify which suffixed form they heard. Participants in both dialect groups sometimes perceived a glide that was truly absent from the speech stream. In these cases, participants mapped static formants in monophthongal [aɪ.a] stimuli to a diphthongal /aɪ/ with a high front endpoint, causing the perception of the necessary F1 fall and subsequent rise of a [j]. Using recent models of speech processing, which encode both social and acoustic representations of speech (e.g. Sumner et al. 2014), we discuss the mapping of monophthongal [aɪ] to a privileged diphthongal underlying form.


2021 ◽  
Author(s):  
Mahmoud Keshavarzi ◽  
Enrico Varano ◽  
Tobias Reichenbach

AbstractUnderstanding speech in background noise is a difficult task. The tracking of speech rhythms such as the rate of syllables and words by cortical activity has emerged as a key neural mechanism for speech-in-noise comprehension. In particular, recent investigations have used transcranial alternating current stimulation (tACS) with the envelope of a speech signal to influence the cortical speech tracking, demonstrating that this type of stimulation modulates comprehension and therefore evidencing a functional role of the cortical tracking in speech processing. Cortical activity has been found to track the rhythms of a background speaker as well, but the functional significance of this neural response remains unclear. Here we employ a speech-comprehension task with a target speaker in the presence of a distractor voice to show that tACS with the speech envelope of the target voice as well as tACS with the envelope of the distractor speaker both modulate the comprehension of the target speech.Because the envelope of the distractor speech does not carry information about the target speech stream, the modulation of speech comprehension through tACS with this envelope evidences that the cortical tracking of the background speaker affects the comprehension of the foreground speech signal. The phase dependency of the resulting modulation of speech comprehension is, however, opposite to that obtained from tACS with the envelope of the target speech signal. This suggests that the cortical tracking of the ignored speech stream and that of the attended speech stream may compete for neural resources.Significance StatementLoud environments such as busy pubs or restaurants can make conversation difficult. However, they also allow us to eavesdrop into other conversations that occur in the background. In particular, we often notice when somebody else mentions our name, even if we have not been listening to that person. However, the neural mechanisms by which background speech is processed remain poorly understood. Here we employ transcranial alternating current stimulation, a technique through which neural activity in the cerebral cortex can be influenced, to show that cortical responses to rhythms in the distractor speech modulate the comprehension of the target speaker. Our results evidence that the cortical tracking of background speech rhythms plays a functional role in speech processing.


2021 ◽  
Vol 15 ◽  
Author(s):  
Björn Holtze ◽  
Manuela Jaeger ◽  
Stefan Debener ◽  
Kamil Adiloğlu ◽  
Bojana Mirkovic

Difficulties in selectively attending to one among several speakers have mainly been associated with the distraction caused by ignored speech. Thus, in the current study, we investigated the neural processing of ignored speech in a two-competing-speaker paradigm. For this, we recorded the participant’s brain activity using electroencephalography (EEG) to track the neural representation of the attended and ignored speech envelope. To provoke distraction, we occasionally embedded the participant’s first name in the ignored speech stream. Retrospective reports as well as the presence of a P3 component in response to the name indicate that participants noticed the occurrence of their name. As predicted, the neural representation of the ignored speech envelope increased after the name was presented therein, suggesting that the name had attracted the participant’s attention. Interestingly, in contrast to our hypothesis, the neural tracking of the attended speech envelope also increased after the name occurrence. On this account, we conclude that the name might not have primarily distracted the participants, at most for a brief duration, but that it alerted them to focus to their actual task. These observations remained robust even when the sound intensity of the ignored speech stream, and thus the sound intensity of the name, was attenuated.


Author(s):  
Francisco Velasco-Álvarez ◽  
Álvaro Fernández-Rodríguez ◽  
M. Teresa Medina-Juliá ◽  
Ricardo Ron-Angevin

2021 ◽  
Vol 3 (3) ◽  
Author(s):  
Marc Vander Ghinst ◽  
Mathieu Bourguignon ◽  
Vincent Wens ◽  
Gilles Naeije ◽  
Cecile Ducène ◽  
...  

Abstract Impaired speech perception in noise despite normal peripheral auditory function is a common problem in young adults. Despite a growing body of research, the pathophysiology of this impairment remains unknown. This magnetoencephalography study characterizes the cortical tracking of speech in a multi-talker background in a group of highly selected adult subjects with impaired speech perception in noise without peripheral auditory dysfunction. Magnetoencephalographic signals were recorded from 13 subjects with impaired speech perception in noise (six females, mean age: 30 years) and matched healthy subjects while they were listening to 5 different recordings of stories merged with a multi-talker background at different signal to noise ratios (No Noise, +10, +5, 0 and −5 dB). The cortical tracking of speech was quantified with coherence between magnetoencephalographic signals and the temporal envelope of (i) the global auditory scene (i.e. the attended speech stream and the multi-talker background noise), (ii) the attended speech stream only and (iii) the multi-talker background noise. Functional connectivity was then estimated between brain areas showing altered cortical tracking of speech in noise in subjects with impaired speech perception in noise and the rest of the brain. All participants demonstrated a selective cortical representation of the attended speech stream in noisy conditions, but subjects with impaired speech perception in noise displayed reduced cortical tracking of speech at the syllable rate (i.e. 4–8 Hz) in all noisy conditions. Increased functional connectivity was observed in subjects with impaired speech perception in noise in Noiseless and speech in noise conditions between supratemporal auditory cortices and left-dominant brain areas involved in semantic and attention processes. The difficulty to understand speech in a multi-talker background in subjects with impaired speech perception in noise appears to be related to an inaccurate auditory cortex tracking of speech at the syllable rate. The increased functional connectivity between supratemporal auditory cortices and language/attention-related neocortical areas probably aims at supporting speech perception and subsequent recognition in adverse auditory scenes. Overall, this study argues for a central origin of impaired speech perception in noise in the absence of any peripheral auditory dysfunction.


Open Mind ◽  
2020 ◽  
Vol 4 ◽  
pp. 1-12 ◽  
Author(s):  
Adam King ◽  
Andrew Wedel

There has been much work over the last century on optimization of the lexicon for efficient communication, with a particular focus on the form of words as an evolving balance between production ease and communicative accuracy. Zipf’s law of abbreviation, the cross-linguistic trend for less-probable words to be longer, represents some of the strongest evidence the lexicon is shaped by a pressure for communicative efficiency. However, the various sounds that make up words do not all contribute the same amount of disambiguating information to a listener. Rather, the information a sound contributes depends in part on what specific lexical competitors exist in the lexicon. In addition, because the speech stream is perceived incrementally, early sounds in a word contribute on average more information than later sounds. Using a dataset of diverse languages, we demonstrate that, above and beyond containing more sounds, less-probable words contain sounds that convey more disambiguating information overall. We show further that this pattern tends to be strongest at word-beginnings, where sounds can contribute the most information.


2020 ◽  
Vol 5 (1) ◽  
pp. 599
Author(s):  
Megan Rouch ◽  
Anya Lunden

The right edge of the word is a known domain for processes like phonological devoicing. This has been argued to be the effect of analogy from higher prosodic domains, rather than an in situ motivated change (Hock 1999, Hualde and Eager 2016). Phonetic word-level phenomena of final lengthening and final devoicing have been found to occur natively word-finally (Lunden 2006, 2017, Nakai et al. 2009) despite claims that they have no natural phonetic pressure originating in this position (Hock 1999). We present the results of artificial language learning studies that seek to answer the question of whether phonetic-level cues to the word-final position can aid in language parsing. If they do, it provides evidence that listeners can make use of word-level phonetic phenomena, which, together with studies that have found them to be present, speaks to their inherent presence at the word level. We find that adult listeners are better able to recognize the words they heard in a speech stream, and better able to reject words that they did not hear, when final lengthening was present at the right edge of the word. Final devoicing was not found to give the same boost to parsing.


Sign in / Sign up

Export Citation Format

Share Document