Perceptual strategies in prelingual speech segmentation

1993 ◽  
Vol 20 (2) ◽  
pp. 229-252 ◽  
Author(s):  
Jan V. Goodsitt ◽  
James L. Morgan ◽  
Patricia K. Kuhl

ABSTRACTPrevious work has suggested that infants may segment continuous speech by a BRACKETING STRATEGY that segregates portions of the speech stream based on prosodic cues to their endpoints. The two present studies were designed to assess whether infants also can deploy a CLUSTERING STRATEGY that exploits asymmetries in transitional probabilities between successive elements, aggregating elements with high transitional probabilities and identifying points of low transitional probabilities as boundaries between units. These studies examined effects of the structure and redundancy of speech context on infants' discrimination of two target syllables using an operant head-turning procedure. After discrimination training on the target syllables in isolation, discrimination maintenance was tested when the target syllables were embedded in one of three contexts. Invariant Order contexts were structured to promote clustering, whereas the Redundant and Variable Order contexts were not. Thirty-six seven-month-olds were tested in Experiment I, in which stimuli were produced with varying intonation contours; 36 eight-month-olds were tested in Experiment 2, in which stimuli were produced with comparable flat pitch contours. In both experiments, performance of the three groups was equivalent in an initial 20-trial test. However, in a second 20-trial test, significant improvements in performance were shown by infants in the Invariant Order condition. No such gains were shown by infants in the other two conditions. These studies suggest that clustering may complement bracketing in infants' discovery of units of language.

2021 ◽  
Vol 12 ◽  
Author(s):  
Theresa Matzinger ◽  
Nikolaus Ritt ◽  
W. Tecumseh Fitch

A prerequisite for spoken language learning is segmenting continuous speech into words. Amongst many possible cues to identify word boundaries, listeners can use both transitional probabilities between syllables and various prosodic cues. However, the relative importance of these cues remains unclear, and previous experiments have not directly compared the effects of contrasting multiple prosodic cues. We used artificial language learning experiments, where native German speaking participants extracted meaningless trisyllabic “words” from a continuous speech stream, to evaluate these factors. We compared a baseline condition (statistical cues only) to five test conditions, in which word-final syllables were either (a) followed by a pause, (b) lengthened, (c) shortened, (d) changed to a lower pitch, or (e) changed to a higher pitch. To evaluate robustness and generality we used three tasks varying in difficulty. Overall, pauses and final lengthening were perceived as converging with the statistical cues and facilitated speech segmentation, with pauses helping most. Final-syllable shortening hindered baseline speech segmentation, indicating that when cues conflict, prosodic cues can override statistical cues. Surprisingly, pitch cues had little effect, suggesting that duration may be more relevant for speech segmentation than pitch in our study context. We discuss our findings with regard to the contribution to speech segmentation of language-universal boundary cues vs. language-specific stress patterns.


2017 ◽  
Vol 61 (1) ◽  
pp. 84-96 ◽  
Author(s):  
David M. Gómez ◽  
Peggy Mok ◽  
Mikhail Ordin ◽  
Jacques Mehler ◽  
Marina Nespor

Research has demonstrated distinct roles for consonants and vowels in speech processing. For example, consonants have been shown to support lexical processes, such as the segmentation of speech based on transitional probabilities (TPs), more effectively than vowels. Theory and data so far, however, have considered only non-tone languages, that is to say, languages that lack contrastive lexical tones. In the present work, we provide a first investigation of the role of consonants and vowels in statistical speech segmentation by native speakers of Cantonese, as well as assessing how tones modulate the processing of vowels. Results show that Cantonese speakers are unable to use statistical cues carried by consonants for segmentation, but they can use cues carried by vowels. This difference becomes more evident when considering tone-bearing vowels. Additional data from speakers of Russian and Mandarin suggest that the ability of Cantonese speakers to segment streams with statistical cues carried by tone-bearing vowels extends to other tone languages, but is much reduced in speakers of non-tone languages.


2014 ◽  
Vol 281 (1787) ◽  
pp. 20140480 ◽  
Author(s):  
Michelle J. Spierings ◽  
Carel ten Cate

Variation in pitch, amplitude and rhythm adds crucial paralinguistic information to human speech. Such prosodic cues can reveal information about the meaning or emphasis of a sentence or the emotional state of the speaker. To examine the hypothesis that sensitivity to prosodic cues is language independent and not human specific, we tested prosody perception in a controlled experiment with zebra finches. Using a go/no-go procedure, subjects were trained to discriminate between speech syllables arranged in XYXY patterns with prosodic stress on the first syllable and XXYY patterns with prosodic stress on the final syllable. To systematically determine the salience of the various prosodic cues (pitch, duration and amplitude) to the zebra finches, they were subjected to five tests with different combinations of these cues. The zebra finches generalized the prosodic pattern to sequences that consisted of new syllables and used prosodic features over structural ones to discriminate between stimuli. This strong sensitivity to the prosodic pattern was maintained when only a single prosodic cue was available. The change in pitch was treated as more salient than changes in the other prosodic features. These results show that zebra finches are sensitive to the same prosodic cues known to affect human speech perception.


2015 ◽  
Vol 6 ◽  
Author(s):  
Ruth de Diego-Balaguer ◽  
Antoni Rodríguez-Fornells ◽  
Anne-Catherine Bachoud-Lévi

Author(s):  
Louise Goyet ◽  
Séverine Millotte ◽  
Anne Christophe ◽  
Thierry Nazzi

The present chapter focuses on fluent speech segmentation abilities in early language development. We first review studies exploring the early use of major prosodic boundary cues which allow infants to cut full utterances into smaller-sized sequences like clauses or phrases. We then summarize studies showing that word segmentation abilities emerge around 8 months, and rely on infants’ processing of various bottom-up word boundary cues and top-down known word recognition cues. Given that most of these cues are specific to the language infants are acquiring, we emphasize how the development of these abilities varies cross-linguistically, and explore their developmental origin. In particular, we focus on two cues that might allow bootstrapping of these abilities: transitional probabilities and rhythmic units.


Phonology ◽  
2021 ◽  
Vol 38 (2) ◽  
pp. 203-239
Author(s):  
Eleanor Glewwe

This paper presents the results of a corpus study and an online loanword adaptation experiment examining the tonal adaptation of English loanwords in Mandarin. Using maximum entropy models, I control for the substantial influences of lexical tone distributions and standardisation, and uncover phonological determinants of tone beyond these lexical and conventional factors. The most important phonological determinant of tone in the corpus was English voicing, while in the experiment it was English stress-aligned pitch contours. I argue that these distinct tonal adaptation patterns constitute two different perceptual mappings, one from F0 perturbations to tone and the other from English intonation to tone, both arising due to particular borrowing contexts. I suggest that increasingly close contact between English and Mandarin may lead to more intonation-driven tonal adaptation in the latest wave of borrowing. The maximum entropy approach holds promise for the analysis of complex cases of tonal adaptation in other languages.


Author(s):  
Artur Ferreira Tramontin ◽  
Fernando Klitzke Borszcz ◽  
Vitor Costa

AbstractThis study investigated the influence of different warm-up protocols on functional threshold power. Twenty-one trained cyclists (˙VO2max=60.2±6.8 ml·kg−1·min−1) performed an incremental test and four 20-min time trials preceded by different warm-up protocols. Two warm-up protocols lasted 45 min, with a 5-min time trial performed either 15 min (Traditional) or 25 min (Reverse) before the 20-min time trial. The other two warm-up protocols lasted 25 min (High Revolutions-per minute) and 10 min (Self-selected), including three fast accelerations and self-selected intensity, respectively. The power outputs achieved during the 20-min time trial preceded by the Traditional and Reverse warm-up protocols were significantly lower than the High Revolutions-per-minute and Self-selected protocols (256±30; 257±30; 270±30; 270±30 W, respectively). Participants chose a conservative pacing strategy at the onset (negative) for the Traditional and Reverse but implemented a fast-start strategy (U-shaped) for the High revolutions-per-minute and Self-selected warm-up protocols. In conclusion, 20-min time-trial performance and pacing are affected by different warm-ups. Consequently, the resultant functional threshold power may be different depending on whether the original protocol with a 5-min time trial is followed or not.


2000 ◽  
Vol 17 (4) ◽  
pp. 461-479 ◽  
Author(s):  
Carol L. Krumhansl

Sensitivity to tone distributions has been proposed as a mechanism underlying tonality induction. This sensitivity is considered in a cross-cultural context using two styles of music, Finnish spiritual folk hymns and North Sami yoiks. Previous research on melodic continuation judgments showed strong correlations with the statistics of the musical style, specifically, the tone distributions and two- and three-tone transitions. This article develops models using these three kinds of statistics to categorize short initial segments as coming from one style or the other. The model using tone distributions was found to make numerous categorization errors, which can be understood because the tone distributions for these styles are similar. However, categorization was better for the models that used two- and three-tone transitions. The major differences between the transitional probabilities in the styles were analyzed, and these differences were used to account for the cases that the models found difficult. These results point to listeners' sensitivity to higher order transition information and its utility for style identification.


Sign in / Sign up

Export Citation Format

Share Document