Transitional probabilities and positional frequency phonotactics in a hierarchical model of speech segmentation

Research has demonstrated distinct roles for consonants and vowels in speech processing. For example, consonants have been shown to support lexical processes, such as the segmentation of speech based on transitional probabilities (TPs), more effectively than vowels. Theory and data so far, however, have considered only non-tone languages, that is to say, languages that lack contrastive lexical tones. In the present work, we provide a first investigation of the role of consonants and vowels in statistical speech segmentation by native speakers of Cantonese, as well as assessing how tones modulate the processing of vowels. Results show that Cantonese speakers are unable to use statistical cues carried by consonants for segmentation, but they can use cues carried by vowels. This difference becomes more evident when considering tone-bearing vowels. Additional data from speakers of Russian and Mandarin suggest that the ability of Cantonese speakers to segment streams with statistical cues carried by tone-bearing vowels extends to other tone languages, but is much reduced in speakers of non-tone languages.

Download Full-text

Processing Continuous Speech in Infancy

10.1093/oxfordhb/9780199601264.013.8 ◽

2016 ◽

Author(s):

Louise Goyet ◽

Séverine Millotte ◽

Anne Christophe ◽

Thierry Nazzi

Keyword(s):

Word Recognition ◽

Language Development ◽

Word Segmentation ◽

Speech Segmentation ◽

Top Down ◽

Early Language ◽

Early Language Development ◽

Early Use ◽

Prosodic Boundary ◽

Transitional Probabilities

The present chapter focuses on fluent speech segmentation abilities in early language development. We first review studies exploring the early use of major prosodic boundary cues which allow infants to cut full utterances into smaller-sized sequences like clauses or phrases. We then summarize studies showing that word segmentation abilities emerge around 8 months, and rely on infants’ processing of various bottom-up word boundary cues and top-down known word recognition cues. Given that most of these cues are specific to the language infants are acquiring, we emphasize how the development of these abilities varies cross-linguistically, and explore their developmental origin. In particular, we focus on two cues that might allow bootstrapping of these abilities: transitional probabilities and rhythmic units.

Download Full-text

The Influence of Different Prosodic Cues on Word Segmentation

Frontiers in Psychology ◽

10.3389/fpsyg.2021.622042 ◽

2021 ◽

Vol 12 ◽

Author(s):

Theresa Matzinger ◽

Nikolaus Ritt ◽

W. Tecumseh Fitch

Keyword(s):

Language Learning ◽

Speech Segmentation ◽

Continuous Speech ◽

Artificial Language Learning ◽

Prosodic Cues ◽

German Speaking ◽

Final Syllable ◽

Language Universal ◽

Study Context ◽

Transitional Probabilities

A prerequisite for spoken language learning is segmenting continuous speech into words. Amongst many possible cues to identify word boundaries, listeners can use both transitional probabilities between syllables and various prosodic cues. However, the relative importance of these cues remains unclear, and previous experiments have not directly compared the effects of contrasting multiple prosodic cues. We used artificial language learning experiments, where native German speaking participants extracted meaningless trisyllabic “words” from a continuous speech stream, to evaluate these factors. We compared a baseline condition (statistical cues only) to five test conditions, in which word-final syllables were either (a) followed by a pause, (b) lengthened, (c) shortened, (d) changed to a lower pitch, or (e) changed to a higher pitch. To evaluate robustness and generality we used three tasks varying in difficulty. Overall, pauses and final lengthening were perceived as converging with the statistical cues and facilitated speech segmentation, with pauses helping most. Final-syllable shortening hindered baseline speech segmentation, indicating that when cues conflict, prosodic cues can override statistical cues. Surprisingly, pitch cues had little effect, suggesting that duration may be more relevant for speech segmentation than pitch in our study context. We discuss our findings with regard to the contribution to speech segmentation of language-universal boundary cues vs. language-specific stress patterns.

Download Full-text

Perceptual strategies in prelingual speech segmentation

Journal of Child Language ◽

10.1017/s0305000900008266 ◽

1993 ◽

Vol 20 (2) ◽

pp. 229-252 ◽

Cited By ~ 72

Author(s):

Jan V. Goodsitt ◽

James L. Morgan ◽

Patricia K. Kuhl

Keyword(s):

The Other ◽

Speech Segmentation ◽

Variable Order ◽

Order Condition ◽

Head Turning ◽

Prosodic Cues ◽

Invariant Order ◽

Trial Test ◽

Pitch Contours ◽

Transitional Probabilities

ABSTRACTPrevious work has suggested that infants may segment continuous speech by a BRACKETING STRATEGY that segregates portions of the speech stream based on prosodic cues to their endpoints. The two present studies were designed to assess whether infants also can deploy a CLUSTERING STRATEGY that exploits asymmetries in transitional probabilities between successive elements, aggregating elements with high transitional probabilities and identifying points of low transitional probabilities as boundaries between units. These studies examined effects of the structure and redundancy of speech context on infants' discrimination of two target syllables using an operant head-turning procedure. After discrimination training on the target syllables in isolation, discrimination maintenance was tested when the target syllables were embedded in one of three contexts. Invariant Order contexts were structured to promote clustering, whereas the Redundant and Variable Order contexts were not. Thirty-six seven-month-olds were tested in Experiment I, in which stimuli were produced with varying intonation contours; 36 eight-month-olds were tested in Experiment 2, in which stimuli were produced with comparable flat pitch contours. In both experiments, performance of the three groups was equivalent in an initial 20-trial test. However, in a second 20-trial test, significant improvements in performance were shown by infants in the Invariant Order condition. No such gains were shown by infants in the other two conditions. These studies suggest that clustering may complement bracketing in infants' discovery of units of language.

Download Full-text

How Looking While Listening Affects Speech Segmentation

Inquiry@Queen's Undergraduate Research Conference Proceedings ◽

10.24908/iqurcp.8574 ◽

2018 ◽

Author(s):

Jaime Leung

Keyword(s):

Visual Information ◽

Speech Segmentation ◽

Visual Context ◽

Preliminary Results ◽

Natural Context ◽

How People Learn ◽

Visual Boundaries ◽

Segmentation Task ◽

Transitional Probabilities ◽

Do So

This study looks at the mechanisms behind how people learn words of a new language. Syllables that occur within words have a higher chance of occurring together than the syllables between words. Both infants and adults use these transitional probabilities to extract the words in language. However, previous research has examined speech segmentation when learners are presented just with speech. In natural context, we look while we listen and what we see is correlated with what we hear. The goal of my study was to explore how visual context affects adult speech segmentation. To do so, we have three conditions: one where adults were presented with only a word stream, one where while listening adults saw animations that corresponded to words they heard, and one where the animations that the adults saw did not correspond to the words they heard. One hypothesis is that participants in the audio-visual conditions perform better at the segmentation task because the statistical boundaries in the audio are reinforced by the visual boundaries between animations. However, it is also possible that the visual information impairs performance because learners engage in learning the meanings of words in addition to speech segmentation. Preliminary results support the latter hypothesis.

Download Full-text

Influence of General and Crystallized Intelligence on Vocabulary Test Performance

European Journal of Psychological Assessment ◽

10.1027//1015-5759.18.1.78 ◽

2002 ◽

Vol 18 (1) ◽

pp. 78-84 ◽

Cited By ~ 10

Author(s):

Eva Ullstadius ◽

Jan-Eric Gustafsson ◽

Berit Carlstedt

Keyword(s):

Hierarchical Model ◽

Test Performance ◽

Verbal Ability ◽

Testing Procedure ◽

Intellectual Ability ◽

Analysis Of Covariance ◽

Categorical Variables ◽

Vocabulary Test ◽

Crystallized Intelligence ◽

General Ability

Summary: Vocabulary tests, part of most test batteries of general intellectual ability, measure both verbal and general ability. Newly developed techniques for confirmatory factor analysis of dichotomous variables make it possible to analyze the influence of different abilities on the performance on each item. In the testing procedure of the Computerized Swedish Enlistment test battery, eight different subtests of a new vocabulary test were given randomly to subsamples of a representative sample of 18-year-old male conscripts (N = 9001). Three central dimensions of a hierarchical model of intellectual abilities, general (G), verbal (Gc'), and spatial ability (Gv') were estimated under different assumptions of the nature of the data. In addition to an ordinary analysis of covariance matrices, assuming linearity of relations, the item variables were treated as categorical variables in the Mplus program. All eight subtests fit the hierarchical model, and the items were found to load about equally on G and Gc'. The results also indicate that if nonlinearity is not taken into account, the G loadings for the easy items are underestimated. These items, moreover, appear to be better measures of G than the difficult ones. The practical utility of the outcome for item selection and the theoretical implications for the question of the origin of verbal ability are discussed.

Download Full-text

Rapid Serial Auditory Presentation

Experimental Psychology (formerly Zeitschrift für Experimentelle Psychologie) ◽

10.1027/1618-3169/a000295 ◽

2015 ◽

Vol 62 (5) ◽

pp. 346-351 ◽

Cited By ~ 15

Author(s):

Ana Franco ◽

Julia Eberlen ◽

Arnaud Destrebecqz ◽

Axel Cleeremans ◽

Julie Bertels

Keyword(s):

Statistical Learning ◽

Reaction Times ◽

Visual Presentation ◽

Detection Task ◽

Speech Segmentation ◽

Auditory Presentation ◽

Indirect Measure ◽

Speech Stream ◽

Adult Participants ◽

Presentation Procedure

Abstract. The Rapid Serial Visual Presentation procedure is a method widely used in visual perception research. In this paper we propose an adaptation of this method which can be used with auditory material and enables assessment of statistical learning in speech segmentation. Adult participants were exposed to an artificial speech stream composed of statistically defined trisyllabic nonsense words. They were subsequently instructed to perform a detection task in a Rapid Serial Auditory Presentation (RSAP) stream in which they had to detect a syllable in a short speech stream. Results showed that reaction times varied as a function of the statistical predictability of the syllable: second and third syllables of each word were responded to faster than first syllables. This result suggests that the RSAP procedure provides a reliable and sensitive indirect measure of auditory statistical learning.

Download Full-text

Masked Translation Priming Effects With Highly Proficient Simultaneous Bilinguals

Experimental Psychology (formerly Zeitschrift für Experimentelle Psychologie) ◽

10.1027/1618-3169/a000013 ◽

2010 ◽

Vol 57 (2) ◽

pp. 98-107 ◽

Cited By ~ 90

Author(s):

Jon Andoni Duñabeitia ◽

Manuel Perea ◽

Manuel Carreiras

Keyword(s):

Lexical Decision ◽

Hierarchical Model ◽

Priming Effect ◽

Masked Priming ◽

The Other ◽

Memory Organization ◽

Priming Effects ◽

Bilingual Memory ◽

Translation Priming ◽

Masked Translation Priming

One essential issue for models of bilingual memory organization is to what degree the representation from one of the languages is shared with the other language. In this study, we examine whether there is a symmetrical translation priming effect with highly proficient, simultaneous bilinguals. We conducted a masked priming lexical decision experiment with cognate and noncognate translation equivalents. Results showed a significant masked translation priming effect for both cognates and noncognates, with a greater priming effect for cognates. Furthermore, the magnitude of the translation priming was similar in the two directions. Thus, highly fluent bilinguals do develop symmetrical between-language links, as predicted by the Revised Hierarchical model and the BIA+ model. We examine the implications of these results for models of bilingual memory.

Download Full-text