scholarly journals What constrains distributional learning in adults?

2020 ◽  
Author(s):  
Dave F Kleinschmidt

One of the many remarkable features of human language is it's flexibility: during acquisition, any normally-developing human infant can acquire any human language, and during adulthood, language users quickly and flexibly adapt to a wide range of talker variation. Both language acquisition in infants and adaptation in adults have been hypothesized to be forms of distributional learning, where flexibility is driven by sensitivity to statistical properties of sensory stimuli and the corresponding underlying linguistic structures. Despite the similarities between these forms of linguistic flexibility, there are obvious differences as well, chief among them being that adults have a much harder time acquiring the same unfamiliar languages that they would have picked up naturally during infancy. This suggests that there are strong constraints on distributional learning during adulthood. This paper provides further, direct evidence for these constraints, by showing that American English listeners struggle to learn voice-onset time (VOT) distributions that are atypical of American English. Moreover, computational modeling shows that the pattern of distributional learning (or lack thereof) across different VOT distributions is consistent with Bayesian belief-updating, starting from prior beliefs that are very similar to the VOT distributions produced by a typical talker of American English. Together, this suggests that distributional learning in adults is constrained by prior experience with other talkers, and that distributional learning may be a computational principle of human language that operates throughout the lifespan.

1991 ◽  
Vol 13 (4) ◽  
pp. 471-492 ◽  
Author(s):  
Z. S. Bond ◽  
Joann Fokes

Recordings of naturally produced stop-vowel English words with a wide range of voice onset time values were used to investigate nonnative perception of voicing categories. L2 learners of English from nine language groups and native American English speakers served as subjects. The responses of the language learners suggested that they were using a hybrid perceptual system, one in which the English voicing categories were not yet fully established.


2020 ◽  
Vol 63 (2) ◽  
pp. 405-420 ◽  
Author(s):  
Victoria S. McKenna ◽  
Jennifer A. Hylkema ◽  
Monique C. Tardif ◽  
Cara E. Stepp

Purpose This study examined vocal hyperfunction (VH) using voice onset time (VOT). We hypothesized that speakers with VH would produce shorter VOTs, indicating increased laryngeal tension, and more variable VOTs, indicating disordered vocal motor control. Method We enrolled 32 adult women with VH (aged 20–74 years) and 32 age- and sex-matched controls. All were speakers of American English. Participants produced vowel–consonant–vowel combinations that varied by vowel (ɑ/u) and plosive (p/b, t/d, k/g). VOT—measured at the release of the plosive to the initiation of voicing—was averaged over three repetitions of each vowel–consonant–vowel combination. The coefficient of variation (CoV), a measure of VOT variability, was also computed for each combination. Results The mean VOTs were not significantly different between the two groups; however, the CoVs were significantly greater in speakers with VH compared to controls. Voiceless CoV values were moderately correlated with clinical ratings of dysphonia ( r = .58) in speakers with VH. Conclusion Speakers with VH exhibited greater variability in phonemic voicing targets compared to vocally healthy speakers, supporting the hypothesis for disordered vocal motor control in VH. We suggest future work incorporate VOT measures when assessing auditory discrimination and auditory–motor integration deficits in VH.


Author(s):  
Thea Knowles ◽  
Scott G. Adams ◽  
Mandar Jog

Purpose The purpose of this study was to quantify changes in acoustic distinctiveness in two groups of talkers with Parkinson's disease as they modify across a wide range of speaking rates. Method People with Parkinson's disease with and without deep brain stimulation and older healthy controls read 24 carrier phrases at different speech rates. Target nonsense words in the carrier phrases were designed to elicit stop consonants and corner vowels. Participants spoke at seven self-selected speech rates from very slow to very fast, elicited via magnitude production. Speech rate was measured in absolute words per minute and as a proportion of each talker's habitual rate. Measures of segmental distinctiveness included a temporal consonant measure, namely, voice onset time, and a spectral vowel measure, namely, vowel articulation index. Results All talkers successfully modified their rate of speech from slow to fast. Talkers with Parkinson's disease and deep brain stimulation demonstrated greater baseline speech impairment and produced smaller proportional changes at the fast end of the continuum. Increasingly slower speaking rates were associated with increased temporal contrasts (voice onset time) but not spectral contrasts (vowel articulation). Faster speech was associated with decreased contrasts in both domains. Talkers with deep brain stimulation demonstrated more aberrant productions across all speaking rates. Conclusions Findings suggest that temporal and spectral segmental distinctiveness are asymmetrically affected by speaking rate modifications in Parkinson's disease. Talkers with deep brain stimulation warrant further investigation with regard to speech changes they make as they adjust their speaking rate.


2019 ◽  
Vol 63 (3) ◽  
pp. 526-549
Author(s):  
Yoonjeong Lee ◽  
Elsi Kaiser ◽  
Louis Goldstein

This study uses a response mouse-tracking paradigm to examine the role of sub-phonemic information in online lexical ambiguity resolution of continuous speech. We examine listeners’ sensitivity to the sub-phonemic information that is specific to the ambiguous internal open juncture /s/-stop sequences in American English (e.g., “ place kin” vs. “ play skin”), that is, voice onset time (VOT) indicating different degrees of aspiration (e.g., long VOT for “ k in” vs. short VOT for “ s k in”) in connected speech contexts. A cross-splicing method was used to create two-word sequences (e.g., “ place kin” or “ play skin”) with matching VOTs (long for “ k in”; short for “ s k in”) or mismatching VOTs ( short for “ k in”; long for “ s k in”). Participants ( n = 20) heard the two-word sequences, while looking at computer displays with the second word in the left/right corner (“ KIN” and “ SKIN”). Then, listeners’ click responses and mouse movement trajectories were recorded. Click responses show significant effects of VOT manipulation, while mouse trajectories do not. Our results show that stop-release information, whether temporal or spectral, can (mis)guide listeners’ interpretation of the possible location of a word boundary between /s/ and a following stop, even when other aspects in the acoustic signal (e.g., duration of /s/) point to the alternative segmentation. Taken together, our results suggest that segmentation and lexical access are highly attuned to bottom-up phonetic information; our results have implications for a model of spoken language recognition with position-specific representations available at the prelexical level and also allude to the possibility that detailed phonetic information may be stored in the listeners’ lexicons.


1980 ◽  
Vol 7 (1) ◽  
pp. 41-74 ◽  
Author(s):  
Marlys A. Macken ◽  
David Barton

ABSTRACTThis paper reports on a longitudinal study of the acquisition of the voicing contrast in American English word-initial stop consonants, as measured by voice onset time. Four monolingual children were recorded at two-week intervals, beginning when the children were about 1; 6. Data provide evidence for three general stages: (1) the child has no contrast; (2) the child has a contrast but one that falls within the adult perceptual boundaries of one (usually voiced) phoneme and thus is presumably not perceptible to adults; and (3) the child has a contrast that resembles the adult contrast. The rate and nature of the developmental process are discussed in relation to two competing models for phonological acquisition and two hypotheses regarding the skills being learned.


2017 ◽  
Vol 62 (1) ◽  
pp. 61-79 ◽  
Author(s):  
Joseph C Toscano ◽  
Charissa R Lansing

Listeners weight acoustic cues in speech according to their reliability, but few studies have examined how cue weights change across the lifespan. Previous work has suggested that older adults have deficits in auditory temporal discrimination, which could affect the reliability of temporal phonetic cues, such as voice onset time (VOT), and in turn, impact speech perception in real-world listening environments. We addressed this by examining younger and older adults’ use of VOT and onset F0 (a secondary phonetic cue) for voicing judgments (e.g., /b/ vs. /p/), using both synthetic and naturally produced speech. We found age-related differences in listeners’ use of the two voicing cues, such that older adults relied more heavily on onset F0 than younger adults, even though this cue is less reliable in American English. These results suggest that phonetic cue weights continue to change across the lifespan.


2018 ◽  
Vol 62 (3) ◽  
pp. 494-508
Author(s):  
Jeffrey J Holliday

Previous studies have shown that non-native speakers of Korean not only have difficulty producing the word-initial three-way stop contrast, but also exhibit a wide range of production patterns. Because these studies have only investigated native (L1) speakers of English and Mandarin and given the overall paucity of research on non-native Korean, it is not yet clear how dependent these findings are on the particular native language under investigation. The current paper reinforces our empirical grounding via extension to L1 speakers of Japanese. It is shown that although naïve Japanese listeners consistently perceive Korean fortis stops as voiced, and Korean lenis and aspirated stops as voiceless, novice second language learners do not produce any significant difference among the three stop categories, despite producing clear differences between their native Japanese stop categories. Unlike in previous studies of L1 speakers of English and Mandarin, there was very little inter-speaker variation, and all speakers produced all Korean stops with long lag voice onset time.


2000 ◽  
Vol 12 (6) ◽  
pp. 1038-1055 ◽  
Author(s):  
Colin Phillips ◽  
Thomas Pellathy ◽  
Alec Marantz ◽  
Elron Yellin ◽  
Kenneth Wexler ◽  
...  

The studies presented here use an adapted oddball paradigm to show evidence that representations of discrete phonological categories are available to the human auditory cortex. Brain activity was recorded using a 37-channel biomagnetometer while eight subjects listened passively to synthetic speech sounds. In the phonological condition, which contrasted stimuli from an acoustic /dæ/-/tæ/ continuum, a magnetic mismatch field (MMF) was elicited in a sequence of stimuli in which phonological categories occurred in a many-to-one ratio, but no acoustic many-to-one ratio was present. In order to isolate the contribution of phonological categories to the MMF responses, the acoustic parameter of voice onset time, which distinguished standard and deviant stimuli, was also varied within the standard and deviant categories. No MMF was elicited in the acoustic condition, in which the acoustic distribution of stimuli was identical to the first experiment, but the many-to-one distribution of phonological categories was removed. The design of these studies makes it possible to demonstrate the all-or-nothing property of phonological category membership. This approach contrasts with a number of previous studies of phonetic perception using the mismatch paradigm, which have demonstrated the graded property of enhanced acoustic discrimination at or near phonetic category boundaries.


1975 ◽  
Vol 2 (2) ◽  
pp. 223-231 ◽  
Author(s):  
Paula Menyuk ◽  
Mary Klatt

ABSTRACTVoice onset time (VOT) characteristics of stops in initial clusters in American English words produced by children and adults were studied. Words were spoken in isolation and in sentence context by eleven three-and four-year-old children, and by a male and female adult. Spectrograms were made and VOT duration measurements taken. Three experienced listeners transcribed the isolated words and sentences. Analyses showed that overall timing characteristics were similar for children and adults. Speakers differed in their [±voice] boundary and there was no absolute time distinction between [±voice] stops; [+voice] stops showed less variability than [−voice]. VOT generally increased from labial to dental to velar clusters, and was shorter in sentence context and longer in clusters than in singletons. Children's VOT averages were generally, but not significantly, longer than adults' in all contexts, and coarticulation constraints affected the accuracy with which children produced the stop and liquid portion of a particular cluster.


Sign in / Sign up

Export Citation Format

Share Document