Electromyogram Based Prediction of Spoken Syllable Duration

Kiyotaka MIYASAKA; Yuji SAKAMOTO; Takahiro YAMANOI

doi:10.3156/jsoft.33.3_718

Distinct developmental profiles in typical speech acquisition

Journal of Neurophysiology ◽

10.1152/jn.00337.2010 ◽

2012 ◽

Vol 107 (10) ◽

pp. 2885-2900 ◽

Cited By ~ 11

Author(s):

Jennell C. Vick ◽

Thomas F. Campbell ◽

Lawrence D. Shriberg ◽

Jordan R. Green ◽

Hervé Abdi ◽

...

Keyword(s):

Speech Development ◽

Developmental Pathways ◽

Subgroup Discovery ◽

Typically Developing ◽

Speech Acquisition ◽

Typically Developing Children ◽

Lower Lip ◽

Syllable Duration ◽

High Level ◽

Syllable Stress

Three- to five-year-old children produce speech that is characterized by a high level of variability within and across individuals. This variability, which is manifest in speech movements, acoustics, and overt behaviors, can be input to subgroup discovery methods to identify cohesive subgroups of speakers or to reveal distinct developmental pathways or profiles. This investigation characterized three distinct groups of typically developing children and provided normative benchmarks for speech development. These speech development profiles, identified among 63 typically developing preschool-aged speakers (ages 36–59 mo), were derived from the children's performance on multiple measures. These profiles were obtained by submitting to a k-means cluster analysis of 72 measures that composed three levels of speech analysis: behavioral (e.g., task accuracy, percentage of consonants correct), acoustic (e.g., syllable duration, syllable stress), and kinematic (e.g., variability of movements of the upper lip, lower lip, and jaw). Two of the discovered group profiles were distinguished by measures of variability but not by phonemic accuracy; the third group of children was characterized by their relatively low phonemic accuracy but not by an increase in measures of variability. Analyses revealed that of the original 72 measures, 8 key measures were sufficient to best distinguish the 3 profile groups.

Get full-text (via PubEx)

Lack of Syllable Duration as a Post-Lexical Acoustic Cue in Spanish in Contact with Maya

Languages ◽

10.3390/languages4040084 ◽

2019 ◽

Vol 4 (4) ◽

pp. 84

Author(s):

Nuria Martínez García ◽

Melanie Uth

Keyword(s):

Native Speakers ◽

Mixed Effects ◽

Mixed Effects Models ◽

Yucatec Maya ◽

Linear Mixed Effects Models ◽

Linear Mixed Effects ◽

Contrastive Focus ◽

Bilingual Speakers ◽

Syllable Duration ◽

Focus Marking

This paper focuses on the duration of stressed syllables in broad versus contrastive focus in Yucatecan Spanish and examines its connection with Spanish–Maya bilingualism. We examine the claim that phonemic vowel length in one language prevents the use of syllable duration as a post-lexical acoustic cue in another. We study the duration of stressed syllables of nouns in subject and object position in subject-verb-object (SVO) sentences (broad and contrastive focus) of a semi-spontaneous production task. One thousand one hundred and twenty-six target syllables of 34 mono- and bilingual speakers were measured and submitted to linear mixed-effects models. Although the target syllables were slightly longer in contrastive focus, duration was not significant, nor was the effect of bilingualism. The results point to duration not constituting a cue to focus marking in Yucatecan Spanish. Finally, it is discussed how this result relates to the strong influence of Yucatec Maya on Yucatecan Spanish prosody observed by both scholars and native speakers of Yucatecan Spanish and other Mexican varieties of Spanish.

Get full-text (via PubEx)

Effects of syllable duration on the perception of the Mandarin Tone 2/Tone 3 distinction: evidence of auditory enhancement

Journal of Phonetics ◽

10.1016/s0095-4470(19)30357-2 ◽

1990 ◽

Vol 18 (1) ◽

pp. 37-49 ◽

Cited By ~ 63

Author(s):

Deborah L. Blicher ◽

Randy L. Diehl ◽

Leslie B. Cohen

Keyword(s):

Auditory Enhancement ◽

Syllable Duration ◽

Mandarin Tone

Get full-text (via PubEx)

Adaptive and selective production of syllable duration and fundamental frequency as word segmentation cues by French-English bilinguals

The Journal of the Acoustical Society of America ◽

10.1121/1.5134781 ◽

2019 ◽

Vol 146 (6) ◽

pp. 4255-4272

Author(s):

Annie C. Gilbert ◽

Max Wolpert ◽

Haruka Saito ◽

Shanna Kousaie ◽

Inbal Itzhak ◽

...

Keyword(s):

Fundamental Frequency ◽

Word Segmentation ◽

Syllable Duration

Get full-text (via PubEx)

The effect of focus on trisyllabic syllable duration in Mandarin

2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) ◽

10.1109/o-cocosda46868.2019.9041173 ◽

2019 ◽

Author(s):

Ziyu Xiong ◽

Qiguang Lin ◽

Maolin Wang ◽

Zhouyu Chen

Keyword(s):

Syllable Duration

Get full-text (via PubEx)

A scheme of syllable duration prediction and F0-contour generation to synthesize Chinese speech

International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003 ◽

10.1109/icnnsp.2003.1280745 ◽

2003 ◽

Author(s):

Wei Feng ◽

Yunbiao Xu ◽

Li Zhao ◽

Y. Niimi

Keyword(s):

Syllable Duration ◽

F0 Contour ◽

Duration Prediction

Get full-text (via PubEx)

Acoustic correlates of rhythm in New Zealand English: A diachronic study

Language Variation and Change ◽

10.1017/s0954394512000051 ◽

2012 ◽

Vol 24 (1) ◽

pp. 1-31 ◽

Cited By ~ 7

Author(s):

Jacqui Nokes ◽

Jennifer Hay

Keyword(s):

New Zealand ◽

Large Scale ◽

Speech Rate ◽

Intensity Variation ◽

Acoustic Correlates ◽

Syllable Duration ◽

Diachronic Study ◽

Variability Index ◽

The Mean ◽

Vowel Shift

AbstractThis paper reports on a large-scale diachronic investigation into the timing of New Zealand English (NZE), which points to changes in its rhythmic structure. The Pairwise Variability Index (PVI) was used to measure the mean variation in duration, intensity, and pitch of successive vowels in the speech of over 500 New Zealanders, born between 1851 and 1988. Normalized vocalic PVIs for duration have reduced over time, after allowing for changes in speech rate, supporting existing findings that stressed and unstressed vowels are less differentiated by duration in modern NZE than in other varieties of English. Rhythmically, syllable duration may be playing a reduced role in signalling prominence in NZE. This is supported by the finding that there have been contemporaneous changes in pitch and intensity variation. We discuss external and internal influences on the timing of NZE, including contact with Māori, the emergence of Māori English, and diachronic vowel shift.

Get full-text (via PubEx)

Speech Recognition Using Syllable Duration Ratio Model

2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings ◽

10.1109/icassp.2006.1660027 ◽

2006 ◽

Author(s):

M. Ariu ◽

T. Masuko ◽

S. Tanaka ◽

A. Kawamura

Keyword(s):

Speech Recognition ◽

Ratio Model ◽

Syllable Duration

Get full-text (via PubEx)

Erratum: Syllable Duration and Pausing in the Speech of Chinese ESL Speakers

TESOL Quarterly ◽

10.2307/3587818 ◽

1995 ◽

Vol 29 (1) ◽

pp. 196

Keyword(s):

Syllable Duration

Get full-text (via PubEx)

Phonetics of Singing in Western Classical Style

Oxford Research Encyclopedia of Linguistics ◽

10.1093/acrefore/9780199384655.013.412 ◽

2018 ◽

Author(s):

Johan Sundberg

Keyword(s):

Fundamental Frequency ◽

Vocal Fold ◽

Vocal Tract ◽

Formant Frequency ◽

Vowel Identification ◽

Fundamental Frequencies ◽

Vocal Fold Vibration ◽

Syllable Duration ◽

Classical Singing ◽

The Voice

The function of the voice organ is basically the same in classical singing as in speech. However, loud orchestral accompaniment has necessitated the use of the voice in an economical way. As a consequence, the vowel sounds tend to deviate considerably from those in speech. Male voices cluster formant three, four, and five, so that a marked peak is produced in spectrum envelope near 3,000 Hz. This helps them to get heard through a loud orchestral accompaniment. They seem to achieve this effect by widening the lower pharynx, which makes the vowels more centralized than in speech. Singers often sing at fundamental frequencies higher than the normal first formant frequency of the vowel in the lyrics. In such cases they raise the first formant frequency so that it gets somewhat higher than the fundamental frequency. This is achieved by reducing the degree of vocal tract constriction or by widening the lip and jaw openings, constricting the vocal tract in the pharyngeal end and widening it in the mouth. These deviations from speech cause difficulties in vowel identification, particularly at high fundamental frequencies. Actually, vowel identification is almost impossible above 700 Hz (pitch F5). Another great difference between vocal sound produced in speech and the classical singing tradition concerns female voices, which need to reduce the timbral differences between voice registers. Females normally speak in modal or chest register, and the transition to falsetto tends to happen somewhere above 350 Hz. The great timbral differences between these registers are avoided by establishing control over the register function, that is, over the vocal fold vibration characteristics, so that seamless transitions are achieved. In many other respects, there are more or less close similarities between speech and singing. Thus, marking phrase structure, emphasizing important events, and emotional coloring are common principles, which may make vocal artists deviate considerably from the score’s nominal description of fundamental frequency and syllable duration.

Get full-text (via PubEx)