Mandarin-English bilingual phone modeling and combining MPE based Discriminative training for cross-language speech recognition

Abstract This study examined the development of vowel categories in young Mandarin -English bilingual children. The participants included 35 children aged between 3 and 4 years old (15 Mandarin-English bilinguals, six English monolinguals, and 14 Mandarin monolinguals). The bilingual children were divided into two groups: one group had a shorter duration (<1 year) of intensive immersion in English (Bi-low group) and one group had a longer duration (>1 year) of intensive immersion in English (Bi-high group). The participants were recorded producing one list of Mandarin words containing the vowels /a, i, u, y, ɤ/ and/or one list of English words containing the vowels /i, ɪ, e, ɛ, æ, u, ʊ, o, ɑ, ʌ/. Formant frequency values were extracted at five equidistant time locations (the 20–35–50–65–80% point) over the course of vowel duration. Cross-language and within-language comparisons were conducted on the midpoint formant values and formant trajectories. The results showed that children in the Bi-low group produced their English vowels into clusters and showed positional deviations from the monolingual targets. However, they maintained the phonetic features of their native vowel sounds well and mainly used an assimilatory process to organize the vowel systems. Children in the Bi-high group separated their English vowels well. They used both assimilatory and dissimilatory processes to construct and refine the two vowel systems. These bilingual children approximated monolingual English children to a better extent than the children in the Bi-low group. However, when compared to the monolingual peers, they demonstrated observable deviations in both L1 and L2.

Download Full-text

How Chinese–English Bilingual Fourth Graders Draw on Syntactic Awareness in Reading Comprehension: Within‐ and Cross‐Language Effects

Reading Research Quarterly ◽

10.1002/rrq.400 ◽

2021 ◽

Author(s):

Xiuhong Tong ◽

Joyce Lok Yin Kwan ◽

Shelley Xiuli Tong ◽

S. Hélène Deacon

Keyword(s):

Reading Comprehension ◽

Fourth Graders ◽

Syntactic Awareness ◽

English Bilingual ◽

Language Effects ◽

Cross Language

Download Full-text

Discriminative Training using Heterogeneous Feature Vector for Hindi Automatic Speech Recognition System

2017 International Conference on Computer and Applications (ICCA) ◽

10.1109/comapp.2017.8079777 ◽

2017 ◽

Cited By ~ 3

Author(s):

Mohit Dua ◽

Rajesh Kumar Aggarwal ◽

Mantosh Biswas

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Feature Vector ◽

Recognition System ◽

Discriminative Training ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

Heterogeneous Feature

Download Full-text

Language contact within the speaker: Phonetic variation and crosslinguistic influence

10.31219/osf.io/jhsfc ◽

2021 ◽

Author(s):

Khia A. Johnson ◽

Molly Babel

Keyword(s):

Language Contact ◽

Mutual Influence ◽

Spontaneous Speech ◽

Recent Model ◽

Phonetic Variation ◽

English Bilingual ◽

Speech Communities ◽

Crosslinguistic Influence ◽

Cross Language ◽

Language Influence

A recent model of sound change posits that the direction of change is determined, at least in part, by the distribution of variation within speech communities (Harrington, Kleber, Reubold, Schiel, & Stevens, 2018; Harrington & Schiel, 2017). We explore this model in the context of bilingual speech, asking whether the less variable language constrains phonetic variation in the more variable language, using a corpus of spontaneous speech from early Cantonese-English bilinguals (Johnson, Babel, Fong, & Yiu, 2020). As predicted, given the phonetic distributions of stop obstruents in Cantonese compared to English, intervocalic English /b d g/ were produced with less voicing for Cantonese-English bilinguals and word-final English /t k/ were more likely to be unreleased compared to spontaneous speech from two monolingual English control corpora (Pitt, Johnson, Hume, Kiesling, & Raymond, 2005; Swan, 2016). Cantonese phonology is more gradient in terms of voicing initial obstruents (Clumeck, Barton, Macken, & Huntington, 1981; W. Y. P. Wong, 2006) than permitting releases of final obstruents, which is categorically prohibited Bauer & Benedict (2011); Khouw & Ciocca (2006). Neither Cantonese-English bilingual initial voicing nor word-final stop release patterns were significantly impacted by language mode. These results provide evidence that the phonetic variation in crosslinguistically linked categories in bilingual speech is shaped by the distribution of phonetic variation within each language, thus suggesting a mechanistic account for why some segments are more susceptible to cross-language influence than others in studies of mutual influence.

Download Full-text

Discriminative training for speech recognition is compensating for statistical dependence in the HMM framework

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2012.6288979 ◽

2012 ◽

Cited By ~ 4

Author(s):

Dan Gillick ◽

Steven Wegmann ◽

Larry Gillick

Keyword(s):

Speech Recognition ◽

Discriminative Training ◽

Statistical Dependence

Download Full-text

Cross-Language End-to-End Speech Recognition Research Based on Transfer Learning for the Low-Resource Tujia Language

Symmetry ◽

10.3390/sym11020179 ◽

2019 ◽

Vol 11 (2) ◽

pp. 179 ◽

Cited By ~ 4

Author(s):

Chongchong Yu ◽

Yunbing Chen ◽

Yueqiao Li ◽

Meng Kang ◽

Shixuan Xu ◽

...

Keyword(s):

Speech Recognition ◽

Transfer Learning ◽

Short Term Memory ◽

Recognition System ◽

Language Recognition ◽

Low Resource ◽

End To End ◽

The Cross ◽

Hidden Layer ◽

Cross Language

To rescue and preserve an endangered language, this paper studied an end-to-end speech recognition model based on sample transfer learning for the low-resource Tujia language. From the perspective of the Tujia language international phonetic alphabet (IPA) label layer, using Chinese corpus as an extension of the Tujia language can effectively solve the problem of an insufficient corpus in the Tujia language, constructing a cross-language corpus and an IPA dictionary that is unified between the Chinese and Tujia languages. The convolutional neural network (CNN) and bi-directional long short-term memory (BiLSTM) network were used to extract the cross-language acoustic features and train shared hidden layer weights for the Tujia language and Chinese phonetic corpus. In addition, the automatic speech recognition function of the Tujia language was realized using the end-to-end method that consists of symmetric encoding and decoding. Furthermore, transfer learning was used to establish the model of the cross-language end-to-end Tujia language recognition system. The experimental results showed that the recognition error rate of the proposed model is 46.19%, which is 2.11% lower than the that of the model that only used the Tujia language data for training. Therefore, this approach is feasible and effective.

Download Full-text

Polish–English bilingual children overuse referential markers: MLU inflation in Polish-language narratives

First Language ◽

10.1177/0142723720933769 ◽

2020 ◽

pp. 014272372093376

Author(s):

Agnieszka Otwinowska ◽

Marcin Opacki ◽

Karolina Mieszkowska ◽

Marta Białecka-Pikul ◽

Zofia Wodniecka ◽

...

Keyword(s):

English Language ◽

Bilingual Children ◽

Overt Pronouns ◽

English Bilingual ◽

Cohesive Devices ◽

Dp Structure ◽

Polish Language ◽

The Uk ◽

Mean Length Of Utterance ◽

Cross Language

Polish and English differ in the surface realization of the underlying Determiner Phrase (DP): Polish lacks an article system, whereas English makes use of articles for both grammatical and pragmatic reasons. This difference has an impact on how referentiality is rendered in both languages. In this article, the authors investigate the use of referential markers by Polish–English bilingual children and Polish monolingual children. Using the LITMUS-MAIN picture stories, the authors collected speech samples of Polish–English bilinguals raised in the UK ( n = 92, mean age 5;7) and compared them with matched Polish monolinguals ( n = 92, mean age 5;7). The analyses revealed that the bilinguals’ mean length of utterance (MLU) in Polish was significantly higher than that of the monolinguals because the bilinguals produced significantly more referential markers (especially pronouns) which inflated their MLU. The authors posit that the non-standard referentiality used by the bilinguals in Polish is caused by cross-language transfer at the syntax–pragmatics interface. When producing narratives in Polish, Polish–English bilinguals overuse referential markers as cohesive devices in their stories, which is not ungrammatical, but pragmatically odd in Polish. Bilinguals tend to do this because they are immersed in English-language input, rich in overt pronouns. Thus, in the process of realizing the surface features of the Polish DP they partly rely on an underlying English DP structure.

Download Full-text