Mandarin-English bilingual phone modeling and combining MPE based Discriminative training for cross-language speech recognition

Author(s):  
Yanmin Qian ◽  
Jia Liu
Phonetica ◽  
2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Jing Yang

Abstract This study examined the development of vowel categories in young Mandarin -English bilingual children. The participants included 35 children aged between 3 and 4 years old (15 Mandarin-English bilinguals, six English monolinguals, and 14 Mandarin monolinguals). The bilingual children were divided into two groups: one group had a shorter duration (<1 year) of intensive immersion in English (Bi-low group) and one group had a longer duration (>1 year) of intensive immersion in English (Bi-high group). The participants were recorded producing one list of Mandarin words containing the vowels /a, i, u, y, ɤ/ and/or one list of English words containing the vowels /i, ɪ, e, ɛ, æ, u, ʊ, o, ɑ, ʌ/. Formant frequency values were extracted at five equidistant time locations (the 20–35–50–65–80% point) over the course of vowel duration. Cross-language and within-language comparisons were conducted on the midpoint formant values and formant trajectories. The results showed that children in the Bi-low group produced their English vowels into clusters and showed positional deviations from the monolingual targets. However, they maintained the phonetic features of their native vowel sounds well and mainly used an assimilatory process to organize the vowel systems. Children in the Bi-high group separated their English vowels well. They used both assimilatory and dissimilatory processes to construct and refine the two vowel systems. These bilingual children approximated monolingual English children to a better extent than the children in the Bi-low group. However, when compared to the monolingual peers, they demonstrated observable deviations in both L1 and L2.


2021 ◽  
Author(s):  
Khia A. Johnson ◽  
Molly Babel

A recent model of sound change posits that the direction of change is determined, at least in part, by the distribution of variation within speech communities (Harrington, Kleber, Reubold, Schiel, &amp; Stevens, 2018; Harrington &amp; Schiel, 2017). We explore this model in the context of bilingual speech, asking whether the less variable language constrains phonetic variation in the more variable language, using a corpus of spontaneous speech from early Cantonese-English bilinguals (Johnson, Babel, Fong, &amp; Yiu, 2020). As predicted, given the phonetic distributions of stop obstruents in Cantonese compared to English, intervocalic English /b d g/ were produced with less voicing for Cantonese-English bilinguals and word-final English /t k/ were more likely to be unreleased compared to spontaneous speech from two monolingual English control corpora (Pitt, Johnson, Hume, Kiesling, &amp; Raymond, 2005; Swan, 2016). Cantonese phonology is more gradient in terms of voicing initial obstruents (Clumeck, Barton, Macken, &amp; Huntington, 1981; W. Y. P. Wong, 2006) than permitting releases of final obstruents, which is categorically prohibited Bauer &amp; Benedict (2011); Khouw &amp; Ciocca (2006). Neither Cantonese-English bilingual initial voicing nor word-final stop release patterns were significantly impacted by language mode. These results provide evidence that the phonetic variation in crosslinguistically linked categories in bilingual speech is shaped by the distribution of phonetic variation within each language, thus suggesting a mechanistic account for why some segments are more susceptible to cross-language influence than others in studies of mutual influence.


Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 179 ◽  
Author(s):  
Chongchong Yu ◽  
Yunbing Chen ◽  
Yueqiao Li ◽  
Meng Kang ◽  
Shixuan Xu ◽  
...  

To rescue and preserve an endangered language, this paper studied an end-to-end speech recognition model based on sample transfer learning for the low-resource Tujia language. From the perspective of the Tujia language international phonetic alphabet (IPA) label layer, using Chinese corpus as an extension of the Tujia language can effectively solve the problem of an insufficient corpus in the Tujia language, constructing a cross-language corpus and an IPA dictionary that is unified between the Chinese and Tujia languages. The convolutional neural network (CNN) and bi-directional long short-term memory (BiLSTM) network were used to extract the cross-language acoustic features and train shared hidden layer weights for the Tujia language and Chinese phonetic corpus. In addition, the automatic speech recognition function of the Tujia language was realized using the end-to-end method that consists of symmetric encoding and decoding. Furthermore, transfer learning was used to establish the model of the cross-language end-to-end Tujia language recognition system. The experimental results showed that the recognition error rate of the proposed model is 46.19%, which is 2.11% lower than the that of the model that only used the Tujia language data for training. Therefore, this approach is feasible and effective.


2020 ◽  
pp. 014272372093376
Author(s):  
Agnieszka Otwinowska ◽  
Marcin Opacki ◽  
Karolina Mieszkowska ◽  
Marta Białecka-Pikul ◽  
Zofia Wodniecka ◽  
...  

Polish and English differ in the surface realization of the underlying Determiner Phrase (DP): Polish lacks an article system, whereas English makes use of articles for both grammatical and pragmatic reasons. This difference has an impact on how referentiality is rendered in both languages. In this article, the authors investigate the use of referential markers by Polish–English bilingual children and Polish monolingual children. Using the LITMUS-MAIN picture stories, the authors collected speech samples of Polish–English bilinguals raised in the UK ( n = 92, mean age 5;7) and compared them with matched Polish monolinguals ( n = 92, mean age 5;7). The analyses revealed that the bilinguals’ mean length of utterance (MLU) in Polish was significantly higher than that of the monolinguals because the bilinguals produced significantly more referential markers (especially pronouns) which inflated their MLU. The authors posit that the non-standard referentiality used by the bilinguals in Polish is caused by cross-language transfer at the syntax–pragmatics interface. When producing narratives in Polish, Polish–English bilinguals overuse referential markers as cohesive devices in their stories, which is not ungrammatical, but pragmatically odd in Polish. Bilinguals tend to do this because they are immersed in English-language input, rich in overt pronouns. Thus, in the process of realizing the surface features of the Polish DP they partly rely on an underlying English DP structure.


Sign in / Sign up

Export Citation Format

Share Document