chinese dialects
Recently Published Documents


TOTAL DOCUMENTS

200
(FIVE YEARS 71)

H-INDEX

11
(FIVE YEARS 2)

Author(s):  
Fan Xu ◽  
Yangjie Dan ◽  
Keyu Yan ◽  
Yong Ma ◽  
Mingwen Wang

Chinese dialects discrimination is a challenging natural language processing task due to scarce annotation resource. In this article, we develop a novel Chinese dialects discrimination framework with transfer learning and data augmentation (CDDTLDA) in order to overcome the shortage of resources. To be more specific, we first use a relatively larger Chinese dialects corpus to train a source-side automatic speech recognition (ASR) model. Then, we adopt a simple but effective data augmentation method (i.e., speed, pitch, and noise disturbance) to augment the target-side low-resource Chinese dialects, and fine-tune another target ASR model based on the previous source-side ASR model. Meanwhile, the potential common semantic features between source-side and target-side ASR models can be captured by using self-attention mechanism. Finally, we extract the hidden semantic representation in the target ASR model to conduct Chinese dialects discrimination. Our extensive experimental results demonstrate that our model significantly outperforms state-of-the-art methods on two benchmark Chinese dialects corpora.


2021 ◽  
Vol 23 (1) ◽  
pp. 4-19
Author(s):  
Jiangping Kong

Abstract This paper mainly studies phonemic cognitive ability through the databases of living spoken languages in the Sino-Tibetan languages including 20 Chinese dialects, 6 Tibetan dialects, 5 Miao dialects, Mian, Zhuang, Thai, Li, Dai, Yi, Burmese, Zaiwa, and, Achang. The methods of statistics and information entropy and the concepts of the actual syllabic space, the syllabic theoretical space and redundancy rate are used and proposed in this paper. The results show that: (1) statistical methods can be used in the study of phonemic cognition; (2) the actual syllabic space in spoken Sino-Tibetan languages reflects the man’s phonemic cognitive ability; (3) the theoretical syllabic space composed of initial, final, and tone in the Sino-Tibetan languages reflects the dynamic process of a phoneme system in language contact and evolution; (4) a redundancy rate of 60% is the bottom limit in oral communication in the Sino-Tibetan languages. Therefore, the conclusion of this study is that Active Syllable Average Limit 1,000 not only reflects man’s phonemic cognitive ability, but also reflects the interdependence of phonemic cognition and semantic cognition, and reveals an important link in the process of a language chain from semantic to phonemic transformation, which has important theoretical significance in the study of language cognition.


2021 ◽  
Vol 23 (1) ◽  
pp. 20-46
Author(s):  
Ling Zhang

Abstract Cantonese is a syllable-timed language: that is, the syllable is the isochronous unit of speech. However, in Cantonese, there is a type of closed syllable with the stop codas [-p], [-t], or [-k] (i.e. syllables with the so called “entering-tones”) which sound much shorter than other syllables. On the surface, the shorter duration of stop syllables and the general prosodic feature of syllable-isochrony seem to conflict. This study conducted acoustic investigations of stop syllables in Cantonese in different contexts (i.e. in isolated form, in disyllabic words, and in disyllabic words located at the beginning, middle, and final positions of sentences). The results showed that stop syllables alone are shorter than non-stop syllables in various contexts. However, in disyllabic words or in sentences, there is a supplementary lengthening effect immediately after the stop syllables: there is more acoustic blank, and in some circumstances the initial of the following syllable is lengthened. Therefore, we propose that the phonetic realization of syllable isochrony is beyond the syllable itself in Cantonese. The results and discussions of this study may also shed light on the problem of the disappearance of “entering tones” from various Chinese dialects.


2021 ◽  
Vol 5 (Supplement_1) ◽  
pp. 200-200
Author(s):  
XinQi Dong ◽  
Dexia Kong

Abstract This paper aims to describe study design of the unique dyadic older Chinese American-adult children dataset, and present sample characteristics of the dyads. A total of 807 older parents were matched with their adult children (characteristics of matched versus not matched participants will be compared). On average, adult children were 48 years old, had 12 years of education, lived with 3 persons in household, had 2 children, and lived in U.S. for 17 years. Approximately 65% of the adult children sample were female, 82% married, 93% preferred to speak Chinese dialects, and over 97% foreign-born immigrants. On the other hand, older parents were 74 years old, had 7 years of education, lived with 3 persons in household, had 3 children, and lived in U.S. for 17 years on average. About 60% of the older parent sample were female, 73% married, over 99% foreign-born immigrants who preferred to speak Chinese dialects.


2021 ◽  
Vol 29 (1) ◽  
pp. 161-171
Author(s):  
Min Wang

Abstract This study examines the ability to identify different Chinese dialects through the English language and evaluates how often respondents pay attention to phonological features and rate of speech to explain their categorizations. The research includes 100 Chinese undergraduate students and 100 young people without advanced degrees aged 20 to 25. Discrete independent data samples collected during the interview of participants are analyzed with the help of such statistical methods as Student's t-test, Mann-Whitney U-test, and Wilcoxon's test. The obtained results indirectly show the ability of respondents to identify native and non-native English speakers around the world, as well as determine their nationality. The outcomes of the paper explicate who, in general, categorize Chinese dialects better and which dialects are the most recognizable. Research data reveal a high degree of stereotypization of various dialects, especially the Beijing and U dialects. Moreover, based on the data obtained, it can be concluded that speaking rate significantly affects the perception and classification of a speaker from a particular province of China.


2021 ◽  
Vol 50 (2) ◽  
pp. 135-206
Author(s):  
Giorgio Francesco ARCODIA

Abstract The received view that the differences among Sinitic languages are mostly limited to their phonology and, to a lesser extent, to the lexicon (Chao 1968), has been challenged in recent years, with plenty of studies showing that Chinese ‘dialects’ are, indeed, diverse at all levels, including morphology and (morpho-)syntax (see Chappell 2015a for an overview). Some major differences within the Sinitic branch follow areal patterns, in which contact is often claimed to play a crucial role. In our contribution, we would like to propose that there is an area within Northern China, spread over the Shanxi, Henan, Hebei, and Shandong provinces, in which we find Sinitic languages possessing some features not seen (or, at least, uncommon) elsewhere. These include: 1. reduced/nonconcatenative morphology (see Arcodia 2013, 2015; Lamarre 2015); 2. object markers based on speech act verbs (see Chappell 2013); and 3. structural particles with an l-initial (see Chen A. 2013, a.o.). Based on our own survey of a sample of 96 dialects, we shall discuss the distribution of these features, as well as their possible origins.


Author(s):  
Amelia Amanda ◽  
Anggraeni Anggraeni ◽  
Retno Purnama Irawati ◽  
Ria Riski Marsuki

Bahasa Mandarin merupakan bahasa dengan penutur terbanyak di dunia, termasuk merupakan bahasa nasional yang digunakan di Taiwan. Meskipun berasal dari sumber yang sama yaitu Beifanghua, namun terdapat perbedaan yang dapat dijumpai diantara keduanya terutama pada aspek fonologi dan leksikal. Untuk itu peneliti melakukan penelitian dengan menggunakan sumber data berupa film untuk membahas perbedaan yang ditemukan dalam film tersebut.Tujuan dari penelitian ini yaitu: (1) Mendeskripsikan perbedaan fonologis dialek Mandarin Tiongkok dan dialek Mandarin Taiwan yang ditemukan di dalam film, (2) Mendeskripsikan perbedaan leksikal dialek Mandarin Tiongkok dan dialek Mandarin Taiwan yang ditemukan di dalam film. Hasil penelitian dari total 85 kosa kata data fonologi yang ditemukan dalam film The Ex-File 3 : The Return Of The Exes (Tiongkok) dan film Our Times (Taiwan) ditemukan perubahan konsonan dan nada yang meliputi konsonan zh [tʂ], ch [tʂ‘], sh [ʂ] dimana dialek Mandarin Taiwan pelafalannya menyerupai konsonan z [ts], c [ts‘] dan s [s], perubahan konsonan r [ʐ] menjadi l [l] dan pengurangan konsonan g [k] pada dialek Mandarin Taiwan, serta perubahan nada dimana dialek Mandarin Tiongkok didominasi oleh nada ringan sedangkan dialek Mandarin Taiwan lebih bervariasi tanpa mengubah arti kata.Mandarin is the most spoken language in the world, including the national language spoken in Taiwan. Even though they come from the same source, namely Beifanghua, there are differences that can be found between the two, especially in the phonological and lexical aspects. For this reason, researchers conducted research using a data source in the form of a film to discuss the differences found in the film.The objectives of this study were: (1) Describing the phonological differences between Mandarin Chinese dialects and Mandarin Taiwanese dialects found in the film, (2) Describing the lexical differences between Mandarin Chinese dialects and Mandarin Taiwanese dialects found in the film. The research results from a total of 85 vocabulary words of phonological data found in The Ex-File 3: The Return of the Exes (China) and Our Times (Taiwan) films found changes in consonants and tones including the consonants zh [tʂ], ch [tʂ '], sh [ʂ] where the Chinese Taiwanese dialect is pronounced like the consonant z [ts], c [ts'] and s [s], changes the consonant r [ʐ] to l [l] and the reduction of the consonant g [k] in the dialect Mandarin Taiwan, as well as the change in tone where the Mandarin Chinese dialect is dominated by light tones while the Mandarin Taiwan dialect is more varied without changing the meaning of the word. 


Sign in / Sign up

Export Citation Format

Share Document