pronunciation variation Latest Research Papers

2021 ◽

Vol 2021 (1) ◽

Author(s):

Yanhua Long ◽

Shuang Wei ◽

Jie Lian ◽

Yijie Li

Keyword(s):

Speech Recognition ◽

Code Switching ◽

Training Data ◽

Acoustic Data ◽

Pronunciation Variation ◽

Learning Framework ◽

Switching Property ◽

Recognition Systems ◽

Language Universal ◽

Heavy Training

AbstractCode-switching (CS) refers to the phenomenon of using more than one language in an utterance, and it presents great challenge to automatic speech recognition (ASR) due to the code-switching property in one utterance, the pronunciation variation phenomenon of the embedding language words and the heavy training data sparse problem. This paper focuses on the Mandarin-English CS ASR task. We aim at dealing with the pronunciation variation and alleviating the sparse problem of code-switches by using pronunciation augmentation methods. An English-to-Mandarin mix-language phone mapping approach is first proposed to obtain a language-universal CS lexicon. Based on this lexicon, an acoustic data-driven lexicon learning framework is further proposed to learn new pronunciations to cover the accents, mis-pronunciations, or pronunciation variations of those embedding English words. Experiments are performed on real CS ASR tasks. Effectiveness of the proposed methods are examined on all of the conventional, hybrid, and the recent end-to-end speech recognition systems. Experimental results show that both the learned phone mapping and augmented pronunciations can significantly improve the performance of code-switching speech recognition.

Download Full-text

Analysis of vowel addition or deletion in Continuous Speech

Global Journal of Engineering and Technology Advances ◽

10.30574/gjeta.2021.7.3.0084 ◽

2021 ◽

Vol 7 (3) ◽

pp. 136-143

Author(s):

Balakrishnan Sivakumar ◽

Praveen Kadakola Biligirirangaiah

Keyword(s):

Machine Learning ◽

Signal Processing ◽

Recognition Performance ◽

Continuous Speech ◽

Pronunciation Variation ◽

Word Level ◽

Sentence Level

In order to improve the recognition performance, the articulation of the transcription is very important in the process of training. For continuous speech, the essential characteristics of various speakers are pronunciation variation, over focused or inadequately highlighted words can results the waveform misalignment in the sub word unit margin. Because of the deviation in the articulation leads into misalignment when this is compared with articulation dictionary. So the deletion or insertion of the sub word is necessary. This happens because for each expression, the transcription is not precise. This paper presents the corrections in the transcription at the sub word level utilizing sound prompts that are presented in the waveform. The transcription of a word is fixed Utilizing sentence-level transcriptions with reference to the phonemes that create the word. Specifically, it clarifies that vowels are either deleted or inserted. To help the proposed contention, errors in persistent discourse are validated utilizing machine learning and signal processing tools. A programmed information driven annotator abusing the inductions drawn from the examination is utilized to address transcription errors. The outcomes show that rectified pronunciations lead to higher probability for train expressions in the TIMIT corpus.

Download Full-text

The Influence of Regional Pronunciation Variation on Children’s Spelling and the Potential Benefits of Accent Adapted Spellcheckers

10.18653/v1/2021.conll-1.52 ◽

2021 ◽

Author(s):

Emma O’Neill ◽

Joe Kenny ◽

Anthony Ventresque ◽

Julie Carson-Berndsen

Keyword(s):

Pronunciation Variation ◽

Potential Benefits

Download Full-text

Investigating the effects of gender, dialect, and training size on the performance of Arabic speech recognition

Language Resources and Evaluation ◽

10.1007/s10579-020-09505-5 ◽

2020 ◽

Vol 54 (4) ◽

pp. 975-998

Author(s):

Eiman Alsharhan ◽

Allan Ramsay

Keyword(s):

Speech Recognition ◽

Arabic Language ◽

Learning Curves ◽

Training Data ◽

Language Resources ◽

Acoustic Models ◽

Pronunciation Variation ◽

Register Variation ◽

Fine Grained ◽

Dialectal Arabic

Abstract Research in Arabic automatic speech recognition (ASR) is constrained by datasets of limited size, and of highly variable content and quality. Arabic-language resources vary in the attributes that affect language resources in other languages (noise, channel, speaker, genre), but also vary significantly in the dialect and level of formality of the spoken Arabic they capture. Many languages suffer similar levels of cross-dialect and cross-register acoustic variability, but these effects have been under-studied. This paper is an experimental analysis of the interaction between classical ASR corpus-compensation methods (feature selection, data selection, gender-dependent acoustic models) and the dialect-dependent/register-dependent variation among Arabic ASR corpora. The first interaction studied in this paper is that between acoustic recording quality and discrete pronunciation variation. Discrete pronunciation variation can be compensated by using grapheme-based instead of phone-based acoustic models, and by filtering out speakers with insufficient training data; the latter technique also helps to compensate for poor recording quality, which is further compensated by eliminating delta-delta acoustic features. All three techniques, together, reduce Word Error Rate (WER) by between 3.24% and 5.35%. The second aspect of dialect and register variation to be considered is variation in the fine-grained acoustic pronunciations of each phoneme in the language. Experimental results prove that gender and dialect are the principal components of variation in speech, therefore, building gender and dialect-specific models leads to substantial decreases in WER. In order to further explore the degree of acoustic differences between phone models required for each of the dialects of Arabic, cross-dialect experiments are conducted to measure how far apart Arabic dialects are acoustically in order to make a better decision about the minimal number of recognition systems needed to cover all dialectal Arabic. Finally, the research addresses an important question: how much training data is needed for building efficient speaker-independent ASR systems? This includes developing some learning curves to find out how large must the training set be to achieve acceptable performance.

Download Full-text

Handling within-word and cross-word pronunciation variation for Arabic speech recognition (knowledge-based approach)

10.54216/jisiot.010202 ◽

2020 ◽

pp. 72-79

Author(s):

Ibrahim El El-Henawy ◽

◽

Marwa Abo Abo-Elazm

Keyword(s):

Speech Recognition ◽

Recognition Performance ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

Pronunciation Variation ◽

Knowledge Based ◽

Data Driven Approach ◽

Using Data ◽

Phonological Rules

Arabic is one of the phonetically complex languages, and the creation of accurate speech recognition system is a challengeable task. Phonetic dictionary is essential component in automatic speech recognition system (ASR). The pronunciation variations in Arabic are tangible and are investigated widely using data driven approach or knowledge based approach. The phonological rules are used to get the pronunciation of each word accurately to reduce the mismatch between the actual phoneme representation of the spoken words and ASR dictionary. Several studies in Arabic ASR system are conducted using different number of phonological rules. In this paper we focus on those rule that handle within-word pronunciation variation and cross-word pronunciation variation. The experimental results indicate that handling within-word pronunciation variation using phonological rule doesn’t enhance the recognition performance, but using these rules to handle cross-word variation provide a good performance.

Download Full-text

Pronunciation variation

Clear English Pronunciation ◽

10.4324/9780429347382-11 ◽

2019 ◽

pp. 55-61

Author(s):

Dick Smakman

Keyword(s):

Pronunciation Variation

Download Full-text

Performance of speech recognition unit considering morphological pronunciation variation

Phonetics and Speech Sciences ◽

10.13064/ksss.2018.10.4.111 ◽

2018 ◽

Vol 10 (4) ◽

pp. 111-119 ◽

Cited By ~ 1

Author(s):

Jeong-Uk Bang ◽

Sang-Hun Kim ◽

Oh-Wook Kwon

Keyword(s):

Speech Recognition ◽

Pronunciation Variation

Download Full-text

Quantifying lexical and pronunciation variation between three Arabic varieties*

Perspectives on Arabic Linguistics XXVII - Studies in Arabic Linguistics ◽

10.1075/sal.3.09abu ◽

2016 ◽

pp. 187-212 ◽

Cited By ~ 1

Author(s):

Mahmoud Abunasser ◽

Elabbas Benmamoun

Keyword(s):

Pronunciation Variation

Download Full-text

Implicit pronunciation variation model for automatic speech recognition

Machine Learning and Data Analysis ◽

10.21469/22233792.2.4.01 ◽

2016 ◽

Vol 2 (4) ◽

pp. 370-377 ◽

Cited By ~ 1

Author(s):

Chuchupal V. J.

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Pronunciation Variation ◽

Variation Model

Download Full-text

Context and paradigms

The Mental Lexicon ◽

10.1075/ml.10.3.01coh ◽

2015 ◽

Vol 10 (3) ◽

pp. 313-338 ◽

Cited By ~ 6

Author(s):

Clara Cohen

Keyword(s):

Current Work ◽

Yield Reduction ◽

Exemplar Model ◽

Word Form ◽

Pronunciation Variation ◽

Growing Body ◽

Word Forms ◽

Verbal Agreement

A small but growing body of research on English and Dutch has found that pronunciation of affixes in a word form is sensitive to paradigmatic probability – i.e., the probability of using that form over other words in the same morphological paradigm. Yet it remains unclear (a) how paradigmatic probability is best measured; (b) whether an increase in paradigmatic probability leads to phonetic enhancement or reduction; and (c) by what mechanism paradigmatic probability can affect pronunciation. The current work examines pronunciation variation of Russian verbal agreement suffixes. I show that there are two distinct patterns of variation, corresponding to two different measures of paradigmatic probability. One measure, pairwise paradigmatic probability, is associated with a pronunciation pattern that resembles phonetic enhancement. The second measure, lexeme paradigmatic probability, can show enhancement effects, but can also yield reduction effects more similar to those of contextual probability. I propose that these two patterns can be explained in an exemplar model of lexical storage. Reduction effects are the consequence of faster retrieval and encoding of an articulatory target, while effects that resemble enhancement result when the pronunciation target of both members of a pair of competing word forms is shifted towards the more frequent of two.

Download Full-text

pronunciation variation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Pronunciation augmentation for Mandarin-English code-switching speech recognition

Analysis of vowel addition or deletion in Continuous Speech

The Influence of Regional Pronunciation Variation on Children’s Spelling and the Potential Benefits of Accent Adapted Spellcheckers

Investigating the effects of gender, dialect, and training size on the performance of Arabic speech recognition

Handling within-word and cross-word pronunciation variation for Arabic speech recognition (knowledge-based approach)

Pronunciation variation

Performance of speech recognition unit considering morphological pronunciation variation

Quantifying lexical and pronunciation variation between three Arabic varieties*

Implicit pronunciation variation model for automatic speech recognition

Context and paradigms

Export Citation Format

pronunciation variationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Pronunciation augmentation for Mandarin-English code-switching speech recognition

Analysis of vowel addition or deletion in Continuous Speech

The Influence of Regional Pronunciation Variation on Children’s Spelling and the Potential Benefits of Accent Adapted Spellcheckers

Investigating the effects of gender, dialect, and training size on the performance of Arabic speech recognition

Handling within-word and cross-word pronunciation variation for Arabic speech recognition (knowledge-based approach)

Pronunciation variation

Performance of speech recognition unit considering morphological pronunciation variation

Quantifying lexical and pronunciation variation between three Arabic varieties*

Implicit pronunciation variation model for automatic speech recognition

Context and paradigms

pronunciation variation
Recently Published Documents