Impact of pronunciation variation in speech recognition

2012 International Conference on Signal Processing and Communications (SPCOM) ◽

10.1109/spcom.2012.6290037 ◽

2012 ◽

Author(s):

R. Golda Brunet ◽

Hema A Murthy

Keyword(s):

Speech Recognition ◽

Pronunciation Variation

Download Full-text

Weighted finite-state transducer-based dysarthric speech recognition error correction using context-dependent pronunciation variation modelling

International Journal of Engineering Systems Modelling and Simulation ◽

10.1504/ijesms.2014.058418 ◽

2014 ◽

Vol 6 (1/2) ◽

pp. 4

Author(s):

Woo Kyeong Seong ◽

Ji Hun Park

Keyword(s):

Speech Recognition ◽

Error Correction ◽

Recognition Error ◽

Pronunciation Variation ◽

Finite State ◽

Finite State Transducer ◽

Dysarthric Speech ◽

Context Dependent

Download Full-text

Cross-word Arabic pronunciation variation modeling for speech recognition

International Journal of Speech Technology ◽

10.1007/s10772-011-9098-0 ◽

2011 ◽

Vol 14 (3) ◽

pp. 227-236 ◽

Author(s):

Dia AbuZeina ◽

Wasfi Al-Khatib ◽

Moustafa Elshafei ◽

Husni Al-Muhtaseb

Keyword(s):

Speech Recognition ◽

Pronunciation Variation

Download Full-text

Morpheme-Based Modeling of Pronunciation Variation for Large Vocabulary Continuous Speech Recognition in Korean

IEICE Transactions on Information and Systems ◽

10.1093/ietisy/e90-d.7.1063 ◽

2007 ◽

Vol E90-D (7) ◽

pp. 1063-1072 ◽

Author(s):

K.-N. LEE ◽

M. CHUNG

Keyword(s):

Speech Recognition ◽

Continuous Speech ◽

Continuous Speech Recognition ◽

Large Vocabulary ◽

Pronunciation Variation

Download Full-text

Investigating the effects of gender, dialect, and training size on the performance of Arabic speech recognition

Language Resources and Evaluation ◽

10.1007/s10579-020-09505-5 ◽

2020 ◽

Vol 54 (4) ◽

pp. 975-998

Author(s):

Eiman Alsharhan ◽

Allan Ramsay

Keyword(s):

Speech Recognition ◽

Arabic Language ◽

Learning Curves ◽

Training Data ◽

Language Resources ◽

Acoustic Models ◽

Pronunciation Variation ◽

Register Variation ◽

Fine Grained ◽

Dialectal Arabic

Abstract Research in Arabic automatic speech recognition (ASR) is constrained by datasets of limited size, and of highly variable content and quality. Arabic-language resources vary in the attributes that affect language resources in other languages (noise, channel, speaker, genre), but also vary significantly in the dialect and level of formality of the spoken Arabic they capture. Many languages suffer similar levels of cross-dialect and cross-register acoustic variability, but these effects have been under-studied. This paper is an experimental analysis of the interaction between classical ASR corpus-compensation methods (feature selection, data selection, gender-dependent acoustic models) and the dialect-dependent/register-dependent variation among Arabic ASR corpora. The first interaction studied in this paper is that between acoustic recording quality and discrete pronunciation variation. Discrete pronunciation variation can be compensated by using grapheme-based instead of phone-based acoustic models, and by filtering out speakers with insufficient training data; the latter technique also helps to compensate for poor recording quality, which is further compensated by eliminating delta-delta acoustic features. All three techniques, together, reduce Word Error Rate (WER) by between 3.24% and 5.35%. The second aspect of dialect and register variation to be considered is variation in the fine-grained acoustic pronunciations of each phoneme in the language. Experimental results prove that gender and dialect are the principal components of variation in speech, therefore, building gender and dialect-specific models leads to substantial decreases in WER. In order to further explore the degree of acoustic differences between phone models required for each of the dialects of Arabic, cross-dialect experiments are conducted to measure how far apart Arabic dialects are acoustically in order to make a better decision about the minimal number of recognition systems needed to cover all dialectal Arabic. Finally, the research addresses an important question: how much training data is needed for building efficient speaker-independent ASR systems? This includes developing some learning curves to find out how large must the training set be to achieve acceptable performance.

Download Full-text

MLLR/MAP adaptation using pronunciation variation for non-native speech recognition

2009 IEEE Workshop on Automatic Speech Recognition & Understanding ◽

10.1109/asru.2009.5373299 ◽

2009 ◽

Author(s):

Yoo Rhee Oh ◽

Hong Kook Kim

Keyword(s):

Speech Recognition ◽

Pronunciation Variation ◽

Map Adaptation ◽

Download Full-text

Modeling Syllable-Based Pronunciation Variation for Accented Mandarin Speech Recognition

2010 20th International Conference on Pattern Recognition ◽

10.1109/icpr.2010.397 ◽

2010 ◽

Author(s):

Shilei Zhang ◽

Qin Shi ◽

Yong Qin

Keyword(s):

Speech Recognition ◽

Pronunciation Variation ◽

Mandarin Speech Recognition

Download Full-text

Non-Native Pronunciation Variation Modeling for Automatic Speech Recognition

Advances in Speech Recognition ◽

10.5772/10112 ◽

2010 ◽

Author(s):

Hong Kook ◽

Mina Kim ◽

Yoo Rhee

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Pronunciation Variation

Download Full-text

Probabilistic Pronunciation Variation Model Based on Bayesian Network for Conversational Speech Recognition

2008 Second International Symposium on Universal Communication ◽

10.1109/isuc.2008.33 ◽

2008 ◽

Author(s):

Sakriani Sakti ◽

Konstantin Markov ◽

Satoshi Nakamura

Keyword(s):

Speech Recognition ◽

Bayesian Network ◽

Conversational Speech ◽

Pronunciation Variation ◽

Model Based ◽

Variation Model

Download Full-text

Implicit pronunciation variation model for automatic speech recognition

Machine Learning and Data Analysis ◽

10.21469/22233792.2.4.01 ◽

2016 ◽

Vol 2 (4) ◽

pp. 370-377 ◽

Author(s):

Chuchupal V. J.

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Pronunciation Variation ◽

Variation Model

Download Full-text

Incorporating linguistic theories of pronunciation variation into speech–recognition models

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2000.0589 ◽

2000 ◽

Vol 358 (1769) ◽

pp. 1325-1338 ◽

Author(s):

Mari Ostendorf

Keyword(s):

Speech Recognition ◽

Pronunciation Variation

Download Full-text