Improving N-gram language modeling for code-switching speech recognition

Author(s):  
Zhiping Zeng ◽  
Haihua Xu ◽  
Tze Yuang Chong ◽  
Eng-Siong Chng ◽  
Haizhou Li
2012 ◽  
Vol 20 (2) ◽  
pp. 235-259 ◽  
Author(s):  
MARTHA YIFIRU TACHBELIE ◽  
SOLOMON TEFERRA ABATE ◽  
WOLFGANG MENZEL

AbstractThis paper presents morpheme-based language models developed for Amharic (a morphologically rich Semitic language) and their application to a speech recognition task. A substantial reduction in the out of vocabulary rate has been observed as a result of using subwords or morphemes. Thus a severe problem of morphologically rich languages has been addressed. Moreover, lower perplexity values have been obtained with morpheme-based language models than with word-based models. However, when comparing the quality based on the probability assigned to the test sets, word-based models seem to fare better. We have studied the utility of morpheme-based language models in speech recognition systems and found that the performance of a relatively small vocabulary (5k) speech recognition system improved significantly as a result of using morphemes as language modeling and dictionary units. However, as the size of the vocabulary increases (20k or more) the morpheme-based systems suffer from acoustic confusability and did not achieve a significant improvement over a word-based system with an equivalent vocabulary size even with the use of higher order (quadrogram) n-gram language models.


2021 ◽  
Vol 11 (6) ◽  
pp. 2866
Author(s):  
Damheo Lee ◽  
Donghyun Kim ◽  
Seung Yun ◽  
Sanghun Kim

In this paper, we propose a new method for code-switching (CS) automatic speech recognition (ASR) in Korean. First, the phonetic variations in English pronunciation spoken by Korean speakers should be considered. Thus, we tried to find a unified pronunciation model based on phonetic knowledge and deep learning. Second, we extracted the CS sentences semantically similar to the target domain and then applied the language model (LM) adaptation to solve the biased modeling toward Korean due to the imbalanced training data. In this experiment, training data were AI Hub (1033 h) in Korean and Librispeech (960 h) in English. As a result, when compared to the baseline, the proposed method improved the error reduction rate (ERR) by up to 11.6% with phonetic variant modeling and by 17.3% when semantically similar sentences were applied to the LM adaptation. If we considered only English words, the word correction rate improved up to 24.2% compared to that of the baseline. The proposed method seems to be very effective in CS speech recognition.


Author(s):  
Daoyuan Li ◽  
Tegawende F. Bissyande ◽  
Sylvain Kubler ◽  
Jacques Klein ◽  
Yves Le Traon

2013 ◽  
Author(s):  
Haşim Sak ◽  
Yun-hsuan Sung ◽  
Françoise Beaufays ◽  
Cyril Allauzen

2004 ◽  
Author(s):  
Dimitra Vergyri ◽  
Katrin Kirchhoff ◽  
Kevin Duh ◽  
Andreas Stolcke

10.5772/6380 ◽  
2008 ◽  
Author(s):  
Ebru Arsoy ◽  
Mikko Kurimo ◽  
Murat Saralar ◽  
Teemu Hirsimki ◽  
Janne Pylkknen ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document