Topic-Specific Language Model Based on Graph Spectral Approach for Speech Recognition

Author(s):  
Shinya Takahashi
2021 ◽  
Author(s):  
Cheng Yi ◽  
Jianzong Wang ◽  
Ning Cheng ◽  
Shiyu Zhou ◽  
Bo Xu

With the Internet and the World Wide Web revolution, large corpora in variety of forms are germinating ceaselessly that can be manifested as big data. One obligatory area for the usage of such large corpora is language modeling for large vocabulary continuous speech recognition. Language modeling is an indispensable module in speech recognition architecture, which plays a vital role in reducing the search space during the recognition process. Additionally, the language model that is contiguous to the domain of the speech can dwindle the search space and escalate the recognition accuracy. In this paper, an efficient searching mechanism for domain-specific document retrieval from the large corpora has been elucidated using Elasticsearch which is a distributed and an efficient search engine for big data. This assisted us in tuning the language model in accordance with the domain and also by reducing the search time by more than 90% in comparison to conventional search and retrieval mechanism used in our earlier work. A word level and a phrase level retrieval process for creating domain-specific language model has been implemented. The evaluation of the system is performed on the basis of word error rate (WER) and perplexity (PPL) of the speech recognition system. The results shows nearly 10% decrease on WER and a major reduction in the PPL that helped in boosting the performance of the speech recognition process. From the results, it can be consummated that Elasticsearch is an efficient mechanism for domain specific document retrieval from large corpora rather than using topic modeling toolkits


2016 ◽  
Vol 25 (02) ◽  
pp. 1650006
Author(s):  
Aleksander Smywinski-Pohl ◽  
Bartosz Ziółko

In this paper we investigate the usefulness of morphosyntactic information as well as clustering in modeling Polish for automatic speech recognition. Polish is an inflectional language, thus we investigate the usefulness of an N-gram model based on morphosyntactic features. We present how individual types of features influence the model and which types of features are best suited for building a language model for automatic speech recognition. We compared the results of applying them with a class-based model that is automatically derived from the training corpus. We show that our approach towards clustering performs significantly better than frequently used SRI LM clustering method. However, this difference is apparent only for smaller corpora.


Sign in / Sign up

Export Citation Format

Share Document