A Chinese Small Vocabulary Offline Speech Recognition System Based on Pocketsphinx in Android Platform

2014 ◽  
Vol 623 ◽  
pp. 267-273
Author(s):  
Xin Fei Liu ◽  
Hui Zhou

This paper describes a Chinese small-vocabulary offline speech recognition system based on PocketSphinx which acoustic models are regenerated by improving the existing models of Sphinx and language model is generated by LMTool online tool. And then build an offline speech recognition system which could run on the Android smartphone in Android development environment in Linux system. The experiment results show that the system used for recognizing the voice commands for cell phone has good recognition performance.

Author(s):  
Sonal Anilkumar Tiwari

Abstract: This can be quite interesting when we think that we commanding something to in-animated objects. Yes it is possible with the help of ASR systems. Speech recognition system is a system that can make humans to talk with machineries. Nowadays speech recognition is such a technique that without it, a person cannot do any of his work properly. People get addicted of it. And it has become a habit for humans like we use mobile phones but when we want to type something, then we immediately can pass the voice commands. With which our Efforts are reduced, as well as a lot of our time. Keywords: Speech, Speech Recognition, ASR, Corpus, PRAAT


Author(s):  
Qi Yue ◽  
Weiliang Shi ◽  
Yi He ◽  
Jing Chu ◽  
Zhan Han ◽  
...  

2009 ◽  
Vol 2 (4) ◽  
pp. 67-80 ◽  
Author(s):  
Mohamed Ali ◽  
Moustafa Elshafei ◽  
Mansour Al-Ghamdi ◽  
Husni Al-Muhtaseb

Phonetic dictionaries are essential components of large-vocabulary speaker-independent speech recognition systems. This paper presents a rule-based technique to generate phonetic dictionaries for a large vocabulary Arabic speech recognition system. The system used conventional Arabic pronunciation rules, common pronunciation rules of Modern Standard Arabic, as well as some common dialectal cases. The paper gives in detail an explanation of these rules as well as their formal mathematical presentation. The rules were used to generate a dictionary for a 5.4 hour corpus of broadcast news. The rules and the phone set were tested and evaluated on an Arabic speech recognition system. The system was trained on 4.3 hours of the 5.4 hours of Arabic broadcast news corpus and tested on the remaining 1.1 hours. The phonetic dictionary contains 23,841 definitions corresponding to about 14232 words. The language model contains both bi-grams and tri-grams. The Word Error Rate (WER) came to 9.0%.


Sign in / Sign up

Export Citation Format

Share Document