Speech recognition based on zero crossing rate and energy

1985 ◽  
Vol 33 (1) ◽  
pp. 320-323 ◽  
Author(s):  
Yiu-Kei Lau ◽  
Chok-Ki Chan
Author(s):  
Kai Zhao ◽  
Dan Wang

Aiming at the problem of low recognition rate in speech recognition methods, a speech recognition method in multi-layer perceptual network environment is proposed. In the multi-layer perceptual network environment, the speech signal is processed in the filter by using the transfer function of the filter. According to the framing process, the speech signal is windowed and framing processed to remove the silence segment of the speech signal. At the same time, the average energy of the speech signal is calculated and the zero crossing rate is calculated to extract the characteristics of the speech signal. By analyzing the principle of speech signal recognition, the process of speech recognition is designed, and the speech recognition in multi-layer perceptual network environment is realized. The experimental results show that the speech recognition method designed in this paper has good speech recognition performance


2015 ◽  
Vol 14 (4) ◽  
pp. 5607-5615
Author(s):  
Eman Karam Elsayed

From the importance of knowledge in the speech, we knew the importance of oral exam. So in this paper we integrated BOW (Bag of Word), LSA(Latin Semantic Analysis), ASR (automatic speech recognition), zero crossing rate, and Ontology based approach to automate the online oral exam especially in Arabic language with take into consideration the authentication problem. Our proposal method faced many challenges in Arabic language because there isn't semantic dictionary like WordNet in English and HowNet in Chinese. Also Arabic language has complicated synonyms. Our proposal can help improving meaningfulness. Finally, the proposed method in this paper didn't forget automation the feedback for determining learning disability


2021 ◽  
Vol 39 (1B) ◽  
pp. 1-10
Author(s):  
Iman H. Hadi ◽  
Alia K. Abdul-Hassan

Speaker recognition depends on specific predefined steps. The most important steps are feature extraction and features matching. In addition, the category of the speaker voice features has an impact on the recognition process. The proposed speaker recognition makes use of biometric (voice) attributes to recognize the identity of the speaker. The long-term features were used such that maximum frequency, pitch and zero crossing rate (ZCR).  In features matching step, the fuzzy inner product was used between feature vectors to compute the matching value between a claimed speaker voice utterance and test voice utterances. The experiments implemented using (ELSDSR) data set. These experiments showed that the recognition accuracy is 100% when using text dependent speaker recognition.


1995 ◽  
Vol 88 (7) ◽  
pp. 851-855
Author(s):  
Koji MIYATA ◽  
Kazuhiko SHOJI ◽  
Hisayoshi KOJIMA ◽  
Shigeru HIRANO ◽  
Shogo SHINHOARA ◽  
...  

2012 ◽  
Vol 239-240 ◽  
pp. 409-414
Author(s):  
Shi Lin Liu ◽  
Zheng Pei

An improved project based on decision trees in noisy environments is proposed for robust endpoints detection. Firstly, the noise level of the environment is estimated by wavelet decomposition, and then whether the denoising process is done according to the noise level is determined. Next, the thresholds are obtained by decision trees for the signal. Finally, endpoints are detected by the double thresholds on different importance of the energy and zero-crossing rate (ZCR) in the corresponding situation. The simulation results indicate that the proposed method based on noise estimation can obtain the same accurate data by computing less than the one with decision trees.


Sign in / Sign up

Export Citation Format

Share Document