scholarly journals Comparison of Hidden Markov Model and Recurrent Neural Network in Automatic Speech Recognition

2020 ◽  
Vol 5 (8) ◽  
pp. 958-965
Author(s):  
Akshay Madhav Deshmukh

Understanding human speech precisely by a machine has been a major challenge for many years.With Automatic Speech Recognition (ASR) being decades old and considering the advancement of the technology, where it is not at the point where machines understand all speech, it is used on a regular basis in many applications and services. Hence, to advance research it is important to identify significant research directions, specifically to those that have not been pursued or funded in the past. The performance of such ASR systems, traditionally build upon an Hidden Markov Model (HMM), has improved due tothe application of Deep Neural Networks (DNNs). Despite this progress, building an ASR system remained a challenging task requiring multiple resources and training stages. The idea of using DNNs for Automatic Speech Recognition has gone further from being a single component in a pipeline to building a system mainly based on such a network.This paper provides a literature survey on state of the art researches on two major models, namely Deep Neural Network - Hidden Markov Model (DNN-HMM) and Recurrent Neural Networks trained with Connectionist Temporal Classification (RNN-CTC). It also provides the differences between these two models at the architectural level.

This paper presents a brief review on Automatic Speech Recognition and provide a technical understanding of ASR system. The objective of this review paper is to elaborate one of the best techniques in the field of speech recognition that is hidden Markov model. Hidden Markov model is very popular technique for speech recognition because speech signal is more like piecewise stationary or short time stationary signal and these models can be trained easily and they are computationally feasible. So, this paper gives a proper implementation of hidden Markov model. After so many years of research, the main challenge in speech recognition field is accuracy. The speech recognition system includes feature extraction, building word template, comparing word and selecting the best with maximum likelihood. Hence, this paper will give a great contribution for understanding the concepts of Automatic Speech Recognition system and hidden Markov model.


2016 ◽  
Vol 7 (2) ◽  
pp. 76-82
Author(s):  
Hugeng Hugeng ◽  
Edbert Hansel

We have built an application of speech recognition for Indonesian geography dictionary based on Android operating system, named GAIA. This application uses a smartphone as a device to receive input in the form of a spoken word from a user. The approach used in recognition is Hidden Markov Model which is contained in the Pocketsphinx library. The phonemes used are Indonesian phonemes’ rule. The advantage of this application is that it can be used without internet access. In the application testing, word detection is done with four conditions to determine the level of accuracy. The four conditions are near silent, near noisy, far silent, and far noisy. From the testing and analysis conducted, it can be concluded that GAIA application can be built as a speech recognition application on Android for Indonesian geography dictionary; with the results in the near silent condition accuracy of word recognition reaches an average of 52.87%, in the near noisy reaches an average of 14.5%, in the far silent condition reaches an average of 23.2%, and in the far noisy condition reaches an average of 2.8%. Index Terms—speech recognition, Indonesian geography dictionary, Hidden Markov Model, Pocketsphinx, Android.


Sign in / Sign up

Export Citation Format

Share Document