Comparison of Hidden Markov Model and Recurrent Neural Network in Automatic Speech Recognition

Understanding human speech precisely by a machine has been a major challenge for many years.With Automatic Speech Recognition (ASR) being decades old and considering the advancement of the technology, where it is not at the point where machines understand all speech, it is used on a regular basis in many applications and services. Hence, to advance research it is important to identify significant research directions, specifically to those that have not been pursued or funded in the past. The performance of such ASR systems, traditionally build upon an Hidden Markov Model (HMM), has improved due tothe application of Deep Neural Networks (DNNs). Despite this progress, building an ASR system remained a challenging task requiring multiple resources and training stages. The idea of using DNNs for Automatic Speech Recognition has gone further from being a single component in a pipeline to building a system mainly based on such a network.This paper provides a literature survey on state of the art researches on two major models, namely Deep Neural Network - Hidden Markov Model (DNN-HMM) and Recurrent Neural Networks trained with Connectionist Temporal Classification (RNN-CTC). It also provides the differences between these two models at the architectural level.

Download Full-text

Hybrid Hidden Markov Model and Artificial Neural Network for Automatic Speech Recognition

2009 Pacific-Asia Conference on Circuits, Communications and Systems ◽

10.1109/paccs.2009.138 ◽

2009 ◽

Cited By ~ 4

Author(s):

Xian Tang

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Automatic Speech Recognition ◽

Hidden Markov ◽

Artificial Neural

Download Full-text

HYBRID NEURAL NETWORK/HIDDEN MARKOV MODEL SYSTEMS FOR CONTINUOUS SPEECH RECOGNITION

Series in Machine Perception and Artificial Intelligence - Advances in Pattern Recognition Systems Using Neural Network Technologies ◽

10.1142/9789812797926_0015 ◽

1994 ◽

pp. 255-272

Author(s):

NELSON MORGAN ◽

HERVÉ BOURLARD ◽

STEVE RENALS ◽

MICHAEL COHEN ◽

HORACIO FRANCO

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Model Systems ◽

Continuous Speech ◽

Continuous Speech Recognition ◽

Hybrid Neural Network

Download Full-text

Speech Recognition Implementation

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i1116.0789s419 ◽

2019 ◽

Vol 8 (9S4) ◽

pp. 111-116

Keyword(s):

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Automatic Speech Recognition ◽

Hidden Markov ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

Main Challenge ◽

Short Time

This paper presents a brief review on Automatic Speech Recognition and provide a technical understanding of ASR system. The objective of this review paper is to elaborate one of the best techniques in the field of speech recognition that is hidden Markov model. Hidden Markov model is very popular technique for speech recognition because speech signal is more like piecewise stationary or short time stationary signal and these models can be trained easily and they are computationally feasible. So, this paper gives a proper implementation of hidden Markov model. After so many years of research, the main challenge in speech recognition field is accuracy. The speech recognition system includes feature extraction, building word template, comparing word and selecting the best with maximum likelihood. Hence, this paper will give a great contribution for understanding the concepts of Automatic Speech Recognition system and hidden Markov model.

Download Full-text

A comparative review of dynamic neural networks and hidden Markov model methods for mobile on-device speech recognition

Neural Computing and Applications ◽

10.1007/s00521-017-3028-2 ◽

2017 ◽

Vol 31 (S2) ◽

pp. 891-899 ◽

Cited By ~ 23

Author(s):

Mohammed Kyari Mustafa ◽

Tony Allen ◽

Kofi Appiah

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Dynamic Neural Networks ◽

Comparative Review

Download Full-text

A HYBRID SPEECH RECOGNITION SYSTEM WITH HIDDEN MARKOV MODEL AND RADIAL BASIS FUNCTION NEURAL NETWORK

American Journal of Applied Sciences ◽

10.3844/ajassp.2013.1148.1153 ◽

2013 ◽

Vol 10 (10) ◽

pp. 1148-1153

Author(s):

Justin

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Radial Basis Function ◽

Basis Function ◽

Hidden Markov ◽

Recognition System ◽

Speech Recognition System ◽

Radial Basis

Download Full-text

From artificial neural network inversion to hidden Markov model inversion: application to robust speech recognition

Proceedings of 1995 IEEE Workshop on Neural Networks for Signal Processing ◽

10.1109/nnsp.1995.514899 ◽

2002 ◽

Author(s):

Seokyong Moon ◽

Jenq-Neng Hwang

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Robust Speech Recognition ◽

Model Inversion ◽

Neural Network Inversion ◽

Artificial Neural

Download Full-text

Time-inhomogeneous hidden Bernoulli model: An alternative to hidden Markov model for automatic speech recognition

2008 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2008.4518556 ◽

2008 ◽

Cited By ~ 2

Author(s):

Jahanshah Kabudian ◽

M. Mehdi Homayounpour ◽

S. Mohammad Ahadi

Keyword(s):

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Automatic Speech Recognition ◽

Hidden Markov ◽

Bernoulli Model

Download Full-text

Hidden Markov model/neural network training techniques for connected alphadigit speech recognition

[Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing ◽

10.1109/icassp.1991.150290 ◽

1991 ◽

Cited By ~ 5

Author(s):

M.M. Hochberg ◽

L.T. Niles ◽

J.T. Foote ◽

H.F. Silverman

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Neural Network Training ◽

Training Techniques ◽

Network Training

Download Full-text

Speech recognition via Hidden Markov Model and neural network trained by genetic algorithm

2010 International Conference on Machine Learning and Cybernetics ◽

10.1109/icmlc.2010.5580758 ◽

2010 ◽

Cited By ~ 3

Author(s):

Shing-Tai Pan ◽

Ching-Fa Chen ◽

Jian-Hong Zeng

Keyword(s):

Neural Network ◽

Genetic Algorithm ◽

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov

Download Full-text

Implementation of Android Based Speech Recognition for Indonesian Geography Dictionary

Jurnal ULTIMA Computing ◽

10.31937/sk.v7i2.296 ◽

2016 ◽

Vol 7 (2) ◽

pp. 76-82

Author(s):

Hugeng Hugeng ◽

Edbert Hansel

Keyword(s):

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Spoken Word ◽

Silent Condition ◽

Word Detection ◽

Index Terms ◽

To Receive ◽

Noisy Condition

We have built an application of speech recognition for Indonesian geography dictionary based on Android operating system, named GAIA. This application uses a smartphone as a device to receive input in the form of a spoken word from a user. The approach used in recognition is Hidden Markov Model which is contained in the Pocketsphinx library. The phonemes used are Indonesian phonemes’ rule. The advantage of this application is that it can be used without internet access. In the application testing, word detection is done with four conditions to determine the level of accuracy. The four conditions are near silent, near noisy, far silent, and far noisy. From the testing and analysis conducted, it can be concluded that GAIA application can be built as a speech recognition application on Android for Indonesian geography dictionary; with the results in the near silent condition accuracy of word recognition reaches an average of 52.87%, in the near noisy reaches an average of 14.5%, in the far silent condition reaches an average of 23.2%, and in the far noisy condition reaches an average of 2.8%. Index Terms—speech recognition, Indonesian geography dictionary, Hidden Markov Model, Pocketsphinx, Android.

Download Full-text