Visual Speech Recognition based on Improved type of Hidden Markov Model

Visual speech recognition is able to supplement the information of speech sound to improve the accuracy of speech recognition. A viseme, which describes the facial and oral movements that occur alongside the voicing of a particular phoneme, is a supposed basic unit of speech in the visual domain. As in phonemes, there are variations for the same viseme expressed by different persons or even by the same person. A classifier must be robust to this kind of variation. In this chapter, the author’s describe the Adaptively Boosted (AdaBoost) Hidden Markov Model (HMM) technique (Foo, 2004; Foo, 2003; Dong, 2002). By applying the AdaBoost technique to HMM modeling, a multi-HMM classifier that improves the robustness of HMM is obtained. The method is applied to identify context-independent and contextdependent visual speech units. Experimental results indicate that higher recognition accuracy can be attained using the AdaBoost HMM than that using conventional HMM.

Download Full-text

Lip feature extraction for visual speech recognition using Hidden Markov Model

2012 International Conference on Computing, Communication and Applications ◽

10.1109/iccca.2012.6179154 ◽

2012 ◽

Cited By ~ 2

Author(s):

P. Sujatha ◽

M. Radha Krishnan

Keyword(s):

Feature Extraction ◽

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Visual Speech ◽

Visual Speech Recognition

Download Full-text

A probabilistic principal component analysis based hidden Markov model for audio-visual speech recognition

2008 42nd Asilomar Conference on Signals, Systems and Computers ◽

10.1109/acssc.2008.5074819 ◽

2008 ◽

Cited By ~ 2

Author(s):

Zhanyu Ma ◽

Arne Leijon

Keyword(s):

Principal Component Analysis ◽

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Principal Component ◽

Component Analysis ◽

Visual Speech ◽

Visual Speech Recognition ◽

Probabilistic Principal Component Analysis

Download Full-text

Visual Speech Recognition Using Optical Flow and Hidden Markov Model

Wireless Personal Communications ◽

10.1007/s11277-018-5930-z ◽

2018 ◽

Vol 106 (4) ◽

pp. 2129-2147

Author(s):

Usha Sharma ◽

Sushila Maheshkar ◽

A. N. Mishra ◽

Rahul Kaushik

Keyword(s):

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Optical Flow ◽

Hidden Markov ◽

Visual Speech ◽

Visual Speech Recognition

Download Full-text

Hidden Markov Model Based Visemes Recognition, Part II

Visual Speech Recognition ◽

10.4018/978-1-60566-186-5.ch012 ◽

2009 ◽

pp. 356-387

Author(s):

Say Wei Foo ◽

Liang Donga

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Building Blocks ◽

Visual Speech ◽

Training Strategy ◽

Model Based ◽

Visual Speech Recognition ◽

Training Approach ◽

Channel Training

The basic building blocks of visual speech are the visemes. Unlike phonemes, the visemes are, however, confusable and easily distorted by the contexts in which they appear. Classifiers capable of distinguishing the minute difference among the different categories are desirable. In this chapter, we describe two Hidden Markov Model based techniques using the discriminative approach to increase the accuracy of visual speech recognition. The approaches investigated include Maximum Separable Distance (MSD) training strategy (Dong, 2005) and Two-channel training approach (Dong, 2005; Foo, 2003; Foo, 2002) The MSD training strategy and the Two-channel training approach adopt a proposed criterion function called separable distance to improve the discriminative power of an HMM. The methods are applied to identify confusable visemes. Experimental results indicate that higher recognition accuracy can be attained using these approaches than that using conventional HMM.

Download Full-text

Implementation of Android Based Speech Recognition for Indonesian Geography Dictionary

Jurnal ULTIMA Computing ◽

10.31937/sk.v7i2.296 ◽

2016 ◽

Vol 7 (2) ◽

pp. 76-82

Author(s):

Hugeng Hugeng ◽

Edbert Hansel

Keyword(s):

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Spoken Word ◽

Silent Condition ◽

Word Detection ◽

Index Terms ◽

To Receive ◽

Noisy Condition

We have built an application of speech recognition for Indonesian geography dictionary based on Android operating system, named GAIA. This application uses a smartphone as a device to receive input in the form of a spoken word from a user. The approach used in recognition is Hidden Markov Model which is contained in the Pocketsphinx library. The phonemes used are Indonesian phonemes’ rule. The advantage of this application is that it can be used without internet access. In the application testing, word detection is done with four conditions to determine the level of accuracy. The four conditions are near silent, near noisy, far silent, and far noisy. From the testing and analysis conducted, it can be concluded that GAIA application can be built as a speech recognition application on Android for Indonesian geography dictionary; with the results in the near silent condition accuracy of word recognition reaches an average of 52.87%, in the near noisy reaches an average of 14.5%, in the far silent condition reaches an average of 23.2%, and in the far noisy condition reaches an average of 2.8%. Index Terms—speech recognition, Indonesian geography dictionary, Hidden Markov Model, Pocketsphinx, Android.

Download Full-text