scholarly journals Lip reading using neural networks

Author(s):  
Dhananjay Kalbande ◽  
Akassh A Mishra ◽  
Sanjivani Patil ◽  
Sneha Nirgudkar ◽  
Prashant Patel
Keyword(s):  
2019 ◽  
Vol 88 ◽  
pp. 76-83 ◽  
Author(s):  
Abderrahim Mesbah ◽  
Aissam Berrahou ◽  
Hicham Hammouchi ◽  
Hassan Berbia ◽  
Hassan Qjidaa ◽  
...  

Author(s):  
Rishabh Nevatia

Abstract: Lip reading is the visual task of interpreting phrases from lip movements. While speech is one of the most common ways of communicating among individuals, understanding what a person wants to convey while having access only to their lip movements is till date a task that has not seen its paradigm. Various stages are involved in the process of automated lip reading, ranging from extraction of features to applying neural networks. This paper covers various deep learning approaches that are used for lip reading Keywords: Automatic Speech Recognition, Lip Reading, Neural Networks, Feature Extraction, Deep Learning


2021 ◽  
Vol 11 (2) ◽  
pp. 6986-6992
Author(s):  
L. Poomhiran ◽  
P. Meesad ◽  
S. Nuanmeesri

This paper proposes a lip reading method based on convolutional neural networks applied to Concatenated Three Sequence Keyframe Image (C3-SKI), consisting of (a) the Start-Lip Image (SLI), (b) the Middle-Lip Image (MLI), and (c) the End-Lip Image (ELI) which is the end of the pronunciation of that syllable. The lip area’s image dimensions were reduced to 32×32 pixels per image frame and three keyframes concatenate together were used to represent one syllable with a dimension of 96×32 pixels for visual speech recognition. Every three concatenated keyframes representing any syllable are selected based on the relative maximum and relative minimum related to the open lip’s width and height. The evaluation results of the model’s effectiveness, showed accuracy, validation accuracy, loss, and validation loss values at 95.06%, 86.03%, 4.61%, and 9.04% respectively, for the THDigits dataset. The C3-SKI technique was also applied to the AVDigits dataset, showing 85.62% accuracy. In conclusion, the C3-SKI technique could be applied to perform lip reading recognition.


Sign in / Sign up

Export Citation Format

Share Document