scholarly journals Visual Speech Recognition using Convolutional Neural Network

2021 ◽  
Vol 1084 (1) ◽  
pp. 012020
Author(s):  
B Soundarya ◽  
R Krishnaraj ◽  
S Mythili
2021 ◽  
Author(s):  
Shashidhar R ◽  
S Patilkulkarni ◽  
Nishanth S Murthy

Abstract Communication is all about expressing one’s thoughts to another person through speech and facial expressions. But for people with hearing impairment, it is difficult to communicate without any assistance. In most of these cases Visual speech recognition (VSR) systems simplify the tasks by using Machine Learning algorithms and assisting them to understand speech and socialize without depending on the auditory perception. Thus, one can visualize VSR system as a lifeline for people with hearing impairment which helps them in providing a way to understand the words that are being tried to convey to them through speech. In this work we used VGG16 convolutional neural network architecture for Kannada and English datasets. We used custom dataset for the research work and got the accuracy of 90.10% for English database and 91.90% for Kannada database.


2021 ◽  
Author(s):  
Shashidhar R ◽  
Sudarshan Patil Kulkarni

Abstract In the current scenario, audio visual speech recognition is one of the emerging fields of research, but there is still deficiency of appropriate visual features for recognition of visual speech. Human lip-readers are increasingly being presented as useful in the gathering of forensic evidence but, like all human, suffer from unreliability in analyzing the lip movement. Here we used a custom dataset and design the system in such a way that it predicts the output for the lip reading. The problem of speaker independent lip-reading is very demanding due to unpredictable variations between people. Also due to recent developments and advances in the fields of signal processing and computer vision. The task of automating the lip reading is becoming a field of great interest. Here we use MFCC techniques for audio processing and LSTM method for visual speech recognition and finally integrate the audio and video using feed forward neural network (FFNN) and also got good accuracy. That is why the AVSR technique attract a great attention as a reliable solution for the speech detection problem. The final model was capable of taking more appropriate decision while predicting the spoken word. We were able to get a good accuracy of about 92.38% for the final model.


Author(s):  
Kingston Pal Thamburaj ◽  
Kartheges Ponniah ◽  
Ilangkumaran Sivanathan ◽  
Muniisvaran Kumar

Human and Computer interaction has been a part of our day-to-day life. Speech is one of the essential and comfortable ways of interacting through devices as well as a human being. The device, particularly smartphones have multiple sensors in camera and microphone, etc. speech recognition is the process of converting the acoustic signal to a smartphone as a set of words. The efficient performance of the speech recognition system highly enhances the interaction between humans and machines by making the latter more receptive to user needs. The recognized words can be applied for many applications such as Commands & Control, Data entry, and Document preparation. This research paper highlights speech recognition through ANN (Artificial Neural Network). Also, a hybrid model is proposed for audio-visual speech recognition of the Tamil and Malay language through SOM (Self-organizing map0 and MLP (Multilayer Perceptron). The Effectiveness of the different models of NN (Neural Network) utilized in speech recognition will be examined.


2019 ◽  
Author(s):  
Nilay Shrivastava ◽  
Astitwa Saxena ◽  
Yaman Kumar ◽  
Rajiv Ratn Shah ◽  
Amanda Stent ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document