scholarly journals Survey on automatic lip-reading in the era of deep learning

2018 ◽  
Vol 78 ◽  
pp. 53-72 ◽  
Author(s):  
Adriana Fernandez-Lopez ◽  
Federico M. Sukno
Keyword(s):  
2020 ◽  
Vol 34 (04) ◽  
pp. 6917-6924 ◽  
Author(s):  
Ya Zhao ◽  
Rui Xu ◽  
Xinchao Wang ◽  
Peng Hou ◽  
Haihong Tang ◽  
...  

Lip reading has witnessed unparalleled development in recent years thanks to deep learning and the availability of large-scale datasets. Despite the encouraging results achieved, the performance of lip reading, unfortunately, remains inferior to the one of its counterpart speech recognition, due to the ambiguous nature of its actuations that makes it challenging to extract discriminant features from the lip movement videos. In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers. The rationale behind our approach is that the features extracted from speech recognizers may provide complementary and discriminant clues, which are formidable to be obtained from the subtle movements of the lips, and consequently facilitate the training of lip readers. This is achieved, specifically, by distilling multi-granularity knowledge from speech recognizers to lip readers. To conduct this cross-modal knowledge distillation, we utilize an efficacious alignment scheme to handle the inconsistent lengths of the audios and videos, as well as an innovative filtering strategy to refine the speech recognizer's prediction. The proposed method achieves the new state-of-the-art performance on the CMLR and LRS2 datasets, outperforming the baseline by a margin of 7.66% and 2.75% in character error rate, respectively.


Author(s):  
Kartik Datar ◽  
Meet N. Gandhi ◽  
Priyanshu Aggarwal ◽  
Mayank Sohani

In the world of development and advancement, deep learning has made its significant impact in certain tasks in such a way which seemed impossible a few years ago. Deep learning has been able to solve problems which are even complex for machine learning algorithms. The task of lip reading and converting the lip moments to text is been performed by various methods, one of the most successful methods for the following is Lip-net they provide end to end conversion form lip to text. The end to end conversion of lip moments to the words is possible because of availability of huge data and development of different deep learning methods such as Convolution Neural Network and Recurrent Neural Networks. The use of Deep Learning in lip reading is a recent concept and solves upcoming challenges in real-world such as Virtual Reality system, assisted driving systems, sign language recognition, movement recognition, improving hearing aid via Google lens. Various other approaches along with different datasets are explained in the paper.


Author(s):  
Doaa Sami Khafaga ◽  
Hanan A. Hosni Mahmoud ◽  
Norah S. Alghamdi ◽  
Amani A. Albraikan

2021 ◽  
Author(s):  
Hengxin Ruan ◽  
Hanting Zhao ◽  
Menglin Wei ◽  
Zhuo Wang ◽  
Hongrui Zhang ◽  
...  
Keyword(s):  

Author(s):  
Mohammed Abid Abrar ◽  
A. N. M. Nafiul Islam ◽  
Mohammad Muntasir Hassan ◽  
Mohammad Tariqul Islam ◽  
Celia Shahnaz ◽  
...  

IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Souheil Fenghour ◽  
Daqing Chen ◽  
Kun Guo ◽  
Bo Li ◽  
Perry Xiao
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document