Survey on automatic lip-reading in the era of deep learning

Lip reading has witnessed unparalleled development in recent years thanks to deep learning and the availability of large-scale datasets. Despite the encouraging results achieved, the performance of lip reading, unfortunately, remains inferior to the one of its counterpart speech recognition, due to the ambiguous nature of its actuations that makes it challenging to extract discriminant features from the lip movement videos. In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers. The rationale behind our approach is that the features extracted from speech recognizers may provide complementary and discriminant clues, which are formidable to be obtained from the subtle movements of the lips, and consequently facilitate the training of lip readers. This is achieved, specifically, by distilling multi-granularity knowledge from speech recognizers to lip readers. To conduct this cross-modal knowledge distillation, we utilize an efficacious alignment scheme to handle the inconsistent lengths of the audios and videos, as well as an innovative filtering strategy to refine the speech recognizer's prediction. The proposed method achieves the new state-of-the-art performance on the CMLR and LRS2 datasets, outperforming the baseline by a margin of 7.66% and 2.75% in character error rate, respectively.

Download Full-text

Text Extraction through Video Lip Reading Using Deep Learning

2019 8th International Conference System Modeling and Advancement in Research Trends (SMART) ◽

10.1109/smart46866.2019.9117224 ◽

2019 ◽

Author(s):

S.M. Mazharul Hoque Chowdhury ◽

Mushfiqur Rahman ◽

Marzan Tasnim Oyshi ◽

Md. Arid Hasan

Keyword(s):

Deep Learning ◽

Text Extraction ◽

Lip Reading

Download Full-text

Lip-Reading Driven Deep Learning Approach for Speech Enhancement

IEEE Transactions on Emerging Topics in Computational Intelligence ◽

10.1109/tetci.2019.2917039 ◽

2019 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

Ahsan Adeel ◽

Mandar Gogate ◽

Amir Hussain ◽

William M. Whitmer

Keyword(s):

Deep Learning ◽

Speech Enhancement ◽

Learning Approach ◽

Lip Reading

Download Full-text

A Review on Deep Learning Based Lip-Reading

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206140 ◽

2020 ◽

pp. 182-188

Author(s):

Kartik Datar ◽

Meet N. Gandhi ◽

Priyanshu Aggarwal ◽

Mayank Sohani

Keyword(s):

Deep Learning ◽

Machine Learning Algorithms ◽

Language Recognition ◽

Virtual Reality System ◽

Sign Language Recognition ◽

Huge Data ◽

Movement Recognition ◽

Lip Reading ◽

The World ◽

End To End

In the world of development and advancement, deep learning has made its significant impact in certain tasks in such a way which seemed impossible a few years ago. Deep learning has been able to solve problems which are even complex for machine learning algorithms. The task of lip reading and converting the lip moments to text is been performed by various methods, one of the most successful methods for the following is Lip-net they provide end to end conversion form lip to text. The end to end conversion of lip moments to the words is possible because of availability of huge data and development of different deep learning methods such as Convolution Neural Network and Recurrent Neural Networks. The use of Deep Learning in lip reading is a recent concept and solves upcoming challenges in real-world such as Virtual Reality system, assisted driving systems, sign language recognition, movement recognition, improving hearing aid via Google lens. Various other approaches along with different datasets are explained in the paper.

Download Full-text