An Audio-Aided Face and Text Recognition System for Visually Impaired

Author(s):  
Manjusha Sreedharan ◽  
Shalini Mohanraj ◽  
Lakshmi Sutha Kumar
2019 ◽  
Vol 9 (2) ◽  
pp. 236 ◽  
Author(s):  
Saad Ahmed ◽  
Saeeda Naz ◽  
Muhammad Razzak ◽  
Rubiyah Yusof

This paper presents a comprehensive survey on Arabic cursive scene text recognition. The recent years’ publications in this field have witnessed the interest shift of document image analysis researchers from recognition of optical characters to recognition of characters appearing in natural images. Scene text recognition is a challenging problem due to the text having variations in font styles, size, alignment, orientation, reflection, illumination change, blurriness and complex background. Among cursive scripts, Arabic scene text recognition is contemplated as a more challenging problem due to joined writing, same character variations, a large number of ligatures, the number of baselines, etc. Surveys on the Latin and Chinese script-based scene text recognition system can be found, but the Arabic like scene text recognition problem is yet to be addressed in detail. In this manuscript, a description is provided to highlight some of the latest techniques presented for text classification. The presented techniques following a deep learning architecture are equally suitable for the development of Arabic cursive scene text recognition systems. The issues pertaining to text localization and feature extraction are also presented. Moreover, this article emphasizes the importance of having benchmark cursive scene text dataset. Based on the discussion, future directions are outlined, some of which may provide insight about cursive scene text to researchers.


2021 ◽  
Vol 31 (3) ◽  
pp. 222-229
Author(s):  
Kyu-Ree Kim ◽  
Su-Jeong Choi ◽  
Tae-Won Kang ◽  
Jin-Woo Jung

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Diandian Zhang ◽  
Yan Liu ◽  
Zhuowei Wang ◽  
Depei Wang

Manchu is a low-resource language that is rarely involved in text recognition technology. Because of the combination of typefaces, ordinary text recognition practice requires segmentation before recognition, which affects the recognition accuracy. In this paper, we propose a Manchu text recognition system divided into two parts: text recognition and text retrieval. First, a deep CNN model is used for text recognition, using a sliding window instead of manual segmentation. Second, text retrieval finds similarities within the image and locates the position of the recognized text in the database; this process is described in detail. We conducted comparative experiments on the FAST-NU dataset using different quantities of sample data, as well as comparisons with the latest model. The experiments revealed that the optimal results of the proposed deep CNN model reached 98.84%.


2020 ◽  
Vol 9 (3) ◽  
pp. 1208-1219
Author(s):  
Hendra Kusuma ◽  
Muhammad Attamimi ◽  
Hasby Fahrudin

In general, a good interaction including communication can be achieved when verbal and non-verbal information such as body movements, gestures, facial expressions, can be processed in two directions between the speaker and listener. Especially the facial expression is one of the indicators of the inner state of the speaker and/or the listener during the communication. Therefore, recognizing the facial expressions is necessary and becomes the important ability in communication. Such ability will be a challenge for the visually impaired persons. This fact motivated us to develop a facial recognition system. Our system is based on deep learning algorithm. We implemented the proposed system on a wearable device which enables the visually impaired persons to recognize facial expressions during the communication. We have conducted several experiments involving the visually impaired persons to validate our proposed system and the promising results were achieved.


Sign in / Sign up

Export Citation Format

Share Document