scholarly journals Improvement of the end-to-end scene text recognition method for “text-to-speech” conversion

Author(s):  
Fazliddin Makhmudov ◽  
Mukhriddin Mukhiddinov ◽  
Akmalbek Abdusalomov ◽  
Kuldoshbay Avazov ◽  
Utkir Khamdamov ◽  
...  

Methods for text detection and recognition in images of natural scenes have become an active research topic in computer vision and have obtained encouraging achievements over several benchmarks. In this paper, we introduce a robust yet simple pipeline that produces accurate and fast text detection and recognition for the Uzbek language in natural scene images using a fully convolutional network and the Tesseract OCR engine. First, the text detection step quickly predicts text in random orientations in full-color images with a single fully convolutional neural network, discarding redundant intermediate stages. Then, the text recognition step recognizes the Uzbek language, including both the Latin and Cyrillic alphabets, using a trained Tesseract OCR engine. Finally, the recognized text can be pronounced using the Uzbek language text-to-speech synthesizer. The proposed method was tested on the ICDAR 2013, ICDAR 2015 and MSRA-TD500 datasets, and it showed an advantage in efficiently detecting and recognizing text from natural scene images for assisting the visually impaired.

Author(s):  
Saeed Mian Qaisa

This paper propose an original approach of achieving a Cymatics based visual perception of image-extracted text. In this context, an effective approach for automated text detection and recognition for the natural scene images is proposed. The incoming image is firstly enhanced by employing CLAHE and DWT. Afterwards, the text regions of the enhanced image are detected by employing the MSER feature detector. The non-text MSERs are removed by employing the geometrical and contour based filters. The remaining MSERs are grouped into words or phrases by finding out similarities between them. The text recognition is performed by employing an OCR function. The extracted text is sequentially analysed on character by character basis. Each character is converted into a methodical acoustic excitation. Finally, these excitations are converted into the systematic visual perceptions by using the phenomenon of Cymatics. The system functionality is tested with an experimental setup. For the case of studied natural scenes, the suggested approach achieves 80% precision in text localization and 53% precision in end-to-end text recognition. The devised system principle is novel and can be employed in various applications like visual art, encryption, education, integration of impaired people, etc.


Author(s):  
Ahlam Alnefaie ◽  
Deepak Gupta ◽  
Monowar H. Bhuyan ◽  
Imran Razzak ◽  
Prashant Gupta ◽  
...  

Author(s):  
Sankirti Sandeep Shiravale ◽  
R. Jayadevan ◽  
Sanjeev S. Sannakki

Text present in a camera captured scene images is semantically rich and can be used for image understanding. Automatic detection, extraction, and recognition of text are crucial in image understanding applications. Text detection from natural scene images is a tedious task due to complex background, uneven light conditions, multi-coloured and multi-sized font. Two techniques, namely ‘edge detection' and ‘colour-based clustering', are combined in this paper to detect text in scene images. Region properties are used for elimination of falsely generated annotations. A dataset of 1250 images is created and used for experimentation. Experimental results show that the combined approach performs better than the individual approaches.


Sign in / Sign up

Export Citation Format

Share Document