Scene Text Recognition in the Wild with Motion Deblurring Using Deep Networks

Author(s):  
Sukhad Anand ◽  
Seba Susan ◽  
Shreshtha Aggarwal ◽  
Shubham Aggarwal ◽  
Rajat Singla
2021 ◽  
Vol 54 (2) ◽  
pp. 1-35
Author(s):  
Xiaoxue Chen ◽  
Lianwen Jin ◽  
Yuanzhi Zhu ◽  
Canjie Luo ◽  
Tianwei Wang

The history of text can be traced back over thousands of years. Rich and precise semantic information carried by text is important in a wide range of vision-based application scenarios. Therefore, text recognition in natural scenes has been an active research topic in computer vision and pattern recognition. In recent years, with the rise and development of deep learning, numerous methods have shown promising results in terms of innovation, practicality, and efficiency. This article aims to (1) summarize the fundamental problems and the state-of-the-art associated with scene text recognition, (2) introduce new insights and ideas, (3) provide a comprehensive review of publicly available resources, and (4) point out directions for future work. In summary, this literature review attempts to present an entire picture of the field of scene text recognition. It provides a comprehensive reference for people entering this field and could be helpful in inspiring future research. Related resources are available at our GitHub repository: https://github.com/HCIILAB/Scene-Text-Recognition.


2021 ◽  
Vol 6 (1) ◽  
pp. 1-4
Author(s):  
Zobeir Raisi ◽  
Mohamed A. Naiel ◽  
Paul Fieguth ◽  
Steven Wardell ◽  
John Zelek

Recent state-of-the-art scene text recognition methods are primarily based on Recurrent Neural Networks (RNNs), however, these methods require one-dimensional (1D) features and are not designed for recognizing irregular-text instances due to the loss of spatial information present in the original two-dimensional (2D) images.  In this paper, we leverage a Transformer-based architecture for recognizing both regular and irregular text-in-the-wild images. The proposed method takes advantage of using a 2D positional encoder with the Transformer architecture to better preserve the spatial information of 2D image features than previous methods. The experiments on popular benchmarks, including the challenging COCO-Text dataset, demonstrate that the proposed scene text recognition method outperformed the state-of-the-art in most cases, especially on irregular-text recognition.


2021 ◽  
Vol 2 (2) ◽  
pp. 1-18
Author(s):  
Hongchao Gao ◽  
Yujia Li ◽  
Jiao Dai ◽  
Xi Wang ◽  
Jizhong Han ◽  
...  

Recognizing irregular text from natural scene images is challenging due to the unconstrained appearance of text, such as curvature, orientation, and distortion. Recent recognition networks regard this task as a text sequence labeling problem and most networks capture the sequence only from a single-granularity visual representation, which to some extent limits the performance of recognition. In this article, we propose a hierarchical attention network to capture multi-granularity deep local representations for recognizing irregular scene text. It consists of several hierarchical attention blocks, and each block contains a Local Visual Representation Module (LVRM) and a Decoder Module (DM). Based on the hierarchical attention network, we propose a scene text recognition network. The extensive experiments show that our proposed network achieves the state-of-the-art performance on several benchmark datasets including IIIT-5K, SVT, CUTE, SVT-Perspective, and ICDAR datasets under shorter training time.


2019 ◽  
Vol 9 (2) ◽  
pp. 236 ◽  
Author(s):  
Saad Ahmed ◽  
Saeeda Naz ◽  
Muhammad Razzak ◽  
Rubiyah Yusof

This paper presents a comprehensive survey on Arabic cursive scene text recognition. The recent years’ publications in this field have witnessed the interest shift of document image analysis researchers from recognition of optical characters to recognition of characters appearing in natural images. Scene text recognition is a challenging problem due to the text having variations in font styles, size, alignment, orientation, reflection, illumination change, blurriness and complex background. Among cursive scripts, Arabic scene text recognition is contemplated as a more challenging problem due to joined writing, same character variations, a large number of ligatures, the number of baselines, etc. Surveys on the Latin and Chinese script-based scene text recognition system can be found, but the Arabic like scene text recognition problem is yet to be addressed in detail. In this manuscript, a description is provided to highlight some of the latest techniques presented for text classification. The presented techniques following a deep learning architecture are equally suitable for the development of Arabic cursive scene text recognition systems. The issues pertaining to text localization and feature extraction are also presented. Moreover, this article emphasizes the importance of having benchmark cursive scene text dataset. Based on the discussion, future directions are outlined, some of which may provide insight about cursive scene text to researchers.


Sign in / Sign up

Export Citation Format

Share Document