Analysis of Text Identification Techniques Using Scene Text and Optical Character Recognition

Author(s):  
Monica Gupta ◽  
Alka Choudhary ◽  
Jyotsna Parmar

In today's era, data in digitalized form is needed for faster processing and performing of all tasks. The best way to digitalize the documents is by extracting the text from them. This work of text extraction can be performed by various text identification tasks such as scene text recognition, optical character recognition, handwriting recognition, and much more. This paper presents, reviews, and analyses recent research expansion in the area of optical character recognition and scene text recognition based on various existing models such as convolutional neural network, long short-term memory, cognitive reading for image processing, maximally stable extreme regions, stroke width transformation, and achieved remarkable results up to 90.34% of F-score with benchmark datasets such as ICDAR 2013, ICDAR 2019, IIIT5k. The researchers have done outstanding work in the text recognition field. Yet, improvement in text detection in low-quality image performance is required, as text identification should not be limited to the input quality of the image.

2021 ◽  
Vol 2 (2) ◽  
pp. 1-18
Author(s):  
Hongchao Gao ◽  
Yujia Li ◽  
Jiao Dai ◽  
Xi Wang ◽  
Jizhong Han ◽  
...  

Recognizing irregular text from natural scene images is challenging due to the unconstrained appearance of text, such as curvature, orientation, and distortion. Recent recognition networks regard this task as a text sequence labeling problem and most networks capture the sequence only from a single-granularity visual representation, which to some extent limits the performance of recognition. In this article, we propose a hierarchical attention network to capture multi-granularity deep local representations for recognizing irregular scene text. It consists of several hierarchical attention blocks, and each block contains a Local Visual Representation Module (LVRM) and a Decoder Module (DM). Based on the hierarchical attention network, we propose a scene text recognition network. The extensive experiments show that our proposed network achieves the state-of-the-art performance on several benchmark datasets including IIIT-5K, SVT, CUTE, SVT-Perspective, and ICDAR datasets under shorter training time.


2021 ◽  
pp. 894-911
Author(s):  
Bhavesh Kataria, Dr. Harikrishna B. Jethva

India's constitution has 22 languages written in 17 different scripts. These materials have a limited lifespan, and as generations pass, these materials deteriorate, and the vital knowledge is lost. This work uses digital texts to convey information to future generations. Optical Character Recognition (OCR) helps extract information from scanned manuscripts (printed text). This paper proposes a simple and effective solution of optical character recognition (OCR) Sanskrit Character from text document images using long short-term memory (LSTM) and neural networks of Sanskrit Characters. Existing methods focuses only upon the single touching characters. But our main focus is to design a robust method using Bidirectional Long Short-Term Memory (BLSTM) architecture for overlapping lines, touching characters in middle and upper zone and half character which would increase the accuracy of the present OCR system for recognition of poorly maintained Sanskrit literature.


2021 ◽  
Vol 3 (2) ◽  
pp. 103-116
Author(s):  
Supriadi Supriadi

The calculator is a calculation tool that is widely used in various specialized fields of business and commerce. The use of a calculator makes it easier for humans to perform arithmetic operations, but there are obstacles in the process of inputting numbers if you want to calculate the value of numbers on written media such as paper, whiteboards and so on. The user must first see the text on written media, then read it and remember it then type the writing on a calculator tool or application. The drawback of this method is that when the user forgets the writing on the written media, the user will see the written text and remember it again so that it takes longer to perform calculations using a calculator. The method used in this study is Optical Character Recognition, this method can recognize text contained in images or handwritten images of mathematical number operations. The results of the text recognition will then be carried out by arithmetic calculations to get the calculation results. From the trials on 20 handwritten images of mathematical number operations, the results obtained were 85% accuracy of extraction and accuracy of handwritten images that can be calculated and correct by 85%


Theoretical—This paper shows a camera based assistive content perusing of item marks from articles to support outwardly tested individuals. Camera fills in as fundamental wellspring of info. To recognize the items, the client will move the article before camera and this moving item will be identified by Background Subtraction (BGS) Method. Content district will be naturally confined as Region of Interest (ROI). Content is extricated from ROI by consolidating both guideline based and learning based technique. A tale standard based content limitation calculation is utilized by recognizing geometric highlights like pixel esteem, shading force, character size and so forth and furthermore highlights like Gradient size, slope width and stroke width are found out utilizing SVM classifier and a model is worked to separate content and non-content area. This framework is coordinated with OCR (Optical Character Recognition) to extricate content and the separated content is given as a voice yield to the client. The framework is assessed utilizing ICDAR-2011 dataset which comprise of 509 common scene pictures with ground truth.


Author(s):  
Janarthanan A ◽  
Pandiyarajan C ◽  
Sabarinathan M ◽  
Sudhan M ◽  
Kala R

Optical character recognition (OCR) is a process of text recognition in images (one word). The input images are taken from the dataset. The collected text images are implemented to pre-processing. In pre-processing, we can implement the image resize process. Image resizing is necessary when you need to increase or decrease the total number of pixels, whereas remapping can occur when you are zooming refers to increase the quantity of pixels, so that when you zoom an image, you will see clear content. After that, we can implement the segmentation process. In segmentation, we can segment the each characters in one word. We can extract the features values from the image that means test feature. In classification process, we have to classify the text from the image. Image classification is performed the images in order to identify which image contains text. A classifier is used to identify the image containing text. The experimental results shows that the accuracy.


2021 ◽  
Vol 23 (06) ◽  
pp. 301-305
Author(s):  
Roshan Suvaris ◽  
◽  
Dr. S Sathyanarayana ◽  

Optical Character Recognition has been an inseparable part of human life during everyday transactions. The OCR has extended its application areas in almost all fields viz. healthcare, finance, banking, entertainment, trading system, digital storage, and so on. In the recent past, handwriting recognition is one of the hardest study areas in the area of image processing. In this paper, the various techniques for converting textual content from number plates, printed, handwritten paper documents into machine code have been discussed. The transforming method used in all these techniques is known as OCR. The English OCR system is necessary for the conversion of various published books and other documents in English into human editable computer text files. The latest researches in this area have included methodologies that identify different fonts and styles of English handwritten scripts. As of date, even though a number of algorithms are available, it has its own pros and cons. Since the recognition of different styles and fonts in machine-printed and handwritten English script is the biggest challenge, this field is open for researchers to implement new algorithms that would overcome the deficiencies of its predecessors.


Sign in / Sign up

Export Citation Format

Share Document