YOLOv3-Tesseract model for improved intelligent form recognition

Author(s):  
Zhang Yun-An ◽  
Pan Ziheng ◽  
Dui Hongyan ◽  
Bai Guanghan

Background: YOLOv3-Tesseract is widely used for the intelligent form recognition because it exhibits several attractive properties. It is important to improve the accuracy and efficiency of the optical character recognition. Methods: The YOLOv3 exhibits the classification advantages for the object detection. Tesseract can effectively recognize regular characters in the field of the optical character recognition. In this study, a YOLOv3 and Tesseract-based model of improved intelligent form recognition is proposed. Results: First, YOLOv3 is trained to detect the position of the text in the table and to subsequently segment text blocks. Second, Tesseract is used to individually detect separated text blocks and combine YOLOv3 and Tesseract to achieve the goal of table character recognition. Conclusion: Based on the Tianchi big data, experimental simulation is used to demonstrate the proposed method. The YOLOv3-Tesseract model is trained and tested to effectively accomplish the recognition task.

Author(s):  
Andrew Brock ◽  
Theodore Lim ◽  
J. M. Ritchie ◽  
Nick Weston

End-to-end machine analysis of engineering document drawings requires a reliable and precise vision frontend capable of localizing and classifying various characters in context. We develop an object detection framework, based on convolutional networks, designed specifically for optical character recognition in engineering drawings. Our approach enables classification and localization on a 10-fold cross-validation of an internal dataset for which other techniques prove unsuitable.


Author(s):  
Christian Wibisono ◽  
Setia Budi

Industry 4.0 revolve the way of thinking in manufacturing factory business. Speed and accuracy become the main focus to survive and To growth. This study aims to build a blue print of an system that will increase both speed and accuracy in form input. This research will use several computer vision technologies like CNN that will used to do form classification and image segmentation, there is also OCR that will take specific information from a document that have been classified with CNN and then transform it into a JSON format which have more generic format and can be used in most common platform.


1997 ◽  
Vol 9 (1-3) ◽  
pp. 58-77
Author(s):  
Vitaly Kliatskine ◽  
Eugene Shchepin ◽  
Gunnar Thorvaldsen ◽  
Konstantin Zingerman ◽  
Valery Lazarev

In principle, printed source material should be made machine-readable with systems for Optical Character Recognition, rather than being typed once more. Offthe-shelf commercial OCR programs tend, however, to be inadequate for lists with a complex layout. The tax assessment lists that assess most nineteenth century farms in Norway, constitute one example among a series of valuable sources which can only be interpreted successfully with specially designed OCR software. This paper considers the problems involved in the recognition of material with a complex table structure, outlining a new algorithmic model based on ‘linked hierarchies’. Within the scope of this model, a variety of tables and layouts can be described and recognized. The ‘linked hierarchies’ model has been implemented in the ‘CRIPT’ OCR software system, which successfully reads tables with a complex structure from several different historical sources.


2020 ◽  
Vol 2020 (1) ◽  
pp. 78-81
Author(s):  
Simone Zini ◽  
Simone Bianco ◽  
Raimondo Schettini

Rain removal from pictures taken under bad weather conditions is a challenging task that aims to improve the overall quality and visibility of a scene. The enhanced images usually constitute the input for subsequent Computer Vision tasks such as detection and classification. In this paper, we present a Convolutional Neural Network, based on the Pix2Pix model, for rain streaks removal from images, with specific interest in evaluating the results of the processing operation with respect to the Optical Character Recognition (OCR) task. In particular, we present a way to generate a rainy version of the Street View Text Dataset (R-SVTD) for "text detection and recognition" evaluation in bad weather conditions. Experimental results on this dataset show that our model is able to outperform the state of the art in terms of two commonly used image quality metrics, and that it is capable to improve the performances of an OCR model to detect and recognise text in the wild.


Sign in / Sign up

Export Citation Format

Share Document