ANALYSIS OF FORM IMAGES

Automatic analysis of images of forms is a problem of both practical and theoretical interest; due to its importance in office automation, and due to the conceptual challenges posed for document image analysis, respectively. We describe an approach to the extraction of text, both typed and handwritten, from scanned and digitized images of filled-out forms. In decomposing a filled-out form into three basic components of boxes, line segments and the remainder (handwritten and typed characters, words, and logos), the method does not use a priori knowledge of form structure. The input binary image is first segmented into small and large connected components. Complex boxes are decomposed into elementary regions using an approach based on key-point analysis. Handwritten and machine-printed text that touches or overlaps guide lines and boxes are separated by removing lines. Characters broken by line removal are rejoined using a character patching method. Experimental results with filled-out forms, from several different domains (insurance, banking, tax, retail and postal) are given.

Download Full-text

Proceedings First International Workshop on Document Image Analysis for Libraries

First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings. ◽

10.1109/dial.2004.1324424 ◽

2004 ◽

Keyword(s):

Image Analysis ◽

International Workshop ◽

Document Image ◽

Document Image Analysis

Download Full-text

Document Image Analysis in Compressed Domain-Limitations, Applications & Challenges

2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) ◽

10.1109/iceca49313.2020.9297593 ◽

2020 ◽

Author(s):

Kavita V. Horadi

Keyword(s):

Image Analysis ◽

Document Image ◽

Compressed Domain ◽

Document Image Analysis

Download Full-text

MULTI-LEVEL DOCUMENT IMAGE SEGMENTATION USING MULTI-LAYER PERCEPTRON AND SUPPORT VECTOR MACHINE

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001412530023 ◽

2012 ◽

Vol 26 (06) ◽

pp. 1253002 ◽

Cited By ~ 1

Author(s):

YAN ZHANG ◽

BIN YU ◽

HAI-MING GU

Keyword(s):

Support Vector Machine ◽

Image Segmentation ◽

Research Area ◽

Document Image ◽

Support Vector ◽

Multi Layer Perceptron ◽

Document Image Analysis ◽

Important Research Area ◽

Multi Level ◽

Document Image Segmentation

Document image segmentation is an important research area of document image analysis which classifies the contents of a document image into a set of text and non-text classes. Previous existing methods are often designed to classify text and halftone therefore they perform poorly in classifying graphics, tables and circuit, etc. In this paper, we present a robust multi-level classification method using multi-layer perceptron (MLP) and support vector machine (SVM) to segment the texts from non-texts and thereafter classify them as tables, graphics and halftones. This method outperforms previously existing methods by overcoming various issues associated with the complexity of document images. Experimental results prove the effectiveness of our proposed method. By virtue of our multi-level classification approach, the text components, halftone components, graphic components and table components are accurately classified respectively which would highly improve OCR accuracy to reduce garbage symbols as well as increase compression ratio thereafter simultaneously.

Download Full-text

Artificial Intelligence for Document Image Analysis

Document Processing Using Machine Learning ◽

10.1201/9780429277573-1 ◽

2019 ◽

pp. 1-14

Author(s):

Himadri Mukherjee ◽

Payel Rakshit ◽

Ankita Dhar ◽

Sk Md Obaidullah ◽

KC Santosh ◽

...

Keyword(s):

Artificial Intelligence ◽

Image Analysis ◽

Document Image ◽

Document Image Analysis

Download Full-text

Reduction of the non-uniform illumination using nonlocal variational models for document image analysis

Journal of the Franklin Institute ◽

10.1016/j.jfranklin.2018.08.012 ◽

2018 ◽

Vol 355 (16) ◽

pp. 8225-8244 ◽

Cited By ~ 2

Author(s):

Fatim Zahra Ait Bella ◽

Mohammed El Rhabi ◽

Abdelilah Hakim ◽

Amine Laghrib

Keyword(s):

Image Analysis ◽

Document Image ◽

Document Image Analysis ◽

Uniform Illumination ◽

Variational Models

Download Full-text

Preattentive reading and selective attention for document image analysis

Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318) ◽

10.1109/icdar.1999.791853 ◽

1999 ◽

Cited By ~ 2

Author(s):

C. Faure

Keyword(s):

Image Analysis ◽

Selective Attention ◽

Document Image ◽

Document Image Analysis

Download Full-text

LATEST DEVELOPMENTS OF LSTM NEURAL NETWORKS WITH APPLICATIONS OF DOCUMENT IMAGE ANALYSIS

Handbook of Pattern Recognition and Computer Vision ◽

10.1142/9789814656535_0016 ◽

2015 ◽

pp. 293-311

Author(s):

Marcus Liwicki ◽

Volkmar Frinken ◽

Muhammad Zeshan Afzal

Keyword(s):

Neural Networks ◽

Image Analysis ◽

Document Image ◽

Document Image Analysis

Download Full-text

A Duplicate Chinese Document Image Retrieval System Based on Line Segment Feature in Character Image Block

Multimedia Systems and Content-Based Image Retrieval ◽

10.4018/978-1-59140-156-8.ch002 ◽

2011 ◽

pp. 14-23

Author(s):

Yung-Kuan Chan ◽

Tung-Shou Chen ◽

Yu-An Ho

Keyword(s):

Image Retrieval ◽

Line Segment ◽

Document Image ◽

Experimental Results ◽

Document Images ◽

Rapid Progress ◽

Image Block ◽

Line Segments ◽

Image Retrieval System ◽

On Line

With the rapid progress of digital image technology, the management of duplicate document images is also emphasized widely. As a result, this paper suggests a duplicate Chinese document image retrieval (DCDIR) system, which uses the ratio of the number of black pixels to that of white pixels on the scanned line segments in a character image block as the feature of the character image block. Experimental results indicate that the system can indeed effectively and quickly retrieve the desired duplicate Chinese document image from a database.

Download Full-text