SKEW CORRECTION OF DOCUMENT IMAGES BY RANK ANALYSIS IN FAREY SEQUENCE

Skew correction of a scanned document page is an important preprocessing step in document image analysis. We propose here a fast and robust skew estimation algorithm based on rank analysis in Farey sequence. Our target document class comprises two major Indian scripts with headlines, namely Devnagari and Bangla. At the beginning, straight edge segments from the edge map of the document page are detected by our algorithm using properties of digital straightness. Straight edges derived in this manner are binned by Farey ranks in correspondence with their slopes. The principal bin, identified from these bins using the strength of accumulated edge points, represents the principal direction along the direction of headlines, from which the gross skew angle is estimated. A fast refinement algorithm is then applied with a finer tuning of Farey ranks, to detect the skew up to the desired level of precision. The algorithm has been tested on a diverse set of document images, containing Bangla and Devnagari scripts. Experimental results are quite encouraging in terms of accuracy, sensitivity to non-textual objects, effectiveness in dealing with unrestricted layouts, and computational efficiency.

Download Full-text

A Unified Algorithm for Identification of Various Tabular Structures from Document Images

Modern Library Technologies for Data Storage, Retrieval, and Use ◽

10.4018/978-1-4666-2928-8.ch001 ◽

2012 ◽

pp. 1-28

Author(s):

Sekhar Mandal ◽

Amit K. Das ◽

Partha Bhowmick ◽

Bhabatosh Chanda

Keyword(s):

Image Segmentation ◽

Document Image ◽

Experimental Results ◽

Document Images ◽

Segmentation Process ◽

Considerable Simplification ◽

Unified Algorithm ◽

Distinguishing Features ◽

Document Page ◽

Document Image Segmentation

This paper presents a unified algorithm for segmentation and identification of various tabular structures from document page images. Such tabular structures include conventional tables and displayed math-zones, as well as Table of Contents (TOC) and Index pages. After analyzing the page composition, the algorithm initially classifies the input set of document pages into tabular and non-tabular pages. A tabular page contains at least one of the tabular structures, whereas a non-tabular page does not contain any. The approach is unified in the sense that it is able to identify all tabular structures from a tabular page, which leads to a considerable simplification of document image segmentation in a novel manner. Such unification also results in speeding up the segmentation process, because the existing methodologies produce time-consuming solutions for treating different tabular structures as separate physical entities. Distinguishing features of different kinds of tabular structures have been used in stages in order to ensure the simplicity and efficiency of the algorithm and demonstrated by exhaustive experimental results.

Download Full-text

On the Farey sequence and its augmentation for applications to image analysis

International Journal of Applied Mathematics and Computer Science ◽

10.1515/amcs-2017-0045 ◽

2017 ◽

Vol 27 (3) ◽

pp. 637-658 ◽

Cited By ~ 1

Author(s):

Sanjoy Pratihar ◽

Partha Bhowmick

Keyword(s):

Document Images ◽

Polygonal Approximation ◽

Efficient Manner ◽

Farey Sequence ◽

Integer Points ◽

Skew Correction ◽

Digital Plane ◽

Engineering Drawings ◽

Novel Concept ◽

Straight Lines

AbstractWe introduce a novel concept of theaugmented Farey table(AFT). Its purpose is to store the ranks of fractions of aFarey sequencein an efficient manner so as to return therankof any query fraction in constant time. As a result, computations on the digital plane can be crafted down to simple integer operations; for example, the tasks like determining the extent of collinearity of integer points or of parallelism of straight lines—often required to solve many image-analytic problems—can be made fast and efficient through an appropriate AFT-based tool. We derive certain interesting characterizations of an AFT for its efficient generation. We also show how, for a fraction not present in a Farey sequence, the rank of thenearest fractionin that sequence can efficiently be obtained by theregula falsimethod from the AFT concerned. To assert its merit, we show its use in two applications—one in polygonal approximation of digital curves and the other in skew correction of engineering drawings in document images. Experimental results indicate the potential of the AFT in such image-analytic applications.

Download Full-text

LOCAL SKEW CORRECTION IN DOCUMENTS

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001408006417 ◽

2008 ◽

Vol 22 (04) ◽

pp. 691-710 ◽

Cited By ~ 14

Author(s):

P. SARAGIOTIS ◽

N. PAPAMARKOS

Keyword(s):

Nearest Neighbor ◽

Connected Components ◽

Document Images ◽

Skew Angle ◽

Vertical Orientation ◽

Text Localization ◽

Skew Correction ◽

Components Analysis ◽

Skew Angles ◽

Connected Components Analysis

In this paper we propose a technique for detecting and correcting the skew of text areas in a document. The documents we work with may contain several areas of text with different skew angles. First, a text localization procedure is applied based on connected components analysis. Specifically, the connected components of the document are extracted and filtered according to their size and geometric characteristics. Next, the candidate characters are grouped using a nearest neighbor approach to form words and then based on these words text lines of any skew are constructed. Then, the top-line and baseline for each text line are estimated using linear regression. Text lines in near locations, having similar skew angles, are grown to form text areas. For each text area a local skew angle is estimated and then these text areas are skew corrected independently to horizontal or vertical orientation. The technique has been extensively tested on a variety of document images and its accuracy and robustness is compared with other existing techniques.

Download Full-text

PAGE SEGMENTATION AND CLASSIFICATION UTILIZING BOTTOM-UP APPROACH

International Journal of Image and Graphics ◽

10.1142/s0219467801000219 ◽

2001 ◽

Vol 01 (02) ◽

pp. 345-361 ◽

Cited By ~ 6

Author(s):

ADNAN AMIN ◽

RICKY SHIU

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Document Image ◽

Connected Components ◽

Page Segmentation ◽

Skew Correction ◽

Recognition Systems ◽

Document Page ◽

Flat Bed Scanner

Document image processing has become an increasingly important technology in the automation of office documentation tasks. Automatic document scanners such as text readers and OCR (Optical Character Recognition) systems are an essential component of systems capable of those tasks. One of the problems in this field is that the document to be read is not always placed correctly on a flat-bed scanner. This means that the document may be skewed on the scanner bed, resulting in a skewed image. This skew has a detrimental effect on document analysis, document understanding, and character segmentation and recognition. Consequently, detecting the skew of a document image and correcting it are important issues in realizing a practical document reader. This paper presents the use of analyzing the connected components extracted from the binary image of a document page. Such an analysis provides a lot of useful information, and will be used to perform skew correction, segmentation and classification of the document. Moreover, we describe two new algorithms — one for skew detection and one for skew correction. The new skew correction algorithm we propose has been shown to be fast and accurate, with run times averaging under 1.5 CPU seconds and 30 seconds real time to calculate the angle on a 5000/20 DEC workstation. Experiments on over 100 pages show that the method works well on a wide variety of layouts, including sparse textual regions, mixed fonts, multiple columns, and even for documents with a high graphical content.

Download Full-text

Simple and Efficient Document Image Binarization Technique For Degraded Document Images

International Journal of Scientific Research ◽

10.15373/22778179/may2014/65 ◽

2012 ◽

Vol 3 (5) ◽

pp. 217-220

Author(s):

Manju Joseph ◽

◽

Jijina K.P Jijina K.P

Keyword(s):

Document Image ◽

Document Images ◽

Image Binarization ◽

Document Image Binarization ◽

Degraded Document

Download Full-text

An enhanced binarization framework for degraded historical document images

EURASIP Journal on Image and Video Processing ◽

10.1186/s13640-021-00556-4 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Wei Xiong ◽

Lei Zhou ◽

Ling Yue ◽

Lirong Li ◽

Song Wang

Keyword(s):

Document Image ◽

Morphological Operations ◽

Document Images ◽

Minimum Entropy ◽

Stroke Width ◽

Background Estimation ◽

Structuring Element ◽

Document Image Binarization ◽

Benchmark Datasets ◽

Stroke Width Transform

AbstractBinarization plays an important role in document analysis and recognition (DAR) systems. In this paper, we present our winning algorithm in ICFHR 2018 competition on handwritten document image binarization (H-DIBCO 2018), which is based on background estimation and energy minimization. First, we adopt mathematical morphological operations to estimate and compensate the document background. It uses a disk-shaped structuring element, whose radius is computed by the minimum entropy-based stroke width transform (SWT). Second, we perform Laplacian energy-based segmentation on the compensated document images. Finally, we implement post-processing to preserve text stroke connectivity and eliminate isolated noise. Experimental results indicate that the proposed method outperforms other state-of-the-art techniques on several public available benchmark datasets.

Download Full-text

Document Image Retrieval in a Question Answering System for Document Images

Document Analysis Systems VI - Lecture Notes in Computer Science ◽

10.1007/978-3-540-28640-0_49 ◽

2004 ◽

pp. 521-532

Author(s):

Koichi Kise ◽

Shota Fukushima ◽

Keinosuke Matsumoto

Keyword(s):

Image Retrieval ◽

Question Answering ◽

Document Image ◽

Document Images ◽

Question Answering System

Download Full-text

A Duplicate Chinese Document Image Retrieval System Based on Line Segment Feature in Character Image Block

Multimedia Systems and Content-Based Image Retrieval ◽

10.4018/978-1-59140-156-8.ch002 ◽

2011 ◽

pp. 14-23

Author(s):

Yung-Kuan Chan ◽

Tung-Shou Chen ◽

Yu-An Ho

Keyword(s):

Image Retrieval ◽

Line Segment ◽

Document Image ◽

Experimental Results ◽

Document Images ◽

Rapid Progress ◽

Image Block ◽

Line Segments ◽

Image Retrieval System ◽

On Line

With the rapid progress of digital image technology, the management of duplicate document images is also emphasized widely. As a result, this paper suggests a duplicate Chinese document image retrieval (DCDIR) system, which uses the ratio of the number of black pixels to that of white pixels on the scanned line segments in a character image block as the feature of the character image block. Experimental results indicate that the system can indeed effectively and quickly retrieve the desired duplicate Chinese document image from a database.

Download Full-text

From Image to XML

Human-Computer Interaction ◽

10.4018/978-1-4666-8789-9.ch063 ◽

2015 ◽

pp. 1295-1318

Author(s):

Robert Keefer ◽

Nikolaos Bourbakis

Keyword(s):

Body Text ◽

Visually Impaired ◽

Document Image ◽

The Body ◽

Image Process ◽

Document Images ◽

Xml Document ◽

Print Materials ◽

The Creation ◽

Multiple Document

Page layout analysis and the creation of an XML document from a document image are useful for many applications including the preservation of archived documents, robust electronic access to printed documents, and access to print materials by the visually impaired. In this paper, the authors describe a document image process pipeline comprised of techniques for the identification of article headings and the related body text, the aggregation of the body text with the headings, and the creation of an XML document. The pipeline was developed to support multiple document images captured by the head-mounted cameras of a reading device for the visually impaired. Both automatic and manual adaptations of the pipeline processed a sample of 25 newspaper document images. By comparing the automatic and manual processes, we show that overall our approach generates high-quality XML encoded documents for use in further processing, such as a text-to-speech for the visually impaired.

Download Full-text

Script Identification of Camera Based Bilingual Document Images Using SFTA Features

10.4018/978-1-6684-3690-5.ch040 ◽

2022 ◽

pp. 811-822

Author(s):

B.V. Dhandra ◽

Satishkumar Mallappa ◽

Gururaj Mukarambi

Keyword(s):

Block Size ◽

Texture Features ◽

Input Image ◽

Document Image ◽

Optimal Size ◽

Document Images ◽

Binary Images ◽

Script Identification ◽

Unified Algorithm ◽

Block Sizes

In this article, the exhaustive experiment is carried out to test the performance of the Segmentation based Fractal Texture Analysis (SFTA) features with nt = 4 pairs, and nt = 8 pairs, geometric features and their combinations. A unified algorithm is designed to identify the scripts of the camera captured bi-lingual document image containing International language English with each one of Hindi, Kannada, Telugu, Malayalam, Bengali, Oriya, Punjabi, and Urdu scripts. The SFTA algorithm decomposes the input image into a set of binary images from which the fractal dimension of the resulting regions are computed in order to describe the segmented texture patterns. This motivates use of the SFTA features as the texture features to identify the scripts of the camera-based document image, which has an effect of non-homogeneous illumination (Resolution). An experiment is carried on eleven scripts each with 1000 sample images of block sizes 128 × 128, 256 × 256, 512 × 512 and 1024 × 1024. It is observed that the block size 512 × 512 gives the maximum accuracy of 86.45% for Gujarathi and English script combination and is the optimal size. The novelty of this article is that unified algorithm is developed for the script identification of bilingual document images.

Download Full-text