Automatic optimized document skew pre-processor for character segmentation algorithm

In this paper, as a part of character segmentation algorithm, an automatic optimized document skew correction approach based on Hough transform is presented. The importance of skew correction in document image analysis lies in the fact that further processing is impossible if the document image is skewed. The proposed approach is based on fast implementation of the standard Hough transform which is followed by highly optimized low-level machine code implementation of the image rotation. In order to achieve high computational results, linear image representation is used. The proposed approach results from the aspect of time complexity and skew estimation accuracy which are analyzed and compared with the already existing skew correction approaches. The proposed approach gives better results compared with analogous approach used in related work, but it gives worse results compared with optimized version which exploits a BAG algorithm. Provided results show significant improvement of the standard Hough transform implementation.

Download Full-text

Document Image Skew Correction Method Based on Characteristic Sample Point Detection and Hough Transform

Lecture Notes in Electrical Engineering - Proceedings of the 9th International Symposium on Linear Drives for Industry Applications, Volume 3 ◽

10.1007/978-3-642-40633-1_94 ◽

2013 ◽

pp. 759-767

Author(s):

Lijing Tong ◽

Quanyao Peng ◽

Yang Li ◽

Guoliang Zhan ◽

Yifan Li

Keyword(s):

Hough Transform ◽

Correction Method ◽

Document Image ◽

Sample Point ◽

Skew Correction ◽

Characteristic Sample ◽

Point Detection

Download Full-text

Document Image Skew Correction Method based on Characteristic Sample Point Detection and Hough Transform

Journal of Convergence Information Technology ◽

10.4156/jcit.vol7.issue22.68 ◽

2012 ◽

Vol 7 (22) ◽

pp. 576-584

Author(s):

Lijing Tong ◽

Huiqun Zhao ◽

Quanyao Peng ◽

Guoliang Zhan ◽

Yifan Li

Keyword(s):

Hough Transform ◽

Correction Method ◽

Document Image ◽

Sample Point ◽

Skew Correction ◽

Characteristic Sample ◽

Point Detection

Download Full-text

Proceedings First International Workshop on Document Image Analysis for Libraries

First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings. ◽

10.1109/dial.2004.1324424 ◽

2004 ◽

Keyword(s):

Image Analysis ◽

International Workshop ◽

Document Image ◽

Document Image Analysis

Download Full-text

A character segmentation algorithm for mixed-mode communication

Systems and Computers in Japan ◽

10.1002/scj.4690160408 ◽

1985 ◽

Vol 16 (4) ◽

pp. 66-75 ◽

Cited By ~ 2

Author(s):

Osamu Nakamura ◽

Makoto Ujiie ◽

Noriyoshi Okamoto ◽

Toshi Minami

Keyword(s):

Mixed Mode ◽

Segmentation Algorithm ◽

Character Segmentation

Download Full-text

Document Image Analysis in Compressed Domain-Limitations, Applications & Challenges

2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) ◽

10.1109/iceca49313.2020.9297593 ◽

2020 ◽

Author(s):

Kavita V. Horadi

Keyword(s):

Image Analysis ◽

Document Image ◽

Compressed Domain ◽

Document Image Analysis

Download Full-text

MULTI-LEVEL DOCUMENT IMAGE SEGMENTATION USING MULTI-LAYER PERCEPTRON AND SUPPORT VECTOR MACHINE

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001412530023 ◽

2012 ◽

Vol 26 (06) ◽

pp. 1253002 ◽

Cited By ~ 1

Author(s):

YAN ZHANG ◽

BIN YU ◽

HAI-MING GU

Keyword(s):

Support Vector Machine ◽

Image Segmentation ◽

Research Area ◽

Document Image ◽

Support Vector ◽

Multi Layer Perceptron ◽

Document Image Analysis ◽

Important Research Area ◽

Multi Level ◽

Document Image Segmentation

Document image segmentation is an important research area of document image analysis which classifies the contents of a document image into a set of text and non-text classes. Previous existing methods are often designed to classify text and halftone therefore they perform poorly in classifying graphics, tables and circuit, etc. In this paper, we present a robust multi-level classification method using multi-layer perceptron (MLP) and support vector machine (SVM) to segment the texts from non-texts and thereafter classify them as tables, graphics and halftones. This method outperforms previously existing methods by overcoming various issues associated with the complexity of document images. Experimental results prove the effectiveness of our proposed method. By virtue of our multi-level classification approach, the text components, halftone components, graphic components and table components are accurately classified respectively which would highly improve OCR accuracy to reduce garbage symbols as well as increase compression ratio thereafter simultaneously.

Download Full-text