A robust skew detection algorithm for grayscale document image

Author(s):  
Ming Chen ◽  
Xiaoqing Ding
2005 ◽  
Vol 05 (02) ◽  
pp. 247-265 ◽  
Author(s):  
ADNAN AMIN ◽  
SUE WU

This article presents an automatic system that takes in grayscale scanned images, which could be mixed text/graphic documents, and performs thresholding and skew detection on the document images. The system consists of two major components; multistage thresholding and skew detection. The proposed skew detection algorithm has no restriction on detectable angle range and does not rely on large blocks of text. It works well on textual document images, graphical images and mixed text and graphic images. The performance of the systems was evaluated using over 60 images that consist of real life documents like envelopes and artificial mixed text/graphic icons. The superior performance of thresholding is clear compared to other techniques from the evaluation. The skew detection algorithm is robust when compared with other methods when very few text lines are present in the document image.


2018 ◽  
Vol 55 (1) ◽  
pp. 011007
Author(s):  
张新红 Zhang Xinhong ◽  
张一凡 Zhang Yifan ◽  
张帆 Zhang Fan

1996 ◽  
Vol 29 (10) ◽  
pp. 1599-1629 ◽  
Author(s):  
Bin Yu ◽  
Anil K. Jain

Electronics ◽  
2019 ◽  
Vol 9 (1) ◽  
pp. 55
Author(s):  
Kai Huang ◽  
Zixuan Chen ◽  
Min Yu ◽  
Xiaolang Yan ◽  
Aiguo Yin

Document skew detection is one of the key technologies in most of the document analysis systems. However, existing skew detection methods either have low accuracy or require a large amount of computation. To achieve a good tradeoff between efficiency and performance, we propose a novel skew detection approach based on bounding boxes, probability model, and Dixon’s Q test. Firstly, bounding boxes are used to pick out the eligible connected components (ECC). Then, we calculate the slopes of the skew document with the probability model. Finally, we find the optimal result with Dixon’s Q test and projection profile method. Moreover, the proposed method can detect the skew angle in a wider range. The experimental results show that our skew detection algorithm can achieve high speed and accuracy simultaneously compared with existing algorithms.


2019 ◽  
Vol 8 (S1) ◽  
pp. 50-53
Author(s):  
N. P. Revathy ◽  
S. Janarthanam ◽  
S. Sukumaran

Document images are more popular in today’s world and being made available over the internet for Information retrieval. The document images becomes a difficult task compared with digital texts and edge detection is an important task in the document image retrieval, edge detection indicates to the process of finding sharp discontinuation of characters in the document images. The single edge detection methods causing the weak gradient and edge missing problems adopts the method of combining global with local edge detection to extract edge. The global edge detection obtains the whole edges and uses to improve adaptive smooth filter algorithm based on canny operator. These combinations increase the detection efficiency and reduce the computational time. In addition, the proposed algorithm has been tested through real-time document retrieval system to detect the edges in unstructured environment and generate 2D maps. These maps contain the starting and destination points in addition to current positions of the objects. This proposed work enhancing the searching ability of the document to move towards the optimal solution and to verify the capability in terms of detection efficiency.


Author(s):  
Neha. N

Document image processing is an increasingly important technology essential in all optical character recognition (OCR) systems and for automation of various office documents. A document originally has zero-skew (tilt), but when a page is scanned or photo copied, skew may be introduced due to various factors and is practically unavoidable. Presence even a small amount of skew (0.50) will have detrimental effects on document analysis as it has a direct effect on the reliability and efficiency of segmentation, recognition and feature extraction stages. Therefore removal of skew is of paramount importance in the field of document analysis and OCR and is the first step to be accomplished. This paper presents a novel technique for skew detection and correction which is both language and content independent. The proposed technique is based on the maximum density of the foreground pixels and their orientation in the document image. Unlike other conventional algorithms which work only for machine printed textual documents scripted in English, this technique works well for all kinds of document images (machine printed, hand written, complex, noisy and simple). The technique presented here is tested with 150 different document image samples and is found to provide results with an accuracy of 0.10


Sign in / Sign up

Export Citation Format

Share Document