Improved document skew detection based on text line connected-component clustering

Author(s):  
N. Liolios ◽  
N. Fakotakis ◽  
G. Kokkinakis

Segmentation is division of something into smaller parts and one of the Component of character recognition system. Separation of characters, words and lines are done in Segmentation from text documents. character recognition is a process which allows computers to recognize written or printed characters such as numbers or letters and to change them into a form that the computer can use. the accuracy of OCR system is done by taking the output of an OCR run for an image and comparing it to the original version of the same text. The main aim of this paper is to find out the various text line segmentations are Projection profiles, Weighted Bucket Method. Proposed method is horizontal projection profile and connected component method on Handwritten Kannada language. These methods are used for experimentation and finally comparing their accuracy and results.


Author(s):  
Joost van Beusekom ◽  
Faisal Shafait ◽  
Thomas M. Breuel
Keyword(s):  

1997 ◽  
Vol 30 (9) ◽  
pp. 1505-1519 ◽  
Author(s):  
B. Gatos ◽  
N. Papamarkos ◽  
C. Chamzas

Author(s):  
M. Ramanan

Skew detection and correction of a scanned document is a very important step in Optical Character Recognition because skew of scanned document is reducing the accuracy of text line approach for skew detection and correction to calculate the skew angle on multi-script scanned document using Radon transform, Hough transform, Harries corner, Wiener filter and smearing algorithm. In this paper, a proposed approach is compared existing skew detection and correction techniques for printed documents having different scripts: English, Tamil, Sinhala and mixed-script. A proposed hybrid method is tested on 160 documents. The overall testing results is 90.62% for skew detection and correction.


Sign in / Sign up

Export Citation Format

Share Document