A skew detection and correction technique for Arabic script text-line based on subwords bounding

Author(s):  
Atallah M. Al-Shatnawi
Author(s):  
Joost van Beusekom ◽  
Faisal Shafait ◽  
Thomas M. Breuel
Keyword(s):  

1997 ◽  
Vol 30 (9) ◽  
pp. 1505-1519 ◽  
Author(s):  
B. Gatos ◽  
N. Papamarkos ◽  
C. Chamzas

Author(s):  
Neha. N

Document image processing is an increasingly important technology essential in all optical character recognition (OCR) systems and for automation of various office documents. A document originally has zero-skew (tilt), but when a page is scanned or photo copied, skew may be introduced due to various factors and is practically unavoidable. Presence even a small amount of skew (0.50) will have detrimental effects on document analysis as it has a direct effect on the reliability and efficiency of segmentation, recognition and feature extraction stages. Therefore removal of skew is of paramount importance in the field of document analysis and OCR and is the first step to be accomplished. This paper presents a novel technique for skew detection and correction which is both language and content independent. The proposed technique is based on the maximum density of the foreground pixels and their orientation in the document image. Unlike other conventional algorithms which work only for machine printed textual documents scripted in English, this technique works well for all kinds of document images (machine printed, hand written, complex, noisy and simple). The technique presented here is tested with 150 different document image samples and is found to provide results with an accuracy of 0.10


Author(s):  
M. Ramanan

Skew detection and correction of a scanned document is a very important step in Optical Character Recognition because skew of scanned document is reducing the accuracy of text line approach for skew detection and correction to calculate the skew angle on multi-script scanned document using Radon transform, Hough transform, Harries corner, Wiener filter and smearing algorithm. In this paper, a proposed approach is compared existing skew detection and correction techniques for printed documents having different scripts: English, Tamil, Sinhala and mixed-script. A proposed hybrid method is tested on 160 documents. The overall testing results is 90.62% for skew detection and correction.


2008 ◽  
Vol 08 (01) ◽  
pp. 47-59
Author(s):  
A. V. N. MANJUNATH ◽  
K. G. HEMANTHA ◽  
S. NOUSHATH

In this paper, we propose a novel skew estimation technique for binary document images based on Boundary Growing Method (BGM), thinning and moments. BGM helps in extracting the text line blocks from the document. Thinning1 is performed to fit the best line for extracted text line blocks. Further, skew is computed for thinned line using second order moments. Several experiments have been conducted on various types of documents such as documents containing south Indian languages, English documents, journals, text with picture, noisy images, and document with different fonts and resolutions, to reveal the robustness of the proposed method. Based on the experimental results we have realized that the proposed method outperforms existing methods both in terms of mean and standard deviation.


2017 ◽  
Vol 13 (4) ◽  
pp. 13-21
Author(s):  
Sh M Khapizov ◽  
M G Shekhmagomedov

The article is devoted to the study of inscriptions on the gravestones of Haji Ibrahim al-Uradi, his father, brothers and other relatives. The information revealed during the translation of these inscriptions allows one to date important events from the history of Highland Dagestan. Also we can reconsider the look at some important events from the past of Hidatl. Epitaphs are interesting in and of themselves, as historical and cultural monuments that needed to be studied and attributed. Research of epigraphy data monuments clarifies periodization medieval epitaphs mountain Dagestan using record templates and features of the Arabic script. We see the study of medieval epigraphy as one of the important tasks of contemporary Caucasian studies facing Dagestani researchers. Given the relatively weak illumination of the picture of events of that period in historical sources, comprehensive work in this direction can fill gaps in our knowledge of the medieval history of Dagestan. In addition, these epigraphs are of great importance for researchers of onomastics, linguistics, the history of culture and religion of Dagestan. The authors managed to clarify the date of death of Ibrahim-Haji al-Uradi, as well as his two sons. These data, the attraction of written sources and legends allowed the reconstruction of the events of the second half of the 18th century. For example, because of the epidemic of plague and the death of most of the population of Hidatl, this society noticeably weakened and could no longer maintain its influence on Akhvakh. The attraction of memorable records allowed us to specify the dates of the Ibrahim-Haji pilgrimage to Mecca and Medina, as well as the route through which he traveled to these cities.


2019 ◽  
Vol 16 (2-3) ◽  
pp. 281-300
Author(s):  
Amanda Lanzillo

Focusing on the lithographic print revolution in North India, this article analyses the role played by scribes working in Perso-Arabic script in the consolidation of late nineteenth-century vernacular literary cultures. In South Asia, the rise of lithographic printing for Perso-Arabic script languages and the slow shift from classical Persian to vernacular Urdu as a literary register took place roughly contemporaneously. This article interrogates the positionality of scribes within these transitions. Because print in North India relied on lithography, not movable type, scribes remained an important part of book production on the Indian subcontinent through the early twentieth century. It analyses the education and models of employment of late nineteenth-century scribes. New scribal classes emerged during the transition to print and vernacular literary culture, in part due to the intervention of lithographic publishers into scribal education. The patronage of Urdu-language scribal manuals by lithographic printers reveals that scribal education in Urdu was directly informed by the demands of the print economy. Ultimately, using an analysis of scribal manuals, the article contributes to our knowledge of the social positioning of book producers in South Asia and demonstrates the vitality of certain practices associated with manuscript culture in the era of print.


Sign in / Sign up

Export Citation Format

Share Document