Word-Based Correction for Retrieval of Arabic OCR Degraded Documents

Author(s):  
Walid Magdy ◽  
Kareem Darwish
2008 ◽  
Vol 18 (05) ◽  
pp. 405-418 ◽  
Author(s):  
ADNAN KHASHMAN ◽  
BORAN SEKEROGLU

Advances in digital technologies have allowed us to generate more images than ever. Images of scanned documents are examples of these images that form a vital part in digital libraries and archives. Scanned degraded documents contain background noise and varying contrast and illumination, therefore, document image binarisation must be performed in order to separate foreground from background layers. Image binarisation is performed using either local adaptive thresholding or global thresholding; with local thresholding being generally considered as more successful. This paper presents a novel method to global thresholding, where a neural network is trained using local threshold values of an image in order to determine an optimum global threshold value which is used to binarise the whole image. The proposed method is compared with five local thresholding methods, and the experimental results indicate that our method is computationally cost-effective and capable of binarising scanned degraded documents with superior results.


Author(s):  
Ahmed Hussain Aliwy ◽  
Basheer Al-Sadawi

<p><span>An optical character recognition (OCR) refers to a process of converting the text document images into editable and searchable text. OCR process poses several challenges in particular in the Arabic language due to it has caused a high percentage of errors. In this paper, a method, to improve the outputs of the Arabic Optical character recognition (AOCR) Systems is suggested based on a statistical language model built from the available huge corpora. This method includes detecting and correcting non-word and real words error according to the context of the word in the sentence. The results show that the percentage of improvement in the results is up to (98%) as a new accuracy for AOCR output. </span></p>


2016 ◽  
Vol 3 (3) ◽  
pp. 27
Author(s):  
HARSHMANI ◽  
GUPTA NANCY ◽  
KAUR GURPREET ◽  
◽  
◽  
...  

Sign in / Sign up

Export Citation Format

Share Document