scholarly journals Computational modelling of an optical character recognition system for Yorùbá printed text images

2020 ◽  
Vol 9 ◽  
pp. e00415
Author(s):  
Olalekan Joseph ONI ◽  
Franklin Oladiipo ASAHIAH
2021 ◽  
Vol 11 (6) ◽  
pp. 7968-7973
Author(s):  
M. Kazmi ◽  
F. Yasir ◽  
S. Habib ◽  
M. S. Hayat ◽  
S. A. Qazi

Urdu Optical Character Recognition (OCR) based on character level recognition (analytical approach) is less popular as compared to ligature level recognition (holistic approach) due to its added complexity, characters and strokes overlapping. This paper presents a holistic approach Urdu ligature extraction technique. The proposed Photometric Ligature Extraction (PLE) technique is independent of font size and column layout and is capable to handle non-overlapping and all inter and intra overlapping ligatures. It uses a customized photometric filter along with the application of X-shearing and padding with connected component analysis, to extract complete ligatures instead of extracting primary and secondary ligatures separately. A total of ~ 2,67,800 ligatures were extracted from scanned Urdu Nastaliq printed text images with an accuracy of 99.4%. Thus, the proposed framework outperforms the existing Urdu Nastaliq text extraction and segmentation algorithms. The proposed PLE framework can also be applied to other languages using the Nastaliq script style, languages such as Arabic, Persian, Pashto, and Sindhi.


2017 ◽  
Vol 11 (1) ◽  
pp. 193-200
Author(s):  
Brahim Sabir ◽  
Yassine Khazri ◽  
Mohamed Moussetad ◽  
Bouzekri Touri

Background:Optical character Recognition (OCR) is a technic that converts scanned or printed text images into editable text. Many OCR solutions have been proposed and used for Latin and Chinese alphabets.However not much can be found about OCRs for the handwriting scripts Arabic Alphabets, and especially to be used for blind and visually impaired persons.This paper has been an attempt towards the development of an OCR for Arabic Alphabets dedicated to blind and visually impaired persons.Method:The proposed Optical Arabic Alphabets Recognition algorithm includes binarization of the inputted image, segmentation, feature extraction and a classification based on neural networks to match read Arabic alphabets with trained pattern.The proposed algorithm has been developed using Matlab, and the solution was designed to be implemented on hardware platform and can be customized for mobile phones.Conclusion:The presented method has the benefit that the accuracy of recognition is comparable to other OCR algorithms.


Sign in / Sign up

Export Citation Format

Share Document