Handwritten Indic Script Identification from Document Images—A Statistical Comparison of Different Attribute Selection Techniques in Multi-classifier Environment

Script identification is crucial for automating optical character recognition (OCR) in multi-script documents since OCRs are script-dependent. In this paper, we present a comprehensive survey of the techniques developed for handwritten Indic script identification. Different pre-processing and feature extraction techniques, including classifiers used for script identification, are categorized and their merits and demerits are discussed. We also provide information about some handwritten Indic script datasets. Finally, we highlight the extensions and/or future scope of works together with challenges.

Download Full-text

An Approach for Automatic Indic Script Identification from Handwritten Document Images

Advances in Intelligent Systems and Computing - Advanced Computing and Systems for Security ◽

10.1007/978-81-322-2653-6_3 ◽

2015 ◽

pp. 37-51 ◽

Cited By ~ 1

Author(s):

Sk. Md. Obaidullah ◽

Chayan Halder ◽

Nibaran Das ◽

Kaushik Roy

Keyword(s):

Document Images ◽

Script Identification ◽

Handwritten Document ◽

Indic Script

Download Full-text

Transform based approach for Indic script identification from handwritten document images

2015 3rd International Conference on Signal Processing, Communication and Networking (ICSCN) ◽

10.1109/icscn.2015.7219852 ◽

2015 ◽

Cited By ~ 7

Author(s):

Sk Md Obaidullah ◽

Rownaqul Karim ◽

Sujal Shaikh ◽

Chayan Halder ◽

Nibaran Das ◽

...

Keyword(s):

Document Images ◽

Script Identification ◽

Handwritten Document ◽

Indic Script

Download Full-text

Convolution Based Technique for Indic Script Identification from Handwritten Document Images

International Journal of Image Graphics and Signal Processing ◽

10.5815/ijigsp.2015.05.06 ◽

2015 ◽

Vol 7 (5) ◽

pp. 49-57 ◽

Cited By ~ 2

Author(s):

Sk Md Obaidullah ◽

◽

Nibaran Das ◽

Kaushik Roy

Keyword(s):

Document Images ◽

Script Identification ◽

Handwritten Document ◽

Indic Script

Download Full-text

Indic script identification from handwritten document images — An unconstrained block-level approach

2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS) ◽

10.1109/retis.2015.7232880 ◽

2015 ◽

Cited By ~ 8

Author(s):

Sk Md Obaidullah ◽

Nibaran Das ◽

Chayan Halder ◽

Kaushik Roy

Keyword(s):

Document Images ◽

Script Identification ◽

Block Level ◽

Handwritten Document ◽

Indic Script

Download Full-text

Gabor Filter Based Technique for Offline Indic Script Identification from Handwritten Document Images

2014 International Conference on Devices, Circuits and Communications (ICDCCom) ◽

10.1109/icdccom.2014.7024723 ◽

2014 ◽

Cited By ~ 5

Author(s):

Sk Md Obaidullah ◽

Das Nibaran ◽

Roy Kaushik

Keyword(s):

Gabor Filter ◽

Document Images ◽

Script Identification ◽

Handwritten Document ◽

Indic Script

Download Full-text

Visual Analytic-Based Technique for Handwritten Indic Script Identification—A Greedy Heuristic Feature Fusion Framework

Advances in Intelligent Systems and Computing - Proceedings of the 4th International Conference on Frontiers in Intelligent Computing: Theory and Applications (FICTA) 2015 ◽

10.1007/978-81-322-2695-6_19 ◽

2015 ◽

pp. 211-219

Author(s):

Sk. Md. Obaidullah ◽

Chayan Halder ◽

Nibaran Das ◽

Kaushik Roy

Keyword(s):

Feature Fusion ◽

Greedy Heuristic ◽

Script Identification ◽

Indic Script ◽

Fusion Framework

Download Full-text

Script Identification of Camera Based Bilingual Document Images Using SFTA Features

10.4018/978-1-6684-3690-5.ch040 ◽

2022 ◽

pp. 811-822

Author(s):

B.V. Dhandra ◽

Satishkumar Mallappa ◽

Gururaj Mukarambi

Keyword(s):

Block Size ◽

Texture Features ◽

Input Image ◽

Document Image ◽

Optimal Size ◽

Document Images ◽

Binary Images ◽

Script Identification ◽

Unified Algorithm ◽

Block Sizes

In this article, the exhaustive experiment is carried out to test the performance of the Segmentation based Fractal Texture Analysis (SFTA) features with nt = 4 pairs, and nt = 8 pairs, geometric features and their combinations. A unified algorithm is designed to identify the scripts of the camera captured bi-lingual document image containing International language English with each one of Hindi, Kannada, Telugu, Malayalam, Bengali, Oriya, Punjabi, and Urdu scripts. The SFTA algorithm decomposes the input image into a set of binary images from which the fractal dimension of the resulting regions are computed in order to describe the segmented texture patterns. This motivates use of the SFTA features as the texture features to identify the scripts of the camera-based document image, which has an effect of non-homogeneous illumination (Resolution). An experiment is carried on eleven scripts each with 1000 sample images of block sizes 128 × 128, 256 × 256, 512 × 512 and 1024 × 1024. It is observed that the block size 512 × 512 gives the maximum accuracy of 86.45% for Gujarathi and English script combination and is the optimal size. The novelty of this article is that unified algorithm is developed for the script identification of bilingual document images.

Download Full-text