A clustering‐based feature selection framework for handwritten Indic script classification

AbstractIn any multi-script environment, handwritten script classification is an unavoidable pre-requisite before the document images are fed to their respective Optical Character Recognition (OCR) engines. Over the years, this complex pattern classification problem has been solved by researchers proposing various feature vectors mostly having large dimensions, thereby increasing the computation complexity of the whole classification model. Feature Selection (FS) can serve as an intermediate step to reduce the size of the feature vectors by restricting them only to the essential and relevant features. In the present work, we have addressed this issue by introducing a new FS algorithm, called Hybrid Swarm and Gravitation-based FS (HSGFS). This algorithm has been applied over three feature vectors introduced in the literature recently—Distance-Hough Transform (DHT), Histogram of Oriented Gradients (HOG), and Modified log-Gabor (MLG) filter Transform. Three state-of-the-art classifiers, namely, Multi-Layer Perceptron (MLP), K-Nearest Neighbour (KNN), and Support Vector Machine (SVM), are used to evaluate the optimal subset of features generated by the proposed FS model. Handwritten datasets at block, text line, and word level, consisting of officially recognized 12 Indic scripts, are prepared for experimentation. An average improvement in the range of 2–5% is achieved in the classification accuracy by utilizing only about 75–80% of the original feature vectors on all three datasets. The proposed method also shows better performance when compared to some popularly used FS models. The codes used for implementing HSGFS can be found in the following Github link: https://github.com/Ritam-Guha/HSGFS.

Download Full-text

Relationship Aware Context Adaptive Feature Selection Framework for Image Parsing

10.1109/ijcnn52387.2021.9534310 ◽

2021 ◽

Author(s):

Basim Azam ◽

Ranju Mandal ◽

Brijesh Verma

Keyword(s):

Feature Selection ◽

Image Parsing ◽

Adaptive Feature Selection ◽

Selection Framework

Download Full-text

Deployment of a Regularized Feature Selection Framework on an Overlay Desktop Grid

2009 International Workshop on High Performance Computational Systems Biology ◽

10.1109/hibi.2009.15 ◽

2009 ◽

Cited By ~ 1

Author(s):

Annalisa Barla ◽

Marco Ferrante

Keyword(s):

Feature Selection ◽

Desktop Grid ◽

Selection Framework

Download Full-text

Hybrid Feature Selection Framework for Identification of Alzheimer�s Biomarkers

Indian Journal of Science and Technology ◽

10.17485/ijst/2018/v11i20/123310 ◽

2018 ◽

Vol 11 (22) ◽

pp. 1-10

Author(s):

V. Thavavel ◽

M. Karthiyayini ◽

◽

Keyword(s):

Feature Selection ◽

Selection Framework

Download Full-text

An ensemble feature selection framework for early detection of Parkinson's disease based on feature correlation analysis

10.22541/au.161526195.51192977/v1 ◽

2021 ◽

Author(s):

Sarfaraz Masood ◽

Khwaja Wisal ◽

Om Pal ◽

Chanchal Kumar

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Feature Selection ◽

Large Population ◽

Feature Selection Method ◽

Sampling Strategy ◽

Chi Square ◽

Accurate Identification ◽

Initial Symptoms ◽

Selection Framework

Parkinson’s disease (PD) is a highly common neurological disease affecting a large population worldwide. Several studies revealed that the degradation of voice is one of its initial symptoms, which is also known as dysarthria. In this work, we attempt to explore and harness the correlation between various features in the voice samples observed in PD subjects. To do so, a novel two-level ensemble-based feature selection method has been proposed, whose results were combined with an MLP based classifier using K-fold cross-validation as the re-sampling strategy. Three separate benchmark datasets of voice samples were used for the experimentation work. Results strongly suggest that the proposed feature selection framework helps in identifying an optimal set of features which further helps in highly accurate identification of PD patients using a Multi-Layer Perceptron from their voice samples. The proposed model achieves an overall accuracy of 98.3%, 95.1% and 100% on the three selected datasets respectively. These results are significantly better than those achieved by a non-feature selection based option, and even the recently proposed chi-square based feature selection option.

Download Full-text