A New Combined Feature Extraction Method for Persian Handwritten Digit Recognition
Feature extraction is one of the most important steps in Optical Character Recognition (OCR) systems, that is effective in recognition accuracy. In this paper, a suitable combination of different features such as zoning, hole size, crossing counts, etc. for Persian handwritten digits recognition is proposed. Due to high number of features, feature vector dimensions will be high that increases training time exponentially. In this paper, to solve this problem, Principal Component Analysis (PCA) method is employed for reducing the feature vector dimensions. Finally, data are classified by Support Vector Machine (SVM) classification method. The proposed method has been executed on HODA dataset which is one of the largest standard datasets of Persian handwritten digits that includes 60[Formula: see text]000 training and 20[Formula: see text]000 test samples. The proposed method reaches to 99.07% of accuracy in this dataset, and the experimental results show significant improvement in accuracy of Persian handwritten OCR compared to the previous methods.