Offline handwritten text recognition using support vector machines

Author(s):  
Martin Rajnoha ◽  
Radim Burget ◽  
Malay Kishore Dutta
2020 ◽  
Vol 10 (19) ◽  
pp. 6904
Author(s):  
Chang-Min Kim ◽  
Ellen J. Hong ◽  
Kyungyong Chung ◽  
Roy C. Park

Recently, demand for handwriting recognition, such as automation of mail sorting, license plate recognition, and electronic memo pads, has exponentially increased in various industrial fields. In addition, in the image recognition field, methods using artificial convolutional neural networks, which show outstanding performance, have been applied to handwriting recognition. However, owing to the diversity of recognition application fields, the number of dimensions in the learning and reasoning processes is increasing. To solve this problem, a principal component analysis (PCA) technique is used for dimensionality reduction. However, PCA is likely to increase the accuracy loss due to data compression. Therefore, in this paper, we propose a line-segment feature analysis (LFA) algorithm for input dimensionality reduction in handwritten text recognition. This proposed algorithm extracts the line segment information, constituting the image of input data, and assigns a unique value to each segment using 3 × 3 and 5 × 5 filters. Using the unique values to identify the number of line segments and adding them up, a 1-D vector with a size of 512 is created. This vector is used as input to machine-learning. For the performance evaluation of the method, the Extending Modified National Institute of Standards and Technology (EMNIST) database was used. In the evaluation, PCA showed 96.6% and 93.86% accuracy with k-nearest neighbors (KNN) and support vector machine (SVM), respectively, while LFA showed 97.5% and 98.9% accuracy with KNN and SVM, respectively.


2020 ◽  
Vol 13 (2) ◽  
pp. 200-214
Author(s):  
Rajib Ghosh ◽  
Prabhat Kumar

Background: The growing use of smart hand-held devices in the daily lives of the people urges for the requirement of online handwritten text recognition. Online handwritten text recognition refers to the identification of the handwritten text at the very moment it is written on a digitizing tablet using some pen-like stylus. Several techniques are available for online handwritten text recognition in English, Arabic, Latin, Chinese, Japanese, and Korean scripts. However, limited research is available for Indic scripts. Objective: This article presents a novel approach for online handwritten numeral and character (simple and compound) recognition of three popular Indic scripts - Devanagari, Bengali and Tamil. Methods: The proposed work employs the Zone wise Slopes of Dominant Points (ZSDP) method for feature extraction from the individual characters. Support Vector Machine (SVM) and Hidden Markov Model (HMM) classifiers are used for recognition process. Recognition efficiency is improved by combining the probabilistic outcomes of the SVM and HMM classifiers using Dempster-Shafer theory. The system is trained using separate as well as combined dataset of numerals, simple and compound characters. Results: The performance of the present system is evaluated using large self-generated datasets as well as public datasets. Results obtained from the present work demonstrate that the proposed system outperforms the existing works in this regard. Conclusion: This work will be helpful to carry out researches on online recognition of handwritten character in other Indic scripts as well as recognition of isolated words in various Indic scripts including the scripts used in the present work.


In Sindhi Language, handwritten text feature extraction is such a challenging task for all scholars, because different people write in different styles or manners, to analyze each text is such a complex problem. Feature extraction of text segmentation, classifying each character and labelling for training data to recognize text for different handwritings and testing for analyzing features of providing handwritten text data .In this research, SVM (support vector machine) is used for analyzing and tokenizing each character or word of Sindhi Language text and transform into suitable information with efficiency & accuracy. The research is not only useful for improving the knowledge of Sindhi Handwritten Text Recognition but it can be beneficial for other recognition systems


2018 ◽  
Author(s):  
Nelson Marcelo Romero Aquino ◽  
Matheus Gutoski ◽  
Leandro Takeshi Hattori ◽  
Heitor Silvério Lopes

Sign in / Sign up

Export Citation Format

Share Document