scholarly journals Sindhi Handwritten Text Recognition Using SVM

In Sindhi Language, handwritten text feature extraction is such a challenging task for all scholars, because different people write in different styles or manners, to analyze each text is such a complex problem. Feature extraction of text segmentation, classifying each character and labelling for training data to recognize text for different handwritings and testing for analyzing features of providing handwritten text data .In this research, SVM (support vector machine) is used for analyzing and tokenizing each character or word of Sindhi Language text and transform into suitable information with efficiency & accuracy. The research is not only useful for improving the knowledge of Sindhi Handwritten Text Recognition but it can be beneficial for other recognition systems

2020 ◽  
Vol 10 (19) ◽  
pp. 6904
Author(s):  
Chang-Min Kim ◽  
Ellen J. Hong ◽  
Kyungyong Chung ◽  
Roy C. Park

Recently, demand for handwriting recognition, such as automation of mail sorting, license plate recognition, and electronic memo pads, has exponentially increased in various industrial fields. In addition, in the image recognition field, methods using artificial convolutional neural networks, which show outstanding performance, have been applied to handwriting recognition. However, owing to the diversity of recognition application fields, the number of dimensions in the learning and reasoning processes is increasing. To solve this problem, a principal component analysis (PCA) technique is used for dimensionality reduction. However, PCA is likely to increase the accuracy loss due to data compression. Therefore, in this paper, we propose a line-segment feature analysis (LFA) algorithm for input dimensionality reduction in handwritten text recognition. This proposed algorithm extracts the line segment information, constituting the image of input data, and assigns a unique value to each segment using 3 × 3 and 5 × 5 filters. Using the unique values to identify the number of line segments and adding them up, a 1-D vector with a size of 512 is created. This vector is used as input to machine-learning. For the performance evaluation of the method, the Extending Modified National Institute of Standards and Technology (EMNIST) database was used. In the evaluation, PCA showed 96.6% and 93.86% accuracy with k-nearest neighbors (KNN) and support vector machine (SVM), respectively, while LFA showed 97.5% and 98.9% accuracy with KNN and SVM, respectively.


2021 ◽  
Vol 3 (8) ◽  
Author(s):  
Fetulhak Abdurahman ◽  
Eyob Sisay ◽  
Kinde Anlay Fante

AbstractAmharic ("Image missing") is the official language of the Federal Government of Ethiopia, with more than 27 million speakers. It uses an Ethiopic script, which has 238 core and 27 labialized characters. It is a low-resourced language, and a few attempts have been made so far for its handwritten text recognition. However, Amharic handwritten text recognition is challenging due to the very high similarity between characters. This paper presents a convolutional recurrent neural networks based offline handwritten Amharic word recognition system. The proposed framework comprises convolutional neural networks (CNNs) for feature extraction from input word images, recurrent neural network (RNNs) for sequence encoding, and connectionist temporal classification as a loss function. We designed a custom CNN model and compared its performance with three different state-of-the-art CNN models, including DenseNet-121, ResNet-50 and VGG-19 after modifying their architectures to fit our problem domain, for robust feature extraction from handwritten Amharic word images. We have conducted detailed experiments with different CNN and RNN architectures, input word image sizes, and applied data augmentation techniques to enhance performance of the proposed models. We have prepared a handwritten Amharic word dataset, HARD-I, which is available publicly for researchers. From the experiments on various recognition models using our dataset, a WER of 5.24 % and CER of 1.15 % were achieved using our best-performing recognition model. The proposed models achieve a competitive performance compared to existing models for offline handwritten Amharic word recognition.


2020 ◽  
Vol 13 (2) ◽  
pp. 200-214
Author(s):  
Rajib Ghosh ◽  
Prabhat Kumar

Background: The growing use of smart hand-held devices in the daily lives of the people urges for the requirement of online handwritten text recognition. Online handwritten text recognition refers to the identification of the handwritten text at the very moment it is written on a digitizing tablet using some pen-like stylus. Several techniques are available for online handwritten text recognition in English, Arabic, Latin, Chinese, Japanese, and Korean scripts. However, limited research is available for Indic scripts. Objective: This article presents a novel approach for online handwritten numeral and character (simple and compound) recognition of three popular Indic scripts - Devanagari, Bengali and Tamil. Methods: The proposed work employs the Zone wise Slopes of Dominant Points (ZSDP) method for feature extraction from the individual characters. Support Vector Machine (SVM) and Hidden Markov Model (HMM) classifiers are used for recognition process. Recognition efficiency is improved by combining the probabilistic outcomes of the SVM and HMM classifiers using Dempster-Shafer theory. The system is trained using separate as well as combined dataset of numerals, simple and compound characters. Results: The performance of the present system is evaluated using large self-generated datasets as well as public datasets. Results obtained from the present work demonstrate that the proposed system outperforms the existing works in this regard. Conclusion: This work will be helpful to carry out researches on online recognition of handwritten character in other Indic scripts as well as recognition of isolated words in various Indic scripts including the scripts used in the present work.


2021 ◽  
pp. 340-350
Author(s):  
Silvia Cascianelli ◽  
Marcella Cornia ◽  
Lorenzo Baraldi ◽  
Maria Ludovica Piazzi ◽  
Rosiana Schiuma ◽  
...  

Author(s):  
Novie Theresia Br Pasaribu ◽  
M. Jimmy Hasugian

Offline handwriting recognition is one of the most prominent research topics due to its tremendous application and high variability as well. This paper covers the offline Batak Toba handwritten text recognition, from the noise removal, the process of feature extraction until the recognition by using several classifiers. Experiments show that elliptic fourier descriptor (EFD) is the most discriminative feature and Mahalanobis distance (MD) outperforms the two others classifier.


Sign in / Sign up

Export Citation Format

Share Document