scholarly journals Camera Captured Handwritten Kannada Character Recognition

Optical Character Recognition (OCR) is an automatic reading of text components that are optically sensed to translate human-readable characters into machine-rea dable codes. In handwritten the style of writing vary from person to person, so it is very challenging task to segment and recognize the characters. In this paper we are proposing segmentation and feature extraction techniques to recognise camera captured, handwritten Kannada documents. The segmentation is done by using projection profile technique & Connected Component Analysis (CCA). The pre-processing technique to detect the edges of Kannada character, we have proposed our own technique by combining of Sobel and Canny edge detection. The feature selection and extraction is done in two level, global and local features. Global features are extracted from entire image. In local feature extraction we divided an input character image in to four quadrate based on centroid of character and we will extract local features from all quadrates rather than whole image. We have used Support vector machine (SVM) to classify the handwritten Kannada characters. To evaluate the efficiency of proposed system we have used KHDD dataset, our own document and character dataset. The experimental results shows that our proposed features selection and extraction achieved 96.31% of accuracy, results are encouraging

Author(s):  
Htwe Pa Pa Win ◽  
Phyo Thu Thu Khine ◽  
Khin Nwe Ni Tun

This paper proposes a new feature extraction method for off-line recognition of Myanmar printed documents. One of the most important factors to achieve high recognition performance in Optical Character Recognition (OCR) system is the selection of the feature extraction methods. Different types of existing OCR systems used various feature extraction methods because of the diversity of the scripts’ natures. One major contribution of the work in this paper is the design of logically rigorous coding based features. To show the effectiveness of the proposed method, this paper assumed the documents are successfully segmented into characters and extracted features from these isolated Myanmar characters. These features are extracted using structural analysis of the Myanmar scripts. The experimental results have been carried out using the Support Vector Machine (SVM) classifier and compare the pervious proposed feature extraction method.


2010 ◽  
Vol 20-23 ◽  
pp. 1253-1259
Author(s):  
Chang Jun Zhou ◽  
Xiao Peng Wei ◽  
Qiang Zhang

In this paper, we propose a novel algorithm for facial recognition based on features fusion in support vector machine (SVM). First, some local features and global features from pre-processed face images are obtained. The global features are obtained by making use of singular value decomposition (SVD). At the same time, the local features are obtained by utilizing principal component analysis (PCA) to extract the principal Gabor features. Finally, the feature vectors which are fused with global and local features are used to train SVM to realize the face expression recognition, and the computer simulation illustrates the effectivity of this method on the JAFFE database.


2017 ◽  
Vol 17 (02) ◽  
pp. 1750012 ◽  
Author(s):  
Mohammad Javad Parseh ◽  
Mojtaba Meftahi

Feature extraction is one of the most important steps in Optical Character Recognition (OCR) systems, that is effective in recognition accuracy. In this paper, a suitable combination of different features such as zoning, hole size, crossing counts, etc. for Persian handwritten digits recognition is proposed. Due to high number of features, feature vector dimensions will be high that increases training time exponentially. In this paper, to solve this problem, Principal Component Analysis (PCA) method is employed for reducing the feature vector dimensions. Finally, data are classified by Support Vector Machine (SVM) classification method. The proposed method has been executed on HODA dataset which is one of the largest standard datasets of Persian handwritten digits that includes 60[Formula: see text]000 training and 20[Formula: see text]000 test samples. The proposed method reaches to 99.07% of accuracy in this dataset, and the experimental results show significant improvement in accuracy of Persian handwritten OCR compared to the previous methods.


Author(s):  
Fardilla Zardi Putri ◽  
Budhi Irawan ◽  
Umar Ali Ahmad

Pada era global ini menguasai bahasa selain bahasa Indonesia merupakan salah satu kebutuhan penting yang harus dimiliki setiap orang. Banyak orang berkunjung ke negara lain untuk melakukan banyak kegiatan seperti bekerja, belajar, bahkan berlibur. Salah satu negara yang banyak dikunjungi adalah negara Jepang. Negara Jepang memiliki bentuk huruf yang berbeda dengan huruf latin pada umumnya. Untuk mempelajari bahasa Jepang tersebut dibutuhkan pemahaman dengan huruf-hurufnya. Seiring dengan berkembangnya teknologi, pengenalan karakter atau sering Optical Character Recognition (OCR) merupakan salah satu aplikasi teknologi pada bidang pengenalan karakter atau pola dan kecerdasan buatan sebagai mesin pembaca. Pada penelitian ini, akan dirancang sebuah aplikasi penerjemah kata dalam bahasa Jepang berbasis Android dengan memanfaatkan prinsip dasar OCR dengan menggunakan metode Directional Feature Extraction dan Support Vector Machine. Pengujian yang dilakukan memberikan hasil terbaik pada nilai akurasi yang dicapai dengan menggunakan metode Directional Feature Extraction dan Support Vector Machine adalah 85,71%. Pada penelitian ini, menggunakan 104 data latih. Hasil pengujian Beta atas empat poin, yaitu tampilan aplikasi, waktu respons sistem, ketepatan penerjemahan, dan manfaat aplikasi menunjukkan aplikasi dapat diklasifikasikan baik.


Author(s):  
Ritam Guha ◽  
Manosij Ghosh ◽  
Pawan Kumar Singh ◽  
Ram Sarkar ◽  
Mita Nasipuri

AbstractIn any multi-script environment, handwritten script classification is an unavoidable pre-requisite before the document images are fed to their respective Optical Character Recognition (OCR) engines. Over the years, this complex pattern classification problem has been solved by researchers proposing various feature vectors mostly having large dimensions, thereby increasing the computation complexity of the whole classification model. Feature Selection (FS) can serve as an intermediate step to reduce the size of the feature vectors by restricting them only to the essential and relevant features. In the present work, we have addressed this issue by introducing a new FS algorithm, called Hybrid Swarm and Gravitation-based FS (HSGFS). This algorithm has been applied over three feature vectors introduced in the literature recently—Distance-Hough Transform (DHT), Histogram of Oriented Gradients (HOG), and Modified log-Gabor (MLG) filter Transform. Three state-of-the-art classifiers, namely, Multi-Layer Perceptron (MLP), K-Nearest Neighbour (KNN), and Support Vector Machine (SVM), are used to evaluate the optimal subset of features generated by the proposed FS model. Handwritten datasets at block, text line, and word level, consisting of officially recognized 12 Indic scripts, are prepared for experimentation. An average improvement in the range of 2–5% is achieved in the classification accuracy by utilizing only about 75–80% of the original feature vectors on all three datasets. The proposed method also shows better performance when compared to some popularly used FS models. The codes used for implementing HSGFS can be found in the following Github link: https://github.com/Ritam-Guha/HSGFS.


2021 ◽  
Vol 13 (22) ◽  
pp. 4518
Author(s):  
Xin Zhao ◽  
Jiayi Guo ◽  
Yueting Zhang ◽  
Yirong Wu

The semantic segmentation of remote sensing images requires distinguishing local regions of different classes and exploiting a uniform global representation of the same-class instances. Such requirements make it necessary for the segmentation methods to extract discriminative local features between different classes and to explore representative features for all instances of a given class. While common deep convolutional neural networks (DCNNs) can effectively focus on local features, they are limited by their receptive field to obtain consistent global information. In this paper, we propose a memory-augmented transformer (MAT) to effectively model both the local and global information. The feature extraction pipeline of the MAT is split into a memory-based global relationship guidance module and a local feature extraction module. The local feature extraction module mainly consists of a transformer, which is used to extract features from the input images. The global relationship guidance module maintains a memory bank for the consistent encoding of the global information. Global guidance is performed by memory interaction. Bidirectional information flow between the global and local branches is conducted by a memory-query module, as well as a memory-update module, respectively. Experiment results on the ISPRS Potsdam and ISPRS Vaihingen datasets demonstrated that our method can perform competitively with state-of-the-art methods.


2019 ◽  
Vol 9 (15) ◽  
pp. 3130 ◽  
Author(s):  
Navarro ◽  
Perez

Many applications in image analysis require the accurate classification of complex patterns including both color and texture, e.g., in content image retrieval, biometrics, and the inspection of fabrics, wood, steel, ceramics, and fruits, among others. A new method for pattern classification using both color and texture information is proposed in this paper. The proposed method includes the following steps: division of each image into global and local samples, texture and color feature extraction from samples using a Haralick statistics and binary quaternion-moment-preserving method, a classification stage using support vector machine, and a final stage of post-processing employing a bagging ensemble. One of the main contributions of this method is the image partition, allowing image representation into global and local features. This partition captures most of the information present in the image for colored texture classification allowing improved results. The proposed method was tested on four databases extensively used in color–texture classification: the Brodatz, VisTex, Outex, and KTH-TIPS2b databases, yielding correct classification rates of 97.63%, 97.13%, 90.78%, and 92.90%, respectively. The use of the post-processing stage improved those results to 99.88%, 100%, 98.97%, and 95.75%, respectively. We compared our results to the best previously published results on the same databases finding significant improvements in all cases.


2018 ◽  
Vol 5 (4) ◽  
pp. 1-31 ◽  
Author(s):  
Shalini Puri ◽  
Satya Prakash Singh

In recent years, many information retrieval, character recognition, and feature extraction methodologies in Devanagari and especially in Hindi have been proposed for different domain areas. Due to enormous scanned data availability and to provide an advanced improvement of existing Hindi automated systems beyond optical character recognition, a new idea of Hindi printed and handwritten document classification system using support vector machine and fuzzy logic is introduced. This first pre-processes and then classifies textual imaged documents into predefined categories. With this concept, this article depicts a feasibility study of such systems with the relevance of Hindi, a survey report of statistical measurements of Hindi keywords obtained from different sources, and the inherent challenges found in printed and handwritten documents. The technical reviews are provided and graphically represented to compare many parameters and estimate contents, forms and classifiers used in various existing techniques.


Sign in / Sign up

Export Citation Format

Share Document