scholarly journals Enhanced Handwritten Document Recognition Using Confusion Matrix Analysis

2021 ◽  
Author(s):  
Umadevi T P ◽  
Murugan A

The handwritten Multilanguage phase is the preprocessing phase that improves the image quality for better identification in the system. The main goals of preprocessing are diodes, noise suppression and line cancellation. After word processing, various attribute extraction techniques are used to process attribute properties for the identification process. Smoothing plays an important role in character recognition. The partitioning process in the word distribution strategy can be divided into global and local texts. The writer does not use this header line to write the text which creates a problem for skew correction, classification and recognition. The dataset used are HWSC and TST1. The tensor flow method is used to estimate the consistency of confusion matrix for the enhancement of the text recognition .The accuracy of the proposed method is 98%.

Author(s):  
Ioannis Markoulidakis ◽  
George Kopsiaftis ◽  
Ioannis Rallis ◽  
Ioannis Georgoulas ◽  
Anastasios Doulamis ◽  
...  

Author(s):  
Shivali Parkhedkar ◽  
Shaveri Vairagade ◽  
Vishakha Sakharkar ◽  
Bharti Khurpe ◽  
Arpita Pikalmunde ◽  
...  

In our proposed work we will accept the challenges of recognizing the words and we will work to win the challenge. The handwritten document is scanned using a scanner. The image of the scanned document is processed victimization the program. Each character in the word is isolated. Then the individual isolated character is subjected to “Feature Extraction” by the Gabor Feature. Extracted features are passed through KNN classifier. Finally we get the Recognized word. Character recognition is a process by which computer recognizes handwritten characters and turns them into a format which a user can understand. Computer primarily based pattern recognition may be a method that involves many sub process. In today’s surroundings character recognition has gained ton of concentration with in the field of pattern recognition. Handwritten character recognition is beneficial in cheque process in banks, form processing systems and many more. Character recognition is one in all the favored and difficult space in analysis. In future, character recognition creates paperless environment. The novelty of this approach is to achieve better accuracy, reduced computational time for recognition of handwritten characters. The proposed method extracts the geometric features of the character contour. These features are based on the basic line types that forms the character skeleton. The system offers a feature vector as its output. The feature vectors so generated from a training set, were then used to train a pattern recognition engine based on Neural Networks so that the system can be benchmarked. The algorithm proposed concentrates on the same. It extracts totally different line varieties that forms a specific character. It conjointly also concentrates on the point options of constant. The feature extraction technique explained was tested using a Neural Network which was trained with the feature vectors obtained from the proposed method.


Author(s):  
Anne Permaloff ◽  
Carl Grafton

Almost any governmental task employing a computer can be accomplished more efficiently with a variety of tools rather than any single tool. Basic tools for inclusion in the software toolkit are word processing, spreadsheets, statistics, and database management programs. Beyond these, presentation graphics, optical character recognition (OCR), and scheduling software can be helpful depending upon the job at hand. This chapter concerns computer applications and information technology in government. It could have been organized by public administration task such as human resource management or budgeting, but each governmental function uses several software tools that are not unique to that function. Thus a human resource manager uses word processing software and probably a spreadsheet and a database management program. The same could be said of someone involved in budgeting. This example suggests that a tool kit approach that concentrates on software type is a more useful way to organize this subject matter. Topics covered in this chapter include: word processing and desktop publishing, spreadsheets, statistics packages, database management, presentation software, project planning software, graphics for illustrations, optical character recognition, network applications, and geographic information systems. Since most readers are likely to have substantial word processing experience, it would be unproductive to devote much space to word processing per se. The same applies to searching the Web. At the opposite extreme, Web page creation programs are too complex to discuss here. <BR>


Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6666
Author(s):  
Kamil Książek ◽  
Michał Romaszewski ◽  
Przemysław Głomb ◽  
Bartosz Grabowski ◽  
Michał Cholewa

In recent years, growing interest in deep learning neural networks has raised a question on how they can be used for effective processing of high-dimensional datasets produced by hyperspectral imaging (HSI). HSI, traditionally viewed as being within the scope of remote sensing, is used in non-invasive substance classification. One of the areas of potential application is forensic science, where substance classification on the scenes is important. An example problem from that area—blood stain classification—is a case study for the evaluation of methods that process hyperspectral data. To investigate the deep learning classification performance for this problem we have performed experiments on a dataset which has not been previously tested using this kind of model. This dataset consists of several images with blood and blood-like substances like ketchup, tomato concentrate, artificial blood, etc. To test both the classic approach to hyperspectral classification and a more realistic application-oriented scenario, we have prepared two different sets of experiments. In the first one, Hyperspectral Transductive Classification (HTC), both a training and a test set come from the same image. In the second one, Hyperspectral Inductive Classification (HIC), a test set is derived from a different image, which is more challenging for classifiers but more useful from the point of view of forensic investigators. We conducted the study using several architectures like 1D, 2D and 3D convolutional neural networks (CNN), a recurrent neural network (RNN) and a multilayer perceptron (MLP). The performance of the models was compared with baseline results of Support Vector Machine (SVM). We have also presented a model evaluation method based on t-SNE and confusion matrix analysis that allows us to detect and eliminate some cases of model undertraining. Our results show that in the transductive case, all models, including the MLP and the SVM, have comparative performance, with no clear advantage of deep learning models. The Overall Accuracy range across all models is 98–100% for the easier image set, and 74–94% for the more difficult one. However, in a more challenging inductive case, selected deep learning architectures offer a significant advantage; their best Overall Accuracy is in the range of 57–71%, improving the baseline set by the non-deep models by up to 9 percentage points. We have presented a detailed analysis of results and a discussion, including a summary of conclusions for each tested architecture. An analysis of per-class errors shows that the score for each class is highly model-dependent. Considering this and the fact that the best performing models come from two different architecture families (3D CNN and RNN), our results suggest that tailoring the deep neural network architecture to hyperspectral data is still an open problem.


2006 ◽  
Vol 49 (6) ◽  
pp. 798-802
Author(s):  
Shoichiro Fukuda ◽  
Naomi Toida ◽  
Kunihiro Fukushima ◽  
Yuko Kataoka ◽  
Kazunori Nishizaki

2019 ◽  
Vol 8 (1) ◽  
pp. 50-54
Author(s):  
Ashok Kumar Bathla . ◽  
Sunil Kumar Gupta .

Optical Character Recognition (OCR) technology allows a computer to “read” text (both typed and handwritten) the way a human brain does.Significant research efforts have been put in the area of Optical Character Segmentation (OCR) of typewritten text in various languages, however very few efforts have been put on the segmentation and skew correction of handwritten text written in Devanagari which is a scripting language of Hindi. This paper aims a novel technique for segmentation and skew correction of hand written Devanagari text. It shows the accuracy of 91% and takes less than one second to segment a particular handwritten word.


The need for offline handwritten character recognition is intense, yet difficult as the writing varies from person to person and also depends on various other factors connected to the attitude and mood of the person. However, we are able to achieve it by converting the handwritten document into digital form. It has been advanced with introducing convolutional neural networks and is further productive with pre-trained models which have the capacity of decreasing the training time and increasing accuracy of character recognition. Research in recognition of handwritten characters for Indian languages is less when compared to other languages like English, Latin, Chinese etc., mainly because it is a multilingual country. Recognition of Telugu and Hindi characters are more difficult as the script of these languages is mostly cursive and are with more diacritics. So the research work in this line is to have inclination towards accuracy in their recognition. Some research has already been started and is successful up to eighty percent in offline hand written character recognition of Telugu and Hindi. The proposed work focuses on increasing accuracy in less time in recognition of these selected languages and is able to reach the expectant values.


Sign in / Sign up

Export Citation Format

Share Document