USING A STATISTICAL LANGUAGE MODEL TO IMPROVE THE PERFORMANCE OF AN HMM-BASED CURSIVE HANDWRITING RECOGNITION SYSTEM

Author(s):  
U. -V. MARTI ◽  
H. BUNKE
Author(s):  
U.-V. MARTI ◽  
H. BUNKE

In this paper, a system for the reading of totally unconstrained handwritten text is presented. The kernel of the system is a hidden Markov model (HMM) for handwriting recognition. This HMM is enhanced by a statistical language model. Thus linguistic knowledge beyond the lexicon level is incorporated in the recognition process. Another novel feature of the system is that the HMM is applied in such a way that the difficult problem of segmenting a line of text into individual words is avoided. A number of experiments with various language models and large vocabularies have been conducted. The language models used in the system were also analytically compared based on their perplexity.


Cursive Handwriting acknowledgment is an extremely testing zone because of the one of a kind styles of composing starting with one individual then onto the next. Right now, disconnected cursive composing character acknowledgment framework is portrayed utilizing an Artificial Neural Network. The highlights of every character written in the information are extricated and afterward sent to the neural system. Informational collections, having writings of various individuals are utilized in making framework. The suggested acknowledgment framework yields elevated steps of exactness when contrasted with the ordinary methodologies right now. This framework can effectively perceive cursive messages and convert them into auxiliary structure.


Author(s):  
T. VARGA ◽  
H. BUNKE

A perturbation model for the generation of synthetic textlines from existing cursively handwritten lines of text produced by human writers is presented. The goal of synthetic textline generation is to improve the performance of an offline cursive handwriting recognition system by providing it with additional training data. It can be expected that by adding synthetic training data the variability of the training set improves, which leads to a higher recognition rate. On the other hand, synthetic training data may bias a recognizer towards unnatural handwriting styles, which could lead to a deterioration of the recognition rate. In this paper the proposed perturbation model is evaluated under several experimental conditions, and it is shown that significant improvement of the recognition performance is possible even when the original training set is large and the textlines are provided by a large number of different writers.


Sign in / Sign up

Export Citation Format

Share Document