On-Line Handwritten Chinese Character Recognition Directed by Components with Dynamic Templates

Author(s):  
Xuhong Xiao ◽  
Ruwei Dai

Structural matching has been recognized as a promising approach for on-line Chinese character recognition. In order to reduce its great computational complexity and improve its performance, people have been seeking for ways to direct the matching of a whole character by the result of partial matching. In this paper, the authors proposed 45 basic components for 3,755 categories of daily-used Chinese characters to direct the stroke segment matching of whole characters. Since they are always located at either the beginning or the end of the stroke segment string of characters, these components are easy to be extracted and separated from other parts of a character. Besides, in our approach, the reference templates of these components are extracted dynamically from the corresponding segment string of characters when a specific matching is carried out. This strategy avoids building multiple templates for the components of the same kind but at different places of characters. The experiments show that the segment matching computation has been reduced greatly without reducing the correctness of matching.


Author(s):  
Ju-Wei Chen ◽  
Suh-Yin Lee

Chinese characters are constructed by basic strokes based on structural rules. In handwritten characters, the shapes of the strokes may vary to some extent, but the spatial relations and geometric configurations of the strokes are usually maintained. Therefore these spatial relations and configurations could be regarded as invariant features and could be used in the recognition of handwritten Chinese characters. In this paper, we investigate the structural knowledge in Chinese characters and propose the stroke spatial relationship representation (SSRR) to describe Chinese characters. An On-Line Chinese Character Recognition (OLCCR) method using the SSRR is also presented. With SSRR, each character is processed and is represented by an attribute graph. The process of character recognition is thereby transformed into a graph matching problem. After careful analysis, the basic spatial relationship between strokes can be characterized into five classes. A bitwise representation is adopted in the design of the data structure to reduce storage requirements and to speed up character matching. The strategy of hierarchical search in the preclassification improves the recognition speed. Basically, the attribute graph model is a generalized character representation that provides a useful and convenient representation for newly added characters in an OLCCR system with automatic learning capability. The significance of the structural approach of character recognition using spatial relationships is analyzed and is proved by experiments. Realistic testing is provided to show the effectiveness of the proposed method.





Author(s):  
Hahn-Ming Lee ◽  
Chin-Chou Lin ◽  
Jyh-Ming Chen

In this paper, a method of character preclassification for handwritten Chinese character recognition is proposed. Since the number of Chinese characters is very large (at least 5401s for daily use), we employ two stages to reduce the candidates of an input character. In stage I, we extract the first set of primitive features from handwritten Chinese characters and use fuzzy rules to create four preclassification groups. The purpose in stage I is to reduce the candidates roughly. In stage II, we extract the second set of primitive features from handwritten Chinese characters and then use the Supervised Extended ART (SEART) as the classifier to generate preclassification classes for each preclassification group created in stage I. Since the number of characters in each preclassification class is smaller than that in the whole character set, the problem becomes simpler. In order to evaluate the proposed preclassification system, we use 605 Chinese character categories in the textbooks of elementary school as our training and testing data. The database used is HCCRBASE (provided by CCL, ITRI, Taiwan). In samples 1–100, we select the even samples as the training set, and the odd samples as the testing set. The characters of the testing set can be distributed into correct preclassification classes at a rate of 98.11%.



Sign in / Sign up

Export Citation Format

Share Document