Conceptions of an On-line Chinese Character Knowledge Base

2014 ◽  
Vol 33 (0) ◽  
pp. 25-45
Author(s):  
Suen Lun ◽  
Author(s):  
Ju-Wei Chen ◽  
Suh-Yin Lee

Chinese characters are constructed by basic strokes based on structural rules. In handwritten characters, the shapes of the strokes may vary to some extent, but the spatial relations and geometric configurations of the strokes are usually maintained. Therefore these spatial relations and configurations could be regarded as invariant features and could be used in the recognition of handwritten Chinese characters. In this paper, we investigate the structural knowledge in Chinese characters and propose the stroke spatial relationship representation (SSRR) to describe Chinese characters. An On-Line Chinese Character Recognition (OLCCR) method using the SSRR is also presented. With SSRR, each character is processed and is represented by an attribute graph. The process of character recognition is thereby transformed into a graph matching problem. After careful analysis, the basic spatial relationship between strokes can be characterized into five classes. A bitwise representation is adopted in the design of the data structure to reduce storage requirements and to speed up character matching. The strategy of hierarchical search in the preclassification improves the recognition speed. Basically, the attribute graph model is a generalized character representation that provides a useful and convenient representation for newly added characters in an OLCCR system with automatic learning capability. The significance of the structural approach of character recognition using spatial relationships is analyzed and is proved by experiments. Realistic testing is provided to show the effectiveness of the proposed method.


Author(s):  
Chen Hong ◽  
Gareth Loudon ◽  
Yimin Wu ◽  
Ruslana Zitserman

This article introduces the basic segmentation problems in Chinese handwriting and also several prior work to solve these problems. A new segmentation method is proposed, which is applicable to both on-line and off-line systems for free-format handwritten Chinese character sentences. This method performs basic segmentation and fine segmentation based on the varying spacing thresholds and the minimum variance criteria. The five most probable ways of segmentation are derived from this stage and all the possible segments are extracted and recognized. A lattice is created from all the segments and searched using a viterbi based algorithm to find the most likely character sequence. The algorithm presented in this paper provides large flexibility and robustness to handle free-format continuous Chinese handwriting and is a promising solution for a natural and fast Chinese pen input system. The character accuracy is 85.0% for on-line and 77.4% for the off-line test data.


Sign in / Sign up

Export Citation Format

Share Document