Information technology. Coded graphic character set for text communication. Latin alphabet

2015 ◽  
Author(s):  
Weilan Wang ◽  
Zhengjiang Li ◽  
Zhengqi Cai ◽  
Xiaobao Lv ◽  
Caike Zhaxi ◽  
...  

The online handwriting recognition of Tibetan characters is still in its infancy. For further research, an online handwriting database of large Tibetan character set was developed, and a recognition research was carried out on this database as a baseline result. The Northwest Minzu University Online Tibetan Handwriting Database (NMU-OLTHWDB) contains 7240 different types of characters, and the sample number in each type is 5000. The total number of samples is [Formula: see text]. The database covers Tibetan Character Collection, Information Technology Tibetan Coded Character set (Extension Set A), and Information Technology Tibetan Coded Character set (Extension Set B). The characters in the database are composed of 170 types of different components. We studied the online handwritten Tibetan recognition software also, and the character feature extraction, classifier training, and the statistics and analysis of the recognition results on the test set were mainly introduced. The character features included the direction attribute coefficients and spatial combination, and the feature matrix was compressed by Linear Discriminate Analysis (LDA). A quick classifier was designed by a modified quadratic discriminate function (QMQDF), and was trained with 4500 sets of samples. In the large character set, the recognition rates of top 1, top 3, top 5, and top 10 were 75.2%, 89.56%, 93.02%, and 95.96%, respectively. Moreover, an online handwriting recognition system for Tibetan large character set was designed with good performance.


Sign in / Sign up

Export Citation Format

Share Document