KEYWORD SPOTTING FROM ONLINE CHINESE HANDWRITTEN DOCUMENTS USING ONE-VERSUS-ALL CHARACTER CLASSIFICATION MODEL
In this paper, we propose a method for text-query-based keyword spotting from online Chinese handwritten documents using character classification model. The similarity between the query word and handwriting is obtained by combining the character classification scores. The classifier is trained by one-versus-all strategy so that it gives high similarity to the target class and low scores to the others. Using character classification-based word similarity also helps overcome the out-of-vocabulary (OOV) problem. We use a character-synchronous dynamic search algorithm to efficiently spot the query word in large database. The retrieval performance is further improved by using competing character confusion and writer-adaptive thresholds. Our experimental results on a large handwriting database CASIA-OLHWDB justify the superiority of one-versus-all trained classifiers and the benefits of confidence transformation, character confusion and adaptive thresholds. Particularly, a one-versus-all trained prototype classifier performs as well as a linear support vector machine (SVM) classifier, but consumes much less storage of index file. The experimental comparison with keyword spotting based on handwritten text recognition also demonstrates the effectiveness of the proposed method.