Bag of Visual Words for Word Spotting in Handwritten Documents Based on Curvature Features

In this article, the authors propose a segmentation-free word spotting in handwritten document images using a Bag of Visual Words (BoVW) framework based on the co-occurrence histogram of oriented gradient (Co-HOG) descriptor. Initially, the handwritten document is represented using visual word vectors which are obtained based on the frequency of occurrence of Co-HOG descriptor within local patches of the document. The visual word representation vector does not consider their spatial location and spatial information helps to determine a location exclusively with visual information when the different location can be perceived as the same. Hence, to add spatial distribution information of visual words into the unstructured BoVW framework, the authors adopted spatial pyramid matching (SPM) technique. The performance of the proposed method evaluated using popular datasets and it is confirmed that the authors' method outperforms existing segmentation free word spotting techniques.

Download Full-text

Bag of Visual Words Based on Co-HOG Features for Word Spotting in Handwritten Documents

Advancements in Computer Vision and Image Processing - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-5225-5628-2.ch007 ◽

2018 ◽

pp. 162-189

Author(s):

Thontadari C. ◽

Prabhakar C. J.

Keyword(s):

Spatial Information ◽

Bag Of Visual Words ◽

Shape Information ◽

Word Spotting ◽

Handwritten Documents ◽

Visual Words ◽

Gradient Orientation ◽

Handwritten Document ◽

Image Shape ◽

Pyramid Matching

In this chapter, the authors present a segmentation-based word spotting method for handwritten documents using bag of visual words (BoVW) framework based on co-occurrence histograms of oriented gradients (Co-HOG) features. The Co-HOG descriptor captures the word image shape information and encodes the local spatial information by counting the co-occurrence of gradient orientation of neighbor pixel pairs. The handwritten document images are segmented into words and each word image is represented by a vector that contains the frequency of visual words appeared in the image. In order to include spatial information to the BoVW framework, the authors adopted spatial pyramid matching (SPM) method. The proposed method is evaluated using precision and recall metrics through experimentation conducted on popular datasets such as GW and IAM. The performance analysis confirmed that the method outperforms existing word spotting techniques.

Download Full-text

ON THE INFLUENCE OF WORD REPRESENTATIONS FOR HANDWRITTEN WORD SPOTTING IN HISTORICAL DOCUMENTS

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001412630025 ◽

2012 ◽

Vol 26 (05) ◽

pp. 1263002 ◽

Cited By ~ 32

Author(s):

JOSEP LLADÓS ◽

MARÇAL RUSIÑOL ◽

ALICIA FORNÉS ◽

DAVID FERNÁNDEZ ◽

ANJAN DUTTA

Keyword(s):

Structural Model ◽

Historical Data ◽

Historical Documents ◽

Bag Of Visual Words ◽

George Washington ◽

Word Spotting ◽

Visual Words ◽

Advantages And Disadvantages ◽

Word Representation ◽

Word Images

Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images.

Download Full-text