ON THE INFLUENCE OF WORD REPRESENTATIONS FOR HANDWRITTEN WORD SPOTTING IN HISTORICAL DOCUMENTS

Author(s):  
JOSEP LLADÓS ◽  
MARÇAL RUSIÑOL ◽  
ALICIA FORNÉS ◽  
DAVID FERNÁNDEZ ◽  
ANJAN DUTTA

Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images.

2019 ◽  
Vol 9 (2) ◽  
pp. 49-65
Author(s):  
Thontadari C. ◽  
Prabhakar C. J.

In this article, the authors propose a segmentation-free word spotting in handwritten document images using a Bag of Visual Words (BoVW) framework based on the co-occurrence histogram of oriented gradient (Co-HOG) descriptor. Initially, the handwritten document is represented using visual word vectors which are obtained based on the frequency of occurrence of Co-HOG descriptor within local patches of the document. The visual word representation vector does not consider their spatial location and spatial information helps to determine a location exclusively with visual information when the different location can be perceived as the same. Hence, to add spatial distribution information of visual words into the unstructured BoVW framework, the authors adopted spatial pyramid matching (SPM) technique. The performance of the proposed method evaluated using popular datasets and it is confirmed that the authors' method outperforms existing segmentation free word spotting techniques.


Author(s):  
Thontadari C. ◽  
Prabhakar C. J.

In this chapter, the authors present a segmentation-based word spotting method for handwritten documents using bag of visual words (BoVW) framework based on co-occurrence histograms of oriented gradients (Co-HOG) features. The Co-HOG descriptor captures the word image shape information and encodes the local spatial information by counting the co-occurrence of gradient orientation of neighbor pixel pairs. The handwritten document images are segmented into words and each word image is represented by a vector that contains the frequency of visual words appeared in the image. In order to include spatial information to the BoVW framework, the authors adopted spatial pyramid matching (SPM) method. The proposed method is evaluated using precision and recall metrics through experimentation conducted on popular datasets such as GW and IAM. The performance analysis confirmed that the method outperforms existing word spotting techniques.


2013 ◽  
Vol 2 (1) ◽  
pp. 22-26
Author(s):  
Joanna Czekaj ◽  
Kamil Trepka

Abstract Goczałkowice reservoir is one of the main source of drinking water for Upper Silesia Region. In reference to Water Frame Directive matter since 2010 the strategic research project: „Integrated system supporting management and protection of dammed reservoir (ZiZoZap)”, which is being conducted on Goczałkowice reservoir, has been pursued. In the framework of this project complex groundwater monitoring is carried on. One aspect is vadose zone research, conducted to obtain information about changes in chemical composition of infiltrating water and mass transport within this zone. Based on historical data and the structural model of direct catchment of Goczałkowice reservoir location of the vadose zone research site was selected. At the end of November 2012 specially designed lysimeter was installed with 10 MacroRhizon samplers at each lithological variation in unsaturated zone. This lysimeter, together with nested observation wells, located in the direct proximity, create the vadose zone research site which main aim is specifying the amount of nitrate transport in the vertical profile.


Sign in / Sign up

Export Citation Format

Share Document