scholarly journals Word Spotting in Handwritten Document Images based on Multiple Features

This paper presents word spotting in handwritten documents based on multiple features. Multiple features are derived using Gabor, Histogram oriented gradient (HOG), Local binary pattern, texture filters and Morphological filters. The real time documents are heterogeneous in nature, for instance application forms, postal cards, railway reservations forms etc. includes handwritten and printed text with different scripts. To spot a word in such documents and retrieving them from a huge digitized repository is a challenging task. To address such issues word spotting based on multiple features is carried out with learning and without learning methods. In both the methods (learning and learning free) texture filters are exhibiting outstanding performance in terms of precision recall and f-measures. To confirm the capability of the proposed method, extensive experiments are made on publically available dataset i.e.GW20 and noted encouraging results compared to other contemporary works

Author(s):  
Shamik Majumder ◽  
Subhrangshu Ghosh ◽  
Samir Malakar ◽  
Ram Sarkar ◽  
Mita Nasipuri

Author(s):  
C. Thontadari ◽  
C. J. Prabhakar

In this paper, the authors proposed a Scale Space Co-occurrence Histograms of Oriented Gradients method (SS Co-HOG) for retrieving words from digitized handwritten documents. The poor performance of HOG based word spotting in handwritten documents is due to that HOG ignores spatial information of neighboring pixels whereas Co-HOG captures the spatial information of neighboring pixels through counting the occurrence of the gradient orientations of two or more neighboring pixels. The authors employed three scale parameter representation of an image and at each scale, they divide the word image into blocks and Co-HOG features are extracted from each block and finally concatenate them into form a feature descriptor. The proposed method is evaluated using precision and recall metrics through experimentation conducted on popular datasets such as IAM and GW and confirmed that their method outperforms for both the datasets.


2019 ◽  
Vol 9 (2) ◽  
pp. 49-65
Author(s):  
Thontadari C. ◽  
Prabhakar C. J.

In this article, the authors propose a segmentation-free word spotting in handwritten document images using a Bag of Visual Words (BoVW) framework based on the co-occurrence histogram of oriented gradient (Co-HOG) descriptor. Initially, the handwritten document is represented using visual word vectors which are obtained based on the frequency of occurrence of Co-HOG descriptor within local patches of the document. The visual word representation vector does not consider their spatial location and spatial information helps to determine a location exclusively with visual information when the different location can be perceived as the same. Hence, to add spatial distribution information of visual words into the unstructured BoVW framework, the authors adopted spatial pyramid matching (SPM) technique. The performance of the proposed method evaluated using popular datasets and it is confirmed that the authors' method outperforms existing segmentation free word spotting techniques.


Author(s):  
Ryma Benabdelaziz ◽  
Djamel Gaceb ◽  
Mohammed Haddad

Retrieving information from a huge collection of ancient handwritten documents is important for indexing, interpreting, browsing, and searching documents in various domains. Word spotting approaches are widely used in this context but have several limitations related to the complex properties of handwriting. These can appear at several steps: interest point detection, description, and matching. This article proposes a new word spotting approach for the word retrieval in handwritten document, which mainly leverages the properties of image gradients for visual features detection and description. The proposed approach is based on the combination of spatial relationships with textural information to design a more accurate matching. The experimental results of the proposed approach demonstrate a higher performance over the Jeremy Bentham dataset, evaluated following the recent benchmarks of ICDAR 2015 Competition on Keyword Spotting for Handwritten Documents.


Author(s):  
Thontadari C. ◽  
Prabhakar C. J.

In this chapter, the authors present a segmentation-based word spotting method for handwritten documents using bag of visual words (BoVW) framework based on co-occurrence histograms of oriented gradients (Co-HOG) features. The Co-HOG descriptor captures the word image shape information and encodes the local spatial information by counting the co-occurrence of gradient orientation of neighbor pixel pairs. The handwritten document images are segmented into words and each word image is represented by a vector that contains the frequency of visual words appeared in the image. In order to include spatial information to the BoVW framework, the authors adopted spatial pyramid matching (SPM) method. The proposed method is evaluated using precision and recall metrics through experimentation conducted on popular datasets such as GW and IAM. The performance analysis confirmed that the method outperforms existing word spotting techniques.


2014 ◽  
Author(s):  
Irving Biederman ◽  
Ori Amir
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document