Segmentation Free Word Spotting for Handwritten Documents Using Bag of Visual Words Based on Co-HOG Descriptor

In this article, the authors propose a segmentation-free word spotting in handwritten document images using a Bag of Visual Words (BoVW) framework based on the co-occurrence histogram of oriented gradient (Co-HOG) descriptor. Initially, the handwritten document is represented using visual word vectors which are obtained based on the frequency of occurrence of Co-HOG descriptor within local patches of the document. The visual word representation vector does not consider their spatial location and spatial information helps to determine a location exclusively with visual information when the different location can be perceived as the same. Hence, to add spatial distribution information of visual words into the unstructured BoVW framework, the authors adopted spatial pyramid matching (SPM) technique. The performance of the proposed method evaluated using popular datasets and it is confirmed that the authors' method outperforms existing segmentation free word spotting techniques.

Download Full-text

Bag of Visual Words Based on Co-HOG Features for Word Spotting in Handwritten Documents

Advancements in Computer Vision and Image Processing - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-5225-5628-2.ch007 ◽

2018 ◽

pp. 162-189

Author(s):

Thontadari C. ◽

Prabhakar C. J.

Keyword(s):

Spatial Information ◽

Bag Of Visual Words ◽

Shape Information ◽

Word Spotting ◽

Handwritten Documents ◽

Visual Words ◽

Gradient Orientation ◽

Handwritten Document ◽

Image Shape ◽

Pyramid Matching

In this chapter, the authors present a segmentation-based word spotting method for handwritten documents using bag of visual words (BoVW) framework based on co-occurrence histograms of oriented gradients (Co-HOG) features. The Co-HOG descriptor captures the word image shape information and encodes the local spatial information by counting the co-occurrence of gradient orientation of neighbor pixel pairs. The handwritten document images are segmented into words and each word image is represented by a vector that contains the frequency of visual words appeared in the image. In order to include spatial information to the BoVW framework, the authors adopted spatial pyramid matching (SPM) method. The proposed method is evaluated using precision and recall metrics through experimentation conducted on popular datasets such as GW and IAM. The performance analysis confirmed that the method outperforms existing word spotting techniques.

Download Full-text

Scale Space Co-Occurrence HOG Features for Word Spotting in Handwritten Document Images

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2016070105 ◽

2016 ◽

Vol 6 (2) ◽

pp. 71-86 ◽

Cited By ~ 4

Author(s):

C. Thontadari ◽

C. J. Prabhakar

Keyword(s):

Spatial Information ◽

Scale Parameter ◽

Poor Performance ◽

Scale Space ◽

Feature Descriptor ◽

Word Spotting ◽

Handwritten Documents ◽

Histograms Of Oriented Gradients ◽

The Poor ◽

Handwritten Document

In this paper, the authors proposed a Scale Space Co-occurrence Histograms of Oriented Gradients method (SS Co-HOG) for retrieving words from digitized handwritten documents. The poor performance of HOG based word spotting in handwritten documents is due to that HOG ignores spatial information of neighboring pixels whereas Co-HOG captures the spatial information of neighboring pixels through counting the occurrence of the gradient orientations of two or more neighboring pixels. The authors employed three scale parameter representation of an image and at each scale, they divide the word image into blocks and Co-HOG features are extracted from each block and finally concatenate them into form a feature descriptor. The proposed method is evaluated using precision and recall metrics through experimentation conducted on popular datasets such as IAM and GW and confirmed that their method outperforms for both the datasets.

Download Full-text

Word Spotting Based on Bispace Similarity for Visual Information Retrieval in Handwritten Document Images

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2019070103 ◽

2019 ◽

Vol 9 (3) ◽

pp. 38-58 ◽

Cited By ~ 1

Author(s):

Ryma Benabdelaziz ◽

Djamel Gaceb ◽

Mohammed Haddad

Keyword(s):

Visual Information ◽

Visual Features ◽

Keyword Spotting ◽

Visual Information Retrieval ◽

Word Spotting ◽

Handwritten Documents ◽

Handwritten Document ◽

Image Gradients ◽

Point Detection ◽

Accurate Matching

Retrieving information from a huge collection of ancient handwritten documents is important for indexing, interpreting, browsing, and searching documents in various domains. Word spotting approaches are widely used in this context but have several limitations related to the complex properties of handwriting. These can appear at several steps: interest point detection, description, and matching. This article proposes a new word spotting approach for the word retrieval in handwritten document, which mainly leverages the properties of image gradients for visual features detection and description. The proposed approach is based on the combination of spatial relationships with textural information to design a more accurate matching. The experimental results of the proposed approach demonstrate a higher performance over the Jeremy Bentham dataset, evaluated following the recent benchmarks of ICDAR 2015 Competition on Keyword Spotting for Handwritten Documents.

Download Full-text

Bag of Visual Words for Word Spotting in Handwritten Documents Based on Curvature Features

International Journal of Computer Science and Information Technology ◽

10.5121/ijcsit.2017.9406 ◽

2017 ◽

Vol 9 (4) ◽

pp. 77-92

Author(s):

Thontadari C ◽

Prabhakar C.J

Keyword(s):

Bag Of Visual Words ◽

Word Spotting ◽

Handwritten Documents ◽

Visual Words

Download Full-text

KLASIFIKASI EKSPRESI WAJAH MENGGUNAKAN BAG OF VISUAL WORDS

JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING ◽

10.31289/jite.v1i2.1426 ◽

2018 ◽

Vol 1 (2) ◽

pp. 73 ◽

Cited By ~ 4

Author(s):

Muhathir Muhathir

Keyword(s):

Visual Word ◽

Interest Point ◽

Bag Of Visual Words ◽

Visual Words ◽

Speed Up ◽

Speed Up Robust Feature

<div><p class="Abstract">Pada hakikatnya, manusia dapat membedakan pola terhadap suatu objek berdasarkan bentuk visual yang mengandung keadaan emosional. Seperti membedakan ekspresi wajah seseorang pada suatu citra. Manusia dapat membedakan ekspresi pada citra tersebut secara kasat mata. Namun komputer yang tidak dapat mengenali ekspresi wajah tersebut. Bag of visual words merupakan suatu skema untuk mengklasifikasikan citra berdasarkan nilai-nilai pixel pada citra. Dengan menggunakan deteksi interest point dan ekstraksi interest point, bag of visual words mengambil ciri unik pada citra sehingga dapat membedakan pola-pola yang terdapat pada suatu citra. Bag of visual word dengan nilai K 500 mampu mengklasifikasi pola ekspresi wajah dengan tingkat akurasi 69%,</p></div>Kata kunci<strong>: </strong><em>Wajah, Klasifikasi, Speed-up Robust Feature, Bag of visual words, Ekspresi</em>

Download Full-text

A segmentation free Word Spotting for handwritten documents

2015 13th International Conference on Document Analysis and Recognition (ICDAR) ◽

10.1109/icdar.2015.7333781 ◽

2015 ◽

Cited By ~ 3

Author(s):

Adam Ghorbel ◽

Jean-Marc Ogier ◽

Nicole Vincent

Keyword(s):

Word Spotting ◽

Handwritten Documents ◽

Free Word

Download Full-text

ON THE INFLUENCE OF WORD REPRESENTATIONS FOR HANDWRITTEN WORD SPOTTING IN HISTORICAL DOCUMENTS

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001412630025 ◽

2012 ◽

Vol 26 (05) ◽

pp. 1263002 ◽

Cited By ~ 32

Author(s):

JOSEP LLADÓS ◽

MARÇAL RUSIÑOL ◽

ALICIA FORNÉS ◽

DAVID FERNÁNDEZ ◽

ANJAN DUTTA

Keyword(s):

Structural Model ◽

Historical Data ◽

Historical Documents ◽

Bag Of Visual Words ◽

George Washington ◽

Word Spotting ◽

Visual Words ◽

Advantages And Disadvantages ◽

Word Representation ◽

Word Images

Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images.

Download Full-text

Spatial orientations of visual word pairs to improve Bag-of-Visual-Words model

Procedings of the British Machine Vision Conference 2012 ◽

10.5244/c.26.89 ◽

2012 ◽

Cited By ~ 23

Author(s):

Rahat Khan ◽

Cecile Barat ◽

Damien Muselet ◽

Christophe Ducottet

Keyword(s):

Visual Word ◽

Bag Of Visual Words ◽

Visual Words

Download Full-text

Segmentation-Free Word Spotting in Handwritten Documents Using Scale Space Co-HoG Feature Descriptors

Advances in Computational Intelligence and Robotics - Applications of Advanced Machine Intelligence in Computer Vision and Object Recognition ◽

10.4018/978-1-7998-2736-8.ch009 ◽

2020 ◽

pp. 219-247

Author(s):

Prabhakar C. J.

Keyword(s):

Scale Space ◽

Word Segmentation ◽

Literature Survey ◽

Feature Descriptor ◽

Word Spotting ◽

Handwritten Documents ◽

Histograms Of Oriented Gradients ◽

Feature Descriptors ◽

Increase In Accuracy ◽

Free Word

In this chapter, the author present a segmentation-free-based word spotting method for handwritten documents using Scale Space co-occurrence histograms of oriented gradients (Co-HOG) feature descriptor. The chapter begin with introduction to word spotting, its challenges, and applications. It is followed by review of the existing techniques for word spotting in handwritten documents. The literature survey reveals that segmentation-based word spotting methods usually need a layout analysis step for word segmentation, and any segmentation errors can affect the subsequent word representations and matching steps. Hence, in order to overcome the drawbacks of segmentation-based methods, the author proposed segmentation-free word spotting using Scale Space Co-HOG feature descriptor. The proposed method is evaluated using mean Average Precision (mAP) through experimentation conducted on popular datasets such as GW and IAM. The performance of the proposed method is compared with existing state-of-the-segmentation and segmentation-free methods, and there is a considerable increase in accuracy.

Download Full-text

Bag-of-Features HMMs for Segmentation-Free Word Spotting in Handwritten Documents

2013 12th International Conference on Document Analysis and Recognition ◽

10.1109/icdar.2013.264 ◽

2013 ◽

Cited By ~ 44

Author(s):

Leonard Rothacker ◽

Marcal Rusinol ◽

Gernot A. Fink

Keyword(s):

Word Spotting ◽

Handwritten Documents ◽

Bag Of Features ◽

Free Word

Download Full-text