Enhanced Bags of Visual Words Representation Using Spatial Information

Large-scale image retrieval has been shown remarkable potential in real-life applications. The standard approach is based on Inverted Indexing, given images are represented using Bag-of-Words model. However, one major limitation of both Inverted Index and Bag-of-Words presentation is that they ignore spatial information of visual words in image presentation and comparison. As a result, retrieval accuracy is decreased. In this paper, the authors investigate an approach to integrate spatial information into Inverted Index to improve accuracy while maintaining short retrieval time. Experiments conducted on several benchmark datasets (Oxford Building 5K, Oxford Building 5K+100K and Paris 6K) demonstrate the effectiveness of our proposed approach.

Download Full-text

A Combination of Spatial Pyramid and Inverted Index for Large-Scale Image Retrieval

International Journal of Multimedia Data Engineering and Management ◽

10.4018/ijmdem.2015040103 ◽

2015 ◽

Vol 6 (2) ◽

pp. 37-51 ◽

Cited By ~ 2

Author(s):

Vinh-Tiep Nguyen ◽

Thanh Duc Ngo ◽

Minh-Triet Tran ◽

Duy-Dinh Le ◽

Duc Anh Duong

Keyword(s):

Image Retrieval ◽

Large Scale ◽

Spatial Information ◽

Real Life ◽

Inverted Index ◽

Bag Of Words ◽

Visual Words ◽

Benchmark Datasets ◽

Large Scale Image Retrieval ◽

Inverted Indexing

Large-scale image retrieval has been shown remarkable potential in real-life applications. The standard approach is based on Inverted Indexing, given images are represented using Bag-of-Words model. However, one major limitation of both Inverted Index and Bag-of-Words presentation is that they ignore spatial information of visual words in image presentation and comparison. As a result, retrieval accuracy is decreased. In this paper, the authors investigate an approach to integrate spatial information into Inverted Index to improve accuracy while maintaining short retrieval time. Experiments conducted on several benchmark datasets (Oxford Building 5K, Oxford Building 5K+100K and Paris 6K) demonstrate the effectiveness of our proposed approach.

Download Full-text

A Novel Discriminating and Relative Global Spatial Image Representation with Applications in CBIR

Applied Sciences ◽

10.3390/app8112242 ◽

2018 ◽

Vol 8 (11) ◽

pp. 2242 ◽

Cited By ~ 16

Author(s):

Bushra Zafar ◽

Rehan Ashraf ◽

Nouman Ali ◽

Muhammad Iqbal ◽

Muhammad Sajid ◽

...

Keyword(s):

Image Classification ◽

Spatial Information ◽

Image Representation ◽

Feature Space ◽

Research Problem ◽

Visual Words ◽

Spatial Image ◽

User Query ◽

Novel Approach ◽

Image Representations

The requirement for effective image search, which motivates the use of Content-Based Image Retrieval (CBIR) and the search of similar multimedia contents on the basis of user query, remains an open research problem for computer vision applications. The application domains for Bag of Visual Words (BoVW) based image representations are object recognition, image classification and content-based image analysis. Interest point detectors are quantized in the feature space and the final histogram or image signature do not retain any detail about co-occurrences of features in the 2D image space. This spatial information is crucial, as it adversely affects the performance of an image classification-based model. The most notable contribution in this context is Spatial Pyramid Matching (SPM), which captures the absolute spatial distribution of visual words. However, SPM is sensitive to image transformations such as rotation, flipping and translation. When images are not well-aligned, SPM may lose its discriminative power. This paper introduces a novel approach to encoding the relative spatial information for histogram-based representation of the BoVW model. This is established by computing the global geometric relationship between pairs of identical visual words with respect to the centroid of an image. The proposed research is evaluated by using five different datasets. Comprehensive experiments demonstrate the robustness of the proposed image representation as compared to the state-of-the-art methods in terms of precision and recall values.

Download Full-text

Spatially Enhanced Bags of Visual Words Representation to Improve Traffic Signs Recognition

Journal of Signal Processing Systems ◽

10.1007/s11265-017-1324-9 ◽

2017 ◽

Vol 90 (12) ◽

pp. 1729-1741 ◽

Cited By ~ 1

Author(s):

Lotfi Abdi ◽

Aref Meddeb

Keyword(s):

Traffic Signs ◽

Visual Words ◽

Bags Of Visual Words

Download Full-text

Comparing Local Descriptors and Bags of Visual Words to Deep Convolutional Neural Networks for Plant Recognition

Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods ◽

10.5220/0006196204790486 ◽

2017 ◽

Cited By ~ 30

Author(s):

Pornntiwa Pawara ◽

Emmanuel Okafor ◽

Olarik Surinta ◽

Lambert Schomaker ◽

Marco Wiering

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Deep Convolutional Neural Networks ◽

Local Descriptors ◽

Visual Words ◽

Plant Recognition ◽

Bags Of Visual Words

Download Full-text

Segmentation Free Word Spotting for Handwritten Documents Using Bag of Visual Words Based on Co-HOG Descriptor

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2019040105 ◽

2019 ◽

Vol 9 (2) ◽

pp. 49-65

Author(s):

Thontadari C. ◽

Prabhakar C. J.

Keyword(s):

Visual Information ◽

Spatial Information ◽

Spatial Location ◽

Visual Word ◽

Bag Of Visual Words ◽

Word Spotting ◽

Handwritten Documents ◽

Visual Words ◽

Handwritten Document ◽

Free Word

In this article, the authors propose a segmentation-free word spotting in handwritten document images using a Bag of Visual Words (BoVW) framework based on the co-occurrence histogram of oriented gradient (Co-HOG) descriptor. Initially, the handwritten document is represented using visual word vectors which are obtained based on the frequency of occurrence of Co-HOG descriptor within local patches of the document. The visual word representation vector does not consider their spatial location and spatial information helps to determine a location exclusively with visual information when the different location can be perceived as the same. Hence, to add spatial distribution information of visual words into the unstructured BoVW framework, the authors adopted spatial pyramid matching (SPM) technique. The performance of the proposed method evaluated using popular datasets and it is confirmed that the authors' method outperforms existing segmentation free word spotting techniques.

Download Full-text

Mining Regional Co-Occurrence Patterns for Image Classification

Mathematical Problems in Engineering ◽

10.1155/2018/4945304 ◽

2018 ◽

Vol 2018 ◽

pp. 1-14 ◽

Cited By ~ 1

Author(s):

Zhihang Ji ◽

Sining Wu ◽

Fan Wang ◽

Lijuan Xu ◽

Yan Yang ◽

...

Keyword(s):

Image Classification ◽

Large Scale ◽

Spatial Information ◽

Multiple Kernel Learning ◽

The Novel ◽

Color Information ◽

Local Color ◽

Structure Information ◽

Visual Words ◽

Occurrence Patterns

In the context of image classification, bag-of-visual-words mode is widely used for image representation. In recent years several works have aimed at exploiting color or spatial information to improve the representation. In this paper two kinds of representation vectors, namely, Global Color Co-occurrence Vector (GCCV) and Local Color Co-occurrence Vector (LCCV), are proposed. Both of them make use of the color and co-occurrence information of the superpixels in an image. GCCV describes the global statistical distribution of the colorful superpixels with embedding the spatial information between them. By this way, it is capable of capturing the color and structure information in large scale. Unlike the GCCV, LCCV, which is embedded in the Riemannian manifold space, reflects the color information within the superpixels in detail. It records the higher-order distribution of the color between the superpixels within a neighborhood by aggregating the co-occurrence information in the second-order pooling way. In the experiment, we incorporate the two proposed representation vectors with feature vector like LLC or CNN by Multiple Kernel Learning (MKL) technology. Several challenging datasets for visual classification are tested on the novel framework, and experimental results demonstrate the effectiveness of the proposed method.

Download Full-text