Improvement the Bag of Words Image Representation Using Spatial Information

The Bag-of-Words (BoW) approach has been successfully applied in the context of category-level image classification. To incorporate spatial image information in the BoW model, Spatial Pyramids (SPs) are used. However, spatial pyramids are rigid in nature and are based on pre-defined grid configurations. As a consequence, they often fail to coincide with the underlying spatial structure of images from different categories which may negatively affect the classification accuracy.The aim of the paper is to use the 3D scene geometry to steer the layout of spatial pyramids for category-level image classification (object recognition). The proposed approach provides an image representation by inferring the constituent geometrical parts of a scene. As a result, the image representation retains the descriptive spatial information to yield a structural description of the image. From large scale experiments on the Pascal VOC2007 and Caltech101, it can be derived that SPs which are obtained by the proposed Generic SPs outperforms the standard SPs.

Download Full-text

A Combination of Spatial Pyramid and Inverted Index for Large-Scale Image Retrieval

Computer Vision ◽

10.4018/978-1-5225-5204-8.ch054 ◽

2018 ◽

pp. 1307-1321

Author(s):

Vinh-Tiep Nguyen ◽

Thanh Duc Ngo ◽

Minh-Triet Tran ◽

Duy-Dinh Le ◽

Duc Anh Duong

Keyword(s):

Image Retrieval ◽

Large Scale ◽

Spatial Information ◽

Real Life ◽

Inverted Index ◽

Bag Of Words ◽

Visual Words ◽

Benchmark Datasets ◽

Large Scale Image Retrieval ◽

Inverted Indexing

Large-scale image retrieval has been shown remarkable potential in real-life applications. The standard approach is based on Inverted Indexing, given images are represented using Bag-of-Words model. However, one major limitation of both Inverted Index and Bag-of-Words presentation is that they ignore spatial information of visual words in image presentation and comparison. As a result, retrieval accuracy is decreased. In this paper, the authors investigate an approach to integrate spatial information into Inverted Index to improve accuracy while maintaining short retrieval time. Experiments conducted on several benchmark datasets (Oxford Building 5K, Oxford Building 5K+100K and Paris 6K) demonstrate the effectiveness of our proposed approach.

Download Full-text

A Combination of Spatial Pyramid and Inverted Index for Large-Scale Image Retrieval

International Journal of Multimedia Data Engineering and Management ◽

10.4018/ijmdem.2015040103 ◽

2015 ◽

Vol 6 (2) ◽

pp. 37-51 ◽

Cited By ~ 2

Author(s):

Vinh-Tiep Nguyen ◽

Thanh Duc Ngo ◽

Minh-Triet Tran ◽

Duy-Dinh Le ◽

Duc Anh Duong

Keyword(s):

Image Retrieval ◽

Large Scale ◽

Spatial Information ◽

Real Life ◽

Inverted Index ◽

Bag Of Words ◽

Visual Words ◽

Benchmark Datasets ◽

Large Scale Image Retrieval ◽

Inverted Indexing

Large-scale image retrieval has been shown remarkable potential in real-life applications. The standard approach is based on Inverted Indexing, given images are represented using Bag-of-Words model. However, one major limitation of both Inverted Index and Bag-of-Words presentation is that they ignore spatial information of visual words in image presentation and comparison. As a result, retrieval accuracy is decreased. In this paper, the authors investigate an approach to integrate spatial information into Inverted Index to improve accuracy while maintaining short retrieval time. Experiments conducted on several benchmark datasets (Oxford Building 5K, Oxford Building 5K+100K and Paris 6K) demonstrate the effectiveness of our proposed approach.

Download Full-text

A Novel Discriminating and Relative Global Spatial Image Representation with Applications in CBIR

Applied Sciences ◽

10.3390/app8112242 ◽

2018 ◽

Vol 8 (11) ◽

pp. 2242 ◽

Cited By ~ 16

Author(s):

Bushra Zafar ◽

Rehan Ashraf ◽

Nouman Ali ◽

Muhammad Iqbal ◽

Muhammad Sajid ◽

...

Keyword(s):

Image Classification ◽

Spatial Information ◽

Image Representation ◽

Feature Space ◽

Research Problem ◽

Visual Words ◽

Spatial Image ◽

User Query ◽

Novel Approach ◽

Image Representations

The requirement for effective image search, which motivates the use of Content-Based Image Retrieval (CBIR) and the search of similar multimedia contents on the basis of user query, remains an open research problem for computer vision applications. The application domains for Bag of Visual Words (BoVW) based image representations are object recognition, image classification and content-based image analysis. Interest point detectors are quantized in the feature space and the final histogram or image signature do not retain any detail about co-occurrences of features in the 2D image space. This spatial information is crucial, as it adversely affects the performance of an image classification-based model. The most notable contribution in this context is Spatial Pyramid Matching (SPM), which captures the absolute spatial distribution of visual words. However, SPM is sensitive to image transformations such as rotation, flipping and translation. When images are not well-aligned, SPM may lose its discriminative power. This paper introduces a novel approach to encoding the relative spatial information for histogram-based representation of the BoVW model. This is established by computing the global geometric relationship between pairs of identical visual words with respect to the centroid of an image. The proposed research is evaluated by using five different datasets. Comprehensive experiments demonstrate the robustness of the proposed image representation as compared to the state-of-the-art methods in terms of precision and recall values.

Download Full-text

Improving Bag-of-Words model with spatial information

2010 25th International Conference of Image and Vision Computing New Zealand ◽

10.1109/ivcnz.2010.6148795 ◽

2010 ◽

Cited By ~ 15

Author(s):

Edmond Zhang ◽

Michael Mayo

Keyword(s):

Spatial Information ◽

Bag Of Words

Download Full-text

A multi-sample, multi-tree approach to bag-of-words image representation for image retrieval

2009 IEEE 12th International Conference on Computer Vision ◽

10.1109/iccv.2009.5459439 ◽

2009 ◽

Cited By ~ 2

Author(s):

Zhong Wu ◽

Qifa Ke ◽

Jian Sun ◽

Heung-Yeung Shum

Keyword(s):

Image Retrieval ◽

Image Representation ◽

Bag Of Words ◽

Tree Approach

Download Full-text

Biomedical image representation approach using visualness and spatial information in a concept feature space for interactive region-of-interest-based retrieval

Journal of Medical Imaging ◽

10.1117/1.jmi.2.4.046502 ◽

2015 ◽

Vol 2 (4) ◽

pp. 046502 ◽

Cited By ~ 1

Author(s):

Md. Mahmudur Rahman ◽

Sameer K. Antani ◽

Dina Demner-Fushman ◽

George R. Thoma

Keyword(s):

Spatial Information ◽

Image Representation ◽

Region Of Interest ◽

Feature Space ◽

Biomedical Image

Download Full-text

Bag-of-words image representation based on classified vector quantization

2010 International Conference on Machine Learning and Cybernetics ◽

10.1109/icmlc.2010.5580564 ◽

2010 ◽

Cited By ~ 1

Author(s):

Xu Yang ◽

De Xu ◽

Ying-Jian Qi

Keyword(s):

Vector Quantization ◽

Image Representation ◽

Bag Of Words

Download Full-text

Image Representation with Bag-of-Words

Cellular Image Classification ◽

10.1007/978-3-319-47629-2_4 ◽

2016 ◽

pp. 81-87

Author(s):

Xiang Xu ◽

Xingkun Wu ◽

Feng Lin

Keyword(s):

Image Representation ◽

Bag Of Words

Download Full-text

Identity Verification Based on Facial Pose Pool and Bag of Words Model

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2017.p0448 ◽

2017 ◽

Vol 21 (3) ◽

pp. 448-455 ◽

Cited By ~ 1

Author(s):

Wangbin Chu ◽

◽

Yepeng Guan

Keyword(s):

Spatial Information ◽

Gaussian Mixture ◽

Image Features ◽

Superior Performance ◽

Bag Of Words ◽

Incremental Clustering ◽

Identity Verification ◽

Facial Identity ◽

The Arts ◽

Novel Approach

There are many challenges for face based identity verification. It is one of fundamental topics in image processing and video analysis, and so on. A novel approach has been developed for facial identity verification based on a facial pose pool, which is constructed in an incremental clustering way to find both facial spatial information and orientation diversity. Bag of words is selected to extract image features from the facial pose pool in affine SIFT descriptor. The visual codebook is generated ink-means and Gaussian mixture model. Posterior pseudo probabilities are used to compute the similarities between each visual word and corresponding local features for image representation. Comparisons with some state-of-the-arts have highlighted the superior performance of the proposed method.

Download Full-text