Bag-of-words image representation based on classified vector quantization

Author(s):  
Xu Yang ◽  
De Xu ◽  
Ying-Jian Qi
2019 ◽  
Vol 7 (4) ◽  
Author(s):  
Noha Elfiky

The Bag-of-Words (BoW) approach has been successfully applied in the context of category-level image classification. To incorporate spatial image information in the BoW model, Spatial Pyramids (SPs) are used. However, spatial pyramids are rigid in nature and are based on pre-defined grid configurations. As a consequence, they often fail to coincide with the underlying spatial structure of images from different categories which may negatively affect the classification accuracy.The aim of the paper is to use the 3D scene geometry to steer the layout of spatial pyramids for category-level image classification (object recognition). The proposed approach provides an image representation by inferring the constituent geometrical parts of a scene. As a result, the image representation retains the descriptive spatial information to yield a structural description of the image. From large scale experiments on the Pascal VOC2007 and Caltech101, it can be derived that SPs which are obtained by the proposed Generic SPs outperforms the standard SPs.


2016 ◽  
pp. 81-87
Author(s):  
Xiang Xu ◽  
Xingkun Wu ◽  
Feng Lin

2014 ◽  
Vol 8 (5) ◽  
pp. 310-318 ◽  
Author(s):  
Mohammad Mehdi Farhangi ◽  
Mohsen Soryani ◽  
Mahmood Fathy

2013 ◽  
Vol 830 ◽  
pp. 485-489
Author(s):  
Shu Fang Wu ◽  
Jie Zhu ◽  
Zhao Feng Zhang

Combining multiple bioinformatics such as shape and color is a challenging task in object recognition. Usually, we believe that if more different bioinformatics are considered in object recognition, then we could get better result. Bag-of-words-based image representation is one of the most relevant approaches; many feature fusion methods are based on this model. Sparse coding has attracted a considerable amount of attention in many domains. A novel sparse feature fusion algorithm is proposed to fuse multiple bioinformatics to represent the images. Experimental results show good performance of the proposed algorithm.


2019 ◽  
Author(s):  
sugimoto takuma ◽  
Yamaguchi Kousuke ◽  
kanji tanaka

In this paper, we present a new fault diagnosis (FD) -based approach for detection of imagery changes that can detect significant changes as inconsistencies between different sub-modules (e.g., self-localization) of visual SLAM. Unlike classical change detection approaches such as pairwise image comparison (PC) and anomaly detection (AD), neither the memorization of each map image nor the maintenance of up-to-date place-specific anomaly detectors are required in this FD approach. A significant challenge that is encountered when incorporating different SLAM sub-modules into FD involves dealing with the varying scales of objects that have changed (e.g., the appearance of small dangerous obstacles on the floor). To address this issue, we reconsider the bag-of-words (BoW) image representation, by exploiting its recent advances in terms of self-localization and change detection. As a key advantage, BoW image representation can be reorganized into any different scaling by simply cropping the original BoW image. Furthermore, we propose to combine different self-localization modules with strong and weak BoW features with different discriminativity, and to treat inconsistency between strong and weak self-localization as an indicator of change. The efficacy of the proposed approach of FD with/without combining AD and/or PC was experimentally validated.


2014 ◽  
pp. 29-52 ◽  
Author(s):  
Marc T. Law ◽  
Nicolas Thome ◽  
Matthieu Cord

2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
Jian Hou ◽  
Wei-Xue Liu ◽  
Xu E ◽  
Hamid Reza Karimi

Bag-of-visual-words has been shown to be a powerful image representation and attained great success in many computer vision and pattern recognition applications. Usually, for a given dataset, researchers choose to build a specific visual vocabulary from the dataset, and the problem of deriving a universal visual vocabulary is rarely addressed. Based on previous work on the classification performance with respect to visual vocabulary sizes, we arrive at a hypothesis that a universal visual vocabulary can be obtained by taking-into account the similarity extent of keypoints represented by one visual word. We then propose to use a similarity threshold-based clustering method to calculate the optimal vocabulary size, where the universal similarity threshold can be obtained empirically. With the optimal vocabulary size, the optimal visual vocabularies of limited sizes from three datasets are shown to be exchangeable and therefore universal. This result indicates that a universal and compact visual vocabulary can be built from a not too small dataset. Our work narrows the gab between bag-of-visual-words and bag-of-words, where a relatively fixed vocabulary can be used with different text datasets.


Sign in / Sign up

Export Citation Format

Share Document