FASTER SIMILARITY SEARCH FOR MULTIMEDIA DATA VIA QUERY TRANSFORMATIONS

The performance of nearest neighbor (NN) queries degrades noticeably with increasing dimensionality of the data due to reduced selectivity of high-dimensional data and an increased number of seek operations during NN-query execution. If the NN-radii would be known in advance, the disk accesses could be reordered such that seek operations are minimized. We therefore propose a new way of estimating the NN-radius based on the fractal dimensionality and sampling. It is applicable to any page-based index structure. We show that the estimation error is considerably lower than for previous approaches. In the second part of the paper, we present two applications of this technique. We show how the radius estimations can be used to transform k-NN queries into at most two range queries, and how it can be used to reduce the number of page reads during all-NN queries. In both cases, we observe significant speedups over traditional techniques for synthetic and real-world data.

Download Full-text

Parallel High-Dimensional Index Structure Using Cell-Based Filtering for Multimedia Data

Frontiers of High Performance Computing and Networking – ISPA 2006 Workshops - Lecture Notes in Computer Science ◽

10.1007/11942634_80 ◽

2006 ◽

pp. 781-790 ◽

Cited By ~ 1

Author(s):

Jae-Woo Chang ◽

Yong-Ki Kim ◽

Young-Jin Kim

Keyword(s):

Multimedia Data ◽

Index Structure ◽

High Dimensional

Download Full-text

The GC-tree: a high-dimensional index structure for similarity search in image databases

IEEE Transactions on Multimedia ◽

10.1109/tmm.2002.1017736 ◽

2002 ◽

Vol 4 (2) ◽

pp. 235-247 ◽

Cited By ~ 32

Author(s):

Guang-Ho Cha ◽

Chin-Wan Chung

Keyword(s):

Similarity Search ◽

Image Databases ◽

Index Structure ◽

High Dimensional

Download Full-text

A cell-based index structure for similarity search in high-dimensional feature spaces

Proceedings of the 2001 ACM symposium on Applied computing - SAC '01 ◽

10.1145/372202.372338 ◽

2001 ◽

Cited By ~ 1

Author(s):

Kwang-Taek Song ◽

Hwa-Jin Nam ◽

Jae-Woo Chang

Keyword(s):

Similarity Search ◽

Index Structure ◽

High Dimensional ◽

Feature Spaces ◽

A Cell

Download Full-text

Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space

Knowledge and Information Systems ◽

10.1007/s10115-008-0190-y ◽

2009 ◽

Vol 22 (1) ◽

pp. 1-26 ◽

Cited By ~ 17

Author(s):

Ming Zhang ◽

Reda Alhajj

Keyword(s):

Metric Space ◽

Similarity Search ◽

Index Structure ◽

High Dimensional

Download Full-text

SPY-TEC+ : AN INTEGRATED INDEX STRUCTURE FOR k-NEAREST NEIGHBOR QUERIES WITH SEMANTIC PREDICATES IN MULTIMEDIA DATABASE

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194011005529 ◽

2011 ◽

Vol 21 (07) ◽

pp. 989-1011

Author(s):

DONG-JOO PARK ◽

DONG-HO LEE

Keyword(s):

Visual Information ◽

Nearest Neighbor ◽

High Dimensional Data ◽

Multimedia Retrieval ◽

Index Structure ◽

High Dimensional ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Integrated Index ◽

Nearest Neighbor Queries

Recently, advanced multimedia applications, such as geographic information system, and content-based multimedia retrieval system, require the efficient processing of k-nearest neighbor queries over large collection of multimedia objects. These queries usually include the semantic information that is represented by text, as well as the visual information that is represented by a high-dimensional feature vector. Among the available techniques for processing such queries, the incremental nearest neighbor algorithm proposed by Hjaltason and Samet is known as the best choice. However, the R-tree used in their algorithm has no facility capable of partially pruning the candidate tuples that will turn out not to satisfy the semantic predicate. Also, the R-tree does not perform sufficiently well on high-dimensional data even though it provides good results on low or middle-dimensional data. These drawbacks may lead to a poor performance when processing the query. In this paper, we propose an integrated index structure, so-called SPY-TEC+, that provides an efficient method for indexing the visual and semantic feature at the same time using the SPY-TEC that was proposed for indexing high-dimensional data, and the signature file. We also propose an efficient incremental nearest neighbor algorithm for processing k-nearest neighbor queries with visual and semantic predicates on the SPY-TEC+. Finally, we show that the SPY-TEC+ enhances the performance of the SPY-TEC for processing k-nearest neighbor queries with visual and semantic predicates through various experiments.

Download Full-text