MLR-Index: An Index Structure for Fast and Scalable Similarity Search in High Dimensions

In high-dimensional spaces, accuracy and similarity search by low computing and storage costs are always difficult research topics, and there is a balance between efficiency and accuracy. In this paper, we propose a new structure Similar-PBF-PHT to represent items of a set with high dimensions and retrieve accurate and similar items. The Similar-PBF-PHT contains three parts: parallel bloom filters (PBFs), parallel hash tables (PHTs), and a bitmatrix. Experiments show that the Similar-PBF-PHT is effective in membership query and K-nearest neighbors (K-NN) search. With accurate querying, the Similar-PBF-PHT owns low hit false positive probability (FPP) and acceptable memory costs. With K-NN querying, the average overall ratio and rank-i ratio of the Hamming distance are accurate and ratios of the Euclidean distance are acceptable. It takes CPU time not I/O times to retrieve accurate and similar items and can deal with different data formats not only numerical values.

Download Full-text

Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space

Knowledge and Information Systems ◽

10.1007/s10115-008-0190-y ◽

2009 ◽

Vol 22 (1) ◽

pp. 1-26 ◽

Cited By ~ 17

Author(s):

Ming Zhang ◽

Reda Alhajj

Keyword(s):

Metric Space ◽

Similarity Search ◽

Index Structure ◽

High Dimensional

Download Full-text

An Adaptive Index Structure for High-Dimensional Similarity Search

Advances in Multimedia Information Processing — PCM 2001 - Lecture Notes in Computer Science ◽

10.1007/3-540-45453-5_10 ◽

2001 ◽

pp. 71-78 ◽

Cited By ~ 3

Author(s):

P. Wu ◽

B. S. Manjunath ◽

S. Chandrasekaran

Keyword(s):

Similarity Search ◽

Index Structure ◽

High Dimensional ◽

Adaptive Index

Download Full-text

AdaIndex: An Adaptive Index Structure for Fast Similarity Search in Metric Spaces

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-10684-2_81 ◽

2009 ◽

pp. 729-737

Author(s):

Tao Ban ◽

Shanqing Guo ◽

Qiuliang Xu ◽

Youki Kadobayashi

Keyword(s):

Similarity Search ◽

Metric Spaces ◽

Index Structure ◽

Adaptive Index

Download Full-text

FASTER SIMILARITY SEARCH FOR MULTIMEDIA DATA VIA QUERY TRANSFORMATIONS

International Journal of Image and Graphics ◽

10.1142/s0219467803000890 ◽

2003 ◽

Vol 03 (01) ◽

pp. 3-29

Author(s):

CHRISTIAN A. LANG ◽

AMBUJ K. SINGH

Keyword(s):

Similarity Search ◽

Nearest Neighbor ◽

Estimation Error ◽

Fractal Dimensionality ◽

Multimedia Data ◽

Index Structure ◽

High Dimensional ◽

Range Queries ◽

Query Execution ◽

Real World Data

The performance of nearest neighbor (NN) queries degrades noticeably with increasing dimensionality of the data due to reduced selectivity of high-dimensional data and an increased number of seek operations during NN-query execution. If the NN-radii would be known in advance, the disk accesses could be reordered such that seek operations are minimized. We therefore propose a new way of estimating the NN-radius based on the fractal dimensionality and sampling. It is applicable to any page-based index structure. We show that the estimation error is considerably lower than for previous approaches. In the second part of the paper, we present two applications of this technique. We show how the radius estimations can be used to transform k-NN queries into at most two range queries, and how it can be used to reduce the number of page reads during all-NN queries. In both cases, we observe significant speedups over traditional techniques for synthetic and real-world data.

Download Full-text

An Efficient High-Dimensional Index Structure Using Cell Signatures for Similarity Search

Advances in Web-Age Information Management - Lecture Notes in Computer Science ◽

10.1007/3-540-47714-4_3 ◽

2001 ◽

pp. 26-33

Author(s):

Jae-Woo Chang ◽

Kwang-Taek Song

Keyword(s):

Similarity Search ◽

Index Structure ◽

High Dimensional

Download Full-text