Fair near neighbor search via sampling

Similarity search is a fundamental algorithmic primitive, widely used in many computer science disciplines. Given a set of points S and a radius parameter r > 0, the rnear neighbor (r-NN) problem asks for a data structure that, given any query point q, returns a point p within distance at most r from q. In this paper, we study the r-NN problem in the light of individual fairness and providing equal opportunities: all points that are within distance r from the query should have the same probability to be returned. In the low-dimensional case, this problem was first studied by Hu, Qiao, and Tao (PODS 2014). Locality sensitive hashing (LSH), the theoretically strongest approach to similarity search in high dimensions, does not provide such a fairness guarantee.

Download Full-text

Fast Locality-Sensitive Hashing Frameworks for Approximate Near Neighbor Search

10.1007/978-3-030-32047-8_1 ◽

2019 ◽

pp. 3-17

Author(s):

Tobias Christiani

Keyword(s):

Near Neighbor ◽

Locality Sensitive Hashing ◽

Neighbor Search

Download Full-text

Fair Near Neighbor Search: Independent Range Sampling in High Dimensions

Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems ◽

10.1145/3375395.3387648 ◽

2020 ◽

Author(s):

Martin Aumüller ◽

Rasmus Pagh ◽

Francesco Silvestri

Keyword(s):

Near Neighbor ◽

High Dimensions ◽

Neighbor Search

Download Full-text

Distributed similarity search in high dimensions using locality sensitive hashing

Proceedings of the 12th International Conference on Extending Database Technology Advances in Database Technology - EDBT '09 ◽

10.1145/1516360.1516446 ◽

2009 ◽

Cited By ~ 49

Author(s):

Parisa Haghani ◽

Sebastian Michel ◽

Karl Aberer

Keyword(s):

Similarity Search ◽

Locality Sensitive Hashing ◽

High Dimensions

Download Full-text

Kernel Density Estimation through Density Constrained Near Neighbor Search

2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS) ◽

10.1109/focs46700.2020.00025 ◽

2020 ◽

Author(s):

Moses Charikar ◽

Michael Kapralov ◽

Navid Nouri ◽

Paris Siminelakis

Keyword(s):

Density Estimation ◽

Kernel Density Estimation ◽

Kernel Density ◽

Near Neighbor ◽

Neighbor Search

Download Full-text

Privacy-Preserving near Neighbor Search via Sparse Coding with Ambiguation

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9414115 ◽

2021 ◽

Author(s):

Behrooz Razeghi ◽

Sohrab Ferdowsi ◽

Dimche Kostadinov ◽

Flavio P. Calmon ◽

Slava Voloshynovskiy

Keyword(s):

Sparse Coding ◽

Privacy Preserving ◽

Near Neighbor ◽

Neighbor Search

Download Full-text

ClusterTree: Integration of cluster representation and nearest-neighbor search for large data sets with high dimensions

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2003.1232281 ◽

2003 ◽

Vol 15 (5) ◽

pp. 1316-1337 ◽

Cited By ~ 24

Author(s):

Dantong Yu ◽

Aidong Zhang

Keyword(s):

Nearest Neighbor ◽

Large Data ◽

Nearest Neighbor Search ◽

Large Data Sets ◽

Data Sets ◽

High Dimensions ◽

Neighbor Search ◽

Cluster Representation

Download Full-text

Sampling conformations in high dimensions using low-dimensional distribution functions

The Journal of Chemical Physics ◽

10.1063/1.3088434 ◽

2009 ◽

Vol 130 (13) ◽

pp. 134102 ◽

Cited By ~ 14

Author(s):

Sandeep Somani ◽

Benjamin J. Killian ◽

Michael K. Gilson

Keyword(s):

Distribution Functions ◽

High Dimensions ◽

Dimensional Distribution ◽

Low Dimensional

Download Full-text

APPROXIMATE NEAREST NEIGHBOR SEARCH IN HIGH DIMENSIONS

Proceedings of the International Congress of Mathematicians (ICM 2018) ◽

10.1142/9789813272880_0182 ◽

2019 ◽

Cited By ~ 3

Author(s):

ALEXANDR ANDONI ◽

PIOTR INDYK ◽

ILYA RAZENSHTEYN

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbor Search ◽

High Dimensions ◽

Approximate Nearest Neighbor Search ◽

Approximate Nearest Neighbor ◽

Neighbor Search

Download Full-text

A Supremum Norm Based Near Neighbor Search in High Dimensional Spaces

Computer Vision and Graphics - Lecture Notes in Computer Science ◽

10.1007/978-3-642-33564-8_72 ◽

2012 ◽

pp. 600-609

Author(s):

Nikolai Sergeev

Keyword(s):

Near Neighbor ◽

High Dimensional ◽

Supremum Norm ◽

Neighbor Search

Download Full-text

Bootstrapping spectral statistics in high dimensions

Biometrika ◽

10.1093/biomet/asz040 ◽

2019 ◽

Vol 106 (4) ◽

pp. 781-801 ◽

Cited By ~ 1

Author(s):

Miles E Lopes ◽

Andrew Blandino ◽

Alexander Aue

Keyword(s):

Parametric Bootstrap ◽

Covariance Matrices ◽

Asymptotic Formulas ◽

High Dimensional ◽

High Dimensions ◽

Full Data ◽

Sample Covariance ◽

Practical Standpoint ◽

Spectral Statistics ◽

Low Dimensional

Summary Statistics derived from the eigenvalues of sample covariance matrices are called spectral statistics, and they play a central role in multivariate testing. Although bootstrap methods are an established approach to approximating the laws of spectral statistics in low-dimensional problems, such methods are relatively unexplored in the high-dimensional setting. The aim of this article is to focus on linear spectral statistics as a class of prototypes for developing a new bootstrap in high dimensions, a method we refer to as the spectral bootstrap. In essence, the proposed method originates from the parametric bootstrap and is motivated by the fact that in high dimensions it is difficult to obtain a nonparametric approximation to the full data-generating distribution. From a practical standpoint, the method is easy to use and allows the user to circumvent the difficulties of complex asymptotic formulas for linear spectral statistics. In addition to proving the consistency of the proposed method, we present encouraging empirical results in a variety of settings. Lastly, and perhaps most interestingly, we show through simulations that the method can be applied successfully to statistics outside the class of linear spectral statistics, such as the largest sample eigenvalue and others.

Download Full-text