An Efficient Exact Nearest Neighbor Search by Compounded Embedding

Author(s):  
Mingjie Li ◽  
Ying Zhang ◽  
Yifang Sun ◽  
Wei Wang ◽  
Ivor W. Tsang ◽  
...  
Author(s):  
Bilegsaikhan Naidan ◽  
Magnus Lie Hetland

This article presents a new approximate index structure, the Bregman hyperplane tree, for indexing the Bregman divergence, aiming to decrease the number of distance computations required at query processing time, by sacrificing some accuracy in the result. The experimental results on various high-dimensional data sets demonstrate that the proposed index structure performs comparably to the state-of-the-art Bregman ball tree in terms of search performance and result quality. Moreover, this method results in a speedup of well over an order of magnitude for index construction. The authors also apply their space partitioning principle to the Bregman ball tree and obtain a new index structure for exact nearest neighbor search that is faster to build and a slightly slower at query processing than the original.


2013 ◽  
Vol 13 (1) ◽  
pp. 195-206 ◽  
Author(s):  
Travis Mackoy ◽  
Robert C. Harris ◽  
Jesse Johnson ◽  
Michael Mascagni ◽  
Marcia O. Fenley

AbstractStochastic walk-on-spheres (WOS) algorithms for solving the linearized Poisson-Boltzmann equation (LPBE) provide several attractive features not available in traditional deterministic solvers: Gaussian error bars can be computed easily, the algorithm is readily parallelized and requires minimal memory and multiple solvent environments can be accounted for by reweighting trajectories. However, previously-reported computational times of these Monte Carlo methods were not competitive with existing deterministic numerical methods. The present paper demonstrates a series of numerical optimizations that collectively make the computational time of these Monte Carlo LPBE solvers competitive with deterministic methods. The optimization techniques used are to ensure that each atom’s contribution to the variance of the electrostatic solvation free energy is the same, to optimize the bias-generating parameters in the algorithm and to use an epsilon-approximate rather than exact nearest-neighbor search when determining the size of the next step in the Brownian motion when outside the molecule.


2020 ◽  
Author(s):  
Cameron Hargreaves ◽  
Matthew Dyer ◽  
Michael Gaultois ◽  
Vitaliy Kurlin ◽  
Matthew J Rosseinsky

It is a core problem in any field to reliably tell how close two objects are to being the same, and once this relation has been established we can use this information to precisely quantify potential relationships, both analytically and with machine learning (ML). For inorganic solids, the chemical composition is a fundamental descriptor, which can be represented by assigning the ratio of each element in the material to a vector. These vectors are a convenient mathematical data structure for measuring similarity, but unfortunately, the standard metric (the Euclidean distance) gives little to no variance in the resultant distances between chemically dissimilar compositions. We present the Earth Mover’s Distance (EMD) for inorganic compositions, a well-defined metric which enables the measure of chemical similarity in an explainable fashion. We compute the EMD between two compositions from the ratio of each of the elements and the absolute distance between the elements on the modified Pettifor scale. This simple metric shows clear strength at distinguishing compounds and is efficient to compute in practice. The resultant distances have greater alignment with chemical understanding than the Euclidean distance, which is demonstrated on the binary compositions of the Inorganic Crystal Structure Database (ICSD). The EMD is a reliable numeric measure of chemical similarity that can be incorporated into automated workflows for a range of ML techniques. We have found that with no supervision the use of this metric gives a distinct partitioning of binary compounds into clear trends and families of chemical property, with future applications for nearest neighbor search queries in chemical database retrieval systems and supervised ML techniques.


2021 ◽  
Vol 7 (2) ◽  
pp. 187-199
Author(s):  
Meng-Hao Guo ◽  
Jun-Xiong Cai ◽  
Zheng-Ning Liu ◽  
Tai-Jiang Mu ◽  
Ralph R. Martin ◽  
...  

AbstractThe irregular domain and lack of ordering make it challenging to design deep neural networks for point cloud processing. This paper presents a novel framework named Point Cloud Transformer (PCT) for point cloud learning. PCT is based on Transformer, which achieves huge success in natural language processing and displays great potential in image processing. It is inherently permutation invariant for processing a sequence of points, making it well-suited for point cloud learning. To better capture local context within the point cloud, we enhance input embedding with the support of farthest point sampling and nearest neighbor search. Extensive experiments demonstrate that the PCT achieves the state-of-the-art performance on shape classification, part segmentation, semantic segmentation, and normal estimation tasks.


Sign in / Sign up

Export Citation Format

Share Document