Efficient molecular surface rendering by linear-time pseudo-Gaussian approximation to Lee–Richards surfaces (PGALRS)

Text modeling and sentence selection are the fundamental steps of a typical extractive document summarization algorithm. The common text modeling method connects a pair of sentences based on their similarities. Even thought it can effectively represent the sentence similarity graph of given document(s) its big drawback is a large time complexity of $O(n^2)$, where n represents the number of sentences. The quadratic time complexity makes it impractical for large documents. In this paper we propose the fast approximation algorithms for the text modeling and the sentence selection. Our text modeling algorithm reduces the time complexity to near-linear time by rapidly finding the most similar sentences to form the sentences similarity graph. In doing so we utilized Locality-Sensitive Hashing, a fast algorithm for the approximate nearest neighbor search. For the sentence selection step we propose a simple memory-access-efficient node ranking method based on the idea of scanning sequentially only the neighborhood arrays. Experimentally, we show that sacrificing a rather small percentage of recall and precision in the quality of the produced summary can reduce the quadratic to sub-linear time complexity. We see the big potential of proposed method in text summarization for mobile devices and big text data summarization for internet of things on cloud. In our experiments, beside evaluating the presented method on the standard general and query multi-document summarization tasks, we also tested it on few alternative summarization tasks including general and query, timeline, and comparative summarization.

Download Full-text

Fast document summarization using locality sensitive hashing and memory access efficient node ranking

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i3.pp945-954 ◽

2016 ◽

Vol 6 (3) ◽

pp. 945

Author(s):

Ercan Canhasi

Keyword(s):

Time Complexity ◽

Nearest Neighbor ◽

Linear Time ◽

Nearest Neighbor Search ◽

Memory Access ◽

Locality Sensitive Hashing ◽

Document Summarization ◽

Neighbor Search ◽

Node Ranking ◽

Similarity Graph

Text modeling and sentence selection are the fundamental steps of a typical extractive document summarization algorithm. The common text modeling method connects a pair of sentences based on their similarities. Even thought it can effectively represent the sentence similarity graph of given document(s) its big drawback is a large time complexity of $O(n^2)$, where n represents the number of sentences. The quadratic time complexity makes it impractical for large documents. In this paper we propose the fast approximation algorithms for the text modeling and the sentence selection. Our text modeling algorithm reduces the time complexity to near-linear time by rapidly finding the most similar sentences to form the sentences similarity graph. In doing so we utilized Locality-Sensitive Hashing, a fast algorithm for the approximate nearest neighbor search. For the sentence selection step we propose a simple memory-access-efficient node ranking method based on the idea of scanning sequentially only the neighborhood arrays. Experimentally, we show that sacrificing a rather small percentage of recall and precision in the quality of the produced summary can reduce the quadratic to sub-linear time complexity. We see the big potential of proposed method in text summarization for mobile devices and big text data summarization for internet of things on cloud. In our experiments, beside evaluating the presented method on the standard general and query multi-document summarization tasks, we also tested it on few alternative summarization tasks including general and query, timeline, and comparative summarization.

Download Full-text

The Earth Mover’s Distance as a Metric for the Space of Inorganic Compositions

10.26434/chemrxiv.12777566.v1 ◽

2020 ◽

Author(s):

Cameron Hargreaves ◽

Matthew Dyer ◽

Michael Gaultois ◽

Vitaliy Kurlin ◽

Matthew J Rosseinsky

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Inorganic Crystal Structure Database ◽

Earth Mover’S Distance ◽

Chemical Similarity ◽

Earth Mover's Distance ◽

Neighbor Search ◽

The Earth ◽

Binary Compounds

It is a core problem in any field to reliably tell how close two objects are to being the same, and once this relation has been established we can use this information to precisely quantify potential relationships, both analytically and with machine learning (ML). For inorganic solids, the chemical composition is a fundamental descriptor, which can be represented by assigning the ratio of each element in the material to a vector. These vectors are a convenient mathematical data structure for measuring similarity, but unfortunately, the standard metric (the Euclidean distance) gives little to no variance in the resultant distances between chemically dissimilar compositions. We present the Earth Mover’s Distance (EMD) for inorganic compositions, a well-defined metric which enables the measure of chemical similarity in an explainable fashion. We compute the EMD between two compositions from the ratio of each of the elements and the absolute distance between the elements on the modified Pettifor scale. This simple metric shows clear strength at distinguishing compounds and is efficient to compute in practice. The resultant distances have greater alignment with chemical understanding than the Euclidean distance, which is demonstrated on the binary compositions of the Inorganic Crystal Structure Database (ICSD). The EMD is a reliable numeric measure of chemical similarity that can be incorporated into automated workflows for a range of ML techniques. We have found that with no supervision the use of this metric gives a distinct partitioning of binary compounds into clear trends and families of chemical property, with future applications for nearest neighbor search queries in chemical database retrieval systems and supervised ML techniques.

Download Full-text

Adaptive bit allocation hashing for approximate nearest neighbor search

Neurocomputing ◽

10.1016/j.neucom.2014.10.042 ◽

2015 ◽

Vol 151 ◽

pp. 719-728 ◽

Cited By ~ 4

Author(s):

Qin-Zhen Guo ◽

Zhi Zeng ◽

Shuwu Zhang

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Bit Allocation ◽

Approximate Nearest Neighbor Search ◽

Approximate Nearest Neighbor ◽

Neighbor Search

Download Full-text

PCT: Point cloud transformer

Computational Visual Media ◽

10.1007/s41095-021-0229-5 ◽

2021 ◽

Vol 7 (2) ◽

pp. 187-199

Author(s):

Meng-Hao Guo ◽

Jun-Xiong Cai ◽

Zheng-Ning Liu ◽

Tai-Jiang Mu ◽

Ralph R. Martin ◽

...

Keyword(s):

Language Processing ◽

Point Cloud ◽

Nearest Neighbor ◽

Semantic Segmentation ◽

Nearest Neighbor Search ◽

Local Context ◽

Irregular Domain ◽

Cloud Processing ◽

Neighbor Search ◽

Farthest Point

AbstractThe irregular domain and lack of ordering make it challenging to design deep neural networks for point cloud processing. This paper presents a novel framework named Point Cloud Transformer (PCT) for point cloud learning. PCT is based on Transformer, which achieves huge success in natural language processing and displays great potential in image processing. It is inherently permutation invariant for processing a sequence of points, making it well-suited for point cloud learning. To better capture local context within the point cloud, we enhance input embedding with the support of farthest point sampling and nearest neighbor search. Extensive experiments demonstrate that the PCT achieves the state-of-the-art performance on shape classification, part segmentation, semantic segmentation, and normal estimation tasks.

Download Full-text

Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination

Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data ◽

10.1145/3318464.3380600 ◽

2020 ◽

Author(s):

Conglong Li ◽

Minjia Zhang ◽

David G. Andersen ◽

Yuxiong He

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Early Termination ◽

Approximate Nearest Neighbor Search ◽

Approximate Nearest Neighbor ◽

Neighbor Search

Download Full-text

A Fast k-Nearest Neighbor Search Using Query-Specific Signature Selection

Proceedings of the 24th ACM International on Conference on Information and Knowledge Management - CIKM '15 ◽

10.1145/2806416.2806632 ◽

2015 ◽

Cited By ~ 1

Author(s):

Youngki Park ◽

Heasoo Hwang ◽

Sang-goo Lee

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbor Search ◽

K Nearest Neighbor ◽

Neighbor Search ◽

K Nearest Neighbor Search

Download Full-text

The role of local dimensionality measures in benchmarking nearest neighbor search

Information Systems ◽

10.1016/j.is.2021.101807 ◽

2021 ◽

pp. 101807

Author(s):

Martin Aumüller ◽

Matteo Ceccarello

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Neighbor Search

Download Full-text

Authenticated Multistep Nearest Neighbor Search

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2010.157 ◽

2011 ◽

Vol 23 (5) ◽

pp. 641-654 ◽

Cited By ~ 7

Author(s):

Stavros Papadopoulos ◽

Lixing Wang ◽

Yin Yang ◽

Dimitris Papadias ◽

Panagiotis Karras

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Neighbor Search

Download Full-text

Road Short-Term Travel Time Prediction Method Based on Flow Spatial Distribution and the Relations

Mathematical Problems in Engineering ◽

10.1155/2016/7626875 ◽

2016 ◽

Vol 2016 ◽

pp. 1-14 ◽

Cited By ~ 1

Author(s):

Mingjun Deng ◽

Shiru Qu

Keyword(s):

Time Series ◽

Spatial Distribution ◽

Travel Time ◽

Nonparametric Regression ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Short Term ◽

Combination Model ◽

The Road ◽

Neighbor Search

There are many short-term road travel time forecasting studies based on time series, but indeed, road travel time not only relies on the historical travel time series, but also depends on the road and its adjacent sections history flow. However, few studies have considered that. This paper is based on the correlation of flow spatial distribution and the road travel time series, applying nearest neighbor and nonparametric regression method to build a forecasting model. In aspect of spatial nearest neighbor search, three different space distances are defined. In addition, two forecasting functions are introduced: one combines the forecasting value by mean weight and the other uses the reciprocal of nearest neighbors distance as combined weight. Three different distances are applied in nearest neighbor search, which apply to the two forecasting functions. For travel time series, the nearest neighbor and nonparametric regression are applied too. Then minimizing forecast error variance is utilized as an objective to establish the combination model. The empirical results show that the combination model can improve the forecast performance obviously. Besides, the experimental results of the evaluation for the computational complexity show that the proposed method can satisfy the real-time requirement.

Download Full-text