scholarly journals Efficient molecular surface rendering by linear-time pseudo-Gaussian approximation to Lee–Richards surfaces (PGALRS)

2010 ◽  
Vol 43 (2) ◽  
pp. 356-361 ◽  
Author(s):  
Herbert J. Bernstein ◽  
Paul A. Craig

ThePGALRS(pseudo-Gaussian approximation to Lee–Richards surfaces) algorithm is discussed. By modeling electron density with unphysical pseudo-Gaussian atoms, the Lee–Richards surface can be approximated by a contour level of that density in time approximately linear in the number of atoms. Having that contour level, the atoms and residues closest to that surface can be identified in average timeO[n2/3log(n)] using aNearTree-based nearest neighbor search. If a high-quality Lee–Richards surface is required, then, as a final stage, one of the standard Lee–Richards algorithms can be used but considering only the previously identified surface residues; the typical cost is thereby reduced toO[n2/3log(n)], making the overall average time for all the stepsO(n). For very large macromolecules, such a reduction in computational burden may be essential to being able to render a meaningful molecular surface. This approach extends the feasible range of application for existing molecular surface software, such asMSMS, to larger macromolecules, especially to macromolecules with more than 50 000 atoms, and can be used as a starting point for surface-based (as opposed to backbone-based) motif identification,e.g.usingProMol.

Author(s):  
Ercan Canhasi

Text modeling and sentence selection are the fundamental steps of a typical extractive document summarization algorithm.   The common text modeling method connects a pair of sentences based on their similarities.   Even thought it can effectively represent the sentence similarity graph of given document(s) its big drawback is a large time complexity of $O(n^2)$, where n represents the number of sentences.   The quadratic time complexity makes it impractical for large documents.   In this paper we propose the fast approximation algorithms for the text modeling and the sentence selection.   Our text modeling algorithm reduces the time complexity to near-linear time by rapidly finding the most similar sentences to form the sentences similarity graph.   In doing so we utilized Locality-Sensitive Hashing, a fast algorithm for the approximate nearest neighbor search.   For the sentence selection step we propose a simple memory-access-efficient node ranking method based on the idea of scanning sequentially only the neighborhood arrays.    Experimentally, we show that sacrificing a rather small percentage of recall and precision in the quality of the produced summary can reduce the quadratic to sub-linear time complexity.   We see the big potential of proposed method in text summarization for mobile devices and big text data summarization for internet of things on cloud.   In our experiments, beside evaluating the presented method on the standard general and query multi-document summarization tasks, we also tested it on few alternative summarization tasks including general and query, timeline, and comparative summarization.


Author(s):  
Ercan Canhasi

Text modeling and sentence selection are the fundamental steps of a typical extractive document summarization algorithm.   The common text modeling method connects a pair of sentences based on their similarities.   Even thought it can effectively represent the sentence similarity graph of given document(s) its big drawback is a large time complexity of $O(n^2)$, where n represents the number of sentences.   The quadratic time complexity makes it impractical for large documents.   In this paper we propose the fast approximation algorithms for the text modeling and the sentence selection.   Our text modeling algorithm reduces the time complexity to near-linear time by rapidly finding the most similar sentences to form the sentences similarity graph.   In doing so we utilized Locality-Sensitive Hashing, a fast algorithm for the approximate nearest neighbor search.   For the sentence selection step we propose a simple memory-access-efficient node ranking method based on the idea of scanning sequentially only the neighborhood arrays.    Experimentally, we show that sacrificing a rather small percentage of recall and precision in the quality of the produced summary can reduce the quadratic to sub-linear time complexity.   We see the big potential of proposed method in text summarization for mobile devices and big text data summarization for internet of things on cloud.   In our experiments, beside evaluating the presented method on the standard general and query multi-document summarization tasks, we also tested it on few alternative summarization tasks including general and query, timeline, and comparative summarization.


2020 ◽  
Author(s):  
Cameron Hargreaves ◽  
Matthew Dyer ◽  
Michael Gaultois ◽  
Vitaliy Kurlin ◽  
Matthew J Rosseinsky

It is a core problem in any field to reliably tell how close two objects are to being the same, and once this relation has been established we can use this information to precisely quantify potential relationships, both analytically and with machine learning (ML). For inorganic solids, the chemical composition is a fundamental descriptor, which can be represented by assigning the ratio of each element in the material to a vector. These vectors are a convenient mathematical data structure for measuring similarity, but unfortunately, the standard metric (the Euclidean distance) gives little to no variance in the resultant distances between chemically dissimilar compositions. We present the Earth Mover’s Distance (EMD) for inorganic compositions, a well-defined metric which enables the measure of chemical similarity in an explainable fashion. We compute the EMD between two compositions from the ratio of each of the elements and the absolute distance between the elements on the modified Pettifor scale. This simple metric shows clear strength at distinguishing compounds and is efficient to compute in practice. The resultant distances have greater alignment with chemical understanding than the Euclidean distance, which is demonstrated on the binary compositions of the Inorganic Crystal Structure Database (ICSD). The EMD is a reliable numeric measure of chemical similarity that can be incorporated into automated workflows for a range of ML techniques. We have found that with no supervision the use of this metric gives a distinct partitioning of binary compounds into clear trends and families of chemical property, with future applications for nearest neighbor search queries in chemical database retrieval systems and supervised ML techniques.


2021 ◽  
Vol 7 (2) ◽  
pp. 187-199
Author(s):  
Meng-Hao Guo ◽  
Jun-Xiong Cai ◽  
Zheng-Ning Liu ◽  
Tai-Jiang Mu ◽  
Ralph R. Martin ◽  
...  

AbstractThe irregular domain and lack of ordering make it challenging to design deep neural networks for point cloud processing. This paper presents a novel framework named Point Cloud Transformer (PCT) for point cloud learning. PCT is based on Transformer, which achieves huge success in natural language processing and displays great potential in image processing. It is inherently permutation invariant for processing a sequence of points, making it well-suited for point cloud learning. To better capture local context within the point cloud, we enhance input embedding with the support of farthest point sampling and nearest neighbor search. Extensive experiments demonstrate that the PCT achieves the state-of-the-art performance on shape classification, part segmentation, semantic segmentation, and normal estimation tasks.


2011 ◽  
Vol 23 (5) ◽  
pp. 641-654 ◽  
Author(s):  
Stavros Papadopoulos ◽  
Lixing Wang ◽  
Yin Yang ◽  
Dimitris Papadias ◽  
Panagiotis Karras

2016 ◽  
Vol 2016 ◽  
pp. 1-14 ◽  
Author(s):  
Mingjun Deng ◽  
Shiru Qu

There are many short-term road travel time forecasting studies based on time series, but indeed, road travel time not only relies on the historical travel time series, but also depends on the road and its adjacent sections history flow. However, few studies have considered that. This paper is based on the correlation of flow spatial distribution and the road travel time series, applying nearest neighbor and nonparametric regression method to build a forecasting model. In aspect of spatial nearest neighbor search, three different space distances are defined. In addition, two forecasting functions are introduced: one combines the forecasting value by mean weight and the other uses the reciprocal of nearest neighbors distance as combined weight. Three different distances are applied in nearest neighbor search, which apply to the two forecasting functions. For travel time series, the nearest neighbor and nonparametric regression are applied too. Then minimizing forecast error variance is utilized as an objective to establish the combination model. The empirical results show that the combination model can improve the forecast performance obviously. Besides, the experimental results of the evaluation for the computational complexity show that the proposed method can satisfy the real-time requirement.


Sign in / Sign up

Export Citation Format

Share Document