Fast Approximate k-Nearest Neighbours Search Using GPGPU

This paper focuses on forecasting the price of Bitcoin, motivated by its market growth and the recent interest of market participants and academics. We deploy six machine learning algorithms (e.g., Artificial Neural Network, Support Vector Machine, Random Forest, k-Nearest Neighbours, AdaBoost, Ridge regression), without deciding a priori which one is the ‘best’ model. The main contribution is to use these data analytics techniques with great caution in the parameterization, instead of classical parametric modelings (AR), to disentangle the non-stationary behavior of the data. As soon as Bitcoin is also used for diversification in portfolios, we need to investigate its interactions with stocks, bonds, foreign exchange, and commodities. We identify that other cryptocurrencies convey enough information to explain the daily variation of Bitcoin’s spot and futures prices. Forecasting results point to the segmentation of Bitcoin concerning alternative assets. Finally, trading strategies are implemented.

Download Full-text

Computing Expectiles Using k-Nearest Neighbours Approach

Symmetry ◽

10.3390/sym13040645 ◽

2021 ◽

Vol 13 (4) ◽

pp. 645

Author(s):

Muhammad Farooq ◽

Sehrish Sarfraz ◽

Christophe Chesneau ◽

Mahmood Ul Hassan ◽

Muhammad Ali Raza ◽

...

Keyword(s):

Computational Cost ◽

Real Life ◽

Distance Measures ◽

Computational Time ◽

High Dimensional ◽

Test Error ◽

Nearest Neighbours ◽

Comparable Performance ◽

Asymmetric Least Squares ◽

Low Computational Cost

Expectiles have gained considerable attention in recent years due to wide applications in many areas. In this study, the k-nearest neighbours approach, together with the asymmetric least squares loss function, called ex-kNN, is proposed for computing expectiles. Firstly, the effect of various distance measures on ex-kNN in terms of test error and computational time is evaluated. It is found that Canberra, Lorentzian, and Soergel distance measures lead to minimum test error, whereas Euclidean, Canberra, and Average of (L1,L∞) lead to a low computational cost. Secondly, the performance of ex-kNN is compared with existing packages er-boost and ex-svm for computing expectiles that are based on nine real life examples. Depending on the nature of data, the ex-kNN showed two to 10 times better performance than er-boost and comparable performance with ex-svm regarding test error. Computationally, the ex-kNN is found two to five times faster than ex-svm and much faster than er-boost, particularly, in the case of high dimensional data.

Download Full-text

Optimal Relabeling of Water Molecules and Single-Molecule Entropy Estimation

Biophysica ◽

10.3390/biophysica1030021 ◽

2021 ◽

Vol 1 (3) ◽

pp. 279-296

Author(s):

Federico Fogolari ◽

Gennaro Esposito

Keyword(s):

Single Molecule ◽

Degrees Of Freedom ◽

Water Molecules ◽

Maximum Information ◽

Nearest Neighbours ◽

Biomolecular Simulations ◽

Solvent Molecules ◽

Solvation Entropy ◽

Dynamics Simulations ◽

Internal Degrees Of Freedom

Estimation of solvent entropy from equilibrium molecular dynamics simulations is a long-standing problem in statistical mechanics. In recent years, methods that estimate entropy using k-th nearest neighbours (kNN) have been applied to internal degrees of freedom in biomolecular simulations, and for the rigorous computation of positional-orientational entropy of one and two molecules. The mutual information expansion (MIE) and the maximum information spanning tree (MIST) methods were proposed and used to deal with a large number of non-independent degrees of freedom, providing estimates or bounds on the global entropy, thus complementing the kNN method. The application of the combination of such methods to solvent molecules appears problematic because of the indistinguishability of molecules and of their symmetric parts. All indistiguishable molecules span the same global conformational volume, making application of MIE and MIST methods difficult. Here, we address the problem of indistinguishability by relabeling water molecules in such a way that each water molecule spans only a local region throughout the simulation. Then, we work out approximations and show how to compute the single-molecule entropy for the system of relabeled molecules. The results suggest that relabeling water molecules is promising for computation of solvation entropy.

Download Full-text

Baetis majus sp. nov., new species of mayfly (Ephemeroptera: Baetidae) from Far East of Russia

Zootaxa ◽

10.11646/zootaxa.4965.3.8 ◽

2021 ◽

Vol 4965 (3) ◽

pp. 541-557

Author(s):

TATIANA M. TIUNOVA ◽

ALEXANDER A. SEMENCHENKO ◽

XIAOLI TONG

Keyword(s):

New Species ◽

Russian Far East ◽

Far East ◽

Morphological Characters ◽

Species Group ◽

Morphological Studies ◽

Nearest Neighbours ◽

Differential Identification ◽

Coi Sequences ◽

Western Palaearctic

A new species, Baetis majus Tiunova sp. nov., is described and illustrated based on larvae and reared adults discovered in the Russian Far East. The differential identification of this species was determined by the characteristics of other representatives of the genus Baetis Leach, including subgenera Baetis Leach and Tenuibaetis Kang & Yang from Eastern and Western Palaearctic, Nearctic and Oriental regions. In addition to morphological studies, DNA barcoding of the described species with average intraspecific K2P distances to nearest neighbours is documented. We reconstructed the phylogenetic relationships of all available cytochrome c oxidase subunit I (COI) sequences of the subgenera of Baetis and Tenuibaetis from four regions. Bayesian analysis using 47 morphological characters additional to partial COI sequences did not allow to determine the species-group of the Baetis genus to which the described species belongs.

Download Full-text

On the Relevance of Cross-project Learning with Nearest Neighbours for Commit Message Generation

Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops ◽

10.1145/3387940.3391488 ◽

2020 ◽

Author(s):

Khashayar Etemadi ◽

Martin Monperrus

Keyword(s):

Project Learning ◽

Nearest Neighbours ◽

Cross Project

Download Full-text

Incorporating ranking rules into k nearest neighbours

2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) ◽

10.1109/fuzz-ieee.2019.8858892 ◽

2019 ◽

Author(s):

Noelia Rico ◽

Raul Perez-Fernandez ◽

Irene Diaz

Keyword(s):

Nearest Neighbours

Download Full-text

Radial generation of n-dimensional poisson processes

Journal of Applied Probability ◽

10.1017/s0021900200028746 ◽

1984 ◽

Vol 21 (03) ◽

pp. 548-557

Author(s):

M. P. Quine ◽

D. F. Watson

Keyword(s):

Poisson Process ◽

Poisson Processes ◽

Three Dimensions ◽

Simulation Studies ◽

Simple Method ◽

Efficient Simulation ◽

Nearest Neighbours

A simple method is proposed for the generation of successive ‘nearest neighbours' to a given origin in ann-dimensional Poisson process. It is shown that the method provides efficient simulation of random Voronoi polytopes. Results are given of simulation studies in two and three dimensions.

Download Full-text

Improvement of the classification quality in detection of Hashimoto’s disease with a combined classifier approach

Proceedings of the Institution of Mechanical Engineers Part H Journal of Engineering in Medicine ◽

10.1177/0954411917702682 ◽

2017 ◽

Vol 231 (8) ◽

pp. 774-782 ◽

Cited By ~ 8

Author(s):

Zbigniew Omiotek

Keyword(s):

Computer System ◽

Majority Vote ◽

Classification Error ◽

Automatic Identification ◽

Detection Accuracy ◽

Ultrasound Images ◽

Linear Discriminant ◽

Nearest Neighbours ◽

Combined Classifier ◽

Classification Quality

The purpose of the study was to construct an efficient classifier that, along with a given reduced set of discriminant features, could be used as a part of the computer system in automatic identification and classification of ultrasound images of the thyroid gland, which is aimed to detect cases affected by Hashimoto’s thyroiditis. A total of 10 supervised learning techniques and a majority vote for the combined classifier were used. Two models were proposed as a result of the classifier’s construction. The first one is based on the K-nearest neighbours method (for K = 7). It uses three discriminant features and affords sensitivity equal to 88.1%, specificity of 66.7% and classification error at a level of 21.8%. The second model is a combined classifier, which was constructed using three-component classifiers. They are based on the K-nearest neighbours method (for K = 7), linear discriminant analysis and a boosting algorithm. The combined classifier is based on 48 discriminant features. It allows to achieve the classification sensitivity equal to 88.1%, specificity of 69.4% and classification error at a level of 20.5%. The combined classifier allows to improve the classification quality compared to the single model. The models, built as a part of the automatic computer system, may support the physician, especially in first-contact hospitals, in diagnosis of cases that are difficult to recognise based on ultrasound images. The high sensitivity of constructed classification models indicates high detection accuracy of the sick cases, and this is beneficial to the patients from a medical point of view.

Download Full-text

Röntgenbeugungsuntersuchung an geschmolzenem Selen und Tellur sowie an Legierungen des Systems Selen—Tellur/ X-ray Diffraction Study of Molten Selenium, and Tellurium and of Molten Selenium — Tellurium-Alloys

Zeitschrift für Naturforschung A ◽

10.1515/zna-1975-1220 ◽

1975 ◽

Vol 30 (12) ◽

pp. 1633-1639 ◽

Cited By ~ 4

Author(s):

W. Hoyer ◽

E. Thomas ◽

M. Wobst

Keyword(s):

Chain Length ◽

Short Range ◽

Short Range Order ◽

Coordination Shell ◽

Range Order ◽

Average Chain Length ◽

X Ray Diffraction ◽

Coordination Numbers ◽

Nearest Neighbours ◽

Increasing Temperature

Abstract At temperatures just above the melting point molten Selenium seems to be a mixture of long chains and eight-membered rings. With increasing temperature the number of rings and the average chain length decrease. At 460 °C the average chain length lies in the range of 10 atoms.In a slightly supercooled Tellurium-melt the number of first neighbours is two. The atoms are arranged in chains. Selenium rich Se-Te-alloy-melts are built up of mixed chains. It seems to be possible, that a smaller part of atoms forms Se6Te2 rings. At Tellurium concentrations higher than approximately 50 at.-% the chainlike structure with two next nearest neighbours changes to a disturbed Arsen-like short range order. The number of electrons in the first coordination shell, the short range order parameter introduced by Cowley and the partial coordination numbers show that Se-Te-alloys are of the "solution system" type, but in the whole concentration range for each atom there is a tendency to have "strange coordination".

Download Full-text