sunny-as2: Enhancing SUNNY for Algorithm Selection

SUNNY is an Algorithm Selection (AS) technique originally tailored for Constraint Programming (CP). SUNNY is based on the k-nearest neighbors algorithm and enables one to schedule, from a portfolio of solvers, a subset of solvers to be run on a given CP problem. This approach has proved to be effective for CP problems. In 2015, the ASlib benchmarks were released for comparing AS systems coming from disparate fields (e.g., ASP, QBF, and SAT) and SUNNY was extended to deal with generic AS problems. This led to the development of sunny-as, a prototypical algorithm selector based on SUNNY for ASlib scenarios. A major improvement of sunny-as, called sunny-as2, was then submitted to the Open Algorithm Selection Challenge (OASC) in 2017, where it turned out to be the best approach for the runtime minimization of decision problems. In this work we present the technical advancements of sunny-as2, by detailing through several empirical evaluations and by providing new insights. Its current version, built on the top of the preliminary version submitted to OASC, is able to outperform sunny-as and other state-of-the-art AS methods, including those who did not attend the challenge.

Download Full-text

Machine Learning Approach to Dysphonia Detection

Applied Sciences ◽

10.3390/app8101927 ◽

2018 ◽

Vol 8 (10) ◽

pp. 1927 ◽

Cited By ~ 1

Author(s):

Zuzana Dankovičová ◽

Dávid Sovák ◽

Peter Drotár ◽

Liberios Vokorokos

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Nearest Neighbors ◽

Classification Model ◽

Support Vector ◽

Learning Approach ◽

K Nearest Neighbors ◽

Machine Learning Methods ◽

Machine Learning Approach ◽

Speech Features

This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech features and used these to train the classification model. As classifiers, three state-of-the-art methods were used: K-nearest neighbors, random forests, and support vector machine. We analyzed the performance of classifiers with and without gender taken into account. The experimental results showed that it is possible to recognize pathological speech with as high as a 91.3% classification accuracy.

Download Full-text

A decomposition of multi-dimensional point-sets with applications to k-nearest-neighbors and n-body potential fields (preliminary version)

Proceedings of the twenty-fourth annual ACM symposium on Theory of computing - STOC '92 ◽

10.1145/129712.129766 ◽

1992 ◽

Cited By ~ 24

Author(s):

Paul B. Callahan ◽

S. Rao Kosaraju

Keyword(s):

Nearest Neighbors ◽

Potential Fields ◽

Point Sets ◽

K Nearest Neighbors ◽

Preliminary Version ◽

Body Potential

Download Full-text

Localized Incomplete Multiple Kernel k-means

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/454 ◽

2018 ◽

Cited By ~ 4

Author(s):

Xinzhong Zhu ◽

Xinwang Liu ◽

Miaomiao Li ◽

En Zhu ◽

Li Liu ◽

...

Keyword(s):

Iterative Algorithm ◽

Local Structure ◽

Optimization Problem ◽

Clustering Algorithm ◽

Recent Literature ◽

State Of The Art ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Multiple Kernel ◽

Benchmark Datasets

The recently proposed multiple kernel k-means with incomplete kernels (MKKM-IK) optimally integrates a group of pre-specified incomplete kernel matrices to improve clustering performance. Though it demonstrates promising performance in various applications, we observe that it does not \emph{sufficiently consider the local structure among data and indiscriminately forces all pairwise sample similarity to equally align with their ideal similarity values}. This could make the incomplete kernels less effectively imputed, and in turn adversely affect the clustering performance. In this paper, we propose a novel localized incomplete multiple kernel k-means (LI-MKKM) algorithm to address this issue. Different from existing MKKM-IK, LI-MKKM only requires the similarity of a sample to its k-nearest neighbors to align with their ideal similarity values. This helps the clustering algorithm to focus on closer sample pairs that shall stay together and avoids involving unreliable similarity evaluation for farther sample pairs. We carefully design a three-step iterative algorithm to solve the resultant optimization problem and theoretically prove its convergence. Comprehensive experiments on eight benchmark datasets demonstrate that our algorithm significantly outperforms the state-of-the-art comparable algorithms proposed in the recent literature, verifying the advantage of considering local structure.

Download Full-text

k-Nearest Neighbors by Means of Sequence to Sequence Deep Neural Networks and Memory Networks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/442 ◽

2021 ◽

Author(s):

Yiming Xu ◽

Diego Klabjan

Keyword(s):

Neural Network ◽

Deep Neural Networks ◽

State Of The Art ◽

Sampling Technique ◽

Nearest Neighbors ◽

Classification Models ◽

Feed Forward Neural Network ◽

K Nearest Neighbors ◽

Out Of Sample ◽

Memory Network

k-Nearest Neighbors is one of the most fundamental but effective classification models. In this paper, we propose two families of models built on a sequence to sequence model and a memory network model to mimic the k-Nearest Neighbors model, which generate a sequence of labels, a sequence of out-of-sample feature vectors and a final label for classification, and thus they could also function as oversamplers. We also propose 'out-of-core' versions of our models which assume that only a small portion of data can be loaded into memory. Computational experiments show that our models on structured datasets outperform k-Nearest Neighbors, a feed-forward neural network, XGBoost, lightGBM, random forest and a memory network, due to the fact that our models must produce additional output and not just the label. On image and text datasets, the performance of our model is close to many state-of-the-art deep models. As an oversampler on imbalanced datasets, the sequence to sequence kNN model often outperforms Synthetic Minority Over-sampling Technique and Adaptive Synthetic Sampling.

Download Full-text

Efficiently Producing the K Nearest Neighbors in the Skyline on Vertically Partitioned Tables

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2013040104 ◽

2013 ◽

Vol 3 (2) ◽

pp. 58-77

Author(s):

Marlene Goncalves ◽

Maria-Esther Vidal

Keyword(s):

State Of The Art ◽

Property Values ◽

Hybrid Approach ◽

Nearest Neighbors ◽

Query Point ◽

K Nearest Neighbors ◽

Speed Up ◽

Set Of Points ◽

Nearest Points

Criteria that induce a Skyline naturally represent user's preference conditions useful to discard irrelevant data in large datasets. However, in the presence of high-dimensional Skyline spaces, the size of the Skyline can still be very large, making unfeasible for users to process this set of points. To identify the best points among the Skyline, the Top-k Skyline approach has been proposed. Top-k Skyline uses discriminatory criteria to induce a total order of the points that comprise the Skyline, and recognizes the best or top-k points based on these criteria. In this article the authors model queries as multi-dimensional points that represent bounds of VPT (Vertically Partitioned Table) property values, and datasets as sets of multi-dimensional points; the problem is to locate the k best tuples in the dataset whose distance to the query is minimized. A tuple is among the k best tuples whenever there is not another tuple that is better in all dimensions, and that is closer to the query point, i.e., the k best tuples correspond to the k nearest points to the query that are incomparable or belong to the skyline. The authors name these tuples the k nearest neighbors in the skyline. The authors propose a hybrid approach that combines Skyline and Top-k solutions and develop two algorithms: TKSI and k-NNSkyline. The proposed algorithms identify among the skyline tuples, the k ones with the lowest values of the distance metric, i.e., the k nearest neighbors to the multi-dimensional query that are incomparable. Empirically, we study the performance and quality of TKSI and k-NNSkyline. The authors’ experimental results show the TKSI is able to speed up the computation of the Top-k Skyline in at least 50% percent with respect to the state-of-the-art solutions, whenever k is smaller than the size of the Skyline. Additionally, the authors’ results suggest that k-NNSkyline outperforms existing solutions by up to three orders of magnitude.

Download Full-text

Stochastic Constraint Programming with And-Or Branch-and-Bound

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/76 ◽

2017 ◽

Cited By ~ 3

Author(s):

Behrouz Babaki ◽

Tias Guns ◽

Luc de Raedt

Keyword(s):

Constraint Programming ◽

Possible Worlds ◽

Probabilistic Models ◽

Graphical Model ◽

State Of The Art ◽

Search Tree ◽

Decision Problems ◽

Processing Times ◽

Multi Stage ◽

Model Community

Complex multi-stage decision making problems often involve uncertainty, for example, regarding demand or processing times. Stochastic constraint programming was proposed as a way to formulate and solve such decision problems, involving arbitrary constraints over both decision and random variables. What stochastic constraint programming still lacks is support for the use of factorized probabilistic models that are popular in the graphical model community. We show how a state-of-the-art probabilistic inference engine can be integrated into standard constraint solvers. The resulting approach searches over the And-Or search tree directly, and we investigate tight bounds on the expected utility objective. This significantly improves search efficiency and outperforms scenario-based methods that ground out the possible worlds.

Download Full-text

Food Detection Using Histogram of Oriented Gradient (HOG) as Feature Extraction and K-Nearest Neighbors (K-NN) as Classifier

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2020/3191.52020 ◽

2020 ◽

Vol 9 (1.5) ◽

pp. 219-225

Author(s):

Diah Rahmadani

Keyword(s):

Feature Extraction ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Histogram Of Oriented Gradient ◽

Food Detection

Download Full-text

The Implementation of Subspace Outlier Detection in K-Nearest Neighbors to Improve Accuracy in Bank Marketing Data

International Journal of Emerging Trends in Engineering Research ◽

10.30534/ijeter/2020/44822020 ◽

2020 ◽

Vol 8 (2) ◽

pp. 545-550

Author(s):

Dimas Aryo Anggoro

Keyword(s):

Outlier Detection ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Improve Accuracy ◽

Marketing Data ◽

Bank Marketing

Download Full-text

Evolutionary Feature Scaling in K-Nearest Neighbors Based on Label Dispersion Minimization

2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC) ◽

10.1109/smc42975.2020.9282834 ◽

2020 ◽

Author(s):

Suryoday Basak ◽

Manfred Huber

Keyword(s):

Nearest Neighbors ◽

K Nearest Neighbors ◽

Feature Scaling

Download Full-text

Tropical Balls and Its Applications to K Nearest Neighbor over the Space of Phylogenetic Trees

Mathematics ◽

10.3390/math9070779 ◽

2021 ◽

Vol 9 (7) ◽

pp. 779

Author(s):

Ruriko Yoshida

Keyword(s):

Supervised Learning ◽

Phylogenetic Trees ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

High Dimensional ◽

Learning Method ◽

Dimensional Vector ◽

K Nearest Neighbor ◽

K Nearest Neighbors

A tropical ball is a ball defined by the tropical metric over the tropical projective torus. In this paper we show several properties of tropical balls over the tropical projective torus and also over the space of phylogenetic trees with a given set of leaf labels. Then we discuss its application to the K nearest neighbors (KNN) algorithm, a supervised learning method used to classify a high-dimensional vector into given categories by looking at a ball centered at the vector, which contains K vectors in the space.

Download Full-text