scholarly journals sunny-as2: Enhancing SUNNY for Algorithm Selection

2021 ◽  
Vol 72 ◽  
pp. 329-376
Author(s):  
Tong Liu ◽  
Roberto Amadini ◽  
Maurizio Gabbrielli ◽  
Jacopo Mauro

SUNNY is an Algorithm Selection (AS) technique originally tailored for Constraint Programming (CP). SUNNY is based on the k-nearest neighbors algorithm and enables one to schedule, from a portfolio of solvers, a subset of solvers to be run on a given CP problem. This approach has proved to be effective for CP problems. In 2015, the ASlib benchmarks were released for comparing AS systems coming from disparate fields (e.g., ASP, QBF, and SAT) and SUNNY was extended to deal with generic AS problems. This led to the development of sunny-as, a prototypical algorithm selector based on SUNNY for ASlib scenarios. A major improvement of sunny-as, called sunny-as2, was then submitted to the Open Algorithm Selection Challenge (OASC) in 2017, where it turned out to be the best approach for the runtime minimization of decision problems. In this work we present the technical advancements of sunny-as2, by detailing through several empirical evaluations and by providing new insights. Its current version, built on the top of the preliminary version submitted to OASC, is able to outperform sunny-as and other state-of-the-art AS methods, including those who did not attend the challenge.

2018 ◽  
Vol 8 (10) ◽  
pp. 1927 ◽  
Author(s):  
Zuzana Dankovičová ◽  
Dávid Sovák ◽  
Peter Drotár ◽  
Liberios Vokorokos

This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech features and used these to train the classification model. As classifiers, three state-of-the-art methods were used: K-nearest neighbors, random forests, and support vector machine. We analyzed the performance of classifiers with and without gender taken into account. The experimental results showed that it is possible to recognize pathological speech with as high as a 91.3% classification accuracy.


Author(s):  
Xinzhong Zhu ◽  
Xinwang Liu ◽  
Miaomiao Li ◽  
En Zhu ◽  
Li Liu ◽  
...  

The recently proposed multiple kernel k-means with incomplete kernels (MKKM-IK) optimally integrates a group of pre-specified incomplete kernel matrices to improve clustering performance. Though it demonstrates promising performance in various applications, we observe that it does not \emph{sufficiently  consider the local structure among data and indiscriminately forces all pairwise sample similarity to equally align with their ideal similarity values}. This could make the incomplete kernels less effectively imputed, and in turn adversely affect the clustering performance. In this paper, we propose a novel localized incomplete multiple kernel k-means (LI-MKKM) algorithm to address this issue. Different from existing MKKM-IK, LI-MKKM only requires the similarity of a sample to its k-nearest neighbors to align with their ideal similarity values. This helps the clustering algorithm to focus on closer sample pairs that shall stay together and avoids involving unreliable similarity evaluation for farther sample pairs. We carefully design a three-step iterative algorithm to solve the resultant optimization problem and theoretically prove its convergence. Comprehensive experiments on eight benchmark datasets demonstrate that our algorithm significantly outperforms the state-of-the-art comparable algorithms proposed in the recent literature, verifying the advantage of considering local structure.


Author(s):  
Yiming Xu ◽  
Diego Klabjan

k-Nearest Neighbors is one of the most fundamental but effective classification models. In this paper, we propose two families of models built on a sequence to sequence model and a memory network model to mimic the k-Nearest Neighbors model, which generate a sequence of labels, a sequence of out-of-sample feature vectors and a final label for classification, and thus they could also function as oversamplers. We also propose 'out-of-core' versions of our models which assume that only a small portion of data can be loaded into memory. Computational experiments show that our models on structured datasets outperform k-Nearest Neighbors, a feed-forward neural network, XGBoost, lightGBM, random forest and a memory network, due to the fact that our models must produce additional output and not just the label. On image and text datasets, the performance of our model is close to many state-of-the-art deep models. As an oversampler on imbalanced datasets, the sequence to sequence kNN model often outperforms Synthetic Minority Over-sampling Technique and Adaptive Synthetic Sampling.


2013 ◽  
Vol 3 (2) ◽  
pp. 58-77
Author(s):  
Marlene Goncalves ◽  
Maria-Esther Vidal

Criteria that induce a Skyline naturally represent user's preference conditions useful to discard irrelevant data in large datasets. However, in the presence of high-dimensional Skyline spaces, the size of the Skyline can still be very large, making unfeasible for users to process this set of points. To identify the best points among the Skyline, the Top-k Skyline approach has been proposed. Top-k Skyline uses discriminatory criteria to induce a total order of the points that comprise the Skyline, and recognizes the best or top-k points based on these criteria. In this article the authors model queries as multi-dimensional points that represent bounds of VPT (Vertically Partitioned Table) property values, and datasets as sets of multi-dimensional points; the problem is to locate the k best tuples in the dataset whose distance to the query is minimized. A tuple is among the k best tuples whenever there is not another tuple that is better in all dimensions, and that is closer to the query point, i.e., the k best tuples correspond to the k nearest points to the query that are incomparable or belong to the skyline. The authors name these tuples the k nearest neighbors in the skyline. The authors propose a hybrid approach that combines Skyline and Top-k solutions and develop two algorithms: TKSI and k-NNSkyline. The proposed algorithms identify among the skyline tuples, the k ones with the lowest values of the distance metric, i.e., the k nearest neighbors to the multi-dimensional query that are incomparable. Empirically, we study the performance and quality of TKSI and k-NNSkyline. The authors’ experimental results show the TKSI is able to speed up the computation of the Top-k Skyline in at least 50% percent with respect to the state-of-the-art solutions, whenever k is smaller than the size of the Skyline. Additionally, the authors’ results suggest that k-NNSkyline outperforms existing solutions by up to three orders of magnitude.


Author(s):  
Behrouz Babaki ◽  
Tias Guns ◽  
Luc de Raedt

Complex multi-stage decision making problems often involve uncertainty, for example, regarding demand or processing times. Stochastic constraint programming was proposed as a way to formulate and solve such decision problems, involving arbitrary constraints over both decision and random variables. What stochastic constraint programming still lacks is support for the use of factorized probabilistic models that are popular in the graphical model community. We show how a state-of-the-art probabilistic inference engine can be integrated into standard constraint solvers. The resulting approach searches over the And-Or search tree directly, and we investigate tight bounds on the expected utility objective. This significantly improves search efficiency and outperforms scenario-based methods that ground out the possible worlds.


Mathematics ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 779
Author(s):  
Ruriko Yoshida

A tropical ball is a ball defined by the tropical metric over the tropical projective torus. In this paper we show several properties of tropical balls over the tropical projective torus and also over the space of phylogenetic trees with a given set of leaf labels. Then we discuss its application to the K nearest neighbors (KNN) algorithm, a supervised learning method used to classify a high-dimensional vector into given categories by looking at a ball centered at the vector, which contains K vectors in the space.


Sign in / Sign up

Export Citation Format

Share Document