Neural Networks for Predicting Algorithm Runtime Distributions

Many state-of-the-art algorithms for solving hard combinatorial problems in artificial intelligence (AI) include elements of stochasticity that lead to high variations in runtime, even for a fixed problem instance. Knowledge about the resulting runtime distributions (RTDs) of algorithms on given problem instances can be exploited in various meta-algorithmic procedures, such as algorithm selection, portfolios, and randomized restarts. Previous work has shown that machine learning can be used to individually predict mean, median and variance of RTDs. To establish a new state-of-the-art in predicting RTDs, we demonstrate that the parameters of an RTD should be learned jointly and that neural networks can do this well by directly optimizing the likelihood of an RTD given runtime observations. In an empirical study involving five algorithms for SAT solving and AI planning, we show that neural networks predict the true RTDs of unseen instances better than previous methods, and can even do so when only few runtime observations are available per training instance.

Download Full-text

Automated Algorithm Selection: Survey and Perspectives

Evolutionary Computation ◽

10.1162/evco_a_00242 ◽

2019 ◽

Vol 27 (1) ◽

pp. 3-45 ◽

Cited By ~ 45

Author(s):

Pascal Kerschke ◽

Holger H. Hoos ◽

Frank Neumann ◽

Heike Trautmann

Keyword(s):

State Of The Art ◽

Combinatorial Problems ◽

The State ◽

Problem Instance ◽

Ai Planning ◽

Algorithm Selection ◽

Automated Algorithm ◽

Continuous Problems ◽

Algorithm Configuration ◽

The One

It has long been observed that for practically any computational problem that has been intensely studied, different instances are best solved using different algorithms. This is particularly pronounced for computationally hard problems, where in most cases, no single algorithm defines the state of the art; instead, there is a set of algorithms with complementary strengths. This performance complementarity can be exploited in various ways, one of which is based on the idea of selecting, from a set of given algorithms, for each problem instance to be solved the one expected to perform best. The task of automatically selecting an algorithm from a given set is known as the per-instance algorithm selection problem and has been intensely studied over the past 15 years, leading to major improvements in the state of the art in solving a growing number of discrete combinatorial problems, including propositional satisfiability and AI planning. Per-instance algorithm selection also shows much promise for boosting performance in solving continuous and mixed discrete/continuous optimisation problems. This survey provides an overview of research in automated algorithm selection, ranging from early and seminal works to recent and promising application areas. Different from earlier work, it covers applications to discrete and continuous problems, and discusses algorithm selection in context with conceptually related approaches, such as algorithm configuration, scheduling, or portfolio selection. Since informative and cheaply computable problem instance features provide the basis for effective per-instance algorithm selection systems, we also provide an overview of such features for discrete and continuous problems. Finally, we provide perspectives on future work in the area and discuss a number of open research challenges.

Download Full-text

PPDIST, global 0.1° daily and 3-hourly precipitation probability distribution climatologies for 1979–2018

Scientific Data ◽

10.1038/s41597-020-00631-x ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Hylke E. Beck ◽

Seth Westra ◽

Jackson Tan ◽

Florian Pappenberger ◽

George J. Huffman ◽

...

Keyword(s):

Neural Networks ◽

Probability Distribution ◽

State Of The Art ◽

Intertropical Convergence Zone ◽

Coefficient Of Determination ◽

Current State ◽

Peak Intensity ◽

Global Land ◽

The Neural Networks ◽

Better Than

Abstract We introduce the Precipitation Probability DISTribution (PPDIST) dataset, a collection of global high-resolution (0.1°) observation-based climatologies (1979–2018) of the occurrence and peak intensity of precipitation (P) at daily and 3-hourly time-scales. The climatologies were produced using neural networks trained with daily P observations from 93,138 gauges and hourly P observations (resampled to 3-hourly) from 11,881 gauges worldwide. Mean validation coefficient of determination (R2) values ranged from 0.76 to 0.80 for the daily P occurrence indices, and from 0.44 to 0.84 for the daily peak P intensity indices. The neural networks performed significantly better than current state-of-the-art reanalysis (ERA5) and satellite (IMERG) products for all P indices. Using a 0.1 mm 3 h−1 threshold, P was estimated to occur 12.2%, 7.4%, and 14.3% of the time, on average, over the global, land, and ocean domains, respectively. The highest P intensities were found over parts of Central America, India, and Southeast Asia, along the western equatorial coast of Africa, and in the intertropical convergence zone. The PPDIST dataset is available via www.gloh2o.org/ppdist.

Download Full-text

Sentiment Classification Using Convolutional Neural Networks

Applied Sciences ◽

10.3390/app9112347 ◽

2019 ◽

Vol 9 (11) ◽

pp. 2347 ◽

Cited By ~ 18

Author(s):

Hannah Kim ◽

Young-Seob Jeong

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Networks ◽

Text Classification ◽

State Of The Art ◽

Sentiment Classification ◽

Learning Models ◽

Text Data ◽

Textual Data ◽

Better Than

As the number of textual data is exponentially increasing, it becomes more important to develop models to analyze the text data automatically. The texts may contain various labels such as gender, age, country, sentiment, and so forth. Using such labels may bring benefits to some industrial fields, so many studies of text classification have appeared. Recently, the Convolutional Neural Network (CNN) has been adopted for the task of text classification and has shown quite successful results. In this paper, we propose convolutional neural networks for the task of sentiment classification. Through experiments with three well-known datasets, we show that employing consecutive convolutional layers is effective for relatively longer texts, and our networks are better than other state-of-the-art deep learning models.

Download Full-text

Better Prediction of Mutation Score

10.36227/techrxiv.14905032 ◽

2021 ◽

Author(s):

Yossi Gil ◽

Dor Ma’ayan

Keyword(s):

Neural Networks ◽

Real World ◽

State Of The Art ◽

Future Research ◽

World Systems ◽

Reliable Measurement ◽

Mutation Score ◽

Class Level ◽

Effectiveness Prediction ◽

Better Than

<div><div><div><p>Mutation score is widely accepted to be a reliable measurement for the effectiveness of software tests. Recent studies, however, show that mutation analysis is extremely costly and hard to use in practice. We present a novel direct prediction model of mutation score using neural networks. Relying solely on static code features that do not require generation of mutants or execution of the tests, we predict mutation score with an accuracy better than a quintile. When we include statement coverage as a feature, our accuracy rises to about a decile. Using a similar approach, we also improve the state-of-the-art results for binary test effectiveness prediction and introduce an intuitive, easy-to-calculate set of features superior to previously studied sets. We also publish the largest dataset of test-class level mutation score and static code features data to date, for future research. Finally, we discuss how our approach could be integrated into real-world systems, IDEs, CI tools, and testing frameworks.</p></div></div></div>

Download Full-text

When algorithm selection meets Bi-linear Learning to Rank: accuracy and inference time trade off with candidates expansion

International Journal of Data Science and Analytics ◽

10.1007/s41060-020-00229-x ◽

2020 ◽

Author(s):

Jing Yuan ◽

Christian Geissler ◽

Weijia Shao ◽

Andreas Lommatzsch ◽

Brijnesh Jain

Keyword(s):

Optimal Algorithm ◽

Learning To Rank ◽

Mapping Function ◽

Evaluation Process ◽

Objective Evaluation ◽

Problem Instance ◽

Cost Calculation ◽

Algorithm Selection ◽

Trade Off ◽

Problem Instances

Abstract Algorithm selection (AS) tasks are dedicated to find the optimal algorithm for an unseen problem instance. With the knowledge of problem instances’ meta-features and algorithms’ landmark performances, Machine Learning (ML) approaches are applied to solve AS problems. However, the standard training process of benchmark ML approaches in AS either needs to train the models specifically for every algorithm or relies on the sparse one-hot encoding as the algorithms’ representation. To escape these intermediate steps and form the mapping function directly, we borrow the learning to rank framework from Recommender System (RS) and embed the bi-linear factorization to model the algorithms’ performances in AS. This Bi-linear Learning to Rank (BLR) has proven to work with competence in some AS scenarios and thus is also proposed as a benchmark approach. Thinking from the evaluation perspective in the modern AS challenges, precisely predicting the performance is usually the measuring goal. Though approaches’ inference time also needs to be counted for the running time cost calculation, it’s always overlooked in the evaluation process. The multi-objective evaluation metric Adjusted Ratio of Root Ratios (A3R) is therefore advocated in this paper to balance the trade-off between the accuracy and inference time in AS. Concerning A3R, BLR outperforms other benchmarks when expanding the candidates range to TOP3. The better effect of this candidates expansion results from the cumulative optimum performance during the AS process. We take the further step in the experimentation to represent the advantage of such TOPK expansion, and illustrate that such expansion can be considered as the supplement for the convention of TOP1 selection during the evaluation process.

Download Full-text

AutoFolio: An Automatically Configured Algorithm Selector

Journal of Artificial Intelligence Research ◽

10.1613/jair.4726 ◽

2015 ◽

Vol 53 ◽

pp. 745-778 ◽

Cited By ~ 32

Author(s):

Marius Lindauer ◽

Holger H. Hoos ◽

Frank Hutter ◽

Torsten Schaub

Keyword(s):

State Of The Art ◽

Machine Learning Techniques ◽

Problem Instance ◽

Algorithm Selection ◽

The Core ◽

Automated Algorithm ◽

Learning Techniques ◽

Algorithm Configuration ◽

Optimal Values ◽

The One

Algorithm selection (AS) techniques -- which involve choosing from a set of algorithms the one expected to solve a given problem instance most efficiently -- have substantially improved the state of the art in solving many prominent AI problems, such as SAT, CSP, ASP, MAXSAT and QBF. Although several AS procedures have been introduced, not too surprisingly, none of them dominates all others across all AS scenarios. Furthermore, these procedures have parameters whose optimal values vary across AS scenarios. This holds specifically for the machine learning techniques that form the core of current AS procedures, and for their hyperparameters. Therefore, to successfully apply AS to new problems, algorithms and benchmark sets, two questions need to be answered: (i) how to select an AS approach and (ii) how to set its parameters effectively. We address both of these problems simultaneously by using automated algorithm configuration. Specifically, we demonstrate that we can automatically configure claspfolio 2, which implements a large variety of different AS approaches and their respective parameters in a single, highly-parameterized algorithm framework. Our approach, dubbed AutoFolio, allows researchers and practitioners across a broad range of applications to exploit the combined power of many different AS methods. We demonstrate AutoFolio can significantly improve the performance of claspfolio 2 on 8 out of the 13 scenarios from the Algorithm Selection Library, leads to new state-of-the-art algorithm selectors for 7 of these scenarios, and matches state-of-the-art performance (statistically) on all other scenarios. Compared to the best single algorithm for each AS scenario, AutoFolio achieves average speedup factors between 1.3 and 15.4.

Download Full-text

AutoFolio: An Automatically Configured Algorithm Selector (Extended Abstract)

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/715 ◽

2017 ◽

Cited By ~ 4

Author(s):

Marius Lindauer ◽

Frank Hutter ◽

Holger H. Hoos ◽

Torsten Schaub

Keyword(s):

High Performance ◽

State Of The Art ◽

The State ◽

Problem Instance ◽

Algorithm Selection ◽

Algorithm Configuration ◽

Optimal Values ◽

Art Performance ◽

The One

Algorithm selection (AS) techniques -- which involve choosing from a set of algorithms the one expected to solve a given problem instance most efficiently -- have substantially improved the state of the art in solving many prominent AI problems, such as SAT, CSP, ASP, MAXSAT and QBF. Although several AS procedures have been introduced, not too surprisingly, none of them dominates all others across all AS scenarios. Furthermore, these procedures have parameters whose optimal values vary across AS scenarios. In this extended abstract of our 2015 JAIR article of the same title, we summarize AutoFolio, which uses an algorithm configuration procedure to automatically select an AS approach and optimize its parameters for a given AS scenario. AutoFolio allows researchers and practitioners across a broad range of applications to exploit the combined power of many different AS methods and to automatically construct high-performance algorithm selectors. We demonstrate that AutoFolio was able to produce new state-of-the-art algorithm selectors for 7 well-studied AS scenarios and matches state-of-the-art performance statistically on all other scenarios. Compared to the best single algorithm for each AS scenario, AutoFolio achieved average speedup factors between 1.3 and 15.4.

Download Full-text

Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically Based Features

Molecules ◽

10.3390/molecules26051285 ◽

2021 ◽

Vol 26 (5) ◽

pp. 1285

Author(s):

Alfonso T. García-Sosa

Keyword(s):

Neural Networks ◽

Androgen Receptor ◽

Logistic Model ◽

Deep Neural Networks ◽

State Of The Art ◽

Protein Structures ◽

Training Set ◽

Multivariate Logistic Model ◽

And Training ◽

Better Than

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine-learning classifiers and regressors and to evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to different results, with deep neural networks (DNNs) on user-defined physicochemically relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evaluation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and prediction, improving assessment and design of compounds. Source code and data are available on github.

Download Full-text

Convolutional Neural Networks for Classifying Melanoma Images

10.1101/2020.05.22.110973 ◽

2020 ◽

Author(s):

Abhinav Sagar ◽

J Dheeba

Keyword(s):

Neural Networks ◽

Skin Cancer ◽

Transfer Learning ◽

Convolutional Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Cancer Classification ◽

Previous State ◽

Better Than

AbstractIn this work, we address the problem of skin cancer classification using convolutional neural networks. A lot of cancer cases early on are misdiagnosed as something else leading to severe consequences including the death of a patient. Also there are cases in which patients have some other problems and doctors think they might have skin cancer. This leads to unnecessary time and money spent for further diagnosis. In this work, we address both of the above problems using deep neural networks and transfer learning architecture. We have used publicly available ISIC databases for both training and testing our model. Our work achieves an accuracy of 0.935, precision of 0.94, recall of 0.77, F1 score of 0.85 and ROC-AUC of 0.861 which is better than the previous state of the art approaches.

Download Full-text

Integrating Pseudo-Boolean Constraint Reasoning in Multi-Objective Evolutionary Algorithms

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/165 ◽

2019 ◽

Cited By ~ 1

Author(s):

Miguel Terra-Neves ◽

Inês Lynce ◽

Vasco Manquinho

Keyword(s):

Evolutionary Algorithms ◽

Optimization Problems ◽

State Of The Art ◽

Solution Space ◽

Problem Instance ◽

Multi Objective ◽

Constraint Reasoning ◽

Problem Instances ◽

Quality Of The Population

Constraint-based reasoning methods thrive in solving problem instances with a tight solution space. On the other hand, evolutionary algorithms are usually effective when it is not hard to satisfy the problem constraints. This dichotomy has been observed in many optimization problems. In the particular case of Multi-Objective Combinatorial Optimization (MOCO), new recently proposed constraint-based algorithms have been shown to outperform more established evolutionary approaches when a given problem instance is hard to satisfy. In this paper, we propose the integration of constraint-based procedures in evolutionary algorithms for solving MOCO. First, a new core-based smart mutation operator is applied to individuals that do not satisfy all problem constraints. Additionally, a new smart improvement operator based on Minimal Correction Subsets is used to improve the quality of the population. Experimental results clearly show that the integration of these operators greatly improves multi-objective evolutionary algorithms MOEA/D and NSGAII. Moreover, even on problem instances with a tight solution space, the newly proposed algorithms outperform the state-of-the-art constraint-based approaches for MOCO.

Download Full-text