scholarly journals On the Time Complexity of Algorithm Selection Hyper-Heuristics for Multimodal Optimisation

Author(s):  
Andrei Lissovoi ◽  
Pietro S. Oliveto ◽  
John Alasdair Warwicker

Selection hyper-heuristics are automated algorithm selection methodologies that choose between different heuristics during the optimisation process. Recently selection hyperheuristics choosing between a collection of elitist randomised local search heuristics with different neighbourhood sizes have been shown to optimise a standard unimodal benchmark function from evolutionary computation in the optimal expected runtime achievable with the available low-level heuristics. In this paper we extend our understanding to the domain of multimodal optimisation by considering a hyper-heuristic from the literature that can switch between elitist and nonelitist heuristics during the run. We first identify the range of parameters that allow the hyper-heuristic to hillclimb efficiently and prove that it can optimise a standard hillclimbing benchmark function in the best expected asymptotic time achievable by unbiased mutation-based randomised search heuristics. Afterwards, we use standard multimodal benchmark functions to highlight function characteristics where the hyper-heuristic is efficient by swiftly escaping local optima and ones where it is not. For a function class called CLIFFd where a new gradient of increasing fitness can be identified after escaping local optima, the hyper-heuristic is extremely efficient while a wide range of established elitist and non-elitist algorithms are not, including the well-studied Metropolis algorithm. We complete the picture with an analysis of another standard benchmark function called JUMPd as an example to highlight problem characteristics where the hyper-heuristic is inefficient. Yet, it still outperforms the wellestablished non-elitist Metropolis algorithm.

2017 ◽  
Vol 25 (4) ◽  
pp. 587-606 ◽  
Author(s):  
Carola Doerr ◽  
Johannes Lengler

Black-box complexity theory provides lower bounds for the runtime of black-box optimizers like evolutionary algorithms and other search heuristics and serves as an inspiration for the design of new genetic algorithms. Several black-box models covering different classes of algorithms exist, each highlighting a different aspect of the algorithms under considerations. In this work we add to the existing black-box notions a new elitist black-box model, in which algorithms are required to base all decisions solely on (the relative performance of) a fixed number of the best search points sampled so far. Our elitist model thus combines features of the ranking-based and the memory-restricted black-box models with an enforced usage of truncation selection. We provide several examples for which the elitist black-box complexity is exponentially larger than that of the respective complexities in all previous black-box models, thus showing that the elitist black-box complexity can be much closer to the runtime of typical evolutionary algorithms. We also introduce the concept of p-Monte Carlo black-box complexity, which measures the time it takes to optimize a problem with failure probability at most p. Even for small  p, the p-Monte Carlo black-box complexity of a function class [Formula: see text] can be smaller by an exponential factor than its typically regarded Las Vegas complexity (which measures the expected time it takes to optimize [Formula: see text]).


2013 ◽  
Vol 411-414 ◽  
pp. 1125-1128 ◽  
Author(s):  
Hong Yi Li ◽  
Meng Ye ◽  
Di Zhao

The Independent Component Analysis (ICA) is a classical algorithm for exploring statistically independent non-Gaussian signals from multi-dimensional data, which has a wide range of applications in engineering, for instance, the blind source separation. The classical ICA measures the Gaussian characteristic by kurtosis, which has the following two disadvantages. Firstly, the kurtosis relies on the value of samples, and is not robust to outliers. Secondly, the algorithm often falls into local optima. To address these drawbacks, we replace the kurtosis by negative entropy, utilize the simulated annealing algorithm for optimization, and finally propose an improved ICA algorithm. Experimental results demonstrate that the proposed algorithm outperforms the classical ICA in its robustness to outliers and convergent rate.


Author(s):  
Aparna Pradeep Laturkar ◽  
Sridharan Bhavani ◽  
DeepaliParag Adhyapak

<span>Wireless Sensor Network (WSN) is emergingtechnology and has wide range of applications, such as environment monitoring, industrial automation and numerous military applications. Hence, WSN is popular among researchers. WSN has several constraints such as restricted sensing range, communication range and limited battery capacity. These limitations bring issues such as coverage, connectivity, network lifetime and scheduling &amp; data aggregation. There are mainly three strategies for solving coverage problems namely; force, grid and computational geometry based. PSO is a multidimensional optimization method inspired from the social behavior of birds called flocking. Basic version of PSO has the drawback of sometimes getting trapped in local optima as particles learn from each other and past solutions. This issue is solved by discrete version of PSO known as Modified Discrete Binary PSO (MDBPSO) as it uses probabilistic approach. This paper discusses performance analysis of random; grid based MDBPSO (Modified Discrete Binary Particle Swarm Optimization), Force Based VFCPSO and Combination of Grid &amp; Force Based sensor deployment algorithms based on interval and packet size. </span><span>From the results of Combination of Grid &amp; Force Based sensor deployment algorithm, it can be concluded that its performance is best for all parameters as compared to rest of the three methods when interval and packet size is varied.</span>


Author(s):  
Husein Elkeshreu ◽  
Otman Basir

Many medical applications benefit from the diversity inherent in imaging technologies to obtain more reliable diagnoses and assessments. Typically, the images obtained from multiple sources are acquired at distinct times and from different viewpoints, rendering a multitude of challenges for the registration process. Furthermore, different areas of the human body require disparate registration functional capabilities and degrees of accuracy. Thus, the benefit attained from the image multiplicity hinges heavily on the imaging modalities employed as well as the accuracy of the alignment process.  It is no surprise then that a wide range of registration techniques has emerged in the last two decades. Nevertheless, it is widely acknowledged that despite the many attempts, no registration technique has been able to deliver the required accuracy consistently under diverse operating conditions.  This paper introduces a novel method for achieving multimodal medical image registration based on exploiting the complementary and competitive nature of the algorithmic approaches behind a wide range of registration techniques. First, a thorough investigation of a wide range of registration algorithms is conducted for the purpose of understanding and quantifying their registration capabilities as well as the influence of their control parameters. Subsequently, a supervised randomized machine learning strategy is proposed for selecting the best registration algorithm for a given registration instance, and for determining the optimal control parameters for such algorithm. Several experiments have been conducted to verify the capabilities of the proposed selection strategy with respect to registration reliability, accuracy, and robustness.


2020 ◽  
Author(s):  
AK Wills

AbstractThis paper presents a novel multi-step automated algorithm to screen for errors in longitudinal height and weight data and describes the frequency and characteristics of errors in three datasets. It also offers a taxonomy of published cleaning routines from a scoping review.Illustrative data are from three Norwegian retrospective cohorts containing 87,792 assessments (birth to 14y) from 8,428 children. Each has different data pipelines, quality control and data structure. The algorithm contains 43 steps split into 3 sections; (a) dates, (b) Identifiable data entry errors, (c) biologically impossible/ implausible change, and uses logic checks, and cross-sectional and longitudinal routines. The WHO cross-sectional approach was also applied as a comparison.Published cleaning routines were taxonomized by their design, the marker used to screen errors, the reference threshold and how threshold was selected. Fully automated error detection was not possible without false positives or reduced sensitivity. Error frequencies in the cohorts were 0.4%, 2.1% and 2.4% of all assessments, and the percentage of children with ≥1 error was 4.1%, 13.4% and 15.3%. In two of the datasets, >2/3s of errors could be classified as inliers (within ±3SD scores). Children with errors had a similar distribution of HT and WT to those without error. The WHO cross-sectional approach lacked sensitivity (range 0-55%), flagged many false positives (range: 7-100%) and biased estimates of overweight and thinness.Elements of this algorithm may have utility for built-in data entry rules, data harmonisation and sensitivity analyses. The reported error frequencies and structure may also help design more realistic simulation studies to test routines. Multi-step distribution-wide algorithmic approaches are recommended to systematically screen and document the wide range of ways in which errors can occur and to maximise sensitivity for detecting errors, naive cross-sectional trimming as a stand-alone method may do more harm than good.


2019 ◽  
Author(s):  
Christina B. Azodi ◽  
Andrew McCarren ◽  
Mark Roantree ◽  
Gustavo de los Campos ◽  
Shin-Han Shiu

AbstractThe usefulness of Genomic Prediction (GP) in crop and livestock breeding programs has led to efforts to develop new and improved GP approaches including non-linear algorithm, such as artificial neural networks (ANN) (i.e. deep learning) and gradient tree boosting. However, the performance of these algorithms has not been compared in a systematic manner using a wide range of GP datasets and models. Using data of 18 traits across six plant species with different marker densities and training population sizes, we compared the performance of six linear and five non-linear algorithms, including ANNs. First, we found that hyperparameter selection was critical for all non-linear algorithms and that feature selection prior to model training was necessary for ANNs when the markers greatly outnumbered the number of training lines. Across all species and trait combinations, no one algorithm performed best, however predictions based on a combination of results from multiple GP algorithms (i.e. ensemble predictions) performed consistently well. While linear and non-linear algorithms performed best for a similar number of traits, the performance of non-linear algorithms vary more between traits than that of linear algorithms. Although ANNs did not perform best for any trait, we identified strategies (i.e. feature selection, seeded starting weights) that boosted their performance near the level of other algorithms. These results, together with the fact that even small improvements in GP performance could accumulate into large genetic gains over the course of a breeding program, highlights the importance of algorithm selection for the prediction of trait values.


Author(s):  
Christophe Giraud-Carrier ◽  
Pavel Brazdil ◽  
Carlos Soares ◽  
Ricardo Vilalta

The application of Machine Learning (ML) and Data Mining (DM) tools to classification and regression tasks has become a standard, not only in research but also in administrative agencies, commerce and industry (e.g., finance, medicine, engineering). Unfortunately, due in part to the number of available techniques and the overall complexity of the process, users facing a new data mining task must generally either resort to trialand- error or consultation of experts. Clearly, neither solution is completely satisfactory for the non-expert end-users who wish to access the technology more directly and cost-effectively. What is needed is an informed search process to reduce the amount of experimentation with different techniques while avoiding the pitfalls of local optima that may result from low quality models. Informed search requires meta-knowledge, that is, knowledge about the performance of those techniques. Metalearning provides a robust, automatic mechanism for building such meta-knowledge. One of the underlying goals of meta-learning is to understand the interaction between the mechanism of learning and the concrete contexts in which that mechanism is applicable. Metalearning differs from base-level learning in the scope of adaptation. Whereas learning at the base-level focuses on accumulating experience on a specific learning task (e.g., credit rating, medical diagnosis, mine-rock discrimination, fraud detection, etc.), learning at the meta-level is concerned with accumulating experience on the performance of multiple applications of a learning system. The meta-knowledge induced by meta-learning provides the means to inform decisions about the precise conditions under which a given algorithm, or sequence of algorithms, is better than others for a given task. While Data Mining software packages (e.g., SAS Enterprise Miner, SPSS Clementine, Insightful Miner, PolyAnalyst, KnowledgeStudio, Weka, Yale, Xelopes) provide user-friendly access to rich collections of algorithms, they generally offer no real decision support to non-expert end-users. Similarly, tools with emphasis on advanced visualization help users understand the data (e.g., to select adequate transformations) and the models (e.g., to tweak parameters, compare results, and focus on specific parts of the model), but treat algorithm selection as a post-processing activity driven by the users rather than the system. Data mining practitioners need systems that guide them by producing explicit advice automatically. This chapter shows how meta-learning can be leveraged to provide such advice in the context of algorithm selection.


2021 ◽  
Author(s):  
Jonathan Heins ◽  
Jakob Bossek ◽  
Janina Pohl ◽  
Moritz Seiler ◽  
Heike Trautmann ◽  
...  

2019 ◽  
Vol 58 (1) ◽  
Author(s):  
Yuan Cao ◽  
Heta Parmar ◽  
Ann Marie Simmons ◽  
Devika Kale ◽  
Kristy Tong ◽  
...  

ABSTRACT Molecular surveillance of rifampin-resistant Mycobacterium tuberculosis can help to monitor the transmission of the disease. The Xpert MTB/RIF Ultra assay detects mutations in the rifampin resistance-determining region (RRDR) of the rpoB gene by the use of melting temperature (Tm) information from 4 rpoB probes which can fall in one of the 9 different assay-specified Tm windows. The large amount of Tm data generated by the assay offers the possibility of an RRDR genotyping approach more accessible than whole-genome sequencing. In this study, we developed an automated algorithm to specifically identify a wide range of mutations in the rpoB RRDR by utilizing the pattern of the Tm of the 4 probes within the 9 windows generated by the Ultra assay. The algorithm builds a RRDR mutation-specific “Tm signature” reference library from a set of known mutations and then identifies the RRDR genotype of an unknown sample by measuring the Tm distances between the test sample and the reference Tm values. Validated using a set of clinical isolates, the algorithm correctly identified RRDR genotypes of 93% samples with a wide range of rpoB single and double mutations. Our analytical approach showed a great potential for fast RRDR mutation identification and may also be used as a stand-alone method for ruling out relapse or transmission between patients. The algorithm can be further modified and optimized for higher accuracy as more Ultra data become available.


Sign in / Sign up

Export Citation Format

Share Document