scholarly journals Removing Redundancy and Reducing Fitness Evaluation Costs in Genetic Programming

2021 ◽  
Author(s):  
◽  
Phillip Lee-Ming Wong

<p>One of the greater issues in Genetic Programming (GP) is the computational effort required to run the evolution and discover a good solution. Phenomena such as program bloating (where genetic programs rapidly grow in size) can quickly exhaust available memory resources and slow down the evolutionary process, while the heavy cost of performing fitness evaluation can make problems which have a lot of available data very slow to solve. These issues may limit GP in some tasks it can appropriately be applied to, as well as inhibit its applications in time/space sensitive environments. In this thesis, we look at developing solutions to some of these issues in GP computational cost. First, we develop an algebraic program simplification method based on simple rules and hashing techniques, and use this method in conjunction with the standard GP on a variety of tasks. Our results suggest that program simplification can lead to a significant reduction in program size, while not significantly changing the effectiveness of the systems in finding solution programs. Secondly, we analyse the effects of program simplification on the internal GP "building blocks" to investigate whether simplification is a destructive or constructive force. Using two models for building blocks (numerical-nodes and the more complex fixed-depth subtree), we track building blocks through GP runs on a symbolic regression problem, both with and without using simplification. We find that the program simplification process can both disrupt and construct building blocks in the GP populations. However, GP systems using simplification appear to retain important building blocks, and the simplification process appears to lead to an increase in genetic diversity. These may help explain why using simplification does not reduce the effectiveness of GP systems in solving tasks. Lastly, we develop two methods of reducing the cost of fitness evaluation by reducing the number of node evaluations performed. The first method is elitism avoidance, which avoids re-evaluating programs which have been placed in the population using elitismreproduction. Thismethod reduces the CPU time for evolving solutions for six different GP tasks. The second method is a subtree caching mechanism which store fitness evaluations for subtrees in a cache so that they may be fetched when these subtrees are encountered in future fitness evaluations. Results suggest that using this mechanism can significantly reduce both the number of node evaluations and the overall CPU time used in evolving solutions, without reducing the fitness of the solutions produced.</p>

2021 ◽  
Author(s):  
◽  
Phillip Lee-Ming Wong

<p>One of the greater issues in Genetic Programming (GP) is the computational effort required to run the evolution and discover a good solution. Phenomena such as program bloating (where genetic programs rapidly grow in size) can quickly exhaust available memory resources and slow down the evolutionary process, while the heavy cost of performing fitness evaluation can make problems which have a lot of available data very slow to solve. These issues may limit GP in some tasks it can appropriately be applied to, as well as inhibit its applications in time/space sensitive environments. In this thesis, we look at developing solutions to some of these issues in GP computational cost. First, we develop an algebraic program simplification method based on simple rules and hashing techniques, and use this method in conjunction with the standard GP on a variety of tasks. Our results suggest that program simplification can lead to a significant reduction in program size, while not significantly changing the effectiveness of the systems in finding solution programs. Secondly, we analyse the effects of program simplification on the internal GP "building blocks" to investigate whether simplification is a destructive or constructive force. Using two models for building blocks (numerical-nodes and the more complex fixed-depth subtree), we track building blocks through GP runs on a symbolic regression problem, both with and without using simplification. We find that the program simplification process can both disrupt and construct building blocks in the GP populations. However, GP systems using simplification appear to retain important building blocks, and the simplification process appears to lead to an increase in genetic diversity. These may help explain why using simplification does not reduce the effectiveness of GP systems in solving tasks. Lastly, we develop two methods of reducing the cost of fitness evaluation by reducing the number of node evaluations performed. The first method is elitism avoidance, which avoids re-evaluating programs which have been placed in the population using elitismreproduction. Thismethod reduces the CPU time for evolving solutions for six different GP tasks. The second method is a subtree caching mechanism which store fitness evaluations for subtrees in a cache so that they may be fetched when these subtrees are encountered in future fitness evaluations. Results suggest that using this mechanism can significantly reduce both the number of node evaluations and the overall CPU time used in evolving solutions, without reducing the fitness of the solutions produced.</p>


Author(s):  
L. M. Almutairi ◽  
S. Shetty ◽  
H. G. Momm

Evolutionary computation, in the form of genetic programming, is used to aid information extraction process from high-resolution satellite imagery in a semi-automatic fashion. Distributing and parallelizing the task of evaluating all candidate solutions during the evolutionary process could significantly reduce the inherent computational cost of evolving solutions that are composed of multichannel large images. In this study, we present the design and implementation of a system that leverages cloud-computing technology to expedite supervised solution development in a centralized evolutionary framework. The system uses the MapReduce programming model to implement a distributed version of the existing framework in a cloud-computing platform. The proposed system has two major subsystems; (i) data preparation: the generation of random spectral indices; and (ii) distributed processing: the distributed implementation of genetic programming, which is used to spectrally distinguish the features of interest from the remaining image background in the cloud computing environment in order to improve scalability. The proposed system reduces response time by leveraging the vast computational and storage resources in a cloud computing environment. The results demonstrate that distributing the candidate solutions reduces the execution time by 91.58%. These findings indicate that such technology could be applied to more complex problems that involve a larger population size and number of generations.


Proceedings ◽  
2019 ◽  
Vol 33 (1) ◽  
pp. 24 ◽  
Author(s):  
Sascha Ranftl ◽  
Gian Marco Melito ◽  
Vahid Badeli ◽  
Alice Reinbacher-Köstinger ◽  
Katrin Ellermann ◽  
...  

Aortic dissection is a cardiovascular disease with a disconcertingly high mortality. When it comes to diagnosis, medical imaging techniques such as Computed Tomography, Magnetic Resonance Tomography or Ultrasound certainly do the job, but also have their shortcomings. Impedance cardiography is a standard method to monitor a patients heart function and circulatory system by injecting electric currents and measuring voltage drops between electrode pairs attached to the human body. If such measurements distinguished healthy from dissected aortas, one could improve clinical procedures. Experiments are quite difficult, and thus we investigate the feasibility with finite element simulations beforehand. In these simulations, we find uncertain input parameters, e.g., the electrical conductivity of blood. Inference on the state of the aorta from impedance measurements defines an inverse problem in which forward uncertainty propagation through the simulation with vanilla Monte Carlo demands a prohibitively large computational effort. To overcome this limitation, we combine two simulations: one simulation with a high fidelity and another simulation with a low fidelity, and low and high computational costs accordingly. We use the inexpensive low-fidelity simulation to learn about the expensive high-fidelity simulation. It all boils down to a regression problem—and reduces total computational cost after all.


2020 ◽  
Author(s):  
Fangfang Zhang ◽  
Yi Mei ◽  
S Nguyen ◽  
Mengjie Zhang

© 2020, Springer Nature Switzerland AG. Dynamic flexible job shop scheduling (DFJSS) is a very valuable practical application problem that can be applied in many fields such as cloud computing and manufacturing. In DFJSS, machine assignment and operation sequencing decisions need to be made simultaneously in dynamic environments with unpredicted events such as new job arrivals. Scheduling heuristic is an ideal candidate for solving the DFJSS problem due to its efficiency and simplicity. Genetic programming (GP) has been successfully applied to evolve scheduling heuristics for job shop scheduling automatically. However, GP has a huge search space, and the traditional search algorithms do not utilise effectively the information obtained from the evolutionary process. This paper proposes a new method to make better use of the information during the evolutionary process of GP to further enhance the ability of GP. To be specific, this paper proposes two adaptive search strategies based on the frequency of features in promising individuals to guide GP to evolve effective rules. This paper examines the proposed algorithm on six different DFJSS scenarios. The results show that the proposed GP with adaptive search can converge faster and achieve significantly better performance than the GP without adaptive search in most scenarios while no worse in all other scenarios without increasing the computational cost.


1995 ◽  
Vol 3 (4) ◽  
pp. 417-452 ◽  
Author(s):  
Hitoshi Iba ◽  
Hugo deGaris ◽  
Taisuke Sato

This paper introduces a new approach to genetic programming (GP), based on a numerical technique, which integrates a GP-based adaptive search of tree structures, and a local parameter tuning mechanism employing statistical search (a system identification technique). In traditional GP, recombination can cause frequent disruption of building blocks or mutation can cause abrupt changes in the semantics. To overcome these difficulties, we supplement traditional GP with a local hill-climbing search, using a parameter tuning procedure. More precisely, we integrate the structural search of traditional GP with a multiple regression analysis method and establish our adaptive program, called STROGANOFF (STructured Representation On Genetic Algorithms for NOn-linear Function Fitting). The fitness evaluation is based on a minimum description length (MDL) criterion, which effectively controls the tree growth in GP. We demonstrate its effectiveness by solving several system identification (numerical) problems and compare the performance of STROGANOFF with traditional GP and another standard technique (radial basis functions). We then extend STROGANOFF to symbolic (nonnumerical) reasoning by introducing multiple types of nodes, using a modified MDL-based selection criterion and a pruning of the resultant trees. The effectiveness of this numerical approach to GP is demonstrated by successful application to symbolic regression problems.


2021 ◽  
Author(s):  
◽  
Will Smart

<p>Schemata and buiding blocks have been used in Genetic Programming (GP) in several contexts including subroutines, theoretical analysis and for empirical analysis. Of these three the least explored is empirical analysis. This thesis presents a powerful GP empirical analysis technique for analysis of all schemata of a given form occurring in any program of a given population at scales not previously possible for the kinds of global analysis performed. There are many competing GP forms of schema and, rather than choosing one for analysis, the thesis defines the match-tree meta-form of schema as a general language expressing forms of schema for use by the analysis system. This language can express most forms of schema previously used in tree-based GP. The new method can perform wide-ranging analyses on the prohibitively large set of all schemata in the programs by introducing the concepts of maximal schema, maximal program subset, representative set of schemata, and representative program subset. These structures are used to optimize the analysis, shrinking its complexity to a manageable size without sacrificing the result. Characterization experiments analyze GP populations of up to 501 60- node programs, using 11 forms of schema including rooted-hyperschemata and non-rooted fragments. The new method has close to quadratic complexity on population size, and quartic complexity on program size. Efficacy experiments present example analyses using the new method. The experiments offer interesting insights into the dynamics of GP runs including fine-grained analysis of convergence and the visualization of schemata during a GP evolution. Future work will apply the many possible extensions of this new method to understanding how GP operates, including studies of convergence, building blocks and schema fitness. This method provides a much finer-resolution microscope into the inner workings of GP and will be used to provide accessable visualizations of the evolutionary process.</p>


Data Mining ◽  
2011 ◽  
pp. 174-190 ◽  
Author(s):  
Andries P. Engelbrecht ◽  
L. Schoeman ◽  
Sonja Rouwhorst

Genetic programming has recently been used successfully to extract knowledge in the form of IF-THEN rules. For these genetic programming approaches to knowledge extraction from data, individuals represent decision trees. The main objective of the evolutionary process is therefore to evolve the best decision tree, or classifier, to describe the data. Rules are then extracted, after convergence, from the best individual. The current genetic programming approaches to evolve decision trees are computationally complex, since individuals are initialized to complete decision trees. This chapter discusses a new approach to genetic programming for rule extraction, namely the building block approach. This approach starts with individuals consisting of only one building block, and adds new building blocks during the evolutionary process when the simplicity of the individuals cannot account for the complexity in the underlying data. Experimental results are presented and compared with that of C4.5 and CN2. The chapter shows that the building block approach achieves very good accuracies compared to that of C4.5 and CN2. It is also shown that the building block approach extracts substantially less rules.


2020 ◽  
Author(s):  
Fangfang Zhang ◽  
Yi Mei ◽  
S Nguyen ◽  
Mengjie Zhang

© 2020, Springer Nature Switzerland AG. Dynamic flexible job shop scheduling (DFJSS) is a very valuable practical application problem that can be applied in many fields such as cloud computing and manufacturing. In DFJSS, machine assignment and operation sequencing decisions need to be made simultaneously in dynamic environments with unpredicted events such as new job arrivals. Scheduling heuristic is an ideal candidate for solving the DFJSS problem due to its efficiency and simplicity. Genetic programming (GP) has been successfully applied to evolve scheduling heuristics for job shop scheduling automatically. However, GP has a huge search space, and the traditional search algorithms do not utilise effectively the information obtained from the evolutionary process. This paper proposes a new method to make better use of the information during the evolutionary process of GP to further enhance the ability of GP. To be specific, this paper proposes two adaptive search strategies based on the frequency of features in promising individuals to guide GP to evolve effective rules. This paper examines the proposed algorithm on six different DFJSS scenarios. The results show that the proposed GP with adaptive search can converge faster and achieve significantly better performance than the GP without adaptive search in most scenarios while no worse in all other scenarios without increasing the computational cost.


2021 ◽  
Author(s):  
◽  
Will Smart

<p>Schemata and buiding blocks have been used in Genetic Programming (GP) in several contexts including subroutines, theoretical analysis and for empirical analysis. Of these three the least explored is empirical analysis. This thesis presents a powerful GP empirical analysis technique for analysis of all schemata of a given form occurring in any program of a given population at scales not previously possible for the kinds of global analysis performed. There are many competing GP forms of schema and, rather than choosing one for analysis, the thesis defines the match-tree meta-form of schema as a general language expressing forms of schema for use by the analysis system. This language can express most forms of schema previously used in tree-based GP. The new method can perform wide-ranging analyses on the prohibitively large set of all schemata in the programs by introducing the concepts of maximal schema, maximal program subset, representative set of schemata, and representative program subset. These structures are used to optimize the analysis, shrinking its complexity to a manageable size without sacrificing the result. Characterization experiments analyze GP populations of up to 501 60- node programs, using 11 forms of schema including rooted-hyperschemata and non-rooted fragments. The new method has close to quadratic complexity on population size, and quartic complexity on program size. Efficacy experiments present example analyses using the new method. The experiments offer interesting insights into the dynamics of GP runs including fine-grained analysis of convergence and the visualization of schemata during a GP evolution. Future work will apply the many possible extensions of this new method to understanding how GP operates, including studies of convergence, building blocks and schema fitness. This method provides a much finer-resolution microscope into the inner workings of GP and will be used to provide accessable visualizations of the evolutionary process.</p>


Author(s):  
Na Geng ◽  
Zhiting Chen ◽  
Quang A. Nguyen ◽  
Dunwei Gong

AbstractThis paper focuses on the problem of robot rescue task allocation, in which multiple robots and a global optimal algorithm are employed to plan the rescue task allocation. Accordingly, a modified particle swarm optimization (PSO) algorithm, referred to as task allocation PSO (TAPSO), is proposed. Candidate assignment solutions are represented as particles and evolved using an evolutionary process. The proposed TAPSO method is characterized by a flexible assignment decoding scheme to avoid the generation of unfeasible assignments. The maximum number of successful tasks (survivors) is considered as the fitness evaluation criterion under a scenario where the survivors’ survival time is uncertain. To improve the solution, a global best solution update strategy, which updates the global best solution depends on different phases so as to balance the exploration and exploitation, is proposed. TAPSO is tested on different scenarios and compared with other counterpart algorithms to verify its efficiency.


Sign in / Sign up

Export Citation Format

Share Document