Model Complexity Control in Straight Line Program Genetic Programming

Tree encodings of programs are well known for their representative power and are used very often in Genetic Programming. In this paper we experiment with a new data structure, named straight line program (slp), to represent computer programs. The main features of this structure are described, new recombination operators for GP related to slp's are introduced and a study of the Vapnik-Chervonenkis dimension of families of slp's is done. Experiments have been performed on symbolic regression problems. Results are encouraging and suggest that the GP approach based on slp's consistently outperforms conventional GP based on tree structured representations.

Download Full-text

On Model Complexity Control in Identification of Hammerstein Systems

Proceedings of the 44th IEEE Conference on Decision and Control ◽

10.1109/cdc.2005.1582322 ◽

2006 ◽

Author(s):

K. Pelckmans ◽

J.A.K. Suykens ◽

I. Goethals ◽

B. De Moor

Keyword(s):

Model Complexity ◽

Hammerstein Systems ◽

Complexity Control

Download Full-text

Classification as Clustering: A Pareto Cooperative-Competitive GP Approach

Evolutionary Computation ◽

10.1162/evco_a_00016 ◽

2011 ◽

Vol 19 (1) ◽

pp. 137-166 ◽

Cited By ~ 8

Author(s):

Andrew R. McIntyre ◽

Malcolm I. Heywood

Keyword(s):

Multiobjective Optimization ◽

Genetic Programming ◽

Natural Environment ◽

Team Member ◽

Population Based ◽

Classification Performance ◽

Model Complexity ◽

Good Balance ◽

Competitive Coevolution ◽

Do So

Intuitively population based algorithms such as genetic programming provide a natural environment for supporting solutions that learn to decompose the overall task between multiple individuals, or a team. This work presents a framework for evolving teams without recourse to prespecifying the number of cooperating individuals. To do so, each individual evolves a mapping to a distribution of outcomes that, following clustering, establishes the parameterization of a (Gaussian) local membership function. This gives individuals the opportunity to represent subsets of tasks, where the overall task is that of classification under the supervised learning domain. Thus, rather than each team member representing an entire class, individuals are free to identify unique subsets of the overall classification task. The framework is supported by techniques from evolutionary multiobjective optimization (EMO) and Pareto competitive coevolution. EMO establishes the basis for encouraging individuals to provide accurate yet nonoverlaping behaviors; whereas competitive coevolution provides the mechanism for scaling to potentially large unbalanced datasets. Benchmarking is performed against recent examples of nonlinear SVM classifiers over 12 UCI datasets with between 150 and 200,000 training instances. Solutions from the proposed coevolutionary multiobjective GP framework appear to provide a good balance between classification performance and model complexity, especially as the dataset instance count increases.

Download Full-text

Model complexity control for regression using VC generalization bounds

IEEE Transactions on Neural Networks ◽

10.1109/72.788648 ◽

1999 ◽

Vol 10 (5) ◽

pp. 1075-1089 ◽

Cited By ~ 121

Author(s):

V. Cherkassky ◽

Xuhui Shao ◽

F.M. Mulier ◽

V.N. Vapnik

Keyword(s):

Model Complexity ◽

Generalization Bounds ◽

Complexity Control

Download Full-text

Automatic model complexity control using marginalized discriminative growth functions

2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721) ◽

10.1109/asru.2003.1318400 ◽

2004 ◽

Cited By ~ 7

Author(s):

X. Liu ◽

M.J.F. Gales

Keyword(s):

Model Complexity ◽

Growth Functions ◽

Complexity Control

Download Full-text

CHAIN-WISE GENERALIZATION OF ROAD NETWORKS USING MODEL SELECTION

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-1-w1-59-2017 ◽

2017 ◽

Vol IV-1/W1 ◽

pp. 59-66 ◽

Cited By ~ 1

Author(s):

D. Bulatov ◽

S. Wenzel ◽

G. Häufel ◽

J. Meidow

Keyword(s):

Model Selection ◽

Similarity Criteria ◽

Sensor Data ◽

Model Complexity ◽

Complex Data ◽

Data Set ◽

The Road ◽

Straight Line ◽

Geometric Entity ◽

Urban Terrain

Streets are essential entities of urban terrain and their automatized extraction from airborne sensor data is cumbersome because of a complex interplay of geometric, topological and semantic aspects. Given a binary image, representing the road class, centerlines of road segments are extracted by means of skeletonization. The focus of this paper lies in a well-reasoned representation of these segments by means of geometric primitives, such as straight line segments as well as circle and ellipse arcs. We propose the fusion of raw segments based on similarity criteria; the output of this process are the so-called chains which better match to the intuitive perception of what a street is. Further, we propose a two-step approach for chain-wise generalization. First, the chain is pre-segmented using <ttt>circlePeucker</ttt> and finally, model selection is used to decide whether two neighboring segments should be fused to a new geometric entity. Thereby, we consider both variance-covariance analysis of residuals and model complexity. The results on a complex data-set with many traffic roundabouts indicate the benefits of the proposed procedure.

Download Full-text

Genetic Programming with Image-Related Operators and A Flexible Program Structure for Feature Learning in Image Classification

10.26686/wgtn.13158323.v1 ◽

2020 ◽

Author(s):

Ying Bi ◽

Bing Xue ◽

Mengjie Zhang

Keyword(s):

Feature Extraction ◽

Genetic Programming ◽

Image Classification ◽

Domain Knowledge ◽

Extraction Methods ◽

Classification Performance ◽

Model Complexity ◽

Program Structure ◽

New Approach ◽

Classification Tasks

IEEE Feature extraction is essential for solving image classification by transforming low-level pixel values into high-level features. However, extracting effective features from images is challenging due to high variations across images in scale, rotation, illumination, and background. Existing methods often have a fixed model complexity and require domain expertise. Genetic programming with a flexible representation can find the best solution without the use of domain knowledge. This paper proposes a new genetic programming-based approach to automatically learning informative features for different image classification tasks. In the new approach, a number of image-related operators, including filters, pooling operators and feature extraction methods, are employed as functions. A flexible program structure is developed to integrate different functions and terminals into a single tree/solution. The new approach can evolve solutions of variable depths to extract various numbers and types of features from the images. The new approach is examined on 12 different image classification tasks of varying difficulty and compared with a large number of effective algorithms. The results show that the new approach achieves better classification performance than most benchmark methods. The analysis of the evolved programs/solutions and the visualisation of the learned features provide deep insights on the proposed approach.

Download Full-text

Model-driven regularization approach to straight line program genetic programming

Expert Systems with Applications ◽

10.1016/j.eswa.2016.03.003 ◽

2016 ◽

Vol 57 ◽

pp. 76-90 ◽

Cited By ~ 2

Author(s):

José L. Montaña ◽

César L. Alonso ◽

Cruz E. Borges ◽

Cristina Tîrnăucă

Keyword(s):

Genetic Programming ◽

Model Driven ◽

Straight Line

Download Full-text

Balancing Accuracy and Parsimony in Genetic Programming

Evolutionary Computation ◽

10.1162/evco.1995.3.1.17 ◽

1995 ◽

Vol 3 (1) ◽

pp. 17-38 ◽

Cited By ~ 126

Author(s):

Byoung-Tak Zhang ◽

Heinz Mühlenbein

Keyword(s):

Genetic Programming ◽

Adaptive Learning ◽

Model Comparison ◽

Model Complexity ◽

Underlying Structure ◽

Network Synthesis ◽

Inference Problem ◽

Bayesian Model Comparison ◽

Fundamental Relationship ◽

Representation Scheme

Genetic programming is distinguished from other evolutionary algorithms in that it uses tree representations of variable size instead of linear strings of fixed length. The flexible representation scheme is very important because it allows the underlying structure of the data to be discovered automatically. One primary difficulty, however, is that the solutions may grow too big without any improvement of their generalization ability. In this article we investigate the fundamental relationship between the performance and complexity of the evolved structures. The essence of the parsimony problem is demonstrated empirically by analyzing error landscapes of programs evolved for neural network synthesis. We consider genetic programming as a statistical inference problem and apply the Bayesian model-comparison framework to introduce a class of fitness functions with error and complexity terms. An adaptive learning method is then presented that automatically balances the model-complexity factor to evolve parsimonious programs without losing the diversity of the population needed for achieving the desired training accuracy. The effectiveness of this approach is empirically shown on the induction of sigma-pi neural networks for solving a real-world medical diagnosis problem as well as benchmark tasks.

Download Full-text

Model complexity control and compression using discriminative growth functions

2004 IEEE International Conference on Acoustics, Speech, and Signal Processing ◽

10.1109/icassp.2004.1326106 ◽

2004 ◽

Cited By ~ 5

Author(s):

X. Liu ◽

M.J.F. Gales

Keyword(s):

Model Complexity ◽

Growth Functions ◽

Complexity Control

Download Full-text