Genetic Programming, Ensemble Methods and the Bias/Variance Tradeoff – Introductory Investigations

Ensemble modeling approach for rainfall/groundwater balancing

Journal of Hydroinformatics ◽

10.2166/hydro.2007.102 ◽

2007 ◽

Vol 9 (2) ◽

pp. 95-106 ◽

Cited By ~ 6

Author(s):

D. Laucelli ◽

O. Giustolisi ◽

V. Babovic ◽

M. Keijzer

Keyword(s):

Machine Learning ◽

Genetic Programming ◽

Empirical Evidence ◽

Averaging Method ◽

Total Error ◽

Ensemble Methods ◽

Real Data ◽

Symbolic Regression ◽

Ensemble Modeling ◽

Physical Phenomena

This paper introduces an application of machine learning, on real data. It deals with Ensemble Modeling, a simple averaging method for obtaining more reliable approximations using symbolic regression. Considerations on the contribution of bias and variance to the total error, and ensemble methods to reduce errors due to variance, have been tackled together with a specific application of ensemble modeling to hydrological forecasts. This work provides empirical evidence that genetic programming can greatly benefit from this approach in forecasting and simulating physical phenomena. Further considerations have been taken into account, such as the influence of Genetic Programming parameter settings on the model's performance.

Download Full-text

Genetic Programming With a New Representation to Automatically Learn Features and Evolve Ensembles for Image Classification

10.26686/wgtn.13158311.v1 ◽

2020 ◽

Author(s):

Ying Bi ◽

Bing Xue ◽

Mengjie Zhang

Keyword(s):

Genetic Programming ◽

Image Classification ◽

Domain Knowledge ◽

Ensemble Methods ◽

Classification Performance ◽

Classification Algorithms ◽

New Approach ◽

Terminal Set ◽

Class Labels ◽

High Level

Image classification is a popular task in machine learning and computer vision, but it is very challenging due to high variation crossing images. Using ensemble methods for solving image classification can achieve higher classification performance than using a single classification algorithm. However, to obtain a good ensemble, the component (base) classifiers in an ensemble should be accurate and diverse. To solve image classification effectively, feature extraction is necessary to transform raw pixels into high-level informative features. However, this process often requires domain knowledge. This article proposes an evolutionary approach based on genetic programming to automatically and simultaneously learn informative features and evolve effective ensembles for image classification. The new approach takes raw images as inputs and returns predictions of class labels based on the evolved classifiers. To achieve this, a new individual representation, a new function set, and a new terminal set are developed to allow the new approach to effectively find the best solution. More important, the solutions of the new approach can extract informative features from raw images and can automatically address the diversity issue of the ensembles. In addition, the new approach can automatically select and optimize the parameters for the classification algorithms in the ensemble. The performance of the new approach is examined on 13 different image classification datasets of varying difficulty and compared with a large number of effective methods. The results show that the new approach achieves better classification accuracy on most datasets than the competitive methods. Further analysis demonstrates that the new approach can evolve solutions with high accuracy and diversity.

Download Full-text

Bias-variance decomposition in Genetic Programming

Open Mathematics ◽

10.1515/math-2016-0005 ◽

2016 ◽

Vol 14 (1) ◽

pp. 62-80 ◽

Cited By ~ 2

Author(s):

Taras Kowaliw ◽

René Doursat

Keyword(s):

Genetic Programming ◽

Variance Decomposition ◽

Initial Population ◽

Linear Genetic Programming ◽

Improve Performance ◽

Training Samples ◽

Bias Variance ◽

Selection Of

AbstractWe study properties of Linear Genetic Programming (LGP) through several regression and classification benchmarks. In each problem, we decompose the results into bias and variance components, and explore the effect of varying certain key parameters on the overall error and its decomposed contributions. These parameters are the maximum program size, the initial population, and the function set used. We confirm and quantify several insights into the practical usage of GP, most notably that (a) the variance between runs is primarily due to initialization rather than the selection of training samples, (b) parameters can be reasonably optimized to obtain gains in efficacy, and (c) functions detrimental to evolvability are easily eliminated, while functions well-suited to the problem can greatly improve performance—therefore, larger and more diverse function sets are always preferable.

Download Full-text

Genetic Programming With a New Representation to Automatically Learn Features and Evolve Ensembles for Image Classification

10.26686/wgtn.13158311 ◽

2020 ◽

Author(s):

Ying Bi ◽

Bing Xue ◽

Mengjie Zhang

Keyword(s):

Genetic Programming ◽

Image Classification ◽

Domain Knowledge ◽

Ensemble Methods ◽

Classification Performance ◽

Classification Algorithms ◽

New Approach ◽

Terminal Set ◽

Class Labels ◽

High Level

Image classification is a popular task in machine learning and computer vision, but it is very challenging due to high variation crossing images. Using ensemble methods for solving image classification can achieve higher classification performance than using a single classification algorithm. However, to obtain a good ensemble, the component (base) classifiers in an ensemble should be accurate and diverse. To solve image classification effectively, feature extraction is necessary to transform raw pixels into high-level informative features. However, this process often requires domain knowledge. This article proposes an evolutionary approach based on genetic programming to automatically and simultaneously learn informative features and evolve effective ensembles for image classification. The new approach takes raw images as inputs and returns predictions of class labels based on the evolved classifiers. To achieve this, a new individual representation, a new function set, and a new terminal set are developed to allow the new approach to effectively find the best solution. More important, the solutions of the new approach can extract informative features from raw images and can automatically address the diversity issue of the ensembles. In addition, the new approach can automatically select and optimize the parameters for the classification algorithms in the ensemble. The performance of the new approach is examined on 13 different image classification datasets of varying difficulty and compared with a large number of effective methods. The results show that the new approach achieves better classification accuracy on most datasets than the competitive methods. Further analysis demonstrates that the new approach can evolve solutions with high accuracy and diversity.

Download Full-text

An automated ensemble learning framework using genetic programming for image classification

10.26686/wgtn.13884980 ◽

2021 ◽

Author(s):

Ying Bi ◽

Bing Xue ◽

Mengjie Zhang

Keyword(s):

Genetic Programming ◽

Image Classification ◽

Ensemble Learning ◽

Feature Learning ◽

Ensemble Methods ◽

Data Sets ◽

Learning Framework ◽

Computing Machinery ◽

Terminal Set ◽

Function Selection

© 2019 Association for Computing Machinery. An ensemble consists of multiple learners and can achieve a better generalisation performance than a single learner. Genetic programming (GP) has been applied to construct ensembles using different strategies such as bagging and boosting. However, no GP-based ensemble methods focus on dealing with image classification, which is a challenging task in computer vision and machine learning. This paper proposes an automated ensemble learning framework using GP (EGP) for image classification. The new method integrates feature learning, classification function selection, classifier training, and combination into a single program tree. To achieve this, a novel program structure, a new function set and a new terminal set are developed in EGP. The performance of EGP is examined on nine different image classification data sets of varying difficulty and compared with a large number of commonly used methods including recently published methods. The results demonstrate that EGP achieves better performance than most competitive methods. Further analysis reveals that EGP evolves good ensembles simultaneously balancing diversity and accuracy. To the best of our knowledge, this study is the first work using GP to automatically generate ensembles for image classification.

Download Full-text

An automated ensemble learning framework using genetic programming for image classification

10.26686/wgtn.13884980.v1 ◽

2021 ◽

Author(s):

Ying Bi ◽

Bing Xue ◽

Mengjie Zhang

Keyword(s):

Genetic Programming ◽

Image Classification ◽

Ensemble Learning ◽

Feature Learning ◽

Ensemble Methods ◽

Data Sets ◽

Learning Framework ◽

Computing Machinery ◽

Terminal Set ◽

Function Selection

© 2019 Association for Computing Machinery. An ensemble consists of multiple learners and can achieve a better generalisation performance than a single learner. Genetic programming (GP) has been applied to construct ensembles using different strategies such as bagging and boosting. However, no GP-based ensemble methods focus on dealing with image classification, which is a challenging task in computer vision and machine learning. This paper proposes an automated ensemble learning framework using GP (EGP) for image classification. The new method integrates feature learning, classification function selection, classifier training, and combination into a single program tree. To achieve this, a novel program structure, a new function set and a new terminal set are developed in EGP. The performance of EGP is examined on nine different image classification data sets of varying difficulty and compared with a large number of commonly used methods including recently published methods. The results demonstrate that EGP achieves better performance than most competitive methods. Further analysis reveals that EGP evolves good ensembles simultaneously balancing diversity and accuracy. To the best of our knowledge, this study is the first work using GP to automatically generate ensembles for image classification.

Download Full-text