Can Genetic Programming Perform Explainable Machine Learning for Bioinformatics?

The problem of control synthesis is considered as machine learning control. The paper proposes a mathematical formulation of machine learning control, discusses approaches of supervised and unsupervised learning by symbolic regression methods. The principle of small variation of the basic solution is presented to set up the neighbourhood of the search and to increase search efficiency of symbolic regression methods. Different symbolic regression methods such as genetic programming, network operator, Cartesian and binary genetic programming are presented in details. It is shown on the computational example the possibilities of symbolic regression methods as unsupervised machine learning control technique to the solution of MLC problem of control synthesis for obtaining the stabilization system for a mobile robot.

Download Full-text

Memetic Self-Configuring Genetic Programming for Solving Machine Learning Problems

2015 IIAI 4th International Congress on Advanced Applied Informatics ◽

10.1109/iiai-aai.2015.290 ◽

2015 ◽

Cited By ~ 1

Author(s):

Maria Semenkina ◽

Eugene Semenkin

Keyword(s):

Machine Learning ◽

Genetic Programming ◽

Learning Problems

Download Full-text

A One-shot Learning Approach to Image Classification using Genetic Programming

10.26686/wgtn.13150934.v1 ◽

2020 ◽

Author(s):

Harith Al-Sahaf ◽

Mengjie Zhang ◽

M Johnston

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Genetic Programming ◽

Image Classification ◽

Local Binary Patterns ◽

Support Vector ◽

Learning Approach ◽

Data Sets ◽

Domain Specific ◽

International Publishing

In machine learning, it is common to require a large number of instances to train a model for classification. In many cases, it is hard or expensive to acquire a large number of instances. In this paper, we propose a novel genetic programming (GP) based method to the problem of automatic image classification via adopting a one-shot learning approach. The proposed method relies on the combination of GP and Local Binary Patterns (LBP) techniques to detect a predefined number of informative regions that aim at maximising the between-class scatter and minimising the within-class scatter. Moreover, the proposed method uses only two instances of each class to evolve a classifier. To test the effectiveness of the proposed method, four different texture data sets are used and the performance is compared against two other GP-based methods namely Conventional GP and Two-tier GP. The experiments revealed that the proposed method outperforms these two methods on all the data sets. Moreover, a better performance has been achieved by Naïve Bayes, Support Vector Machine, and Decision Trees (J48) methods when extracted features by the proposed method have been used compared to the use of domain-specific and Two-tier GP extracted features. © Springer International Publishing 2013.

Download Full-text

Evolutionary Machine Learning for Classification with Incomplete Data

10.26686/wgtn.17072123 ◽

2021 ◽

Author(s):

◽

Cao Truong Tran

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Genetic Programming ◽

Incomplete Data ◽

Missing Values ◽

Machine Learning Techniques ◽

Feature Construction ◽

Classification Algorithms ◽

Learning Techniques ◽

Effectiveness And Efficiency

<p>Classification is a major task in machine learning and data mining. Many real-world datasets suffer from the unavoidable issue of missing values. Classification with incomplete data has to be carefully handled because inadequate treatment of missing values will cause large classification errors. Existing most researchers working on classification with incomplete data focused on improving the effectiveness, but did not adequately address the issue of the efficiency of applying the classifiers to classify unseen instances, which is much more important than the act of creating classifiers. A common approach to classification with incomplete data is to use imputation methods to replace missing values with plausible values before building classifiers and classifying unseen instances. This approach provides complete data which can be then used by any classification algorithm, but sophisticated imputation methods are usually computationally intensive, especially for the application process of classification. Another approach to classification with incomplete data is to build a classifier that can directly work with missing values. This approach does not require time for estimating missing values, but it often generates inaccurate and complex classifiers when faced with numerous missing values. A recent approach to classification with incomplete data which also avoids estimating missing values is to build a set of classifiers which then is used to select applicable classifiers for classifying unseen instances. However, this approach is also often inaccurate and takes a long time to find applicable classifiers when faced with numerous missing values. The overall goal of the thesis is to simultaneously improve the effectiveness and efficiency of classification with incomplete data by using evolutionary machine learning techniques for feature selection, clustering, ensemble learning, feature construction and constructing classifiers. The thesis develops approaches for improving imputation for classification with incomplete data by integrating clustering and feature selection with imputation. The approaches improve both the effectiveness and the efficiency of using imputation for classification with incomplete data. The thesis develops wrapper-based feature selection methods to improve input space for classification algorithms that are able to work directly with incomplete data. The methods not only improve the classification accuracy, but also reduce the complexity of classifiers able to work directly with incomplete data. The thesis develops a feature construction method to improve input space for classification algorithms with incomplete data by proposing interval genetic programming-genetic programming with a set of interval functions. The method improves the classification accuracy and reduces the complexity of classifiers. The thesis develops an ensemble approach to classification with incomplete data by integrating imputation, feature selection, and ensemble learning. The results show that the approach is more accurate, and faster than previous common methods for classification with incomplete data. The thesis develops interval genetic programming to directly evolve classifiers for incomplete data. The results show that classifiers generated by interval genetic programming can be more effective and efficient than classifiers generated the combination of imputation and traditional genetic programming. Interval genetic programming is also more effective than common classification algorithms able to work directly with incomplete data. In summary, the thesis develops a range of approaches for simultaneously improving the effectiveness and efficiency of classification with incomplete data by using a range of evolutionary machine learning techniques.</p>

Download Full-text

Evolution of Day Trade Agent Strategy by Means of Genetic Programming with Machine Learning

Realistic Simulation of Financial Markets - Evolutionary Economics and Social Complexity Science ◽

10.1007/978-4-431-55057-0_5 ◽

2016 ◽

pp. 97-115

Author(s):

Naoki Mori

Keyword(s):

Machine Learning ◽

Genetic Programming

Download Full-text

Genetic Programming and Other Machine Learning Approaches to Predict Median Oral Lethal Dose (LD50) and Plasma Protein Binding Levels (%PPB) of Drugs

Lecture Notes in Computer Science - Evolutionary Computation,Machine Learning and Data Mining in Bioinformatics ◽

10.1007/978-3-540-71783-6_2 ◽

2007 ◽

pp. 11-23 ◽

Cited By ~ 3

Author(s):

Francesco Archetti ◽

Stefano Lanzeni ◽

Enza Messina ◽

Leonardo Vanneschi

Keyword(s):

Machine Learning ◽

Genetic Programming ◽

Protein Binding ◽

Plasma Protein Binding ◽

Plasma Protein ◽

Lethal Dose ◽

Learning Approaches

Download Full-text

A genetic programming-based approach and machine learning approaches to the classification of multiclass anti-malarial datasets

International Journal of Computational Biology and Drug Design ◽

10.1504/ijcbdd.2018.096125 ◽

2018 ◽

Vol 11 (4) ◽

pp. 275 ◽

Cited By ~ 1

Author(s):

Madhulata Kumari ◽

Neeraj Tiwari ◽

Naidu Subbarao

Keyword(s):

Machine Learning ◽

Genetic Programming ◽

Learning Approaches

Download Full-text

Hitoshi Iba, Yoshihiko Hasegawa, and Topon Kumar Paul: Applied Genetic Programming and Machine Learning

Minds and Machines ◽

10.1007/s11023-012-9274-2 ◽

2012 ◽

Vol 22 (4) ◽

pp. 381-383

Author(s):

Osman Hassab Elgawi

Keyword(s):

Machine Learning ◽

Genetic Programming

Download Full-text

Genetic Programming for Machine Learning

Evolutionary Algorithms ◽

10.1002/9781119136378.ch6 ◽

2017 ◽

pp. 183-216 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Genetic Programming

Download Full-text

Ensemble modeling approach for rainfall/groundwater balancing

Journal of Hydroinformatics ◽

10.2166/hydro.2007.102 ◽

2007 ◽

Vol 9 (2) ◽

pp. 95-106 ◽

Cited By ~ 6

Author(s):

D. Laucelli ◽

O. Giustolisi ◽

V. Babovic ◽

M. Keijzer

Keyword(s):

Machine Learning ◽

Genetic Programming ◽

Empirical Evidence ◽

Averaging Method ◽

Total Error ◽

Ensemble Methods ◽

Real Data ◽

Symbolic Regression ◽

Ensemble Modeling ◽

Physical Phenomena

This paper introduces an application of machine learning, on real data. It deals with Ensemble Modeling, a simple averaging method for obtaining more reliable approximations using symbolic regression. Considerations on the contribution of bias and variance to the total error, and ensemble methods to reduce errors due to variance, have been tackled together with a specific application of ensemble modeling to hydrological forecasts. This work provides empirical evidence that genetic programming can greatly benefit from this approach in forecasting and simulating physical phenomena. Further considerations have been taken into account, such as the influence of Genetic Programming parameter settings on the model's performance.

Download Full-text