A novel binary classification approach based on geometric semantic genetic programming

Improving Land Cover Classification Using Genetic Programming for Feature Construction

Remote Sensing ◽

10.3390/rs13091623 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1623

Author(s):

João E. Batista ◽

Ana I. R. Cabral ◽

Maria J. P. Vasconcelos ◽

Leonardo Vanneschi ◽

Sara Silva

Keyword(s):

Land Cover ◽

Genetic Programming ◽

Satellite Images ◽

State Of The Art ◽

Binary Classification ◽

Feature Construction ◽

Classification Problems ◽

Construction Methods ◽

Box Models

Genetic programming (GP) is a powerful machine learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in the field of remote sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs feature construction by evolving hyperfeatures from the original ones. In this work, we use the M3GP algorithm on several sets of satellite images over different countries to create hyperfeatures from satellite bands to improve the classification of land cover types. We add the evolved hyperfeatures to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (decision trees, random forests, and XGBoost) on multiclass classifications and no significant effect on the binary classifications. We show that adding the M3GP hyperfeatures to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI, and NBR. We also compare the performance of the M3GP hyperfeatures in the binary classification problems with those created by other feature construction methods such as FFX and EFS.

Download Full-text

Binary Image Classification: A Genetic Programming Approach to the Problem of Limited Training Instances

Evolutionary Computation ◽

10.1162/evco_a_00146 ◽

2016 ◽

Vol 24 (1) ◽

pp. 143-182 ◽

Cited By ~ 10

Author(s):

Harith Al-Sahaf ◽

Mengjie Zhang ◽

Mark Johnston

Keyword(s):

Computer Vision ◽

Pattern Recognition ◽

Genetic Programming ◽

Visual System ◽

Image Classification ◽

Human Visual System ◽

Binary Classification ◽

Programming Approach ◽

Data Sets ◽

New Class

In the computer vision and pattern recognition fields, image classification represents an important yet difficult task. It is a challenge to build effective computer models to replicate the remarkable ability of the human visual system, which relies on only one or a few instances to learn a completely new class or an object of a class. Recently we proposed two genetic programming (GP) methods, one-shot GP and compound-GP, that aim to evolve a program for the task of binary classification in images. The two methods are designed to use only one or a few instances per class to evolve the model. In this study, we investigate these two methods in terms of performance, robustness, and complexity of the evolved programs. We use ten data sets that vary in difficulty to evaluate these two methods. We also compare them with two other GP and six non-GP methods. The results show that one-shot GP and compound-GP outperform or achieve results comparable to competitor methods. Moreover, the features extracted by these two methods improve the performance of other classifiers with handcrafted features and those extracted by a recently developed GP-based method in most cases.

Download Full-text

Econometric Genetic Programming in Binary Classification: Evolving Logistic Regressions Through Genetic Programming

Progress in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-319-65340-2_32 ◽

2017 ◽

pp. 382-394 ◽

Cited By ~ 1

Author(s):

André Luiz Farias Novaes ◽

Ricardo Tanscheit ◽

Douglas Mota Dias

Keyword(s):

Genetic Programming ◽

Binary Classification ◽

Logistic Regressions

Download Full-text

Binary image classification: A genetic programming approach to the problem of limited training instances

10.26686/wgtn.13150958 ◽

2020 ◽

Author(s):

Harith Al-Sahaf ◽

Mengjie Zhang ◽

M Johnston

Keyword(s):

Pattern Recognition ◽

Genetic Programming ◽

Image Classification ◽

Binary Classification ◽

Programming Approach ◽

Data Sets ◽

Massachusetts Institute ◽

Massachusetts Institute Of Technology ◽

New Class ◽

Institute Of Technology

© 2016 by the Massachusetts Institute of Technology. In the computer vision and pattern recognition fields, image classification represents an important yet difficult task. It is a challenge to build effective computer models to replicate the remarkable ability of the human visual system, which relies on only one or a few instances to learn a completely new class or an object of a class. Recently we proposed two genetic programming (GP) methods, one-shot GP and compound-GP, that aim to evolve a program for the task of binary classification in images. The two methods are designed to use only one or a few instances per class to evolve the model. In this study, we investigate these two methods in terms of performance, robustness, and complexity of the evolved programs. We use ten data sets that vary in difficulty to evaluate these two methods. We also compare them with two other GP and six non-GP methods. The results show that one-shot GP and compound-GP outperform or achieve results comparable to competitor methods. Moreover, the features extracted by these two methods improve the performance of other classifiers with handcrafted features and those extracted by a recently developed GP-based method in most cases.

Download Full-text

High-dimensional Unbalanced Binary Classification by Genetic Programming with Multi-criterion Fitness Evaluation and Selection

Evolutionary Computation ◽

10.1162/evco_a_00304 ◽

2021 ◽

pp. 1-26

Author(s):

Wenbin Pei ◽

Bing Xue ◽

Lin Shang ◽

Mengjie Zhang

Keyword(s):

Genetic Programming ◽

Fitness Function ◽

Binary Classification ◽

Class Imbalance ◽

Area Under The Curve ◽

High Dimensional ◽

Genetic Operators ◽

Minority Class ◽

Fitness Evaluation ◽

Unbalanced Classification

Abstract High-dimensional unbalanced classification is challenging because of the joint effects of high dimensionality and class imbalance. Genetic programming (GP) has the potential benefits for use in high-dimensional classification due to its built-in capability to select informative features. However, once data is not evenly distributed, GP tends to develop biased classifiers which achieve a high accuracy on the majority class but a low accuracy on the minority class. Unfortunately, the minority class is often at least as important as the majority class. It is of importance to investigate how GP can be effectively utilized for high-dimensional unbalanced classification. In this paper, to address the performance bias issue of GP, a new two-criterion fitness function is developed, which considers two criteria, i.e. the approximation of area under the curve (AUC) and the classification clarity (i.e. how well a program can separate two classes). The obtained values on the two criteria are combined in pairs, instead of summing them together. Furthermore, this paper designs a three-criterion tournament selection to effectively identify and select good programs to be used by genetic operators for generating better offspring during the evolutionary learning process. The experimental results show that the proposed method achieves better classification performance than other compared methods.

Download Full-text

A binary classification approach for automatic preference modeling of virtual agents in Civilization IV

2012 IEEE Conference on Computational Intelligence and Games (CIG) ◽

10.1109/cig.2012.6374151 ◽

2012 ◽

Cited By ~ 1

Author(s):

Marlos C. Machado ◽

Gisele L. Pappa ◽

Luiz Chaimowicz

Keyword(s):

Binary Classification ◽

Virtual Agents ◽

Preference Modeling ◽

Classification Approach

Download Full-text

Membrane positioning for high- and low-resolution protein structures through a binary classification approach

Protein Engineering Design and Selection ◽

10.1093/protein/gzv063 ◽

2015 ◽

Vol 29 (3) ◽

pp. 87-92 ◽

Cited By ~ 3

Author(s):

Guillaume Postic ◽

Yassine Ghouzam ◽

Vincent Guiraud ◽

Jean-Christophe Gelly

Keyword(s):

Binary Classification ◽

Protein Structures ◽

Low Resolution ◽

Classification Approach

Download Full-text

Improving Land Cover Classification Using Genetic Programming for Feature Construction

10.20944/preprints202010.0168.v2 ◽

2020 ◽

Author(s):

João Batista ◽

Ana Cabral ◽

Maria Vasconcelos ◽

Leonardo Vanneschi ◽

Sara Silva

Keyword(s):

Land Cover ◽

Genetic Programming ◽

Satellite Images ◽

State Of The Art ◽

Binary Classification ◽

Feature Construction ◽

Classification Problems ◽

Construction Methods ◽

Box Models

Genetic Programming (GP) is a powerful Machine Learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in Remote Sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs Feature Construction by evolving hyper-features from the original ones. In this work, we use the M3GP algorithm on several sets of satellite images over different countries to create hyper-feature from satellite bands to improve the classification of land cover types. We add the evolved hyper-features to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (Decision Trees, Random Forests and XGBoost) on multiclass classifications and no significant effect on the binary classifications. We show that adding the M3GP hyper-features to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI and NBR. We also compare the performance of the M3GP hyper-features in the binary classification problems with those created by other Feature Construction methods like FFX and EFS.

Download Full-text

Improving Land Cover Classification Using Genetic Programming

10.20944/preprints202010.0168.v1 ◽

2020 ◽

Author(s):

João Batista ◽

Ana Cabral ◽

Maria Vasconcelos ◽

Leonardo Vanneschi ◽

Sara Silva

Keyword(s):

Land Cover ◽

Genetic Programming ◽

Binary Classification ◽

Multiclass Classification ◽

Feature Construction ◽

Classification Problems ◽

Construction Methods ◽

Box Models ◽

Burnt Areas

Genetic Programming (GP) is a powerful Machine Learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in Remote Sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs Feature Construction by evolving hyper-features from the original ones. In this work, we use the M3GP algorithm on several satellite images over different countries to perform binary classification of burnt areas and multiclass classification of land cover types. We add the evolved hyper-features to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (Decision Trees, Random Forests and XGBoost) on the multiclass classification datasets, with no significant effect on the binary classification ones. We show that adding the M3GP hyper-features to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI and NBR. We also compare the performance of the M3GP hyper-features in the binary classification problems with those created by other Feature Construction methods like FFX and EFS.

Download Full-text

Al-Quran recitation verification for memorization test using Siamese LSTM network

Communications in Science and Technology ◽

10.21924/cst.6.1.2021.344 ◽

2021 ◽

Vol 6 (1) ◽

pp. 35-40

Author(s):

Rian Adam Rajagede ◽

Rochana Prih Hastuti

Keyword(s):

Binary Classification ◽

Model Performance ◽

Classification Approach ◽

The Public ◽

Spectral Coefficient ◽

Public Dataset ◽

Lstm Network ◽

Mel Frequency Cepstral Coefficient ◽

Existing Data

In the process of verifying Al-Quran memorization, a person is usually asked to recite a verse without looking at the text. This process is generally done together with a partner to verify the reading. This paper proposes a model using Siamese LSTM Network to help users check their Al-Quran memorization alone. Siamese LSTM network will verify the recitation by matching the input with existing data for a read verse. This study evaluates two Siamese LSTM architectures, the Manhattan LSTM and the Siamese-Classifier. The Manhattan LSTM outputs a single numerical value that represents the similarity, while the Siamese-Classifier uses a binary classification approach. In this study, we compare Mel-Frequency Cepstral Coefficient (MFCC), Mel-Frequency Spectral Coefficient (MFSC), and delta features against model performance. We use the public dataset from Every Ayah website and provide the usage information for future comparison. Our best model, using MFCC with delta and Manhattan LSTM, produces an F1-score of 77.35%

Download Full-text