A Comparative Analysis of Dimensionality Reduction Methods for Genetic Programming to Solve High-Dimensional Symbolic Regression Problems

Tree encodings of programs are well known for their representative power and are used very often in Genetic Programming. In this paper we experiment with a new data structure, named straight line program (slp), to represent computer programs. The main features of this structure are described, new recombination operators for GP related to slp's are introduced and a study of the Vapnik-Chervonenkis dimension of families of slp's is done. Experiments have been performed on symbolic regression problems. Results are encouraging and suggest that the GP approach based on slp's consistently outperforms conventional GP based on tree structured representations.

Download Full-text

Unsupervised Text Feature Learning via Deep Variational Auto-encoder

Information Technology And Control ◽

10.5755/j01.itc.49.3.25918 ◽

2020 ◽

Vol 49 (3) ◽

pp. 421-437

Author(s):

Genggeng Liu ◽

Lin Xie ◽

Chi-Hua Chen

Keyword(s):

Dimensionality Reduction ◽

High Dimensional Data ◽

Image Data ◽

Original Data ◽

Feature Representation ◽

High Dimensional ◽

Learning To Learn ◽

Text Feature ◽

Reduction Methods ◽

Low Dimensional

Dimensionality reduction plays an important role in the data processing of machine learning and data mining, which makes the processing of high-dimensional data more efficient. Dimensionality reduction can extract the low-dimensional feature representation of high-dimensional data, and an effective dimensionality reduction method can not only extract most of the useful information of the original data, but also realize the function of removing useless noise. The dimensionality reduction methods can be applied to all types of data, especially image data. Although the supervised learning method has achieved good results in the application of dimensionality reduction, its performance depends on the number of labeled training samples. With the growing of information from internet, marking the data requires more resources and is more difficult. Therefore, using unsupervised learning to learn the feature of data has extremely important research value. In this paper, an unsupervised multilayered variational auto-encoder model is studied in the text data, so that the high-dimensional feature to the low-dimensional feature becomes efficient and the low-dimensional feature can retain mainly information as much as possible. Low-dimensional feature obtained by different dimensionality reduction methods are used to compare with the dimensionality reduction results of variational auto-encoder (VAE), and the method can be significantly improved over other comparison methods.

Download Full-text

Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods

MethodsX ◽

10.1016/j.mex.2020.101093 ◽

2020 ◽

Vol 7 ◽

pp. 101093 ◽

Cited By ~ 1

Author(s):

Michael C. Thrun ◽

Alfred Ultsch

Keyword(s):

Dimensionality Reduction ◽

High Dimensional ◽

Reduction Methods

Download Full-text

Transfer learning with long term artificial neural network memory (LTANN-MEM) and neural symbolization algorithm (NSA) for solving high dimensional multi-objective symbolic regression problems

2017 34th National Radio Science Conference (NRSC) ◽

10.1109/nrsc.2017.7893495 ◽

2017 ◽

Author(s):

Amr K. Deklel ◽

Mohamed A. Saleh ◽

Alaa M. Hamdy ◽

Elsayed M. Saad

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Transfer Learning ◽

Symbolic Regression ◽

High Dimensional ◽

Multi Objective ◽

Regression Problems ◽

Artificial Neural

Download Full-text

Genetic Programming with Embedded Feature Construction for High-Dimensional Symbolic Regression

Proceedings in Adaptation, Learning and Optimization - Intelligent and Evolutionary Systems ◽

10.1007/978-3-319-49049-6_7 ◽

2016 ◽

pp. 87-102 ◽

Cited By ~ 2

Author(s):

Qi Chen ◽

Mengjie Zhang ◽

Bing Xue

Keyword(s):

Genetic Programming ◽

Symbolic Regression ◽

High Dimensional ◽

Feature Construction

Download Full-text

Feature Selection to Improve Generalization of Genetic Programming for High-Dimensional Symbolic Regression

IEEE Transactions on Evolutionary Computation ◽

10.1109/tevc.2017.2683489 ◽

2017 ◽

Vol 21 (5) ◽

pp. 792-806 ◽

Cited By ~ 41

Author(s):

Qi Chen ◽

Mengjie Zhang ◽

Bing Xue

Keyword(s):

Feature Selection ◽

Genetic Programming ◽

Symbolic Regression ◽

High Dimensional

Download Full-text

A probabilistic linear genetic programming with stochastic context-free grammar for solving symbolic regression problems

Proceedings of the Genetic and Evolutionary Computation Conference on - GECCO '17 ◽

10.1145/3071178.3071325 ◽

2017 ◽

Cited By ~ 3

Author(s):

Léo Françoso Dal Piccol Sotto ◽

Vinícius Veloso de Melo

Keyword(s):

Genetic Programming ◽

Symbolic Regression ◽

Linear Genetic Programming ◽

Context Free Grammar ◽

Regression Problems ◽

Context Free

Download Full-text

Comparison of Matrix Dimensionality Reduction Methods in Uncovering Latent Structures in the Data

Journal of Information & Knowledge Management ◽

10.1142/s0219649210002498 ◽

2010 ◽

Vol 09 (01) ◽

pp. 81-92 ◽

Cited By ~ 3

Author(s):

Ch. Aswani Kumar ◽

Ramaraj Palanisamy

Keyword(s):

Dimensionality Reduction ◽

Time Series Data ◽

Matrix Decomposition ◽

Decomposition Methods ◽

Singular Value ◽

Series Data ◽

High Dimensional ◽

Reduction Methods ◽

Latent Structures ◽

Value Decomposition

Matrix decomposition methods: Singular Value Decomposition (SVD) and Semi Discrete Decomposition (SDD) are proved to be successful in dimensionality reduction. However, to the best of our knowledge, no empirical results are presented and no comparison between these methods is done to uncover latent structures in the data. In this paper, we present how these methods can be used to identify and visualise latent structures in the time series data. Results on a high dimensional dataset demonstrate that SVD is more successful in uncovering the latent structures.

Download Full-text

A Study On Financial Time Series Forecasting And Symbolic Regression By Means Of A Hybrid Probabilistic Model-Building Cartesian Genetic Programming Methodology

10.32920/ryerson.14651808 ◽

2021 ◽

Author(s):

Mahsa Mostowfi

Keyword(s):

Time Series ◽

Stock Market ◽

Genetic Programming ◽

Probabilistic Model ◽

Stock Price ◽

Model Building ◽

Symbolic Regression ◽

Time Series Forecasting ◽

Fusion Model ◽

Regression Problems

This work proposes a hybrid algorithm called Probabilistic Incremental Cartesian Genetic Pro- gramming (PI-CGP), which integrates an Estimation of Distribution Algorithm (EDA) with Carte- sian Genetic Programming (CGP). PI-CGP uses a fixed-length problem representation and the algorithm constructs a probabilistic model of promising solutions. PI-CGP was evaluated on sym- bolic regression problems and next trading day stock price forecasting. On the symbolic regression problems PI-CGP did not outperform other approaches. The reason could be premature convergence and being trapped at a local minimum. However, PI-CGP was competitive at stock market forecasting. It was comparable to a fusion model employing a Hidden Markov Model (HMM). HMMs are extensively used for time-series forecasting. This result is promising considering the volatile nature of the stock market and that PI-CGP was not customized toward forecasting.

Download Full-text

A Study On Financial Time Series Forecasting And Symbolic Regression By Means Of A Hybrid Probabilistic Model-Building Cartesian Genetic Programming Methodology

10.32920/ryerson.14651808.v1 ◽

2021 ◽

Author(s):

Mahsa Mostowfi

Keyword(s):

Time Series ◽

Stock Market ◽

Genetic Programming ◽

Probabilistic Model ◽

Stock Price ◽

Model Building ◽

Symbolic Regression ◽

Time Series Forecasting ◽

Fusion Model ◽

Regression Problems

This work proposes a hybrid algorithm called Probabilistic Incremental Cartesian Genetic Pro- gramming (PI-CGP), which integrates an Estimation of Distribution Algorithm (EDA) with Carte- sian Genetic Programming (CGP). PI-CGP uses a fixed-length problem representation and the algorithm constructs a probabilistic model of promising solutions. PI-CGP was evaluated on sym- bolic regression problems and next trading day stock price forecasting. On the symbolic regression problems PI-CGP did not outperform other approaches. The reason could be premature convergence and being trapped at a local minimum. However, PI-CGP was competitive at stock market forecasting. It was comparable to a fusion model employing a Hidden Markov Model (HMM). HMMs are extensively used for time-series forecasting. This result is promising considering the volatile nature of the stock market and that PI-CGP was not customized toward forecasting.

Download Full-text