A Comparative Analysis of Dimensionality Reduction Methods for Genetic Programming to Solve High-Dimensional Symbolic Regression Problems

Author(s):  
Lianjie Zhong ◽  
Jinghui Zhong ◽  
Chengyu Lu
2009 ◽  
Vol 18 (05) ◽  
pp. 757-781 ◽  
Author(s):  
CÉSAR L. ALONSO ◽  
JOSÉ LUIS MONTAÑA ◽  
JORGE PUENTE ◽  
CRUZ ENRIQUE BORGES

Tree encodings of programs are well known for their representative power and are used very often in Genetic Programming. In this paper we experiment with a new data structure, named straight line program (slp), to represent computer programs. The main features of this structure are described, new recombination operators for GP related to slp's are introduced and a study of the Vapnik-Chervonenkis dimension of families of slp's is done. Experiments have been performed on symbolic regression problems. Results are encouraging and suggest that the GP approach based on slp's consistently outperforms conventional GP based on tree structured representations.


2020 ◽  
Vol 49 (3) ◽  
pp. 421-437
Author(s):  
Genggeng Liu ◽  
Lin Xie ◽  
Chi-Hua Chen

Dimensionality reduction plays an important role in the data processing of machine learning and data mining, which makes the processing of high-dimensional data more efficient. Dimensionality reduction can extract the low-dimensional feature representation of high-dimensional data, and an effective dimensionality reduction method can not only extract most of the useful information of the original data, but also realize the function of removing useless noise. The dimensionality reduction methods can be applied to all types of data, especially image data. Although the supervised learning method has achieved good results in the application of dimensionality reduction, its performance depends on the number of labeled training samples. With the growing of information from internet, marking the data requires more resources and is more difficult. Therefore, using unsupervised learning to learn the feature of data has extremely important research value. In this paper, an unsupervised multilayered variational auto-encoder model is studied in the text data, so that the high-dimensional feature to the low-dimensional feature becomes efficient and the low-dimensional feature can retain mainly information as much as possible. Low-dimensional feature obtained by different dimensionality reduction methods are used to compare with the dimensionality reduction results of variational auto-encoder (VAE), and the method can be significantly improved over other comparison methods.


2010 ◽  
Vol 09 (01) ◽  
pp. 81-92 ◽  
Author(s):  
Ch. Aswani Kumar ◽  
Ramaraj Palanisamy

Matrix decomposition methods: Singular Value Decomposition (SVD) and Semi Discrete Decomposition (SDD) are proved to be successful in dimensionality reduction. However, to the best of our knowledge, no empirical results are presented and no comparison between these methods is done to uncover latent structures in the data. In this paper, we present how these methods can be used to identify and visualise latent structures in the time series data. Results on a high dimensional dataset demonstrate that SVD is more successful in uncovering the latent structures.


2021 ◽  
Author(s):  
Mahsa Mostowfi

This work proposes a hybrid algorithm called Probabilistic Incremental Cartesian Genetic Pro- gramming (PI-CGP), which integrates an Estimation of Distribution Algorithm (EDA) with Carte- sian Genetic Programming (CGP). PI-CGP uses a fixed-length problem representation and the algorithm constructs a probabilistic model of promising solutions. PI-CGP was evaluated on sym- bolic regression problems and next trading day stock price forecasting. On the symbolic regression problems PI-CGP did not outperform other approaches. The reason could be premature convergence and being trapped at a local minimum. However, PI-CGP was competitive at stock market forecasting. It was comparable to a fusion model employing a Hidden Markov Model (HMM). HMMs are extensively used for time-series forecasting. This result is promising considering the volatile nature of the stock market and that PI-CGP was not customized toward forecasting.


2021 ◽  
Author(s):  
Mahsa Mostowfi

This work proposes a hybrid algorithm called Probabilistic Incremental Cartesian Genetic Pro- gramming (PI-CGP), which integrates an Estimation of Distribution Algorithm (EDA) with Carte- sian Genetic Programming (CGP). PI-CGP uses a fixed-length problem representation and the algorithm constructs a probabilistic model of promising solutions. PI-CGP was evaluated on sym- bolic regression problems and next trading day stock price forecasting. On the symbolic regression problems PI-CGP did not outperform other approaches. The reason could be premature convergence and being trapped at a local minimum. However, PI-CGP was competitive at stock market forecasting. It was comparable to a fusion model employing a Hidden Markov Model (HMM). HMMs are extensively used for time-series forecasting. This result is promising considering the volatile nature of the stock market and that PI-CGP was not customized toward forecasting.


Sign in / Sign up

Export Citation Format

Share Document