Divide and Conquer: A Quick Scheme for Symbolic Regression

Author(s):  
Changtong Luo ◽  
Chen Chen ◽  
Zonglin Jiang

Symbolic regression (SR), as a special machine learning method, can produce mathematical models with explicit expressions. It has received increasing attention in recent years. However, finding a concise, accurate expression is still challenging because of its huge search space. In this work, a divide and conquer (D&C) scheme is proposed. It tries to divide the search space into a number of orthogonal sub-spaces based on the separability feature inferred from the sample data (dividing process). For each sub-space, a sub-function is learned (conquering process). The target model function is then reconstructed with the sub-functions according to their separability patterns. To this end, a separability pattern detecting technique, bi-correlation test (Bi-CT), is also proposed. Note that the sub-functions could be determined by any of the existing SR methods, which makes D&C easy to use. The D&C powered SR has been tested on many symbolic regression problems, and the study shows that D&C can help SR to get the target function more quickly and reliably.

2009 ◽  
Vol 18 (05) ◽  
pp. 757-781 ◽  
Author(s):  
CÉSAR L. ALONSO ◽  
JOSÉ LUIS MONTAÑA ◽  
JORGE PUENTE ◽  
CRUZ ENRIQUE BORGES

Tree encodings of programs are well known for their representative power and are used very often in Genetic Programming. In this paper we experiment with a new data structure, named straight line program (slp), to represent computer programs. The main features of this structure are described, new recombination operators for GP related to slp's are introduced and a study of the Vapnik-Chervonenkis dimension of families of slp's is done. Experiments have been performed on symbolic regression problems. Results are encouraging and suggest that the GP approach based on slp's consistently outperforms conventional GP based on tree structured representations.


2008 ◽  
Vol 04 (02) ◽  
pp. 123-141 ◽  
Author(s):  
AREEG ABDALLA ◽  
JAMES BUCKLEY

We apply our new fuzzy Monte Carlo method to certain fuzzy non-linear regression problems to estimate the best solution. The best solution is a vector of triangular fuzzy numbers, for the fuzzy coefficients in the model, which minimizes an error measure. We use a quasi-random number generator to produce random sequences of these fuzzy vectors which uniformly fill the search space. We consider example problems to show that this Monte Carlo method obtains solutions comparable to those obtained by an evolutionary algorithm.


Technologies ◽  
2018 ◽  
Vol 7 (1) ◽  
pp. 3
Author(s):  
Panagiotis Oikonomou ◽  
Antonios Dadaliaris ◽  
Kostas Kolomvatsos ◽  
Thanasis Loukopoulos ◽  
Athanasios Kakarountas ◽  
...  

In standard cell placement, a circuit is given consisting of cells with a standard height, (different widths) and the problem is to place the cells in the standard rows of a chip area so that no overlaps occur and some target function is optimized. The process is usually split into at least two phases. In a first pass, a global placement algorithm distributes the cells across the circuit area, while in the second step, a legalization algorithm aligns the cells to the standard rows of the power grid and alleviates any overlaps. While a few legalization schemes have been proposed in the past for the basic problem formulation, few obstacle-aware extensions exist. Furthermore, they usually provide extreme trade-offs between time performance and optimization efficiency. In this paper, we focus on the legalization step, in the presence of pre-allocated modules acting as obstacles. We extend two known algorithmic approaches, namely Tetris and Abacus, so that they become obstacle-aware. Furthermore, we propose a parallelization scheme to tackle the computational complexity. The experiments illustrate that the proposed parallelization method achieves a good scalability, while it also efficiently prunes the search space resulting in a superlinear speedup. Furthermore, this time performance comes at only a small cost (sometimes even improvement) concerning the typical optimization metrics.


Author(s):  
Gabriel Kronberger ◽  
Lukas Kammerer ◽  
Bogdan Burlacu ◽  
Stephan M. Winkler ◽  
Michael Kommenda ◽  
...  

2019 ◽  
Vol 18 (01) ◽  
pp. 109-127
Author(s):  
Ting Hu ◽  
Jun Fan ◽  
Dao-Hong Xiang

In this paper, we establish the error analysis for distributed pairwise learning with multi-penalty regularization, based on a divide-and-conquer strategy. We demonstrate with [Formula: see text]-error bound that the learning performance of this distributed learning scheme is as good as that of a single machine which could process the whole data. With semi-supervised data, we can relax the restriction of the number of local machines and enlarge the range of the target function to guarantee the optimal learning rate. As a concrete example, we show that the work in this paper can apply to the distributed pairwise learning algorithm with manifold regularization.


Author(s):  
CHENG JIN ◽  
YANGJING LONG

We present a distance metric learning algorithm for regression problems, which incorporates label information to form a biased distance metric in the process of learning. We use Newton's optimization method to solve an optimization problem for the sake of learning this biased distance metric. Experiments show that this method can find the intrinsic variation trend of data in a regression model by a relative small amount of samples without any prior assumption of the structure or distribution of data. In addition, the test sample data can be projected to this metric by a simple linear transformation and it is easy to be combined with manifold learning algorithms to improve the performance. Experiments are conducted on the FG-NET aging database, the UIUC-IFP-Y aging database, and the CHIL head pose database by Gaussian process regression based on the learned metric, which shows that our method is competitive among the start-of-art.


Sign in / Sign up

Export Citation Format

Share Document