scholarly journals Interaction-Transformation Evolutionary Algorithm for Symbolic Regression

2020 ◽  
pp. 1-25
Author(s):  
F. O. de Franca ◽  
G. S. I. Aldeia

Interaction-Transformation (IT) is a new representation for Symbolic Regression that reduces the space of solutions to a set of expressions that follow a specific structure. The potential of this representation was illustrated in prior work with the algorithm called SymTree. This algorithm starts with a simple linear model and incrementally introduces new transformed features until a stop criterion is met. While the results obtained by this algorithm were competitive with the literature, it had the drawback of not scaling well with the problem dimension. This paper introduces a mutation only Evolutionary Algorithm, called ITEA, capable of evolving a population of IT expressions. One advantage of this algorithm is that it enables the user to specify the maximum number of terms in an expression. In order to verify the competitiveness of this approach, ITEA is compared to linear, nonlinear and Symbolic Regression models from the literature. The results indicate that ITEA is capable of finding equal or better approximations than other Symbolic Regression models while being competitive to state-of-the-art non-linear models. Additionally, since this representation follows a specific structure, it is possible to extract the importance of each original feature of a data set as an analytical function, enabling us to automate the explanation of any prediction. In conclusion, ITEA is competitive when comparing to regression models with the additional benefit of automating the extraction of additional information of the generated models.

2010 ◽  
Vol 62 (4) ◽  
pp. 875-882 ◽  
Author(s):  
A. Dembélé ◽  
J.-L. Bertrand-Krajewski ◽  
B. Barillon

Regression models are among the most frequently used models to estimate pollutants event mean concentrations (EMC) in wet weather discharges in urban catchments. Two main questions dealing with the calibration of EMC regression models are investigated: i) the sensitivity of models to the size and the content of data sets used for their calibration, ii) the change of modelling results when models are re-calibrated when data sets grow and change with time when new experimental data are collected. Based on an experimental data set of 64 rain events monitored in a densely urbanised catchment, four TSS EMC regression models (two log-linear and two linear models) with two or three explanatory variables have been derived and analysed. Model calibration with the iterative re-weighted least squares method is less sensitive and leads to more robust results than the ordinary least squares method. Three calibration options have been investigated: two options accounting for the chronological order of the observations, one option using random samples of events from the whole available data set. Results obtained with the best performing non linear model clearly indicate that the model is highly sensitive to the size and the content of the data set used for its calibration.


2021 ◽  
Author(s):  
Mzwakhe Magagula ◽  
Shaun Ramroop ◽  
Faustin Habyarimana

Abstract BackgroundChild malnutrition is perhaps the one of the main medical condition influencing general human wellbeing, mainly in non-industrial nations. The improvement of legitimate evaluations of malnutrition is one of the difficulties encountered by policymakers in numerous countries worldwide. In this manner, the current study was embraced with the essential goal of evaluating and determining all potential determinants of childhood malnutrition in Malawi, using the Demographic and Health Survey (DHS) data 2015/16. The study seeks to reveal some of the significant factors that are perpetuating the incidence of malnutrition in children of Malawi. It also designed to offer deeper insights on how the probability of being diagnosed with this medical condition (malnutrition) evolves across the different levels of the found significant factors.Methods The proportional odds (PO) model was the best model to utilize, motivated by the design of the current study's data set. The PO model is an alternative to conceptualize how the ordinal designed data can be sequentially into dichotomous groups without losing the ordinal nature of response variables. The model is an extension of logistic regression models with two outcomes, it is one of the best models to deal with ordinal response variable comprising of more than two categories. The PO model, as well as the logistic regression models are common classes of generalised linear models (GLMs) mostly used to model association between dependent variable and independent variables. ResultsThe observations derived from fitting the PO model on the Malawi DHS data to investigate risk factors associated with malnutrition (stunting) suggested that: the age of the child; birth type (singleton/multiple births), parents' level of education, household's type of resident; mother's age at the time of birth, mother's BMI, incident of diarrhoea in the last two weeks before the survey, are the most significant independent risk factors of malnutrition (stunting). ConclusionsAll the aforementioned risk factors are controllable, and they can be improved through intervention strategies. The policies that undergird the country are required to counteract this condition, as the majority of the risk factors need the coherent actions of several governing authorities.


2018 ◽  
Vol 61 (2) ◽  
pp. 195-209 ◽  
Author(s):  
Kaifeng Zhao ◽  
Seyed Hanif Mahboobi ◽  
Saeed R Bagheri

This article examines and proposes several attribution models that quantify how revenue should be attributed to online advertising inputs. We adopt and further develop relative importance methods, which are based on regression models that have been extensively studied and utilized to investigate the relationship between advertising efforts and market reaction (revenue). The relative importance methods aim at decomposing and allocating marginal contributions to the coefficient of determination ( R2) of the regression models as attribution values. In particular, we adopt two alternative submethods to perform this decomposition: dominance analysis and relative weight analysis. Moreover, we demonstrate an extension of the decomposition methods from standard linear models to additive models. We claim that our new approaches are more flexible and accurate in modeling the underlying relationship and quantifying the attribution values. We use simulation examples to demonstrate the superior performance of our new approaches to traditional methods. We further illustrate the value of our proposed approaches using a real advertising campaign data set.


2021 ◽  
pp. 095679762097165
Author(s):  
Matthew T. McBee ◽  
Rebecca J. Brand ◽  
Wallace E. Dixon

In 2004, Christakis and colleagues published an article in which they claimed that early childhood television exposure causes later attention problems, a claim that continues to be frequently promoted by the popular media. Using the same National Longitudinal Survey of Youth 1979 data set ( N = 2,108), we conducted two multiverse analyses to examine whether the finding reported by Christakis and colleagues was robust to different analytic choices. We evaluated 848 models, including logistic regression models, linear regression models, and two forms of propensity-score analysis. If the claim were true, we would expect most of the justifiable analyses to produce significant results in the predicted direction. However, only 166 models (19.6%) yielded a statistically significant relationship, and most of these employed questionable analytic choices. We concluded that these data do not provide compelling evidence of a harmful effect of TV exposure on attention.


Author(s):  
Manfred Ehresmann ◽  
Georg Herdrich ◽  
Stefanos Fasoulas

AbstractIn this paper, a generic full-system estimation software tool is introduced and applied to a data set of actual flight missions to derive a heuristic for system composition for mass and power ratios of considered sub-systems. The capability of evolutionary algorithms to analyse and effectively design spacecraft (sub-)systems is shown. After deriving top-level estimates for each spacecraft sub-system based on heuristic heritage data, a detailed component-based system analysis follows. Various degrees of freedom exist for a hardware-based sub-system design; these are to be resolved via an evolutionary algorithm to determine an optimal system configuration. A propulsion system implementation for a small satellite test case will serve as a reference example of the implemented algorithm application. The propulsion system includes thruster, power processing unit, tank, propellant and general power supply system masses and power consumptions. Relevant performance parameters such as desired thrust, effective exhaust velocity, utilised propellant, and the propulsion type are considered as degrees of freedom. An evolutionary algorithm is applied to the propulsion system scaling model to demonstrate that such evolutionary algorithms are capable of bypassing complex multidimensional design optimisation problems. An evolutionary algorithm is an algorithm that uses a heuristic to change input parameters and a defined selection criterion (e.g., mass fraction of the system) on an optimisation function to refine solutions successively. With sufficient generations and, thereby, iterations of design points, local optima are determined. Using mitigation methods and a sufficient number of seed points, a global optimal system configurations can be found.


Geophysics ◽  
2011 ◽  
Vol 76 (5) ◽  
pp. WB175-WB182 ◽  
Author(s):  
Yan Huang ◽  
Bing Bai ◽  
Haiyong Quan ◽  
Tony Huang ◽  
Sheng Xu ◽  
...  

The availability of wide-azimuth data and the use of reverse time migration (RTM) have dramatically increased the capabilities of imaging complex subsalt geology. With these improvements, the current obstacle for creating accurate subsalt images now lies in the velocity model. One of the challenges is to generate common image gathers that take full advantage of the additional information provided by wide-azimuth data and the additional accuracy provided by RTM for velocity model updating. A solution is to generate 3D angle domain common image gathers from RTM, which are indexed by subsurface reflection angle and subsurface azimuth angle. We apply these 3D angle gathers to subsalt tomography with the result that there were improvements in velocity updating with a wide-azimuth data set in the Gulf of Mexico.


2021 ◽  
pp. 107110072110581
Author(s):  
Wenye Song ◽  
Naohiro Shibuya ◽  
Daniel C. Jupiter

Background: Ankle fractures in patients with diabetes mellitus have long been recognized as a challenge to practicing clinicians. Ankle fracture patients with diabetes may experience prolonged healing, higher risk of hardware failure, an increased risk of wound dehiscence and infection, and higher pain scores pre- and postoperatively, compared to patients without diabetes. However, the duration of opioid use among this patient cohort has not been previously evaluated. The purpose of this study is to retrospectively compare the time span of opioid utilization between ankle fracture patients with and without diabetes mellitus. Methods: We conducted a retrospective cohort study using our institution’s TriNetX database. A total of 640 ankle fracture patients were included in the analysis, of whom 73 had diabetes. All dates of opioid use for each patient were extracted from the data set, including the first and last date of opioid prescription. Descriptive analysis and logistic regression models were employed to explore the differences in opioid use between patients with and without diabetes after ankle fracture repair. A 2-tailed P value of .05 was set as the threshold for statistical significance. Results: Logistic regression models revealed that patients with diabetes are less likely to stop using opioids within 90 days, or within 180 days, after repair compared to patients without diabetes. Female sex, neuropathy, and prefracture opioid use are also associated with prolonged opioid use after ankle fracture repair. Conclusion: In our study cohort, ankle fracture patients with diabetes were more likely to require prolonged opioid use after fracture repair. Level of Evidence: Level III, prognostic.


Sign in / Sign up

Export Citation Format

Share Document