Measuring feature importance of symbolic regression models using partial effects

Author(s):  
Guilherme Seidyo Imai Aldeia ◽  
Fabrício Olivetti de França
2020 ◽  
Author(s):  
Giovanni Scabbia ◽  
Antonio Sanfilippo ◽  
Annamaria Mazzoni ◽  
Dunia Bachour ◽  
Daniel Perez-Astudillo ◽  
...  

Abstract A growing number of studies has suggested potential impacts of meteorological variables on the spread of the COVID-19 pandemic. These impacts are supported by data from similar viral contagions, such as SARS and the 1918 Flu Pandemic, and corroborated by US influenza data relative to the last decade. However, there is still limited understanding about the extent to which meteorology affects COVID-19 transmission rates and how meteorology can be a relevant element to anticipate social and politic measures. This study demonstrates that such an understanding is attainable through the development of regression models that verify the contribution of meteorology to the modeling of COVID-19 transmission, and the use of feature importance techniques assessing the relative weight of meteorological variables compared to epidemiological, socioeconomic, environmental, and global health indicator factors. The study results show that meteorological factors play an important role in regression models of COVID-19 transmission that have low error rates (R2 0.964). These results are corroborated by a panel data fixed-effect model showing that meteorological coefficients are often significantly correlated with COVID-19 transmission rates (R2 0.691-0.746, p<0.01).


2021 ◽  
Vol 104 ◽  
pp. 107198
Author(s):  
Aliyu Sani Sambo ◽  
R. Muhammad Atif Azad ◽  
Yevgeniya Kovalchuk ◽  
Vivek Padmanaabhan Indramohan ◽  
Hanifa Shah

2020 ◽  
pp. 1-25
Author(s):  
F. O. de Franca ◽  
G. S. I. Aldeia

Interaction-Transformation (IT) is a new representation for Symbolic Regression that reduces the space of solutions to a set of expressions that follow a specific structure. The potential of this representation was illustrated in prior work with the algorithm called SymTree. This algorithm starts with a simple linear model and incrementally introduces new transformed features until a stop criterion is met. While the results obtained by this algorithm were competitive with the literature, it had the drawback of not scaling well with the problem dimension. This paper introduces a mutation only Evolutionary Algorithm, called ITEA, capable of evolving a population of IT expressions. One advantage of this algorithm is that it enables the user to specify the maximum number of terms in an expression. In order to verify the competitiveness of this approach, ITEA is compared to linear, nonlinear and Symbolic Regression models from the literature. The results indicate that ITEA is capable of finding equal or better approximations than other Symbolic Regression models while being competitive to state-of-the-art non-linear models. Additionally, since this representation follows a specific structure, it is possible to extract the importance of each original feature of a data set as an analytical function, enabling us to automate the explanation of any prediction. In conclusion, ITEA is competitive when comparing to regression models with the additional benefit of automating the extraction of additional information of the generated models.


Author(s):  
Michael Kommenda ◽  
Gabriel Kronberger ◽  
Michael Affenzeller ◽  
Stephan M. Winkler ◽  
Bogdan Burlacu

Sign in / Sign up

Export Citation Format

Share Document