scholarly journals A general framework for functional regression modelling

2017 ◽  
Vol 17 (1-2) ◽  
pp. 1-35 ◽  
Author(s):  
Sonja Greven ◽  
Fabian Scheipl

Researchers are increasingly interested in regression models for functional data. This article discusses a comprehensive framework for additive (mixed) models for functional responses and/or functional covariates based on the guiding principle of reframing functional regression in terms of corresponding models for scalar data, allowing the adaptation of a large body of existing methods for these novel tasks. The framework encompasses many existing as well as new models. It includes regression for ‘generalized’ functional data, mean regression, quantile regression as well as generalized additive models for location, shape and scale (GAMLSS) for functional data. It admits many flexible linear, smooth or interaction terms of scalar and functional covariates as well as (functional) random effects and allows flexible choices of bases—particularly splines and functional principal components—and corresponding penalties for each term. It covers functional data observed on common (dense) or curve-specific (sparse) grids. Penalized-likelihood-based and gradient-boosting-based inference for these models are implemented in R packages refund and FDboost , respectively. We also discuss identifiability and computational complexity for the functional regression models covered. A running example on a longitudinal multiple sclerosis imaging study serves to illustrate the flexibility and utility of the proposed model class. Reproducible code for this case study is made available online.

Author(s):  
François Freddy Ateba ◽  
Manuel Febrero-Bande ◽  
Issaka Sagara ◽  
Nafomon Sogoba ◽  
Mahamoudou Touré ◽  
...  

Mali aims to reach the pre-elimination stage of malaria by the next decade. This study used functional regression models to predict the incidence of malaria as a function of past meteorological patterns to better prevent and to act proactively against impending malaria outbreaks. All data were collected over a five-year period (2012–2017) from 1400 persons who sought treatment at Dangassa’s community health center. Rainfall, temperature, humidity, and wind speed variables were collected. Functional Generalized Spectral Additive Model (FGSAM), Functional Generalized Linear Model (FGLM), and Functional Generalized Kernel Additive Model (FGKAM) were used to predict malaria incidence as a function of the pattern of meteorological indicators over a continuum of the 18 weeks preceding the week of interest. Their respective outcomes were compared in terms of predictive abilities. The results showed that (1) the highest malaria incidence rate occurred in the village 10 to 12 weeks after we observed a pattern of air humidity levels >65%, combined with two or more consecutive rain episodes and a mean wind speed <1.8 m/s; (2) among the three models, the FGLM obtained the best results in terms of prediction; and (3) FGSAM was shown to be a good compromise between FGLM and FGKAM in terms of flexibility and simplicity. The models showed that some meteorological conditions may provide a basis for detection of future outbreaks of malaria. The models developed in this paper are useful for implementing preventive strategies using past meteorological and past malaria incidence.


Mathematics ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 299
Author(s):  
Jaime Pinilla ◽  
Miguel Negrín

The interrupted time series analysis is a quasi-experimental design used to evaluate the effectiveness of an intervention. Segmented linear regression models have been the most used models to carry out this analysis. However, they assume a linear trend that may not be appropriate in many situations. In this paper, we show how generalized additive models (GAMs), a non-parametric regression-based method, can be useful to accommodate nonlinear trends. An analysis with simulated data is carried out to assess the performance of both models. Data were simulated from linear and non-linear (quadratic and cubic) functions. The results of this analysis show how GAMs improve on segmented linear regression models when the trend is non-linear, but they also show a good performance when the trend is linear. A real-life application where the impact of the 2012 Spanish cost-sharing reforms on pharmaceutical prescription is also analyzed. Seasonality and an indicator variable for the stockpiling effect are included as explanatory variables. The segmented linear regression model shows good fit of the data. However, the GAM concludes that the hypothesis of linear trend is rejected. The estimated level shift is similar for both models but the cumulative absolute effect on the number of prescriptions is lower in GAM.


2006 ◽  
Vol 199 (2) ◽  
pp. 176-187 ◽  
Author(s):  
Gretchen G. Moisen ◽  
Elizabeth A. Freeman ◽  
Jock A. Blackard ◽  
Tracey S. Frescino ◽  
Niklaus E. Zimmermann ◽  
...  

Author(s):  
Scott M. Storm ◽  
Raymond R. Hill ◽  
Joseph J. Pignatiello ◽  
G. Geoffrey Vining ◽  
Edward D. White

As we continue to model more complex systems, the validation of dynamical responses has come to the forefront of modeling and simulation. One form of dynamic response is when the output is a function of time. The proper evaluation of functional data over an array of desired input parameters is critical to achieving a robust validation assessment of a simulation model. We extend the correlation analysis (CORA) objective rating system to validate functional data across experimental regions. Functional regression analysis is used to generate surrogate estimations of the system response functions at points within the region where experimental observations are absent. These CORA scores provide a measure of disagreement at each desired parameter configuration. An overall score for model validity is achieved using a weighted linear combination of the individual CORA scores. Finally, an improved CORA size scoring metric is introduced.


Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 469
Author(s):  
Thiago G. Ramires ◽  
Luiz R. Nakamura ◽  
Ana J. Righetto ◽  
Renan J. Carvalho ◽  
Lucas A. Vieira ◽  
...  

This paper presents a discussion regarding regression models, especially those belonging to the location class. Our main motivation is that, with simple distributions having simple interpretations, in some cases, one gets better results than the ones obtained with overly complex distributions. For instance, with the reverse Gumbel (RG) distribution, it is possible to explain response variables by making use of the generalized additive models for location, scale, and shape (GAMLSS) framework, which allows the fitting of several parameters (characteristics) of the probabilistic distributions, like mean, mode, variance, and others. Three real data applications are used to compare several location models against the RG under the GAMLSS framework. The intention is to show that the use of a simple distribution (e.g., RG) based on a more sophisticated regression structure may be preferable than using a more complex location model.


2021 ◽  
pp. 1471082X2110073
Author(s):  
Stanislaus Stadlmann ◽  
Thomas Kneib

A newly emerging field in statistics is distributional regression, where not only the mean but each parameter of a parametric response distribution can be modelled using a set of predictors. As an extension of generalized additive models, distributional regression utilizes the known link functions (log, logit, etc.), model terms (fixed, random, spatial, smooth, etc.) and available types of distributions but allows us to go well beyond the exponential family and to model potentially all distributional parameters. Due to this increase in model flexibility, the interpretation of covariate effects on the shape of the conditional response distribution, its moments and other features derived from this distribution is more challenging than with traditional mean-based methods. In particular, such quantities of interest often do not directly equate the modelled parameters but are rather a (potentially complex) combination of them. To ease the post-estimation model analysis, we propose a framework and subsequently feature an implementation in R for the visualization of Bayesian and frequentist distributional regression models fitted using the bamlss, gamlss and betareg R packages.


2021 ◽  
Author(s):  
Drew Thomas

Media commentary has suggested that recent Black Lives Matter (BLM) protests, particularly riots, drove voters, particularly Hispanic voters, away from Democratic candidate Joe Biden in the 2020 US presidential election. I test these hypotheses with county-level regression models of 2016-to-2020 swing towards the Democratic presidential candidate, using the presence and intensity of BLM non-riot protests and riots as regressors, controlling for state and many background demographic factors (population density, household size, racial composition, etc.). The models (generalized additive models) that control most aggressively for background factors find small and positive associations between BLM protests and Democratic swing: counties with non-riot BLM protests swung more towards Joe Biden by 0.2 percentage points, and counties with BLM-associated riots swung more towards Joe Biden by (a statistically insignificant) 0.1 percentage points. The extra BLM-protest swing was not statistically significantly different in counties with relatively many Hispanic voting-age citizens, although it was weaker in counties with relatively many Asian voting-age citizens. Inasmuch as these results reflect causal impacts of BLM protests, the protests enhanced the Democratic swing but were probably not electorally decisive. My most elaborate model suggests that a lack of BLM protests in 2020 would have flipped only one state: Biden might have narrowly lost Arizona.


Sign in / Sign up

Export Citation Format

Share Document