Generalized Cross-Validation for Bandwidth Selection of Backfitting Estimates in Generalized Additive Models

2004 ◽  
Vol 13 (1) ◽  
pp. 66-89 ◽  
Author(s):  
Göran Kauermann ◽  
J. D Opsomer
2011 ◽  
Vol 68 (10) ◽  
pp. 2252-2263 ◽  
Author(s):  
Stéphanie Mahévas ◽  
Youen Vermard ◽  
Trevor Hutton ◽  
Ane Iriondo ◽  
Angélique Jadaud ◽  
...  

Abstract Mahévas, S., Vermard, Y., Hutton, T., Iriondo, A., Jadaud, A., Maravelias, C. D., Punzón, A., Sacchi, J., Tidd, A., Tsitsika, E., Marchal, P., Goascoz, N., Mortreux, S., and Roos, D. 2011. An investigation of human vs. technology-induced variation in catchability for a selection of European fishing fleets. – ICES Journal of Marine Science, 68: 2252–2263. The impact of the fishing effort exerted by a vessel on a population depends on catchability, which depends on population accessibility and fishing power. The work investigated whether the variation in fishing power could be the result of the technical characteristics of a vessel and/or its gear or whether it is a reflection of inter-vessel differences not accounted for by the technical attributes. These inter-vessel differences could be indicative of a skipper/crew experience effect. To improve understanding of the relationships, landings per unit effort (lpue) from logbooks and technical information on vessels and gears (collected during interviews) were used to identify variables that explained variations in fishing power. The analysis was undertaken by applying a combination of generalized additive models and generalized linear models to data from several European fleets. The study highlights the fact that taking into account information that is not routinely collected, e.g. length of headline, weight of otter boards, or type of groundrope, will significantly improve the modelled relationships between lpue and the variables that measure relative fishing power. The magnitude of the skipper/crew experience effect was weaker than the technical effect of the vessel and/or its gear.


1988 ◽  
Vol 110 (1) ◽  
pp. 37-41 ◽  
Author(s):  
C. R. Dohrmann ◽  
H. R. Busby ◽  
D. M. Trujillo

Smoothing and differentiation of noisy data using spline functions requires the selection of an unknown smoothing parameter. The method of generalized cross-validation provides an excellent estimate of the smoothing parameter from the data itself even when the amount of noise associated with the data is unknown. In the present model only a single smoothing parameter must be obtained, but in a more general context the number may be larger. In an earlier work, smoothing of the data was accomplished by solving a minimization problem using the technique of dynamic programming. This paper shows how the computations required by generalized cross-validation can be performed as a simple extension of the dynamic programming formulas. The results of numerical experiments are also included.


Econometrics ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 22
Author(s):  
Alex Lenkoski ◽  
Fredrik L. Aanes

In economic applications, model averaging has found principal use in examining the validity of various theories related to observed heterogeneity in outcomes such as growth, development, and trade. Though often easy to articulate, these theories are imperfectly captured quantitatively. A number of different proxies are often collected for a given theory and the uneven nature of this collection requires care when employing model averaging. Furthermore, if valid, these theories ought to be relevant outside of any single narrowly focused outcome equation. We propose a methodology which treats theories as represented by latent indices, these latent processes controlled by model averaging on the proxy level. To achieve generalizability of the theory index our framework assumes a collection of outcome equations. We accommodate a flexible set of generalized additive models, enabling non-Gaussian outcomes to be included. Furthermore, selection of relevant theories also occurs on the outcome level, allowing for theories to be differentially valid. Our focus is on creating a set of theory-based indices directed at understanding a country’s potential risk of macroeconomic collapse. These Sovereign Risk Indices are calibrated across a set of different “collapse” criteria, including default on sovereign debt, heightened potential for high unemployment or inflation and dramatic swings in foreign exchange values. The goal of this exercise is to render a portable set of country/year theory indices which can find more general use in the research community.


2004 ◽  
Vol 3 (1) ◽  
pp. 1-23 ◽  
Author(s):  
Mark J. van der Laan ◽  
Sandrine Dudoit ◽  
Sunduz Keles

Likelihood-based cross-validation is a statistical tool for selecting a density estimate based on n i.i.d. observations from the true density among a collection of candidate density estimators. General examples are the selection of a model indexing a maximum likelihood estimator, and the selection of a bandwidth indexing a nonparametric (e.g. kernel) density estimator. In this article, we establish a finite sample result for a general class of likelihood-based cross-validation procedures (as indexed by the type of sample splitting used, e.g. V-fold cross-validation). This result implies that the cross-validation selector performs asymptotically as well (w.r.t. to the Kullback-Leibler distance to the true density) as a benchmark model selector which is optimal for each given dataset and depends on the true density. Crucial conditions of our theorem are that the size of the validation sample converges to infinity, which excludes leave-one-out cross-validation, and that the candidate density estimates are bounded away from zero and infinity. We illustrate these asymptotic results and the practical performance of likelihood-based cross-validation for the purpose of bandwidth selection with a simulation study. Moreover, we use likelihood-based cross-validation in the context of regulatory motif detection in DNA sequences.


2012 ◽  
Vol 2012 ◽  
pp. 1-9 ◽  
Author(s):  
Masaaki Tsujitani ◽  
Yusuke Tanaka ◽  
Masato Sakon

We discuss a flexible method for modeling survival data using penalized smoothing splines when the values of covariates change for the duration of the study. The Cox proportional hazards model has been widely used for the analysis of treatment and prognostic effects with censored survival data. However, a number of theoretical problems with respect to the baseline survival function remain unsolved. We use the generalized additive models (GAMs) with B splines to estimate the survival function and select the optimum smoothing parameters based on a variant multifold cross-validation (CV) method. The methods are compared with the generalized cross-validation (GCV) method using data from a long-term study of patients with primary biliary cirrhosis (PBC).


Author(s):  
Syafruddin Side ◽  
Wahidah Sanusi ◽  
Mustati'atul Waidah Maksum

Abstrak. Regresi semiparametrik merupakan model regresi yang memuat komponen parametrik dan komponen nonparametrik dalam suatu model. Pada penelitian ini digunakan model regresi semiparametrik spline untuk data longitudinal dengan studi kasus penderita Demam Berdarah Dengue (DBD) di Rumah Sakit Universitas Hasanuddin Makassar periode bulan  Januari sampai bulan Maret 2018. Estimasi model regresi terbaik didapat dari pemilihan titik knot optimal dengan melihat nilai Generalized Cross Validation (GCV) dan Mean Square Error (MSE) yang minimum. Komponen parametrik pada penelitian ini adalah hemoglobin (g/dL) dan umur (tahun), suhu tubuh ( ), trombosit ( ) sebagai komponen nonparametrik dengan nilai GCV minimum sebesar 221,67745153 dicapai pada titik knot yaitu 14,552; 14,987; dan 15,096; nilai MSE sebesar 199,1032; dan nilai koefisien determinasi sebesar 75,3% yang diperoleh dari model regresi semiparametrik spline linear dengan tiga titik knot..Kata Kunci: regresi semiparametrik, spline, knot, Generalized Cross Validation, Demam Berdarah Dengue.Abstract. Semiparametric regression is a regression model that includes parametric and nonparametric components in it. The regression model in this research is spline semiparametric regression with case studies of patients with Dengue Hemorrahagic Fever (DHF) at University of Hasanuddin Makassar Hospital during the period of January to March 2018. The best regression model estimation is obtained from the selection of optimal knot which has minimum Generalized Cross Validation (GCV) and Mean Square Error (MSE). Parametric component in this research is hemoglobin (g/dL) and age (years), body temperature ( ), platelets ( ) as a nonparametric components. The minimum value of GCV is 221,67745153 achieved at the point 14,552; 14,987; and 15,096 knot; MSE value of 199,1032; and the value of coefficient determination is 75,3% obtained from semiparametric regression model linear spline with third point of knots.Keywords: semiparametric regression, spline, knot, Generalized Cross Validation, Dengue Hemorrahagic Fever.


Sign in / Sign up

Export Citation Format

Share Document