Trading Variance Reduction with Unbiasedness: The Regularized Subspace Information Criterion for Robust Model Selection in Kernel Regression

2004 ◽  
Vol 16 (5) ◽  
pp. 1077-1104 ◽  
Author(s):  
Masashi Sugiyama ◽  
Motoaki Kawanabe ◽  
Klaus-Robert Müller

A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hilbert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and anempirical Bayesian method in ridge parameter selection, withgood results.

1993 ◽  
Vol 9 (3) ◽  
pp. 478-493 ◽  
Author(s):  
José A.F. Machado

This paper studies the qualitative robustness properties of the Schwarz information criterion (SIC) based on objective functions defining M-estimators. A definition of qualitative robustness appropriate for model selection is provided and it is shown that the crucial restriction needed to achieve robustness in model selection is the uniform boundedness of the objective function. In the process, the asymptotic performance of the SIC for general M-estimators is also studied. The paper concludes with a Monte Carlo study of the finite sample behavior of the SIC for different specifications of the sample objective function.


2014 ◽  
Vol 2014 ◽  
pp. 1-13
Author(s):  
Qichang Xie ◽  
Meng Du

The essential task of risk investment is to select an optimal tracking portfolio among various portfolios. Statistically, this process can be achieved by choosing an optimal restricted linear model. This paper develops a statistical procedure to do this, based on selecting appropriate weights for averaging approximately restricted models. The method of weighted average least squares is adopted to estimate the approximately restricted models under dependent error setting. The optimal weights are selected by minimizing ak-class generalized information criterion (k-GIC), which is an estimate of the average squared error from the model average fit. This model selection procedure is shown to be asymptotically optimal in the sense of obtaining the lowest possible average squared error. Monte Carlo simulations illustrate that the suggested method has comparable efficiency to some alternative model selection techniques.


2012 ◽  
Author(s):  
J. E. García ◽  
V. A. González-López ◽  
M. L. L. Viola ◽  
Julio Michael Stern ◽  
Marcelo De Souza Lauretto ◽  
...  

2010 ◽  
Vol 54 (12) ◽  
pp. 3300-3312 ◽  
Author(s):  
Marco Riani ◽  
Anthony C. Atkinson

2001 ◽  
Vol 13 (8) ◽  
pp. 1863-1889 ◽  
Author(s):  
Masashi Sugiyama ◽  
Hidemitsu Ogawa

The problem of model selection is considerably important for acquiring higher levels of generalization capability in supervised learning. In this article, we propose a new criterion for model selection, the subspace information criterion (SIC), which is a generalization of Mallows's CL. It is assumed that the learning target function belongs to a specified functional Hilbert space and the generalization error is defined as the Hilbert space squared norm of the difference between the learning result function and target function. SIC gives an unbiased estimate of the generalization error so defined. SIC assumes the availability of an unbiased estimate of the target function and the noise covariance matrix, which are generally unknown. A practical calculation method of SIC for least-mean-squares learning is provided under the assumption that the dimension of the Hilbert space is less than the number of training examples. Finally, computer simulations in two examples show that SIC works well even when the number of training examples is small.


Sign in / Sign up

Export Citation Format

Share Document