scholarly journals Analysis of regularized Nyström subsampling for regression functions of low smoothness

2019 ◽  
Vol 17 (06) ◽  
pp. 931-946 ◽  
Author(s):  
Shuai Lu ◽  
Peter Mathé ◽  
Sergiy Pereverzyev

This paper studies a Nyström-type subsampling approach to large kernel learning methods in the misspecified case, where the target function is not assumed to belong to the reproducing kernel Hilbert space generated by the underlying kernel. This case is less understood in spite of its practical importance. To model such a case, the smoothness of target functions is described in terms of general source conditions. It is surprising that almost for the whole range of the source conditions, describing the misspecified case, the corresponding learning rate bounds can be achieved with just one value of the regularization parameter. This observation allows a formulation of mild conditions under which the plain Nyström subsampling can be realized with subquadratic cost maintaining the guaranteed learning rates.

Author(s):  
Fabio Sigrist

AbstractWe introduce a novel boosting algorithm called ‘KTBoost’ which combines kernel boosting and tree boosting. In each boosting iteration, the algorithm adds either a regression tree or reproducing kernel Hilbert space (RKHS) regression function to the ensemble of base learners. Intuitively, the idea is that discontinuous trees and continuous RKHS regression functions complement each other, and that this combination allows for better learning of functions that have parts with varying degrees of regularity such as discontinuities and smooth parts. We empirically show that KTBoost significantly outperforms both tree and kernel boosting in terms of predictive accuracy in a comparison on a wide array of data sets.


2017 ◽  
Vol 15 (06) ◽  
pp. 815-836 ◽  
Author(s):  
Yulong Zhao ◽  
Jun Fan ◽  
Lei Shi

The ranking problem aims at learning real-valued functions to order instances, which has attracted great interest in statistical learning theory. In this paper, we consider the regularized least squares ranking algorithm within the framework of reproducing kernel Hilbert space. In particular, we focus on analysis of the generalization error for this ranking algorithm, and improve the existing learning rates by virtue of an error decomposition technique from regression and Hoeffding’s decomposition for U-statistics.


2016 ◽  
Vol 14 (03) ◽  
pp. 449-477 ◽  
Author(s):  
Andreas Christmann ◽  
Ding-Xuan Zhou

Additive models play an important role in semiparametric statistics. This paper gives learning rates for regularized kernel-based methods for additive models. These learning rates compare favorably in particular in high dimensions to recent results on optimal learning rates for purely nonparametric regularized kernel-based quantile regression using the Gaussian radial basis function kernel, provided the assumption of an additive model is valid. Additionally, a concrete example is presented to show that a Gaussian function depending only on one variable lies in a reproducing kernel Hilbert space generated by an additive Gaussian kernel, but does not belong to the reproducing kernel Hilbert space generated by the multivariate Gaussian kernel of the same variance.


Author(s):  
Mengjuan Pang ◽  
Hongwei Sun

We study distributed learning with partial coefficients regularization scheme in a reproducing kernel Hilbert space (RKHS). The algorithm randomly partitions the sample set [Formula: see text] into [Formula: see text] disjoint sample subsets of equal size. In order to reduce the complexity of algorithms, we apply a partial coefficients regularization scheme to each sample subset to produce an output function, and average the individual output functions to get the final global estimator. The error bound in the [Formula: see text]-metric is deduced and the asymptotic convergence for this distributed learning with partial coefficients regularization is proved by the integral operator technique. Satisfactory learning rates are then derived under a standard regularity condition on the regression function, which reveals an interesting phenomenon that when [Formula: see text] and [Formula: see text] is small enough, this distributed learning has the same convergence rate with the algorithm processing the whole data in one single machine.


Author(s):  
YONG-LI XU ◽  
DI-RONG CHEN

The study of regularized learning algorithms is a very important issue and functional data analysis extends classical methods. We establish the learning rates of the least square regularized regression algorithm in reproducing kernel Hilbert space for functional data. With the iteration method, we obtain fast learning rate for functional data. Our result is a natural extension for least square regularized regression algorithm when the dimension of input data is finite.


2021 ◽  
Vol 2021 (12) ◽  
pp. 124009
Author(s):  
Behrooz Ghorbani ◽  
Song Mei ◽  
Theodor Misiakiewicz ◽  
Andrea Montanari

Abstract For a certain scaling of the initialization of stochastic gradient descent (SGD), wide neural networks (NN) have been shown to be well approximated by reproducing kernel Hilbert space (RKHS) methods. Recent empirical work showed that, for some classification tasks, RKHS methods can replace NNs without a large loss in performance. On the other hand, two-layers NNs are known to encode richer smoothness classes than RKHS and we know of special examples for which SGD-trained NN provably outperform RKHS. This is true even in the wide network limit, for a different scaling of the initialization. How can we reconcile the above claims? For which tasks do NNs outperform RKHS? If covariates are nearly isotropic, RKHS methods suffer from the curse of dimensionality, while NNs can overcome it by learning the best low-dimensional representation. Here we show that this curse of dimensionality becomes milder if the covariates display the same low-dimensional structure as the target function, and we precisely characterize this tradeoff. Building on these results, we present the spiked covariates model that can capture in a unified framework both behaviors observed in earlier work. We hypothesize that such a latent low-dimensional structure is present in image classification. We test numerically this hypothesis by showing that specific perturbations of the training distribution degrade the performances of RKHS methods much more significantly than NNs.


2021 ◽  
Author(s):  
Hongzhi Tong

Abstract To cope with the challenges of memory bottleneck and algorithmic scalability when massive data sets are involved, we propose a distributed least squares procedure in the framework of functional linear model and reproducing kernel Hilbert space. This approach divides the big data set into multiple subsets, applies regularized least squares regression on each of them, and then averages the individual outputs as a final prediction. We establish the non-asymptotic prediction error bounds for the proposed learning strategy under some regularity conditions. When the target function only has weak regularity, we also introduce some unlabelled data to construct a semi-supervised approach to enlarge the number of the partitioned subsets. Results in present paper provide a theoretical guarantee that the distributed algorithm can achieve the optimal rate of convergence while allowing the whole data set to be partitioned into a large number of subsets for parallel processing.


2016 ◽  
Vol 14 (06) ◽  
pp. 809-827
Author(s):  
Ting Hu ◽  
Yuan Yao

This paper studies some robust regression problems associated with the [Formula: see text]-norm loss ([Formula: see text]) and the [Formula: see text]-insensitive [Formula: see text]-norm loss in the reproducing kernel Hilbert space. We establish a variance-expectation bound under a priori noise condition on the conditional distribution, which is the key technique to measure the error bound. Explicit learning rates will be given under the approximation ability assumptions on the reproducing kernel Hilbert space.


Author(s):  
Michael T Jury ◽  
Robert T W Martin

Abstract We extend the Lebesgue decomposition of positive measures with respect to Lebesgue measure on the complex unit circle to the non-commutative (NC) multi-variable setting of (positive) NC measures. These are positive linear functionals on a certain self-adjoint subspace of the Cuntz–Toeplitz $C^{\ast }-$algebra, the $C^{\ast }-$algebra of the left creation operators on the full Fock space. This theory is fundamentally connected to the representation theory of the Cuntz and Cuntz–Toeplitz $C^{\ast }-$algebras; any *−representation of the Cuntz–Toeplitz $C^{\ast }-$algebra is obtained (up to unitary equivalence), by applying a Gelfand–Naimark–Segal construction to a positive NC measure. Our approach combines the theory of Lebesgue decomposition of sesquilinear forms in Hilbert space, Lebesgue decomposition of row isometries, free semigroup algebra theory, NC reproducing kernel Hilbert space theory, and NC Hardy space theory.


Sign in / Sign up

Export Citation Format

Share Document