Learning Rates of Regularized Regression With Multiple Gaussian Kernels for Multi-Task Learning

2018 ◽  
Vol 29 (11) ◽  
pp. 5408-5418 ◽  
Author(s):  
Yong-Li Xu ◽  
Xiao-Xing Li ◽  
Di-Rong Chen ◽  
Han-Xiong Li
2013 ◽  
Vol 2013 ◽  
pp. 1-7
Author(s):  
Yong-Li Xu ◽  
Di-Rong Chen ◽  
Han-Xiong Li

The study of multitask learning algorithms is one of very important issues. This paper proposes a least-square regularized regression algorithm for multi-task learning with hypothesis space being the union of a sequence of Hilbert spaces. The algorithm consists of two steps of selecting the optimal Hilbert space and searching for the optimal function. We assume that the distributions of different tasks are related to a set of transformations under which any Hilbert space in the hypothesis space is norm invariant. We prove that under the above assumption the optimal prediction function of every task is in the same Hilbert space. Based on this result, a pivotal error decomposition is founded, which can use samples of related tasks to bound excess error of the target task. We obtain an upper bound for the sample error of related tasks, and based on this bound, potential faster learning rates are obtained compared to single-task learning algorithms.


2016 ◽  
Vol 28 (12) ◽  
pp. 2853-2889 ◽  
Author(s):  
Hanyuan Hang ◽  
Yunlong Feng ◽  
Ingo Steinwart ◽  
Johan A. K. Suykens

This letter investigates the supervised learning problem with observations drawn from certain general stationary stochastic processes. Here by general, we mean that many stationary stochastic processes can be included. We show that when the stochastic processes satisfy a generalized Bernstein-type inequality, a unified treatment on analyzing the learning schemes with various mixing processes can be conducted and a sharp oracle inequality for generic regularized empirical risk minimization schemes can be established. The obtained oracle inequality is then applied to derive convergence rates for several learning schemes such as empirical risk minimization (ERM), least squares support vector machines (LS-SVMs) using given generic kernels, and SVMs using gaussian kernels for both least squares and quantile regression. It turns out that for independent and identically distributed (i.i.d.) processes, our learning rates for ERM recover the optimal rates. For non-i.i.d. processes, including geometrically [Formula: see text]-mixing Markov processes, geometrically [Formula: see text]-mixing processes with restricted decay, [Formula: see text]-mixing processes, and (time-reversed) geometrically [Formula: see text]-mixing processes, our learning rates for SVMs with gaussian kernels match, up to some arbitrarily small extra term in the exponent, the optimal rates. For the remaining cases, our rates are at least close to the optimal rates. As a by-product, the assumed generalized Bernstein-type inequality also provides an interpretation of the so-called effective number of observations for various mixing processes.


2012 ◽  
Vol 42 (12) ◽  
pp. 1251-1262 ◽  
Author(s):  
HongZhi TONG ◽  
FengHong YANG ◽  
DiRong CHEN

2012 ◽  
Vol 35 (2) ◽  
pp. 174-181
Author(s):  
Feilong Cao ◽  
Joonwhoan Lee ◽  
Yongquan Zhang

Author(s):  
YONG-LI XU ◽  
DI-RONG CHEN

The study of regularized learning algorithms is a very important issue and functional data analysis extends classical methods. We establish the learning rates of the least square regularized regression algorithm in reproducing kernel Hilbert space for functional data. With the iteration method, we obtain fast learning rate for functional data. Our result is a natural extension for least square regularized regression algorithm when the dimension of input data is finite.


2005 ◽  
Vol 6 (2) ◽  
pp. 171-192 ◽  
Author(s):  
Qiang Wu ◽  
Yiming Ying ◽  
Ding-Xuan Zhou

Sign in / Sign up

Export Citation Format

Share Document