Higher-Order Regularized Kernel Canonical Correlation Analysis

Author(s):  
Md. Ashad Alam ◽  
Kenji Fukumizu

It is well known that the performance of kernel methods depends on the choice of appropriate kernels and associated parameters. While cross-validation (CV) is a useful method of kernel and parameter choice for supervised learning such as the support vector machines, there are no general well-founded methods for unsupervised kernel methods. This paper discusses CV for kernel canonical correlation analysis (KCCA), and proposes a new regularization approach for KCCA. As we demonstrate with Gaussian kernels, the CV errors for KCCA tend to decrease as the bandwidth parameter of the kernel decreases, which provides inappropriate features with all the data concentrated in a few points. This is caused by the ill-posedness of the KCCA with the CV. To solve this problem, we propose to use constraints on the fourth-order moments of canonical variables in addition to the variances. Experiments on synthesized and real-world data demonstrate that the proposed higher-order regularized KCCA can be applied effectively with the CV to find appropriate kernel and regularization parameters.

2017 ◽  
Vol 5 (325) ◽  
Author(s):  
Mirosław Krzyśko ◽  
Łukasz Waszak

Canonical correlation methods for data representing functions or curves have received much attention in recent years. Such data, known in the literature as functional data (Ramsay and Silverman, 2005), has been the subject of much recent research interest. Examples of functional data can be found in several application domains, such as medicine, economics, meteorology and many others. Unfortunately, the multivariate data canonical correlation methods cannot be used directly for functional data, because of the problem of dimensionality and difficulty in taking into account the correlation and order of functional data. The problem of constructing canonical correlations and canonical variables for functional data was addressed by Leurgans et al. (1993), and further developments were made by Ramsay and Silverman (2005). In this paper we propose a new method of constructing canonical correlations and canonical variables for functional data.


2011 ◽  
Vol 5 (3) ◽  
pp. 2169-2196 ◽  
Author(s):  
Daniel Samarov ◽  
J. S. Marron ◽  
Yufeng Liu ◽  
Christopher Grulke ◽  
Alexander Tropsha

2019 ◽  
Vol 17 (04) ◽  
pp. 1950028 ◽  
Author(s):  
Md. Ashad Alam ◽  
Osamu Komori ◽  
Hong-Wen Deng ◽  
Vince D. Calhoun ◽  
Yu-Ping Wang

The kernel canonical correlation analysis based U-statistic (KCCU) is being used to detect nonlinear gene–gene co-associations. Estimating the variance of the KCCU is however computationally intensive. In addition, the kernel canonical correlation analysis (kernel CCA) is not robust to contaminated data. Using a robust kernel mean element and a robust kernel (cross)-covariance operator potentially enables the use of a robust kernel CCA, which is studied in this paper. We first propose an influence function-based estimator for the variance of the KCCU. We then present a non-parametric robust KCCU, which is designed for dealing with contaminated data. The robust KCCU is less sensitive to noise than KCCU. We investigate the proposed method using both synthesized and real data from the Mind Clinical Imaging Consortium (MCIC). We show through simulation studies that the power of the proposed methods is a monotonically increasing function of sample size, and the robust test statistics bring incremental gains in power. To demonstrate the advantage of the robust kernel CCA, we study MCIC data among 22,442 candidate Schizophrenia genes for gene–gene co-associations. We select 768 genes with strong evidence for shedding light on gene–gene interaction networks for Schizophrenia. By performing gene ontology enrichment analysis, pathway analysis, gene–gene network and other studies, the proposed robust methods can find undiscovered genes in addition to significant gene pairs, and demonstrate superior performance over several of current approaches.


Author(s):  
Blaž Fortuna ◽  
Nello Cristianini ◽  
John Shawe-Taylor

We present a general method using kernel canonical correlation analysis (KCCA) to learn a semantic of text from an aligned multilingual collection of text documents. The semantic space provides a language-independent representation of text and enables a comparison between the text documents from different languages. In experiments, we apply the KCCA to the cross-lingual retrieval of text documents, where the text query is written in only one language, and to cross-lingual text categorization, where we trained a cross-lingual classifier.


Sign in / Sign up

Export Citation Format

Share Document