Multiple Kernel Learning with Gaussianity Measures

2012 ◽  
Vol 24 (7) ◽  
pp. 1853-1881 ◽  
Author(s):  
Hideitsu Hino ◽  
Nima Reyhani ◽  
Noboru Murata

Kernel methods are known to be effective for nonlinear multivariate analysis. One of the main issues in the practical use of kernel methods is the selection of kernel. There have been a lot of studies on kernel selection and kernel learning. Multiple kernel learning (MKL) is one of the promising kernel optimization approaches. Kernel methods are applied to various classifiers including Fisher discriminant analysis (FDA). FDA gives the Bayes optimal classification axis if the data distribution of each class in the feature space is a gaussian with a shared covariance structure. Based on this fact, an MKL framework based on the notion of gaussianity is proposed. As a concrete implementation, an empirical characteristic function is adopted to measure gaussianity in the feature space associated with a convex combination of kernel functions, and two MKL algorithms are derived. From experimental results on some data sets, we show that the proposed kernel learning followed by FDA offers strong classification power.

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Ling Wang ◽  
Hongqiao Wang ◽  
Guangyuan Fu

Extensions of kernel methods for the class imbalance problems have been extensively studied. Although they work well in coping with nonlinear problems, the high computation and memory costs severely limit their application to real-world imbalanced tasks. The Nyström method is an effective technique to scale kernel methods. However, the standard Nyström method needs to sample a sufficiently large number of landmark points to ensure an accurate approximation, which seriously affects its efficiency. In this study, we propose a multi-Nyström method based on mixtures of Nyström approximations to avoid the explosion of subkernel matrix, whereas the optimization to mixture weights is embedded into the model training process by multiple kernel learning (MKL) algorithms to yield more accurate low-rank approximation. Moreover, we select subsets of landmark points according to the imbalance distribution to reduce the model’s sensitivity to skewness. We also provide a kernel stability analysis of our method and show that the model solution error is bounded by weighted approximate errors, which can help us improve the learning process. Extensive experiments on several large scale datasets show that our method can achieve a higher classification accuracy and a dramatical speedup of MKL algorithms.


Entropy ◽  
2020 ◽  
Vol 22 (7) ◽  
pp. 794
Author(s):  
Alessio Martino ◽  
Enrico De Santis ◽  
Alessandro Giuliani ◽  
Antonello Rizzi

Multiple kernel learning is a paradigm which employs a properly constructed chain of kernel functions able to simultaneously analyse different data or different representations of the same data. In this paper, we propose an hybrid classification system based on a linear combination of multiple kernels defined over multiple dissimilarity spaces. The core of the training procedure is the joint optimisation of kernel weights and representatives selection in the dissimilarity spaces. This equips the system with a two-fold knowledge discovery phase: by analysing the weights, it is possible to check which representations are more suitable for solving the classification problem, whereas the pivotal patterns selected as representatives can give further insights on the modelled system, possibly with the help of field-experts. The proposed classification system is tested on real proteomic data in order to predict proteins’ functional role starting from their folded structure: specifically, a set of eight representations are drawn from the graph-based protein folded description. The proposed multiple kernel-based system has also been benchmarked against a clustering-based classification system also able to exploit multiple dissimilarities simultaneously. Computational results show remarkable classification capabilities and the knowledge discovery analysis is in line with current biological knowledge, suggesting the reliability of the proposed system.


2018 ◽  
Vol 30 (3) ◽  
pp. 820-855 ◽  
Author(s):  
Wei Wang ◽  
Hao Wang ◽  
Chen Zhang ◽  
Yang Gao

Learning an appropriate distance metric plays a substantial role in the success of many learning machines. Conventional metric learning algorithms have limited utility when the training and test samples are drawn from related but different domains (i.e., source domain and target domain). In this letter, we propose two novel metric learning algorithms for domain adaptation in an information-theoretic setting, allowing for discriminating power transfer and standard learning machine propagation across two domains. In the first one, a cross-domain Mahalanobis distance is learned by combining three goals: reducing the distribution difference between different domains, preserving the geometry of target domain data, and aligning the geometry of source domain data with label information. Furthermore, we devote our efforts to solving complex domain adaptation problems and go beyond linear cross-domain metric learning by extending the first method to a multiple kernel learning framework. A convex combination of multiple kernels and a linear transformation are adaptively learned in a single optimization, which greatly benefits the exploration of prior knowledge and the description of data characteristics. Comprehensive experiments in three real-world applications (face recognition, text classification, and object categorization) verify that the proposed methods outperform state-of-the-art metric learning and domain adaptation methods.


2019 ◽  
Vol 29 (02) ◽  
pp. 1850042 ◽  
Author(s):  
D. Collazos-Huertas ◽  
D. Cárdenas-Peña ◽  
G. Castellanos-Dominguez

The early detection of Alzheimer’s disease and quantification of its progression poses multiple difficulties for machine learning algorithms. Two of the most relevant issues are related to missing data and results interpretability. To deal with both issues, we introduce a methodology to predict conversion of mild cognitive impairment patients to Alzheimer’s from structural brain MRI volumes. First, we use morphological measures of each brain structure to build an instance-based feature mapping that copes with missed follow-up visits. Then, the extracted multiple feature mappings are combined into a single representation through the convex combination of reproducing kernels. The weighting parameters per structure are tuned based on the maximization of the centered-kernel alignment criterion. We evaluate the proposed methodology on a couple of well-known classification machines employing the ADNI database devoted to assessing the combined prognostic value of several AD biomarkers. The obtained experimental results show that our proposed method of Instance-based representation using multiple kernel learning enables detecting mild cognitive impairment as well as predicting conversion to Alzheimers disease within three years from the initial screening. Besides, the brain structures with larger combination weights are directly related to memory and cognitive functions.


2010 ◽  
Vol 22 (11) ◽  
pp. 2887-2923 ◽  
Author(s):  
Hideitsu Hino ◽  
Noboru Murata

Reducing the dimensionality of high-dimensional data without losing its essential information is an important task in information processing. When class labels of training data are available, Fisher discriminant analysis (FDA) has been widely used. However, the optimality of FDA is guaranteed only in a very restricted ideal circumstance, and it is often observed that FDA does not provide a good classification surface for many real problems. This letter treats the problem of supervised dimensionality reduction from the viewpoint of information theory and proposes a framework of dimensionality reduction based on class-conditional entropy minimization. The proposed linear dimensionality-reduction technique is validated both theoretically and experimentally. Then, through kernel Fisher discriminant analysis (KFDA), the multiple kernel learning problem is treated in the proposed framework, and a novel algorithm, which iteratively optimizes the parameters of the classification function and kernel combination coefficients, is proposed. The algorithm is experimentally shown to be comparable to or outperforms KFDA for large-scale benchmark data sets, and comparable to other multiple kernel learning techniques on the yeast protein function annotation task.


Author(s):  
Guo ◽  
Xiaoqian Zhang ◽  
Zhigui Liu ◽  
Xuqian Xue ◽  
Qian Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document