Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration

Briefings in Bioinformatics ◽

10.1093/bib/bbz138 ◽

2019 ◽

Vol 21 (6) ◽

pp. 2011-2030 ◽

Author(s):

Morgane Pierre-Jean ◽

Jean-François Deleuze ◽

Edith Le Floch ◽

Florence Mauger

Keyword(s):

Correlation Analysis ◽

Variable Selection ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Molecular Data ◽

Heterogeneous Data ◽

Data Availability ◽

Generalized Canonical Correlation Analysis ◽

Unsupervised Methods

Abstract Recent advances in NGS sequencing, microarrays and mass spectrometry for omics data production have enabled the generation and collection of different modalities of high-dimensional molecular data. The integration of multiple omics datasets is a statistical challenge, due to the limited number of individuals, the high number of variables and the heterogeneity of the datasets to integrate. Recently, a lot of tools have been developed to solve the problem of integrating omics data including canonical correlation analysis, matrix factorization and SM. These commonly used techniques aim to analyze simultaneously two or more types of omics. In this article, we compare a panel of 13 unsupervised methods based on these different approaches to integrate various types of multi-omics datasets: iClusterPlus, regularized generalized canonical correlation analysis, sparse generalized canonical correlation analysis, multiple co-inertia analysis (MCIA), integrative-NMF (intNMF), SNF, MoCluster, mixKernel, CIMLR, LRAcluster, ConsensusClustering, PINSPlus and multi-omics factor analysis (MOFA). We evaluate the ability of the methods to recover the subgroups and the variables that drive the clustering on eight benchmarks of simulation. MOFA does not provide any results on these benchmarks. For clustering, SNF, MoCluster, CIMLR, LRAcluster, ConsensusClustering and intNMF provide the best results. For variable selection, MoCluster outperforms the others. However, the performance of the methods seems to depend on the heterogeneity of the datasets (especially for MCIA, intNMF and iClusterPlus). Finally, we apply the methods on three real studies with heterogeneous data and various phenotypes. We conclude that MoCluster is the best method to analyze these omics data. Availability: An R package named CrIMMix is available on GitHub at https://github.com/CNRGH/crimmix to reproduce all the results of this article.

Download Full-text

Variable selection for generalized canonical correlation analysis

Biostatistics ◽

10.1093/biostatistics/kxu001 ◽

2014 ◽

Vol 15 (3) ◽

pp. 569-583 ◽

Author(s):

A. Tenenhaus ◽

C. Philippe ◽

V. Guillemot ◽

K.-A. Le Cao ◽

J. Grill ◽

...

Keyword(s):

Correlation Analysis ◽

Variable Selection ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Generalized Canonical Correlation Analysis ◽

Download Full-text

Structured Variable Selection for Regularized Generalized Canonical Correlation Analysis

Springer Proceedings in Mathematics & Statistics - The Multiple Facets of Partial Least Squares and Related Methods ◽

10.1007/978-3-319-40643-5_10 ◽

2016 ◽

pp. 129-139

Author(s):

Tommy Löfstedt ◽

Fouad Hadj-Selem ◽

Vincent Guillemot ◽

Cathy Philippe ◽

Edouard Duchesnay ◽

...

Keyword(s):

Correlation Analysis ◽

Variable Selection ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Generalized Canonical Correlation Analysis ◽

Download Full-text

Efficient and Distributed Generalized Canonical Correlation Analysis for Big Multiview Data

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2018.2875908 ◽

2019 ◽

Vol 31 (12) ◽

pp. 2304-2318 ◽

Author(s):

Xiao Fu ◽

Kejun Huang ◽

Evangelos E. Papalexakis ◽

Hyun Ah Song ◽

Partha Talukdar ◽

...

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Generalized Canonical Correlation Analysis

Download Full-text

Cross-domain recommender system using generalized canonical correlation analysis

Knowledge and Information Systems ◽

10.1007/s10115-020-01499-4 ◽

2020 ◽

Vol 62 (12) ◽

pp. 4625-4651

Author(s):

Seyed Mohammad Hashemi ◽

Mohammad Rahmati

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Recommender System ◽

Canonical Correlation ◽

Cross Domain ◽

Generalized Canonical Correlation Analysis

Download Full-text

Gabor-Feature Hallucination based on Generalized Canonical Correlation Analysis for face recognition

2011 International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS) ◽

10.1109/ispacs.2011.6146163 ◽

2011 ◽

Author(s):

Kuong-Hon Pong ◽

Kin-Man Lam

Keyword(s):

Face Recognition ◽

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Generalized Canonical Correlation Analysis ◽

Download Full-text

Nonlinear Generalized Canonical Correlation Analysis by Neural Network Models

Measurement and Multivariate Analysis ◽

10.1007/978-4-431-65955-6_19 ◽

2002 ◽

pp. 183-190 ◽

Author(s):

Yoshio Takane ◽

Yuriko Oshima-Takane

Keyword(s):

Neural Network ◽

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Network Models ◽

Neural Network Models ◽

Generalized Canonical Correlation Analysis

Download Full-text

Unsupervised multi-view representation learning with proximity guided representation and generalized canonical correlation analysis

Applied Intelligence ◽

10.1007/s10489-020-01821-1 ◽

2020 ◽

Vol 51 (1) ◽

pp. 248-264

Author(s):

Tingyi Zheng ◽

Huibin Ge ◽

Jiayi Li ◽

Li Wang

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Representation Learning ◽

Generalized Canonical Correlation Analysis

Download Full-text

A Variable Selection Criterion for Two Sets of Principal Component Scores in Principal Canonical Correlation Analysis

Communication in Statistics- Theory and Methods ◽

10.1080/03610926.2011.605235 ◽

2013 ◽

Vol 42 (12) ◽

pp. 2118-2135 ◽

Author(s):

Toru Ogura ◽

Yasunori Fujikoshi ◽

Takakazu Sugiyama

Keyword(s):

Correlation Analysis ◽

Variable Selection ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Selection Criterion ◽

Principal Component ◽

Component Scores

Download Full-text

A whitening approach to probabilistic canonical correlation analysis for omics data integration

BMC Bioinformatics ◽

10.1186/s12859-018-2572-9 ◽

2019 ◽

Vol 20 (1) ◽

Author(s):

Takoua Jendoubi ◽

Korbinian Strimmer

Keyword(s):

Correlation Analysis ◽

Data Integration ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Omics Data Integration

Download Full-text

Sparse generalized canonical correlation analysis for biological model integration: A genetic study of psychiatric disorders

2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) ◽

10.1109/embc.2013.6609794 ◽

2013 ◽

Author(s):

Mingon Kang ◽

Baoju Zhang ◽

Xiaoyong Wu ◽

Chunyu Liu ◽

Jean Gao

Keyword(s):

Correlation Analysis ◽

Psychiatric Disorders ◽

Canonical Correlation Analysis ◽

Genetic Study ◽

Canonical Correlation ◽

Biological Model ◽

Model Integration ◽

Generalized Canonical Correlation Analysis

Download Full-text