CancerSubtypes: an R/Bioconductor package for molecular cancer subtype identification, validation and visualization

Abstract Background Recent high throughput technologies have been applied for collecting heterogeneous biomedical omics datasets. Computational analysis of the multi-omics datasets could potentially reveal deep insights for a given disease. Most existing clustering methods by multi-omics data assume strong consistency among different sources of datasets, and thus may lose efficacy when the consistency is relatively weak. Furthermore, they could not identify the conflicting parts for each view, which might be important in applications such as cancer subtype identification. Methods In this work, we propose an integrative subspace clustering method (ISC) by common and specific decomposition to identify clustering structures with multi-omics datasets. The main idea of our ISC method is that the original representations for the samples in each view could be reconstructed by the concatenation of a common part and a view-specific part in orthogonal subspaces. The problem can be formulated as a matrix decomposition problem and solved efficiently by our proposed algorithm. Results The experiments on simulation and text datasets show that our method outperforms other state-of-art methods. Our method is further evaluated by identifying cancer types using a colorectal dataset. We finally apply our method to cancer subtype identification for five cancers using TCGA datasets, and the survival analysis shows that the subtypes we found are significantly better than other compared methods. Conclusion We conclude that our ISC model could not only discover the weak common information across views but also identify the view-specific information.

Download Full-text

Cancer subtype identification pipeline: A classifusion approach

2016 IEEE Congress on Evolutionary Computation (CEC) ◽

10.1109/cec.2016.7744150 ◽

2016 ◽

Cited By ~ 3

Author(s):

Utkarsh Agrawal ◽

Daniele Soria ◽

Christian Wagner

Keyword(s):

Cancer Subtype ◽

Subtype Identification

Download Full-text

Breast cancer subtype identification using machine learning techniques

2014 IEEE 4th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS) ◽

10.1109/iccabs.2014.6863912 ◽

2014 ◽

Cited By ~ 1

Author(s):

Forough Firoozbakht ◽

Iman Rezaeian ◽

Lisa Porter ◽

Luis Rueda

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Breast Cancer Subtype ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Cancer Subtype ◽

Subtype Identification

Download Full-text

ParticleMDI: particle Monte Carlo methods for the cluster analysis of multiple datasets with applications to cancer subtype identification

Advances in Data Analysis and Classification ◽

10.1007/s11634-020-00401-y ◽

2020 ◽

Vol 14 (2) ◽

pp. 463-484

Author(s):

Nathan Cunningham ◽

Jim E. Griffin ◽

David L. Wild

Keyword(s):

Cluster Analysis ◽

Monte Carlo ◽

Monte Carlo Methods ◽

Multiple Datasets ◽

Cancer Subtype ◽

Subtype Identification

Download Full-text

Cancer Subtype Identification based on Multi-view Subspace Clustering with Adaptive Local Structure Learning

10.1109/bibm52615.2021.9669659 ◽

2021 ◽

Author(s):

Haoran Liu ◽

Mingchao Shang ◽

Huaxiang Zhang ◽

Cheng Liang

Keyword(s):

Local Structure ◽

Structure Learning ◽

Subspace Clustering ◽

Cancer Subtype ◽

Subtype Identification ◽

Local Structure Learning

Download Full-text

Developing a label propagation approach for cancer subtype identification problem

TURKISH JOURNAL OF BIOLOGY ◽

10.3906/biy-2108-83 ◽

2021 ◽

Keyword(s):

Identification Problem ◽

Label Propagation ◽

Cancer Subtype ◽

Subtype Identification

Download Full-text

Cancer subtype identification using somatic mutation data

British Journal of Cancer ◽

10.1038/s41416-018-0109-7 ◽

2018 ◽

Vol 118 (11) ◽

pp. 1492-1501 ◽

Cited By ~ 20

Author(s):

Marieke Lydia Kuijjer ◽

Joseph Nathaniel Paulson ◽

Peter Salzman ◽

Wei Ding ◽

John Quackenbush

Keyword(s):

Somatic Mutation ◽

Cancer Subtype ◽

Subtype Identification ◽

Mutation Data

Download Full-text

GenSo-FDSS: a neural-fuzzy decision support system for pediatric ALL cancer subtype identification using gene expression data

Artificial Intelligence in Medicine ◽

10.1016/j.artmed.2004.03.009 ◽

2005 ◽

Vol 33 (1) ◽

pp. 61-88 ◽

Cited By ~ 40

Author(s):

W.L. Tung ◽

C. Quek

Keyword(s):

Gene Expression ◽

Decision Support ◽

Gene Expression Data ◽

Expression Data ◽

Fuzzy Decision ◽

Cancer Subtype ◽

Subtype Identification ◽

Pediatric All ◽

Fuzzy Decision Support System ◽

Neural Fuzzy

Download Full-text

Performance Comparison of Deep Learning Autoencoders for Cancer Subtype Detection Using Multi-Omics Data

Cancers ◽

10.3390/cancers13092013 ◽

2021 ◽

Vol 13 (9) ◽

pp. 2013

Author(s):

Edian F. Franco ◽

Pratip Rana ◽

Aline Cruz ◽

Víctor V. Calderón ◽

Vasco Azevedo ◽

...

Keyword(s):

Deep Learning ◽

Data Fusion ◽

Similarity Measures ◽

Research Problem ◽

Optimal Number ◽

Performance Comparison ◽

The Cancer Genome Atlas ◽

Cancer Type ◽

Omics Data ◽

Cancer Subtype

A heterogeneous disease such as cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), the survival of the patients varies significantly and shows different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a significant research problem. Subtype detection is often a complex problem, and in most cases, needs multi-omics data fusion to achieve accurate subtyping. Different data fusion and subtyping approaches have been proposed over the years, such as kernel-based fusion, matrix factorization, and deep learning autoencoders. In this paper, we compared the performance of different deep learning autoencoders for cancer subtype detection. We performed cancer subtype detection on four different cancer types from The Cancer Genome Atlas (TCGA) datasets using four autoencoder implementations. We also predicted the optimal number of subtypes in a cancer type using the silhouette score and found that the detected subtypes exhibit significant differences in survival profiles. Furthermore, we compared the effect of feature selection and similarity measures for subtype detection. For further evaluation, we used the Glioblastoma multiforme (GBM) dataset and identified the differentially expressed genes in each of the subtypes. The results obtained are consistent with other genomic studies and can be corroborated with the involved pathways and biological functions. Thus, it shows that the results from the autoencoders, obtained through the interaction of different datatypes of cancer, can be used for the prediction and characterization of patient subgroups and survival profiles.

Download Full-text