ROKET: Associating Somatic Mutation with Clinical Outcomes through Kernel Regression and Optimal Transport

Somatic mutations in cancer patients are inherently sparse and potentially high dimensional. Cancer patients may share the same set of deregulated biological processes perturbed by different sets of somatically mutated genes. Therefore, when assessing the associations between somatic mutations and clinical outcomes, gene-by-gene analyses is often under-powered because it does not capture the complex disease mechanisms shared across cancer patients. Rather than testing genes one by one, an intuitive approach is to aggregate somatic mutation data of multiple genes to assess the joint association. The challenge is how to aggregate such information. Building on the optimal transport method, we propose a principled approach to estimate the similarity of somatic mutation profiles of multiple genes between tumor samples, while accounting for gene-gene similarity defined by gene annotations or empirical mutational patterns. Using such similarities, we can assess the associations between somatic mutations and clinical outcomes by kernel regression. We have applied our method to analyze somatic mutation data of 17 cancer types and identified at least three cancer types harboring associations between somatic mutations and overall survival, progression-free interval or cytolytic activity.

Download Full-text

Analysis of multi-omics differences in left-side and right-side colon cancer

PeerJ ◽

10.7717/peerj.11433 ◽

2021 ◽

Vol 9 ◽

pp. e11433

Author(s):

Yanyi Huang ◽

Jinzhong Duanmu ◽

Yushu Liu ◽

Mengyun Yan ◽

Taiyuan Li ◽

...

Keyword(s):

Colon Cancer ◽

Cancer Patients ◽

Gene Mutation ◽

Somatic Mutation ◽

Transcriptome Data ◽

Characteristic Analysis ◽

Immune Microenvironment ◽

Mutation Data ◽

Upregulated Genes ◽

Testing Set

Background Colon cancer is one of the most common tumors in the digestive tract. Studies of left-side colon cancer (LCC) and right-side colon cancer (RCC) show that these two subtypes have different prognoses, outcomes, and clinical responses to chemotherapy. Therefore, a better understanding of the importance of the clinical classifications of the anatomic subtypes of colon cancer is needed. Methods We collected colon cancer patients’ transcriptome data, clinical information, and somatic mutation data from the Cancer Genome Atlas (TCGA) database portal. The transcriptome data were taken from 390 colon cancer patients (172 LCC samples and 218 RCC samples); the somatic mutation data included 142 LCC samples and 187 RCC samples. We compared the expression and prognostic differences of LCC and RCC by conducting a multi-omics analysis of each using the clinical characteristics, immune microenvironment, transcriptomic differences, and mutation differences. The prognostic signatures was validated using the internal testing set, complete set, and external testing set (GSE39582). We also verified the independent prognostic value of the signature. Results The results of our clinical characteristic analysis showed that RCC had a significantly worse prognosis than LCC. The analysis of the immune microenvironment showed that immune infiltration was more common in RCC than LCC. The results of differential gene analysis showed that there were 360 differentially expressed genes, with 142 upregulated genes in LCC and 218 upregulated genes in RCC. The mutation frequency of RCC was generally higher than that of LCC. BRAF and KRAS gene mutations were the dominant genes mutations in RCC, and they had a strong mutual exclusion with APC, while APC gene mutation was the dominant gene mutation in LCC. This suggests that the molecular mechanisms of RCC and LCC differed. The 4-mRNA and 6-mRNA in the prognostic signatures of LCC and RCC, respectively, were highly predictive and may be used as independent prognostic factors. Conclusion The clinical classification of the anatomic subtypes of colon cancer is of great significance for early diagnosis and prognostic risk assessment. Our study provides directions for individualized treatment of left and right colon cancer.

Download Full-text

The lncRNA Signatures of Genome Instability to Predict Survival in Patients with Renal Cancer

Journal of Healthcare Engineering ◽

10.1155/2021/1090698 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Liang Huang ◽

Yu Xie ◽

Shusuan Jiang ◽

Weiqing Han ◽

Fanchang Zeng ◽

...

Keyword(s):

Cancer Patients ◽

Somatic Mutation ◽

Renal Cancer ◽

Somatic Mutations ◽

Genome Instability ◽

Noncoding Rnas ◽

Cancer Biomarkers ◽

Computational Framework ◽

Mutation Profile ◽

Prognosis Model

Long noncoding RNAs (lncRNAs) exert an increasingly important effect on genome instability and the prognosis of cancer patients. The present research established a computational framework originating from the mutation assumption combining lncRNA expression profile and somatic mutation profile in the genome of renal cancer to assess the effect of lncRNAs on the gene instability of renal cancer. A total of 45 differentially expressed lncRNAs were evaluated to be genome-instability-associated from the high and low cumulative somatic mutations groups. Then we established a prognosis model based on three genome-instability-associated lncRNAs (AC156455.1, AC016405.3, and LINC01234)-GlncScore. The GlncScore was then verified in testing cohort and the total TCGA renal cancer cohort. The GlncScore was evaluated to have an accurate prediction for the survival of patients. Furthermore, GlncScore was associated with somatic mutation patterns, indicating its capacity of reflecting genome instability in renal cancer. In conclusion, this study evaluated the effect of lncRNAs on genome instability of renal cancer and provided new hidden cancer biomarkers related to genome instability in renal cancer.

Download Full-text

Analysis of multi-omics differences in left-side and right-side colon cancer

10.21203/rs.3.rs-108560/v1 ◽

2020 ◽

Author(s):

Yanyi Huang ◽

Jinzhong Duanmu ◽

Yushu Liu ◽

Mengyun Yan ◽

Taiyuan Li ◽

...

Keyword(s):

Colon Cancer ◽

Cancer Patients ◽

Somatic Mutation ◽

Clinical Characteristics ◽

Transcriptome Data ◽

Clinical Classification ◽

Immune Microenvironment ◽

Mutation Data ◽

Testing Set

Abstract Background:Colon cancer is one of the common tumors of digestive tract. Studies of left-side colon cancer(LCC) and right-side colon cancer(RCC) show that these two subtypes had different prognosis, outcomes, and clinical response to chemotherapy. Therefore,it is necessary to explore the necessity of clinical classification of anatomic subtypes about colon cancer.Methods:We selected the transcriptome data, clinical information and somatic mutation data of colon cancer patients from the the Cancer Genome Atlas(TCGA )database portal.The transcriptome data included 390 colon cancer patients(172 LCC samples and 218 RCC samples),and the somatic mutation data included 142 LCC samples and 187 RCC samples.By conducting a multi-omics analysis of the LCC and RCC from the four aspects of clinical characteristics, immune microenvironment , transcriptomic differences and mutation differences, so as to compare the expression and prognosis difference of LCC and RCC.We are the first to construct prognostic signatures respectively for LCC and RCC respectively.The prognostic signatures is validated by internal testing set, complete set and external testing set(GSE39582).Additionally we also verified the independent prognostic value of the signature.Results:Clinical characteristics analysis results show that RCC had a significantly worse prognosis than LCC.Analysis the immune microenvironment analysis shows that RCC was more immune infiltration than LCC.The results of differential gene analysis showed that there were 360 differential expressed genes,with 142 up genes in LCC and 218 up genes in RCC.Correlation analysis of mutated genes showed that the expression of mutated genes in RCC was negatively correlated, while the expression of mutated genes in LCC was positively correlated, and the mutation frequency of RCC was generally higher than that of LCC.Meanwhile, our 4-mRNA LCC and 6-mRNA RCC prognostic signatures are highly predictive and can be used as independent prognostic factors.Conclusion:The clinical classification of anatomic subtypes of colon cancer is of great significance for its early diagnosis and prognostic risk assessment.Our study provides directions for individualized treatment of left and right colon cancer.

Download Full-text

MutEx: a multifaceted gateway for exploring integrative pan-cancer genomic data

Briefings in Bioinformatics ◽

10.1093/bib/bbz084 ◽

2019 ◽

Vol 21 (4) ◽

pp. 1479-1486 ◽

Cited By ~ 2

Author(s):

Jie Ping ◽

Olufunmilola Oyebamiji ◽

Hui Yu ◽

Scott Ness ◽

Jeremy Chien ◽

...

Keyword(s):

Breast Cancer ◽

Gene Expression ◽

Somatic Mutation ◽

Survival Data ◽

Somatic Mutations ◽

Genomic Data ◽

Mismatch Repair Gene ◽

Survival Difference ◽

Cancer Types ◽

Pan Cancer

Abstract Somatic mutation and gene expression dysregulation are considered two major tumorigenesis factors. While independent investigations of either factor pervade, studies of associations between somatic mutations and gene expression changes have been sporadic and nonsystematic. Utilizing genomic data collected from 11 315 subjects of 33 distinct cancer types, we constructed MutEx, a pan-cancer integrative genomic database. This database records the relationships among gene expression, somatic mutation and survival data for cancer patients. MutEx can be used to swiftly explore the relationship between these genomic/clinic features within and across cancer types and, more importantly, search for corroborating evidence for hypothesis inception. Our database also incorporated Gene Ontology and several pathway databases to enhance functional annotation, and elastic net and a gene expression composite score to aid in survival analysis. To demonstrate the usability of MutEx, we provide several application examples, including top somatic mutations associated with the most extensive expression dysregulation in breast cancer, differential mutational burden downstream of DNA mismatch repair gene mutations and composite gene expression score-based survival difference in breast cancer. MutEx can be accessed at http://www.innovebioinfo.com/Databases/Mutationdb_About.php.

Download Full-text

Cancer subtype identification using somatic mutation data

10.1101/228031 ◽

2017 ◽

Cited By ~ 1

Author(s):

Marieke L. Kuijjer ◽

Joseph N. Paulson ◽

Peter Salzman ◽

Wei Ding ◽

John Quackenbush

Keyword(s):

Somatic Mutation ◽

Treatment Options ◽

Biological Pathways ◽

The Cancer Genome Atlas ◽

Phenotypic Data ◽

Primary Tumors ◽

Cancer Subtypes ◽

Cancer Types ◽

Mutation Data ◽

Pan Cancer

BACKGROUNDWith the onset of next generation sequencing technologies, we have made great progress in identifying recurrent mutational drivers of cancer. As cancer tissues are now frequently screened for specific sets of mutations, a large amount of samples has become available for analysis. Classification of patients with similar mutation profiles may help identifying subgroups of patients who might benefit from specific types of treatment. However, classification based on somatic mutations is challenging due to the sparseness and heterogeneity of the data.METHODSHere, we describe a new method to de-sparsify somatic mutation data using biological pathways. We applied this method to 23 cancer types from The Cancer Genome Atlas, including samples from 5, 805 primary tumors.RESULTSWe show that, for most cancer types, de-sparsified mutation data associates with phenotypic data. We identify poor prognostic subtypes in three cancer types, which are associated with mutations in signal transduction pathways for which targeted treatment options are available. We identify subtype-drug associations for 14 additional subtypes. Finally, we perform a pan-cancer subtyping analysis and identify nine pan-cancer subtypes, which associate with mutations in four overarching sets of biological pathways.CONCLUSIONSThis study is an important step towards understanding mutational patterns in cancer.

Download Full-text

Transfer Learning via Optimal Transportation for Integrative Cancer Patient Stratification

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/380 ◽

2021 ◽

Author(s):

Ziyu Liu ◽

Wei Shao ◽

Jie Zhang ◽

Min Zhang ◽

Kun Huang

Keyword(s):

Cancer Patients ◽

Transfer Learning ◽

Transport Theory ◽

Optimal Transport ◽

Early Stage ◽

The Cancer Genome Atlas ◽

Sufficient Information ◽

Cancer Type ◽

Early Stage Cancer ◽

Cancer Types

The Stratification of early-stage cancer patients for the prediction of clinical outcome is a challenging task since cancer is associated with various molecular aberrations. A single biomarker often cannot provide sufficient information to stratify early-stage patients effectively. Understanding the complex mechanism behind cancer development calls for exploiting biomarkers from multiple modalities of data such as histopathology images and genomic data. The integrative analysis of these biomarkers sheds light on cancer diagnosis, subtyping, and prognosis. Another difficulty is that labels for early-stage cancer patients are scarce and not reliable enough for predicting survival times. Given the fact that different cancer types share some commonalities, we explore if the knowledge learned from one cancer type can be utilized to improve prognosis accuracy for another cancer type. We propose a novel unsupervised multi-view transfer learning algorithm to simultaneously analyze multiple biomarkers in different cancer types. We integrate multiple views using non-negative matrix factorization and formulate the transfer learning model based on the Optimal Transport theory to align features of different cancer types. We evaluate the stratification performance on three early-stage cancers from the Cancer Genome Atlas (TCGA) project. Comparing with other benchmark methods, our framework achieves superior accuracy for patient outcome prediction.

Download Full-text

Pan-Cancer Analysis Reveals Differential Susceptibility of Bidirectional Gene Promoters to DNA Methylation, Somatic Mutations, and Copy Number Alterations

10.20944/preprints201807.0113.v1 ◽

2018 ◽

Author(s):

Jeffrey A. Thompson ◽

Brock C. Christensen ◽

Carmen J. Marsit

Keyword(s):

Dna Methylation ◽

Logistic Regression ◽

Somatic Mutation ◽

Copy Number ◽

Copy Number Alteration ◽

Somatic Mutations ◽

Gene Promoters ◽

Bidirectional Promoters ◽

Somatic Alteration ◽

Cancer Types

Bidirectional gene promoters affect the transcription of two genes, leading to the hypothesis that they should exhibit protection against genetic or epigenetic changes in cancer. Therefore, they provide an excellent opportunity to learn about promoter susceptibility to somatic alteration in tumors. We tested this hypothesis using data from genome-scale DNA methylation (14 cancer types), simple somatic mutation (10 cancer types), and copy number variation profiling (14 cancer types). For DNA methylation, the difference in rank differential methylation between tumor and tumor-adjacent normal matched samples based on promoter type was tested by Wilcoxon rank sum test. Logistic regression was used to compare differences in simple somatic mutations. For copy number alteration, a mixed effects logistic regression model was used. The change in methylation between non-diseased tissues and their tumor counterparts was significantly greater in single compared to bidirectional promoters across all 14 cancer types examined. Similarly, the extent of copy number alteration was greater in single gene compared to bidirectional promoters for all 14 cancer types. Furthermore, among 10 cancer types with available simple somatic mutation data, bidirectional promoters were slightly more susceptible. These results suggest that selective pressures related with specific functional impacts during carcinogenesis drive the susceptibility of promoter regions to somatic alteration.

Download Full-text

Pan-Cancer Analysis Reveals Differential Susceptibility of Bidirectional Gene Promoters to DNA Methylation, Somatic Mutations, and Copy Number Alterations

International Journal of Molecular Sciences ◽

10.3390/ijms19082296 ◽

2018 ◽

Vol 19 (8) ◽

pp. 2296 ◽

Cited By ~ 4

Author(s):

Jeffrey Thompson ◽

Brock Christensen ◽

Carmen Marsit

Keyword(s):

Dna Methylation ◽

Logistic Regression ◽

Somatic Mutation ◽

Copy Number ◽

Copy Number Alteration ◽

Somatic Mutations ◽

Gene Promoters ◽

Bidirectional Promoters ◽

Somatic Alteration ◽

Cancer Types

Bidirectional gene promoters affect the transcription of two genes, leading to the hypothesis that they should exhibit protection against genetic or epigenetic changes in cancer. Therefore, they provide an excellent opportunity to learn about promoter susceptibility to somatic alteration in tumors. We tested this hypothesis using data from genome-scale DNA methylation (14 cancer types), simple somatic mutation (10 cancer types), and copy number variation profiling (14 cancer types). For DNA methylation, the difference in rank differential methylation between tumor and tumor-adjacent normal matched samples based on promoter type was tested by the Wilcoxon rank sum test. Logistic regression was used to compare differences in simple somatic mutations. For copy number alteration, a mixed effects logistic regression model was used. The change in methylation between non-diseased tissues and their tumor counterparts was significantly greater in single compared to bidirectional promoters across all 14 cancer types examined. Similarly, the extent of copy number alteration was greater in single gene compared to bidirectional promoters for all 14 cancer types. Furthermore, among 10 cancer types with available simple somatic mutation data, bidirectional promoters were slightly more susceptible. These results suggest that selective pressures related with specific functional impacts during carcinogenesis drive the susceptibility of promoter regions to somatic alteration.

Download Full-text

Process-specific somatic mutation distributions vary with three-dimensional genome structure

10.1101/426080 ◽

2018 ◽

Author(s):

Kadir C. Akdemir ◽

Victoria T. Le ◽

Sarah Killcoyne ◽

Devin A. King ◽

Ya-Ping Li ◽

...

Keyword(s):

Somatic Mutation ◽

X Chromosome ◽

Genome Organization ◽

Somatic Mutations ◽

Light Exposure ◽

Human Cancer ◽

Three Dimensional ◽

Cancer Evolution ◽

Cancer Types ◽

Mutational Processes

AbstractSomatic mutations arise during the life history of a cell. Mutations occurring in cancer driver genes may ultimately lead to the development of clinically detectable disease. Nascent cancer lineages continue to acquire somatic mutations throughout the neoplastic process and during cancer evolution (Martincorena and Campbell, 2015). Extrinsic and endogenous mutagenic factors contribute to the accumulation of these somatic mutations (Zhang and Pellman, 2015). Understanding the underlying factors generating somatic mutations is crucial for developing potential preventive, therapeutic and clinical decisions. Earlier studies have revealed that DNA replication timing (Stamatoyannopoulos et al., 2009) and chromatin modifications (Schuster-Böckler and Lehner, 2012) are associated with variations in mutational density. What is unclear from these early studies, however, is whether all extrinsic and exogenous factors that drive somatic mutational processes share a similar relationship with chromatin state and structure. In order to understand the interplay between spatial genome organization and specific individual mutational processes, we report here a study of 3000 tumor-normal pair whole genome datasets from more than 40 different human cancer types. Our analyses revealed that different mutational processes lead to distinct somatic mutation distributions between chromatin folding domains. APOBEC- or MSI-related mutations are enriched in transcriptionally-active domains while mutations occurring due to tobacco-smoke, ultraviolet (UV) light exposure or a signature of unknown aetiology (signature 17) enrich predominantly in transcriptionally-inactive domains. Active mutational processes dictate the mutation distributions in cancer genomes, and we show that mutational distributions shift during cancer evolution upon mutational processes switch. Moreover, a dramatic instance of extreme chromatin structure in humans, that of the unique folding pattern of the inactive X-chromosome leads to distinct somatic mutation distribution on X chromosome in females compared to males in various cancer types. Overall, the interplay between three-dimensional genome organization and active mutational processes has a substantial influence on the large-scale mutation rate variations observed in human cancer.

Download Full-text