scholarly journals Cancer subtype identification using somatic mutation data

2017 ◽  
Author(s):  
Marieke L. Kuijjer ◽  
Joseph N. Paulson ◽  
Peter Salzman ◽  
Wei Ding ◽  
John Quackenbush

BACKGROUNDWith the onset of next generation sequencing technologies, we have made great progress in identifying recurrent mutational drivers of cancer. As cancer tissues are now frequently screened for specific sets of mutations, a large amount of samples has become available for analysis. Classification of patients with similar mutation profiles may help identifying subgroups of patients who might benefit from specific types of treatment. However, classification based on somatic mutations is challenging due to the sparseness and heterogeneity of the data.METHODSHere, we describe a new method to de-sparsify somatic mutation data using biological pathways. We applied this method to 23 cancer types from The Cancer Genome Atlas, including samples from 5, 805 primary tumors.RESULTSWe show that, for most cancer types, de-sparsified mutation data associates with phenotypic data. We identify poor prognostic subtypes in three cancer types, which are associated with mutations in signal transduction pathways for which targeted treatment options are available. We identify subtype-drug associations for 14 additional subtypes. Finally, we perform a pan-cancer subtyping analysis and identify nine pan-cancer subtypes, which associate with mutations in four overarching sets of biological pathways.CONCLUSIONSThis study is an important step towards understanding mutational patterns in cancer.

Cells ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 45
Author(s):  
Darío Rocha ◽  
Iris A. García ◽  
Aldana González Montoro ◽  
Andrea Llera ◽  
Laura Prato ◽  
...  

Studying tissue-independent components of cancer and defining pan-cancer subtypes could be addressed using tissue-specific molecular signatures if classification errors are controlled. Since PAM50 is a well-known, United States Food and Drug Administration (FDA)-approved and commercially available breast cancer signature, we applied it with uncertainty assessment to classify tumor samples from over 33 cancer types, discarded unassigned samples, and studied the emerging tumor-agnostic molecular patterns. The percentage of unassigned samples ranged between 55.5% and 86.9% in non-breast tissues, and gene set analysis suggested that the remaining samples could be grouped into two classes (named C1 and C2) regardless of the tissue. The C2 class was more dedifferentiated, more proliferative, with higher centrosome amplification, and potentially more TP53 and RB1 mutations. We identified 28 gene sets and 95 genes mainly associated with cell-cycle progression, cell-cycle checkpoints, and DNA damage that were consistently exacerbated in the C2 class. In some cancer types, the C1/C2 classification was associated with survival and drug sensitivity, and modulated the prognostic meaning of the immune infiltrate. Our results suggest that PAM50 could be repurposed for a pan-cancer context when paired with uncertainty assessment, resulting in two classes with molecular, biological, and clinical implications.


2021 ◽  
pp. 1-10
Author(s):  
Zoe Guan ◽  
Ronglai Shen ◽  
Colin B. Begg

<b><i>Background:</i></b> Many cancer types show considerable heritability, and extensive research has been done to identify germline susceptibility variants. Linkage studies have discovered many rare high-risk variants, and genome-wide association studies (GWAS) have discovered many common low-risk variants. However, it is believed that a considerable proportion of the heritability of cancer remains unexplained by known susceptibility variants. The “rare variant hypothesis” proposes that much of the missing heritability lies in rare variants that cannot reliably be detected by linkage analysis or GWAS. Until recently, high sequencing costs have precluded extensive surveys of rare variants, but technological advances have now made it possible to analyze rare variants on a much greater scale. <b><i>Objectives:</i></b> In this study, we investigated associations between rare variants and 14 cancer types. <b><i>Methods:</i></b> We ran association tests using whole-exome sequencing data from The Cancer Genome Atlas (TCGA) and validated the findings using data from the Pan-Cancer Analysis of Whole Genomes Consortium (PCAWG). <b><i>Results:</i></b> We identified four significant associations in TCGA, only one of which was replicated in PCAWG (BRCA1 and ovarian cancer). <b><i>Conclusions:</i></b> Our results provide little evidence in favor of the rare variant hypothesis. Much larger sample sizes may be needed to detect undiscovered rare cancer variants.


2018 ◽  
Vol 19 (10) ◽  
pp. 3250 ◽  
Author(s):  
Anna Sorrentino ◽  
Antonio Federico ◽  
Monica Rienzo ◽  
Patrizia Gazzerro ◽  
Maurizio Bifulco ◽  
...  

The PR/SET domain gene family (PRDM) encodes 19 different transcription factors that share a subtype of the SET domain [Su(var)3-9, enhancer-of-zeste and trithorax] known as the PRDF1-RIZ (PR) homology domain. This domain, with its potential methyltransferase activity, is followed by a variable number of zinc-finger motifs, which likely mediate protein–protein, protein–RNA, or protein–DNA interactions. Intriguingly, almost all PRDM family members express different isoforms, which likely play opposite roles in oncogenesis. Remarkably, several studies have described alterations in most of the family members in malignancies. Here, to obtain a pan-cancer overview of the genomic and transcriptomic alterations of PRDM genes, we reanalyzed the Exome- and RNA-Seq public datasets available at The Cancer Genome Atlas portal. Overall, PRDM2, PRDM3/MECOM, PRDM9, PRDM16 and ZFPM2/FOG2 were the most mutated genes with pan-cancer frequencies of protein-affecting mutations higher than 1%. Moreover, we observed heterogeneity in the mutation frequencies of these genes across tumors, with cancer types also reaching a value of about 20% of mutated samples for a specific PRDM gene. Of note, ZFPM1/FOG1 mutations occurred in 50% of adrenocortical carcinoma patients and were localized in a hotspot region. These findings, together with OncodriveCLUST results, suggest it could be putatively considered a cancer driver gene in this malignancy. Finally, transcriptome analysis from RNA-Seq data of paired samples revealed that transcription of PRDMs was significantly altered in several tumors. Specifically, PRDM12 and PRDM13 were largely overexpressed in many cancers whereas PRDM16 and ZFPM2/FOG2 were often downregulated. Some of these findings were also confirmed by real-time-PCR on primary tumors.


2021 ◽  
Vol 12 ◽  
Author(s):  
Hua Zhu ◽  
Xinyao Hu ◽  
Yingze Ye ◽  
Zhihong Jian ◽  
Yi Zhong ◽  
...  

Phosphatidylinositol binding clathrin assembly protein interacting mitotic regulator (PIMREG) localizes to the nucleus and can significantly elevate the nuclear localization of clathrin assembly lymphomedullary leukocythemia gene. Although there is some evidence to support an important action for PIMREG in the occurrence and development of certain cancers, currently no pan-cancer analysis of PIMREG is available. Therefore, we intended to estimate the prognostic predictive value of PIMREG and to explore its potential immune function in 33 cancer types. By using a series of bioinformatics approaches, we extracted and analyzed datasets from Oncomine, The Cancer Genome Atlas, Cancer Cell Lineage Encyclopedia (CCLE) and the Human Protein Atlas (HPA), to explore the underlying carcinogenesis of PIMREG, including relevance of PIMREG to prognosis, microsatellite instability (MSI), tumor mutation burden (TMB), tumor microenvironment (TME) and infiltration of immune cells in various types of cancer. Our findings indicate that PIMREG is highly expressed in at least 24 types of cancer, and is negatively correlated with prognosis in major cancer types. In addition, PIMREG expression was correlated with TMB in 24 cancers and with MSI in 10 cancers. We revealed that PIMREG is co-expressed with genes encoding major histocompatibility complex, immune activation, immune suppression, chemokine and chemokine receptors. We also found that the different roles of PIMREG in the infiltration of different immune cell types in different tumors. PIMREG can potentially influence the etiology or pathogenesis of cancer by acting on immune-related pathways, chemokine signaling pathway, regulation of autophagy, RIG-I like receptor signaling pathway, antigen processing and presentation, FC epsilon RI pathway, complement and coagulation cascades, T cell receptor pathway, NK cell mediated cytotoxicity and other immune-related pathways. Our study suggests that PIMREG can be applied as a prognostic marker in a variety of malignancies because of its role in tumorigenesis and immune infiltration.


2020 ◽  
Vol 21 (17) ◽  
pp. 6087
Author(s):  
Yunzhen Wei ◽  
Limeng Zhou ◽  
Yingzhang Huang ◽  
Dianjing Guo

Long noncoding RNA (lncRNA)/microRNA(miRNA)/mRNA triplets contribute to cancer biology. However, identifying significative triplets remains a major challenge for cancer research. The dynamic changes among factors of the triplets have been less understood. Here, by integrating target information and expression datasets, we proposed a novel computational framework to identify the triplets termed as “lncRNA-perturbated triplets”. We applied the framework to five cancer datasets in The Cancer Genome Atlas (TCGA) project and identified 109 triplets. We showed that the paired miRNAs and mRNAs were widely perturbated by lncRNAs in different cancer types. LncRNA perturbators and lncRNA-perturbated mRNAs showed significantly higher evolutionary conservation than other lncRNAs and mRNAs. Importantly, the lncRNA-perturbated triplets exhibited high cancer specificity. The pan-cancer perturbator OIP5-AS1 had higher expression level than that of the cancer-specific perturbators. These lncRNA perturbators were significantly enriched in known cancer-related pathways. Furthermore, among the 25 lncRNA in the 109 triplets, lncRNA SNHG7 was identified as a stable potential biomarker in lung adenocarcinoma (LUAD) by combining the TCGA dataset and two independent GEO datasets. Results from cell transfection also indicated that overexpression of lncRNA SNHG7 and TUG1 enhanced the expression of the corresponding mRNA PNMA2 and CDC7 in LUAD. Our study provides a systematic dissection of lncRNA-perturbated triplets and facilitates our understanding of the molecular roles of lncRNAs in cancers.


2019 ◽  
Vol 2019 ◽  
pp. 1-17 ◽  
Author(s):  
Yahui Shi ◽  
Jinfen Wei ◽  
Zixi Chen ◽  
Yuchen Yuan ◽  
Xingsong Li ◽  
...  

Background. Cancer cells undergo various rewiring of metabolism and dysfunction of epigenetic modification to support their biosynthetic needs. Although the major features of metabolic reprogramming have been elucidated, the global metabolic genes linking epigenetics were overlooked in pan-cancer. Objectives. Identifying the critical metabolic signatures with differential expressions which contributes to the epigenetic alternations across cancer types is an urgent issue for providing the potential targets for cancer therapy. Method. The differential gene expression and DNA methylation were analyzed by using the 5726 samples data from the Cancer Genome Atlas (TCGA). Results. Firstly, we analyzed the differential expression of metabolic genes and found that cancer underwent overall metabolism reprogramming, which exhibited a similar expression trend with the data from the Gene Expression Omnibus (GEO) database. Secondly, the regulatory network of histone acetylation and DNA methylation according to altered expression of metabolism genes was summarized in our results. Then, the survival analysis showed that high expression of DNMT3B had a poorer overall survival in 5 cancer types. Integrative altered methylation and expression revealed specific genes influenced by DNMT3B through DNA methylation across cancers. These genes do not overlap across various cancer types and are involved in different function annotations depending on the tissues, which indicated DNMT3B might influence DNA methylation in tissue specificity. Conclusions. Our research clarifies some key metabolic genes, ACLY, SLC2A1, KAT2A, and DNMT3B, which are most disordered and indirectly contribute to the dysfunction of histone acetylation and DNA methylation in cancer. We also found some potential genes in different cancer types influenced by DNMT3B. Our study highlights possible epigenetic disorders resulting from the deregulation of metabolic genes in pan-cancer and provides potential therapy in the clinical treatment of human cancer.


2016 ◽  
Author(s):  
Nao Hiranuma ◽  
Jie Liu ◽  
Chaozhong Song ◽  
Jacob Goldsmith ◽  
Michael Dorschner ◽  
...  

About 16% of breast cancers fall into a clinically aggressive category designated triple negative (TNBC) due to a lack of ERBB2, estrogen receptor and progesterone receptor expression1-3. The mutational spectrum of TNBC has been characterized as part of The Cancer Genome Atlas (TCGA)4; however, snapshots of primary tumors cannot reveal the mechanisms by which TNBCs progress and spread. To address this limitation we initiated the Intensive Trial of OMics in Cancer (ITOMIC)-001, in which patients with metastatic TNBC undergo multiple biopsies over space and time5. Whole exome sequencing (WES) of 67 samples from 11 patients identified 426 genes containing multiple distinct single nucleotide variants (SNVs) within the same sample, instances we term Multiple SNVs affecting the Same Gene and Sample (MSSGS). We find that >90% of MSSGS result from cis-compound mutations (in which both SNVs affect the same allele), that MSSGS comprised of SNVs affecting adjacent nucleotides arise from single mutational events, and that most other MSSGS result from the sequential acquisition of SNVs. Some MSSGS drive cancer progression, as exemplified by a TNBC driven by FGFR2(S252W;Y375C). MSSGS are more prevalent in TNBC than other breast cancer subtypes and occur at higher-than-expected frequencies across TNBC samples within TCGA. MSSGS may denote genes that play as yet unrecognized roles in cancer progression.


2021 ◽  
Vol 12 ◽  
Author(s):  
Xin Cheng ◽  
Xiaowei Wang ◽  
Kechao Nie ◽  
Lin Cheng ◽  
Zheyu Zhang ◽  
...  

Triggering receptor expressed on myeloid cells-2 (TREM2) is a transmembrane receptor of the immunoglobulin superfamily and a crucial signaling hub for multiple pathological pathways that mediate immunity. Although increasing evidence supports a vital role for TREM2 in tumorigenesis of some cancers, no systematic pan-cancer analysis of TREM2 is available. Thus, we aimed to explore the prognostic value, and investigate the potential immunological functions, of TREM2 across 33 cancer types. Based on datasets from The Cancer Genome Atlas, and the Cancer Cell Line Encyclopedia, Genotype Tissue-Expression, cBioPortal, and Human Protein Atlas, we employed an array of bioinformatics methods to explore the potential oncogenic roles of TREM2, including analyzing the relationship between TREM2 and prognosis, tumor mutational burden (TMB), microsatellite instability (MSI), DNA methylation, and immune cell infiltration of different tumors. The results show that TREM2 is highly expressed in most cancers, but present at low levels in lung cancer. Further, TREM2 is positively or negatively associated with prognosis in different cancers. Additionally, TREM2 expression was associated with TMB and MSI in 12 cancer types, while in 20 types of cancer, there was a correlation between TREM2 expression and DNA methylation. Six tumors, including breast invasive carcinoma, cervical squamous cell carcinoma and endocervical adenocarcinoma, kidney renal clear cell carcinoma, lung squamous cell carcinoma, skin cutaneous melanoma, and stomach adenocarcinoma, were screened out for further study, which demonstrated that TREM2 gene expression was negatively correlated with infiltration levels of most immune cells, but positively correlated with infiltration levels of M1 and M2 macrophages. Moreover, correlation with TREM2 expression differed according to T cell subtype. Our study reveals that TREM2 can function as a prognostic marker in various malignant tumors because of its role in tumorigenesis and tumor immunity.


2017 ◽  
Author(s):  
Donald Eric Freeman ◽  
Gillian Lee Hsieh ◽  
Jonathan Michael Howard ◽  
Erik Lehnert ◽  
Julia Salzman

Short AbstractThe extent to which gene fusions function as drivers of cancer remains a critical open question in cancer biology. In principle, transcriptome sequencing provided by The Cancer Genome Atlas (TCGA) enables unbiased discovery of gene fusions and post-analysis that informs the answer to this question. To date, such an analysis has been impossible because of performance limitations in fusion detection algorithms. By engineering a new, more precise, algorithm and statistical approaches to post-analysis of fusions called in TCGA data, we report new recurrent gene fusions, including those that could be druggable; new candidate pan-cancer oncogenes based on their profiles in fusions; and prevalent, previously overlooked, candidate oncogenic gene fusions in ovarian cancer, a disease with minimal treatment advances in recent decades. The novel and reproducible statistical algorithms and, more importantly, the biological conclusions open the door for increased attention to gene fusions as drivers of cancer and for future research into using fusions for targeted therapy.


PeerJ ◽  
2016 ◽  
Vol 3 ◽  
pp. e1499 ◽  
Author(s):  
Jordan Anaya ◽  
Brian Reon ◽  
Wei-Min Chen ◽  
Stefan Bekiranov ◽  
Anindya Dutta

Numerous studies have identified prognostic genes in individual cancers, but a thorough pan-cancer analysis has not been performed. In addition, previous studies have mostly used microarray data instead of RNA-SEQ, and have not published comprehensive lists of associations with survival. Using recently available RNA-SEQ and clinical data from The Cancer Genome Atlas for 6,495 patients, we have investigated every annotated and expressed gene’s association with survival across 16 cancer types. The most statistically significant harmful and protective genes were not shared across cancers, but were enriched in distinct gene sets which were shared across certain groups of cancers. These groups of cancers were independently recapitulated by both unsupervised clustering of Cox coefficients (a measure of association with survival) for individual genes, and for gene programs. This analysis has revealed unappreciated commonalities among cancers which may provide insights into cancer pathogenesis and rationales for co-opting treatments between cancers.


Sign in / Sign up

Export Citation Format

Share Document