Pan-cancer analysis reveals complex tumor-specific alternative polyadenylation

AbstractAlternative polyadenylation (APA) of 3’ untranslated regions (3’ UTRs) has been implicated in cancer development. Earlier reports on APA in cancer primarily focused on 3’ UTR length modifications, and the conventional wisdom is that tumor cells preferentially express transcripts with shorter 3’ UTRs. Here, we analyzed the APA patterns of 114 genes, a select list of oncogenes and tumor suppressors, in 9,939 tumor and 729 normal tissue samples across 33 cancer types using RNA-Seq data from The Cancer Genome Atlas, and we found that the APA regulation machinery is much more complicated than what was previously thought. We report 77 cases (gene-cancer type pairs) of differential 3’ UTR cleavage patterns between normal and tumor tissues, involving 33 genes in 13 cancer types. For 15 genes, the tumor-specific cleavage patterns are recurrent across multiple cancer types. While the cleavage patterns in certain genes indicate apparent trends of 3’ UTR shortening in tumor samples, over half of the 77 cases imply 3’ UTR length change trends in cancer that are more complex than simple shortening or lengthening. This work extends the current understanding of APA regulation in cancer, and demonstrates how large volumes of RNA-seq data generated for characterizing cancer cohorts can be mined to investigate this process.

Download Full-text

Integrative Analysis of Cancer Omics Data for Prognosis Modeling

Genes ◽

10.3390/genes10080604 ◽

2019 ◽

Vol 10 (8) ◽

pp. 604 ◽

Cited By ~ 2

Author(s):

Wang ◽

Wu ◽

Keyword(s):

Integrative Analysis ◽

The Cancer Genome Atlas ◽

Cancer Prognosis ◽

Sufficient Information ◽

Cancer Type ◽

Multiple Cancer ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Information Borrowing ◽

Penalization Technique

Prognosis modeling plays an important role in cancer studies. With the development of omics profiling, extensive research has been conducted to search for prognostic markers for various cancer types. However, many of the existing studies share a common limitation by only focusing on a single cancer type and suffering from a lack of sufficient information. With potential molecular similarity across cancer types, one cancer type may contain information useful for the analysis of other types. The integration of multiple cancer types may facilitate information borrowing so as to more comprehensively and more accurately describe prognosis. In this study, we conduct marginal and joint integrative analysis of multiple cancer types, effectively introducing integration in the discovery process. For accommodating high dimensionality and identifying relevant markers, we adopt the advanced penalization technique which has a solid statistical ground. Gene expression data on nine cancer types from The Cancer Genome Atlas (TCGA) are analyzed, leading to biologically sensible findings that are different from the alternatives. Overall, this study provides a novel venue for cancer prognosis modeling by integrating multiple cancer types.

Download Full-text

Putatively cancer-specific exon–exon junctions are shared across patients and present in developmental and other non-cancer cells

NAR Cancer ◽

10.1093/narcan/zcaa001 ◽

2020 ◽

Vol 2 (1) ◽

Cited By ~ 1

Author(s):

Julianne K David ◽

Sean K Maden ◽

Benjamin R Weeder ◽

Reid F Thompson ◽

Abhinav Nellore

Keyword(s):

Rna Splicing ◽

Tissue Expression ◽

The Body ◽

The Cancer Genome Atlas ◽

Cancer Type ◽

Rna Seq ◽

Normal Tissues ◽

Cancer Types ◽

Exon Junctions ◽

Cancer Tissues

Abstract This study probes the distribution of putatively cancer-specific junctions across a broad set of publicly available non-cancer human RNA sequencing (RNA-seq) datasets. We compared cancer and non-cancer RNA-seq data from The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression (GTEx) Project and the Sequence Read Archive. We found that (i) averaging across cancer types, 80.6% of exon–exon junctions thought to be cancer-specific based on comparison with tissue-matched samples (σ = 13.0%) are in fact present in other adult non-cancer tissues throughout the body; (ii) 30.8% of junctions not present in any GTEx or TCGA normal tissues are shared by multiple samples within at least one cancer type cohort, and 87.4% of these distinguish between different cancer types; and (iii) many of these junctions not found in GTEx or TCGA normal tissues (15.4% on average, σ = 2.4%) are also found in embryological and other developmentally associated cells. These findings refine the meaning of RNA splicing event novelty, particularly with respect to the human neoepitope repertoire. Ultimately, cancer-specific exon–exon junctions may have a substantial causal relationship with the biology of disease.

Download Full-text

Cancer Type-Dependent Correlations betweenTP53Mutations and Antitumor Immunity

10.1101/692715 ◽

2019 ◽

Author(s):

Lin Li ◽

Mengyuan Li ◽

Xiaosheng Wang

Keyword(s):

Antitumor Immunity ◽

Colon Adenocarcinoma ◽

Joint Effect ◽

The Cancer Genome Atlas ◽

Cancer Type ◽

Mutation Burden ◽

Cancer Genome Atlas ◽

Negative Role ◽

Stomach Adenocarcinoma ◽

Cancer Types

AbstractMany studies have shown thatTP53mutations play a negative role in antitumor immunity. However, a few studies reported thatTP53mutations could promote antitumor immunity. To explain these contradictory findings, we analyzed five cancer cohorts from The Cancer Genome Atlas (TCGA) project. We found thatTP53-mutated cancers had significantly higher levels of antitumor immune signatures thanTP53-wildtype cancers in breast invasive carcinoma (BRCA) and lung adenocarcinoma (LUAD). In contrast,TP53-mutated cancers had significantly lower antitumor immune signature levels thanTP53-wildtype cancers in stomach adenocarcinoma (STAD), colon adenocarcinoma (COAD), and head and neck squamous cell carcinoma (HNSC). Moreover,TP53-mutated cancers likely had higher tumor mutation burden (TMB) and tumor aneuploidy level (TAL) thanTP53-wildtype cancers. However, the TMB differences were more marked betweenTP53-mutated andTP53-wildtype cancers than the TAL differences in BRCA and LUAD, and the TAL differences were more significant in STAD and COAD. Furthermore, we showed that TMB and TAL had a positive and a negative correlation with antitumor immunity and that TMB affected antitumor immunity more greatly than TAL did in BRCA and LUAD while TAL affected antitumor immunity more strongly than TMB in STAD and HNSC. These findings indicate that the distinct correlations betweenTP53mutations and antitumor immunity in different cancer types are a consequence of the joint effect of the altered TMB and TAL caused byTP53mutations on tumor immunity. Our data suggest that theTP53mutation status could be a useful biomarker for cancer immunotherapy response depending on cancer types.

Download Full-text

A pan-cancer analysis of prognostic genes

PeerJ ◽

10.7717/peerj.1499 ◽

2016 ◽

Vol 3 ◽

pp. e1499 ◽

Cited By ~ 13

Author(s):

Jordan Anaya ◽

Brian Reon ◽

Wei-Min Chen ◽

Stefan Bekiranov ◽

Anindya Dutta

Keyword(s):

Microarray Data ◽

The Cancer Genome Atlas ◽

Rna Seq ◽

Cancer Pathogenesis ◽

Gene Sets ◽

Protective Genes ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Measure Of Association ◽

Pan Cancer

Numerous studies have identified prognostic genes in individual cancers, but a thorough pan-cancer analysis has not been performed. In addition, previous studies have mostly used microarray data instead of RNA-SEQ, and have not published comprehensive lists of associations with survival. Using recently available RNA-SEQ and clinical data from The Cancer Genome Atlas for 6,495 patients, we have investigated every annotated and expressed gene’s association with survival across 16 cancer types. The most statistically significant harmful and protective genes were not shared across cancers, but were enriched in distinct gene sets which were shared across certain groups of cancers. These groups of cancers were independently recapitulated by both unsupervised clustering of Cox coefficients (a measure of association with survival) for individual genes, and for gene programs. This analysis has revealed unappreciated commonalities among cancers which may provide insights into cancer pathogenesis and rationales for co-opting treatments between cancers.

Download Full-text

Pan-cancer driver copy number alterations identified by joint expression/CNA data analysis

Scientific Reports ◽

10.1038/s41598-020-74276-6 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Gaojianyong Wang ◽

Dimitris Anastassiou

Keyword(s):

Gene Expression ◽

Copy Number ◽

The Cancer Genome Atlas ◽

Copy Number Alterations ◽

Multiple Cancer ◽

Cancer Driver ◽

Large Gene ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Pan Cancer

Abstract Analysis of large gene expression datasets from biopsies of cancer patients can identify co-expression signatures representing particular biomolecular events in cancer. Some of these signatures involve genomically co-localized genes resulting from the presence of copy number alterations (CNAs), for which analysis of the expression of the underlying genes provides valuable information about their combined role as oncogenes or tumor suppressor genes. Here we focus on the discovery and interpretation of such signatures that are present in multiple cancer types due to driver amplifications and deletions in particular regions of the genome after doing a comprehensive analysis combining both gene expression and CNA data from The Cancer Genome Atlas.

Download Full-text

RNA-seq Reveals the Overexpression of IGSF9 in Endometrial Cancer

Journal of Oncology ◽

10.1155/2018/2439527 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 3

Author(s):

Zonggao Shi ◽

Chunyan Li ◽

Laura Tarwater ◽

Jun Li ◽

Yang Li ◽

...

Keyword(s):

Endometrial Cancer ◽

The Cancer Genome Atlas ◽

Rna Seq ◽

Illumina Platform ◽

Tissue Samples ◽

Significance Level ◽

Glandular Cells ◽

Cancer Genome Atlas ◽

Classification Tool ◽

Gene Functional Classification

We performed RNA-seq on an Illumina platform for 7 patients with endometrioid endometrial carcinoma for which both tumor tissue and adjacent noncancer tissue were available. A total of 66 genes were differentially expressed with significance level at adjusted p value < 0.01. Using the gene functional classification tool in the NIH DAVID bioinformatics resource, 5 genes were found to be the only enriched group out of that list of genes. The gene IGSF9 was chosen for further characterization with immunohistochemical staining of a larger cohort of human endometrioid carcinoma tissues. The expression level of IGSF9 in cancer cells was significantly higher than that in control glandular cells in paired tissue samples from the same patients (p=0.008) or in overall comparison between cancer and the control (p=0.003). IGSF9 expression is higher in patients with myometrium invasion relative to those without invasion (p=0.015). Reanalysis of RNA-seq dataset from The Cancer Genome Atlas shows higher expression of IGSF9 in endometrial cancer versus normal control and expression was associated with poor prognosis. These results suggest IGSF9 as a new biomarker in endometrial cancer and warrant further studies on its function, mechanism of action, and potential clinical utility.

Download Full-text

Identifying cancer type specific oncogenes and tumor suppressors using limited size data

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720016500311 ◽

2016 ◽

Vol 14 (06) ◽

pp. 1650031 ◽

Cited By ~ 4

Author(s):

Ana B. Pavel ◽

Cristian I. Vasile

Keyword(s):

Tumor Suppressors ◽

Molecular Mechanisms ◽

Lung Squamous Cell Carcinoma ◽

The Cancer Genome Atlas ◽

Driver Mutations ◽

Cancer Type ◽

Multiple Cancer ◽

Driver Genes ◽

Cancer Subtypes ◽

Cancer Types

Cancer is a complex and heterogeneous genetic disease. Different mutations and dysregulated molecular mechanisms alter the pathways that lead to cell proliferation. In this paper, we explore a method which classifies genes into oncogenes (ONGs) and tumor suppressors. We optimize this method to identify specific (ONGs) and tumor suppressors for breast cancer, lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC) and colon adenocarcinoma (COAD), using data from the cancer genome atlas (TCGA). A set of genes were previously classified as ONGs and tumor suppressors across multiple cancer types (Science 2013). Each gene was assigned an ONG score and a tumor suppressor score based on the frequency of its driver mutations across all variants from the catalogue of somatic mutations in cancer (COSMIC). We evaluate and optimize this approach within different cancer types from TCGA. We are able to determine known driver genes for each of the four cancer types. After establishing the baseline parameters for each cancer type, we identify new driver genes for each cancer type, and the molecular pathways that are highly affected by them. Our methodology is general and can be applied to different cancer subtypes to identify specific driver genes and improve personalized therapy.

Download Full-text

Identification and Characterization of MicroRNAs Associated with Somatic Copy Number Alterations in Cancer

Cancers ◽

10.3390/cancers10120475 ◽

2018 ◽

Vol 10 (12) ◽

pp. 475 ◽

Cited By ~ 1

Author(s):

Jihee Soh ◽

Hyejin Cho ◽

Chan-Hun Choi ◽

Hyunju Lee

Keyword(s):

Copy Number ◽

The Cancer Genome Atlas ◽

Cancer Development ◽

Cancer Type ◽

Biological Processes ◽

Copy Number Alterations ◽

Coding Regions ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Somatic Copy Number Alterations

MicroRNAs (miRNAs) are key molecules that regulate biological processes such as cell proliferation, differentiation, and apoptosis in cancer. Somatic copy number alterations (SCNAs) are common genetic mutations that play essential roles in cancer development. Here, we investigated the association between miRNAs and SCNAs in cancer. We collected 2538 tumor samples for seven cancer types from The Cancer Genome Atlas. We found that 32−84% of miRNAs are in SCNA regions, with the rate depending on the cancer type. In these regions, we identified 80 SCNA-miRNAs whose expression was mainly associated with SCNAs in at least one cancer type and showed that these SCNA-miRNAs are related to cancer by survival analysis and literature searching. We also identified 58 SCNA-miRNAs common in the seven cancer types (CC-SCNA-miRNAs) and showed that these CC-SCNA-miRNAs are more likely to be related with protein and gene expression than other miRNAs. Furthermore, we experimentally validated the oncogenic role of miR-589. In conclusion, our results suggest that SCNA-miRNAs significantly alter biological processes related to cancer development, confirming the importance of SCNAs in non-coding regions in cancer.

Download Full-text

GIANT: an online resource for comprehensive survival analysis in pan-cancer from The Cancer Genome Atlas (Preprint)

10.2196/preprints.10339 ◽

2018 ◽

Author(s):

Myoung-Eun Han ◽

Tae Sik Goh ◽

Dae Cheon Jeong ◽

Chi-Seung Lee ◽

Ji-Young Kim ◽

...

Keyword(s):

Survival Analysis ◽

Single Gene ◽

The Cancer Genome Atlas ◽

Cancer Type ◽

Rna Seq ◽

Web Based ◽

Online Resource ◽

Cancer Genome Atlas ◽

Or Gene ◽

Pan Cancer

BACKGROUND Prognostic genes or gene signatures have been widely used to predict patients’ survival and aid the decision of therapeutic options. Although few web-based survival analysis tools to identify them have been developed, they only provide limited information. OBJECTIVE To overcome limitations of previous web-based tools and provide comprehensive survival analysis, we developed GIANT, an online resource for identifying prognostic biomarkers in pan-cancer from The Cancer Genome Atlas (TCGA). METHODS We used R program to code survival analysis based on RNA-seq data from TCGA (n=10,320). To perform survival analyses, we excluded patients and genes that have insufficient information (survival status, tumor stage, age, gender, cancer type, blast count, and histologic grade). The GIANT is programmed by applying appropriate cross validation methods and survival analysis methods to provide three analysis services (survival analysis by single gene, cancer type, variable signature). RESULTS It can perform comprehensive survival analysis to identify prognostic genes or gene signatures with reflecting tumor heterogeneity. Using RNA-seq, clinical data and pathway databases in combination, it provides gene/variable signature by grouped variable selection methods (least absolute shrinkage and selection operator, Elastic Net regularization, Network-Regularized high-dimensional Cox-regression) that has better discriminatory power than single gene. Users also can find prognostic values of gene and statistically significant genes in specific cancer. All results are presented as Kaplan-Meier curve with median/optimal cutoff value, C-index, and area under the curve (AUC) value at t-years. Moreover, users can easily obtain results in the forms of graphs and tables. CONCLUSIONS In conclusion, the GIANT has made it possible to easily perform integrated survival analysis while overcoming the limitations of previous online tools. It will help scientists of those who are vulnerable to computer technology to do database analysis can easily perform comprehensive survival analysis.

Download Full-text

Chromoanagenesis Landscape in 10,000 TCGA Patients

Cancers ◽

10.3390/cancers13164197 ◽

2021 ◽

Vol 13 (16) ◽

pp. 4197

Author(s):

Roni Rasnic ◽

Michal Linial

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Learning Algorithm ◽

The Cancer Genome Atlas ◽

Cancer Type ◽

Whole Genome ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Tumor Biopsies ◽

Pan Cancer

During the past decade, whole-genome sequencing of tumor biopsies and individuals with congenital disorders highlighted the phenomenon of chromoanagenesis, a single chaotic event of chromosomal rearrangement. Chromoanagenesis was shown to be frequent in many types of cancers, to occur in early stages of cancer development, and significantly impact the tumor’s nature. However, an in-depth, cancer-type dependent analysis has been somewhat incomplete due to the shortage in whole genome sequencing of cancerous samples. In this study, we extracted data from The Pan-Cancer Analysis of Whole Genome (PCAWG) and The Cancer Genome Atlas (TCGA) to construct and test a machine learning algorithm that can detect chromoanagenesis with high accuracy (86%). The algorithm was applied to ~10,000 unlabeled TCGA cancer patients. We utilize the chromoanagenesis assignment results, to analyze cancer-type specific chromoanagenesis characteristics in 20 TCGA cancer types. Our results unveil prominent genes affected in either chromoanagenesis or non-chromoanagenesis tumorigenesis. The analysis reveals a mutual exclusivity relationship between the genes impaired in chromoanagenesis versus non-chromoanagenesis cases. We offer the discovered characteristics as possible targets for cancer diagnostic and therapeutic purposes.

Download Full-text