GIANT: an online resource for comprehensive survival analysis in pan-cancer from The Cancer Genome Atlas (Preprint)

BACKGROUND Prognostic genes or gene signatures have been widely used to predict patients’ survival and aid the decision of therapeutic options. Although few web-based survival analysis tools to identify them have been developed, they only provide limited information. OBJECTIVE To overcome limitations of previous web-based tools and provide comprehensive survival analysis, we developed GIANT, an online resource for identifying prognostic biomarkers in pan-cancer from The Cancer Genome Atlas (TCGA). METHODS We used R program to code survival analysis based on RNA-seq data from TCGA (n=10,320). To perform survival analyses, we excluded patients and genes that have insufficient information (survival status, tumor stage, age, gender, cancer type, blast count, and histologic grade). The GIANT is programmed by applying appropriate cross validation methods and survival analysis methods to provide three analysis services (survival analysis by single gene, cancer type, variable signature). RESULTS It can perform comprehensive survival analysis to identify prognostic genes or gene signatures with reflecting tumor heterogeneity. Using RNA-seq, clinical data and pathway databases in combination, it provides gene/variable signature by grouped variable selection methods (least absolute shrinkage and selection operator, Elastic Net regularization, Network-Regularized high-dimensional Cox-regression) that has better discriminatory power than single gene. Users also can find prognostic values of gene and statistically significant genes in specific cancer. All results are presented as Kaplan-Meier curve with median/optimal cutoff value, C-index, and area under the curve (AUC) value at t-years. Moreover, users can easily obtain results in the forms of graphs and tables. CONCLUSIONS In conclusion, the GIANT has made it possible to easily perform integrated survival analysis while overcoming the limitations of previous online tools. It will help scientists of those who are vulnerable to computer technology to do database analysis can easily perform comprehensive survival analysis.

Download Full-text

PR/SET Domain Family and Cancer: Novel Insights from the Cancer Genome Atlas

International Journal of Molecular Sciences ◽

10.3390/ijms19103250 ◽

2018 ◽

Vol 19 (10) ◽

pp. 3250 ◽

Cited By ~ 9

Author(s):

Anna Sorrentino ◽

Antonio Federico ◽

Monica Rienzo ◽

Patrizia Gazzerro ◽

Maurizio Bifulco ◽

...

Keyword(s):

Family Members ◽

Cancer Genome ◽

The Cancer Genome Atlas ◽

Driver Gene ◽

Rna Seq ◽

Set Domain ◽

Primary Tumors ◽

Cancer Genome Atlas ◽

Pan Cancer ◽

Genome Atlas

The PR/SET domain gene family (PRDM) encodes 19 different transcription factors that share a subtype of the SET domain [Su(var)3-9, enhancer-of-zeste and trithorax] known as the PRDF1-RIZ (PR) homology domain. This domain, with its potential methyltransferase activity, is followed by a variable number of zinc-finger motifs, which likely mediate protein–protein, protein–RNA, or protein–DNA interactions. Intriguingly, almost all PRDM family members express different isoforms, which likely play opposite roles in oncogenesis. Remarkably, several studies have described alterations in most of the family members in malignancies. Here, to obtain a pan-cancer overview of the genomic and transcriptomic alterations of PRDM genes, we reanalyzed the Exome- and RNA-Seq public datasets available at The Cancer Genome Atlas portal. Overall, PRDM2, PRDM3/MECOM, PRDM9, PRDM16 and ZFPM2/FOG2 were the most mutated genes with pan-cancer frequencies of protein-affecting mutations higher than 1%. Moreover, we observed heterogeneity in the mutation frequencies of these genes across tumors, with cancer types also reaching a value of about 20% of mutated samples for a specific PRDM gene. Of note, ZFPM1/FOG1 mutations occurred in 50% of adrenocortical carcinoma patients and were localized in a hotspot region. These findings, together with OncodriveCLUST results, suggest it could be putatively considered a cancer driver gene in this malignancy. Finally, transcriptome analysis from RNA-Seq data of paired samples revealed that transcription of PRDMs was significantly altered in several tumors. Specifically, PRDM12 and PRDM13 were largely overexpressed in many cancers whereas PRDM16 and ZFPM2/FOG2 were often downregulated. Some of these findings were also confirmed by real-time-PCR on primary tumors.

Download Full-text

Pan-cancer analysis reveals complex tumor-specific alternative polyadenylation

10.1101/160960 ◽

2017 ◽

Author(s):

Zhuyi Xue ◽

René L Warren ◽

Ewan A Gibb ◽

Daniel MacMillan ◽

Johnathan Wong ◽

...

Keyword(s):

Alternative Polyadenylation ◽

Length Change ◽

The Cancer Genome Atlas ◽

Cancer Type ◽

Rna Seq ◽

Multiple Cancer ◽

Tissue Samples ◽

Cancer Genome Atlas ◽

Specific Alternative ◽

Cancer Types

AbstractAlternative polyadenylation (APA) of 3’ untranslated regions (3’ UTRs) has been implicated in cancer development. Earlier reports on APA in cancer primarily focused on 3’ UTR length modifications, and the conventional wisdom is that tumor cells preferentially express transcripts with shorter 3’ UTRs. Here, we analyzed the APA patterns of 114 genes, a select list of oncogenes and tumor suppressors, in 9,939 tumor and 729 normal tissue samples across 33 cancer types using RNA-Seq data from The Cancer Genome Atlas, and we found that the APA regulation machinery is much more complicated than what was previously thought. We report 77 cases (gene-cancer type pairs) of differential 3’ UTR cleavage patterns between normal and tumor tissues, involving 33 genes in 13 cancer types. For 15 genes, the tumor-specific cleavage patterns are recurrent across multiple cancer types. While the cleavage patterns in certain genes indicate apparent trends of 3’ UTR shortening in tumor samples, over half of the 77 cases imply 3’ UTR length change trends in cancer that are more complex than simple shortening or lengthening. This work extends the current understanding of APA regulation in cancer, and demonstrates how large volumes of RNA-seq data generated for characterizing cancer cohorts can be mined to investigate this process.

Download Full-text

A pan-cancer analysis of prognostic genes

PeerJ ◽

10.7717/peerj.1499 ◽

2016 ◽

Vol 3 ◽

pp. e1499 ◽

Cited By ~ 13

Author(s):

Jordan Anaya ◽

Brian Reon ◽

Wei-Min Chen ◽

Stefan Bekiranov ◽

Anindya Dutta

Keyword(s):

Microarray Data ◽

The Cancer Genome Atlas ◽

Rna Seq ◽

Cancer Pathogenesis ◽

Gene Sets ◽

Protective Genes ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Measure Of Association ◽

Pan Cancer

Numerous studies have identified prognostic genes in individual cancers, but a thorough pan-cancer analysis has not been performed. In addition, previous studies have mostly used microarray data instead of RNA-SEQ, and have not published comprehensive lists of associations with survival. Using recently available RNA-SEQ and clinical data from The Cancer Genome Atlas for 6,495 patients, we have investigated every annotated and expressed gene’s association with survival across 16 cancer types. The most statistically significant harmful and protective genes were not shared across cancers, but were enriched in distinct gene sets which were shared across certain groups of cancers. These groups of cancers were independently recapitulated by both unsupervised clustering of Cox coefficients (a measure of association with survival) for individual genes, and for gene programs. This analysis has revealed unappreciated commonalities among cancers which may provide insights into cancer pathogenesis and rationales for co-opting treatments between cancers.

Download Full-text

Comprehensive Analysis of Immune Correlation of KIF20A in Pan-cancer

10.21203/rs.3.rs-127486/v1 ◽

2020 ◽

Author(s):

Xiao-Han Cui ◽

Qiu-Ju Peng ◽

Peng Gao ◽

Xu-Dong Zhang ◽

Ren-Zhi Li ◽

...

Keyword(s):

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

The Cancer Genome Atlas ◽

Immune Checkpoints ◽

Rna Seq ◽

Gene Set Enrichment ◽

Related Information ◽

Common Causes ◽

Cancer Genome Atlas ◽

Pan Cancer

Abstract Background: Cancer is one of the most common causes of death, and the morbidity and mortality are gradually increasing in the world. KIF20A plays an important role in tumors, but its immune relevance in pan-cancer needs to be further studied.Methods: KIF20A-related information was download from The Cancer Genome Atlas (TCGA). Collecting RNA-seq data is fragments per kilobase million (FPKM) style data. The ESTIMATE algorithm was used for estimating the stromal and immune scores for 33 tumors. Then, we analyzed the correlation between KIF20A in pan-cancer and immune checkpoints and performed gene set enrichment analysis (GSEA) analysis on the co-expressed genes of KIF20A in pan-cancer.Results: We have confirmed that the expression of KIF20A has a intensive correlation with prognosis in 33 kinds of tumors. Its expression of KIF20A was related to a variety of immune cells and immune checkpoints. Based on the results of GSEA for further analysis, in multiple tumors, KIF20A is related to immune-related pathways.Conclusion: We have demonstrated that KIF20A played an important role in pan-cancer and could affect the occurrence or development of a variety of tumors. Moreover, KIF20A was related to immunity, and KIF20A- related immune research in pan-cancer also needs to be further demonstrate.

Download Full-text

Chromoanagenesis Landscape in 10,000 TCGA Patients

Cancers ◽

10.3390/cancers13164197 ◽

2021 ◽

Vol 13 (16) ◽

pp. 4197

Author(s):

Roni Rasnic ◽

Michal Linial

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Learning Algorithm ◽

The Cancer Genome Atlas ◽

Cancer Type ◽

Whole Genome ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Tumor Biopsies ◽

Pan Cancer

During the past decade, whole-genome sequencing of tumor biopsies and individuals with congenital disorders highlighted the phenomenon of chromoanagenesis, a single chaotic event of chromosomal rearrangement. Chromoanagenesis was shown to be frequent in many types of cancers, to occur in early stages of cancer development, and significantly impact the tumor’s nature. However, an in-depth, cancer-type dependent analysis has been somewhat incomplete due to the shortage in whole genome sequencing of cancerous samples. In this study, we extracted data from The Pan-Cancer Analysis of Whole Genome (PCAWG) and The Cancer Genome Atlas (TCGA) to construct and test a machine learning algorithm that can detect chromoanagenesis with high accuracy (86%). The algorithm was applied to ~10,000 unlabeled TCGA cancer patients. We utilize the chromoanagenesis assignment results, to analyze cancer-type specific chromoanagenesis characteristics in 20 TCGA cancer types. Our results unveil prominent genes affected in either chromoanagenesis or non-chromoanagenesis tumorigenesis. The analysis reveals a mutual exclusivity relationship between the genes impaired in chromoanagenesis versus non-chromoanagenesis cases. We offer the discovered characteristics as possible targets for cancer diagnostic and therapeutic purposes.

Download Full-text

A pan-cancer analysis of prognostic genes

10.1101/030924 ◽

2015 ◽

Author(s):

Jordan Anaya ◽

Brian J. Reon ◽

Wei-Min Chen ◽

Stefan Bekiranov ◽

Anindya Dutta

Keyword(s):

Microarray Data ◽

The Cancer Genome Atlas ◽

Rna Seq ◽

Cancer Pathogenesis ◽

Gene Sets ◽

Protective Genes ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Measure Of Association ◽

Pan Cancer

AbstractNumerous studies have identified prognostic genes in individual cancers, but a thorough pan-cancer analysis has not been performed. In addition, previous studies have mostly used microarray data instead of RNA-SEQ, and have not published comprehensive lists of associations with survival. Using recently available RNA-SEQ and clinical data from the The Cancer Genome Atlas for 6,495 patients, we have investigated every annotated and expressed gene’s association with survival across 16 cancer types. The most statistically significant harmful and protective genes were not shared across cancers, but were enriched in distinct gene sets which were shared across certain groups of cancers. These groups of cancers were independently reconstructed by unsupervised clustering of Cox coefficients (a measure of association with survival) for individual genes or for gene programs. This analysis has revealed unappreciated commonalities among cancers which may provide insights into cancer pathogenesis and rationales for co-opting treatments between cancers.

Download Full-text

Scaling concepts in 'omics: nuclear lamin-B scales with tumor growth and predicts poor prognosis, whereas fibrosis can be pro-survival

10.1101/2021.02.25.432860 ◽

2021 ◽

Author(s):

Manasvita Vashisth ◽

Dennis Discher ◽

Sangkyun Cho ◽

Jerome Irianto ◽

Yuntao Xia ◽

...

Keyword(s):

Single Cell ◽

Power Laws ◽

The Cancer Genome Atlas ◽

Structural Factors ◽

Rna Seq ◽

Lamin B ◽

Nuclear Lamin ◽

Cancer Genome Atlas ◽

Tumor Types ◽

Pan Cancer

Spatiotemporal relationships between genes expressed in tissues likely reflect physicochemical principles that range from stoichiometric interactions to co-organized fractals with characteristic scaling. For key structural factors within the nucleus and extracellular matrix (ECM), gene-gene power laws are found to be characteristic across several tumor types in The Cancer Genome Atlas (TCGA) and across single-cell RNA-seq data. The nuclear filament LMNB1 scales with many tumor-elevated proliferation genes that predict poor survival in liver cancer, and cell line experiments show LMNB1 regulates cancer cell cycle. Also high in the liver, lung, and breast tumors studied here are the main fibrosis-associated collagens, COL1A1 and COL1A2, that scale stoichiometrically with each other and superstoichiometrically with a pan-cancer fibrosis gene set. However, high fibrosis predicts prolonged survival of patients undergoing therapy and does not correlate with LMNB1. Single-cell RNA-seq data also reveal scaling consistent with the pan-cancer power laws obtained from bulk tissue, allowing new power law relations to be predicted. Lastly, although noisy data frustrate weak scaling, concepts such as stoichiometric scaling highlight a simple, internal consistency check to qualify expression data.

Download Full-text

Chromoanagenesis landscape in 10,000 TCGA patients

10.1101/2021.04.29.441937 ◽

2021 ◽

Author(s):

Roni Rasnic ◽

Michal Linial

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Learning Algorithm ◽

The Cancer Genome Atlas ◽

Cancer Type ◽

Whole Genome ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Tumor Biopsies ◽

Pan Cancer

During the past decade, whole-genome sequencing of tumor biopsies and individuals with congenital disorders highlighted the phenomenon of chromoanagenesis, a single chaotic event of chromosomal rearrangement. Chromoanagenesis was shown to be frequent in many types of cancers, to occur in early stages of cancer development, and significantly impact the tumors nature. However, an in-depth, cancer-type dependent analysis has been somewhat incomplete due to the shortage in whole genome sequencing of cancerous samples. In this study, we extracted data from The Pan-Cancer Analysis of Whole Genome (PCAWG) and The Cancer Genome Atlas (TCGA) to construct a machine learning algorithm that can detect chromoanagenesis with high accuracy (86%). The algorithm was applied to ~10,000 TCGA cancer patients. We utilize the chromoanagenesis assignment results, to analyze cancer-type specific chromoanagenesis characteristics in 20 TCGA cancer types. Our results unveil prominent genes affected in either chromoanagenesis or non-chromoanagenesis tumorigenesis. The analysis reveals a mutual exclusivity relationship between the genes impaired in chromoanagenesis versus non-chromoanagenesis cases. We offer the discovered characteristics as possible targets for cancer diagnostic and therapeutic purposes.

Download Full-text

WebMeV: a Cloud Platform for Analyzing and Visualizing Cancer Genomic Data

10.1101/147884 ◽

2017 ◽

Cited By ~ 2

Author(s):

Yaoyu E. Wang ◽

Lev Kuznetsov ◽

Antony Partensky ◽

Jalil Farid ◽

John Quackenbush

Keyword(s):

Genomic Data ◽

Data Access ◽

The Cancer Genome Atlas ◽

Data Sets ◽

Rna Seq ◽

Cloud Platform ◽

Web Based ◽

Public Data ◽

Cancer Genome Atlas ◽

Effective Use

AbstractAlthough large, complex genomic data sets are increasingly easy to generate, and the number of publicly available data sets in cancer and other diseases is rapidly growing, the lack of intuitive, easy to use analysis tools has remained a barrier to the effective use of such data. WebMeV (https://mev.tm4.org) is an open-source, web-based tool that gives users access to sophisticated tools for analysis of RNA-Seq and other data in an interface designed to democratize data access. WebMeV combines cloud-based technologies with a simple user interface to allow users to access large public data sets such as that from The Cancer Genome Atlas (TCGA) or to upload their own. The interface allows users to visualize data and to apply advanced data mining analysis methods to explore the data and draw biologically meaningful conclusions. We provide an overview of WebMeV and demonstrate two simple use cases that illustrate the value of putting data analysis in the hands of those looking to explore the underlying biology of the systems being studied.

Download Full-text