scholarly journals Comprehensive Analysis of Large-Scale Transcriptomes from Multiple Cancer Types

Genes ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1865
Author(s):  
Baoting Nong ◽  
Mengbiao Guo ◽  
Weiwen Wang ◽  
Songyang Zhou ◽  
Yuanyan Xiong

Various abnormalities of transcriptional regulation revealed by RNA sequencing (RNA-seq) have been reported in cancers. However, strategies to integrate multi-modal information from RNA-seq, which would help uncover more disease mechanisms, are still limited. Here, we present PipeOne, a cross-platform one-stop analysis workflow for large-scale transcriptome data. It was developed based on Nextflow, a reproducible workflow management system. PipeOne is composed of three modules, data processing and feature matrices construction, disease feature prioritization, and disease subtyping. It first integrates eight different tools to extract different information from RNA-seq data, and then used random forest algorithm to study and stratify patients according to evidences from multiple-modal information. Its application in five cancers (colon, liver, kidney, stomach, or thyroid; total samples n = 2024) identified various dysregulated key features (such as PVT1 expression and ABI3BP alternative splicing) and pathways (especially liver and kidney dysfunction) shared by multiple cancers. Furthermore, we demonstrated clinically-relevant patient subtypes in four of five cancers, with most subtypes characterized by distinct driver somatic mutations, such as TP53, TTN, BRAF, HRAS, MET, KMT2D, and KMT2C mutations. Importantly, these subtyping results were frequently contributed by dysregulated biological processes, such as ribosome biogenesis, RNA binding, and mitochondria functions. PipeOne is efficient and accurate in studying different cancer types to reveal the specificity and cross-cancer contributing factors of each cancer.It could be easily applied to other diseases and is available at GitHub.

2020 ◽  
Author(s):  
Ramon Viñas ◽  
Tiago Azevedo ◽  
Eric R. Gamazon ◽  
Pietro Liò

AbstractA question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we present GAIN-GTEx, a method for gene expression imputation based on Generative Adversarial Imputation Networks. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We compare our model to several standard and state-of-the-art imputation methods and show that GAIN-GTEx is significantly superior in terms of predictive performance and runtime. Furthermore, our results indicate strong generalisation on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1420-D1430
Author(s):  
Dongqing Sun ◽  
Jin Wang ◽  
Ya Han ◽  
Xin Dong ◽  
Jun Ge ◽  
...  

Abstract Cancer immunotherapy targeting co-inhibitory pathways by checkpoint blockade shows remarkable efficacy in a variety of cancer types. However, only a minority of patients respond to treatment due to the stochastic heterogeneity of tumor microenvironment (TME). Recent advances in single-cell RNA-seq technologies enabled comprehensive characterization of the immune system heterogeneity in tumors but posed computational challenges on integrating and utilizing the massive published datasets to inform immunotherapy. Here, we present Tumor Immune Single Cell Hub (TISCH, http://tisch.comp-genomics.org), a large-scale curated database that integrates single-cell transcriptomic profiles of nearly 2 million cells from 76 high-quality tumor datasets across 27 cancer types. All the data were uniformly processed with a standardized workflow, including quality control, batch effect removal, clustering, cell-type annotation, malignant cell classification, differential expression analysis and functional enrichment analysis. TISCH provides interactive gene expression visualization across multiple datasets at the single-cell level or cluster level, allowing systematic comparison between different cell-types, patients, tissue origins, treatment and response groups, and even different cancer-types. In summary, TISCH provides a user-friendly interface for systematically visualizing, searching and downloading gene expression atlas in the TME from multiple cancer types, enabling fast, flexible and comprehensive exploration of the TME.


2020 ◽  
Vol 21 (20) ◽  
pp. 7803
Author(s):  
Julie Miro ◽  
Anne-Laure Bougé ◽  
Eva Murauer ◽  
Emmanuelle Beyne ◽  
Dylan Da Cunha ◽  
...  

The Duchenne muscular dystrophy (DMD) gene has a complex expression pattern regulated by multiple tissue-specific promoters and by alternative splicing (AS) of the resulting transcripts. Here, we used an RNAi-based approach coupled with DMD-targeted RNA-seq to identify RNA-binding proteins (RBPs) that regulate splicing of its skeletal muscle isoform (Dp427m) in a human muscular cell line. A total of 16 RBPs comprising the major regulators of muscle-specific splicing events were tested. We show that distinct combinations of RBPs maintain the correct inclusion in the Dp427m of exons that undergo spatio-temporal AS in other dystrophin isoforms. In particular, our findings revealed the complex networks of RBPs contributing to the splicing of the two short DMD exons 71 and 78, the inclusion of exon 78 in the adult Dp427m isoform being crucial for muscle function. Among the RBPs tested, QKI and DDX5/DDX17 proteins are important determinants of DMD exon inclusion. This is the first large-scale study to determine which RBP proteins act on the physiological splicing of the DMD gene. Our data shed light on molecular mechanisms contributing to the expression of the different dystrophin isoforms, which could be influenced by a change in the function or expression level of the identified RBPs.


2019 ◽  
Author(s):  
Xiaokang Zhang ◽  
Inge Jonassen

AbstractBackgroundWith the cost of DNA sequencing decreasing, increasing amounts of RNA-Seq data are being generated giving novel insight into gene expression and regulation. Prior to analysis of gene expression, the RNA-Seq data has to be processed through a number of steps resulting in a quantification of expression of each gene / transcript in each of the analyzed samples. A number of workflows are available to help researchers perform these steps on their own data, or on public data to take advantage of novel software or reference data in data re-analysis. However, many of the existing workflows are limited to specific types of studies. We therefore aimed to develop a maximally general workflow, applicable to a wide range of data and analysis approaches and at the same time support research on both model and non-model organisms. Furthermore, we aimed to make the workflow usable also for users with limited programming skills.ResultsUtilizing the workflow management system Snakemake and the package management system Conda, we have developed a modular, flexible and user-friendly RNA-Seq analysis pipeline: RNA-Seq Analysis Snakemake Workflow (RASflow). Utilizing Snakemake and Conda alleviates challenges with library dependencies and version conflicts and also supports reproducibility. To be applicable for a wide variety of applications, RASflow supports mapping of reads to both genomic and transcriptomic assemblies. RASflow has a broad range of potential users: it can be applied by researchers interested in any organism and since it requires no programming skills, it can be used by researchers with different backgrounds. RASflow is an open source tool and source code as well as documentation, tutorials and example data sets can be found on GitHub: https://github.com/zhxiaokang/RASflowConclusionsRASflow is a simple and reliable RNA-Seq analysis workflow which is a full pack of RNA-Seq analysis.


2017 ◽  
Author(s):  
Zhuyi Xue ◽  
René L Warren ◽  
Ewan A Gibb ◽  
Daniel MacMillan ◽  
Johnathan Wong ◽  
...  

AbstractAlternative polyadenylation (APA) of 3’ untranslated regions (3’ UTRs) has been implicated in cancer development. Earlier reports on APA in cancer primarily focused on 3’ UTR length modifications, and the conventional wisdom is that tumor cells preferentially express transcripts with shorter 3’ UTRs. Here, we analyzed the APA patterns of 114 genes, a select list of oncogenes and tumor suppressors, in 9,939 tumor and 729 normal tissue samples across 33 cancer types using RNA-Seq data from The Cancer Genome Atlas, and we found that the APA regulation machinery is much more complicated than what was previously thought. We report 77 cases (gene-cancer type pairs) of differential 3’ UTR cleavage patterns between normal and tumor tissues, involving 33 genes in 13 cancer types. For 15 genes, the tumor-specific cleavage patterns are recurrent across multiple cancer types. While the cleavage patterns in certain genes indicate apparent trends of 3’ UTR shortening in tumor samples, over half of the 77 cases imply 3’ UTR length change trends in cancer that are more complex than simple shortening or lengthening. This work extends the current understanding of APA regulation in cancer, and demonstrates how large volumes of RNA-seq data generated for characterizing cancer cohorts can be mined to investigate this process.


2021 ◽  
Author(s):  
Yan Luan ◽  
Yingfei Liu ◽  
Jingwen Xue ◽  
Ke Wang ◽  
Kaige Ma ◽  
...  

The histone H3K27 demethylase UTX participates in regulating multiple cancer types. However, less is known about the UTX function in glioblastoma (GBM). This study aims to define the effect of UTX on GBM. GEPIA2 database analysis showed that UTX expression was significantly increased in GBM and inversely correlated with survival. Knockdown UTX inhibited GBM cell proliferation, migration, and invasion while promoting apoptosis. Moreover, knockdown UTX also hampered tumor growth in the heterotopic xenograft model. RNA-seq combined with qRT-PCR and ChIP-qPCR were used to identify the target genes. The results showed that the UTX-mediated genes were strongly associated with tumor progression and the extracellular environment. Protein-protein interaction analysis suggested that periostin (POSTN) interacted with most of the other UTX-mediated genes. POSTN supplement abolished the effect of UTX knockdown in GBM cells. Furthermore, silencing UTX exhibited similar antitumor effect in patient-derived glioblastoma stem-like cells, while UTX functions were partially restored after exposing POSTN. Our findings may reveal a new insight into the onset of gliomagenesis and progression, providing a promising therapeutic strategy for GBM treatment.


2017 ◽  
Author(s):  
Vijay K. Ulaganathan ◽  
Axel Ullrich

AbstractGenetic heterogeneity in tumours is the bonafide hallmark applicable to all cancer types (Burrell et al, 2013). Furthermore, deregulated ribosome biogenesis and elevated protein biosynthesis have been consistently associated with multiple cancer types (Ruggero, 2012; Ruggero & Pandolfi, 2003). We observed that under cultivation conditions almost all cancer cell types actively shed significant amount of particulates as compared to non-malignant cell lines requiring frequent changing of cultivation media. We therefore asked if cancer cell shed particulates might still retain biological activity associated with protein biosynthesis. Here, we communicate our observations of DNA-dependent protein biosynthetic activity exhibited by the cell-free particulates shed by the cancer cell lines. Using pulsed isotope labelling approach we confirmed the cell-free protein translation activity exhibited by particulates shed by various cancer cell lines. Interestingly, the bioactivity was largely dependent on temperature, pH and on 3’-DNA elements. Our results demonstrate that cancer shed particulates are biologically active and may potentially drive expression of tissue non-specific promoters in distant organs.


2015 ◽  
Vol 5 (1) ◽  
Author(s):  
Li Peng ◽  
Xiu Wu Bian ◽  
Di Kang Li ◽  
Chuan Xu ◽  
Guang Ming Wang ◽  
...  

2020 ◽  
Vol 48 (W1) ◽  
pp. W509-W514 ◽  
Author(s):  
Taiwen Li ◽  
Jingxin Fu ◽  
Zexian Zeng ◽  
David Cohen ◽  
Jing Li ◽  
...  

Abstract Tumor progression and the efficacy of immunotherapy are strongly influenced by the composition and abundance of immune cells in the tumor microenvironment. Due to the limitations of direct measurement methods, computational algorithms are often used to infer immune cell composition from bulk tumor transcriptome profiles. These estimated tumor immune infiltrate populations have been associated with genomic and transcriptomic changes in the tumors, providing insight into tumor–immune interactions. However, such investigations on large-scale public data remain challenging. To lower the barriers for the analysis of complex tumor–immune interactions, we significantly improved our previous web platform TIMER. Instead of just using one algorithm, TIMER2.0 (http://timer.cistrome.org/) provides more robust estimation of immune infiltration levels for The Cancer Genome Atlas (TCGA) or user-provided tumor profiles using six state-of-the-art algorithms. TIMER2.0 provides four modules for investigating the associations between immune infiltrates and genetic or clinical features, and four modules for exploring cancer-related associations in the TCGA cohorts. Each module can generate a functional heatmap table, enabling the user to easily identify significant associations in multiple cancer types simultaneously. Overall, the TIMER2.0 web server provides comprehensive analysis and visualization functions of tumor infiltrating immune cells.


Sign in / Sign up

Export Citation Format

Share Document