scholarly journals Efficient Representations of Tumor Diversity with Paired DNA-RNA Aberrations

2020 ◽  
Author(s):  
Qian Ke ◽  
Wikum Dinalankara ◽  
Laurent Younes ◽  
Donald Geman ◽  
Luigi Marchionni

AbstractCancer cells display massive dysregulation of key regulatory pathways due to now well-catalogued mutations and other DNA-related aberrations. Moreover, enormous heterogeneity has been commonly observed in the identity, frequency and location of these aberrations across individuals with the same cancer type or subtype, and this variation naturally propagates to the transcriptome, resulting in myriad types of dysregulated gene expression programs. Many have argued that a more integrative and quantitative analysis of heterogeneity of DNA and RNA molecular profiles may be necessary for designing more systematic explorations of alternative therapies and improving predictive accuracy.We introduce a representation of multi-omics profiles which is sufficiently rich to account for observed heterogeneity and support the construction of quantitative, integrated, metrics of variation. Starting from the network of interactions existing in Reactome, we build a library of “paired DNA-RNA aberrations” that represent prototypical and recurrent patterns of dysregulation in cancer; each two-gene “Source-Target Pair” (STP) consists of a “source” regulatory gene and a “target” gene whose expression is plausibly “controlled” by the source gene. The STP is then “aberrant” in a joint DNA-RNA profile if the source gene is DNA-aberrant (e.g., mutated, deleted, or duplicated), and the downstream target gene is “RNA-aberrant”, meaning its expression level is outside the normal, baseline range. With M STPs, each sample profile has exactly one of the 2M possible configurations.We concentrate on subsets of STPs, and the corresponding reduced configurations, by selecting tissue-dependent minimal coverings, defined as the smallest family of STPs with the property that every sample in the considered population displays at least one aberrant STP within that family. These minimal coverings can be computed with integer programming. Given such a covering, a natural measure of cross-sample diversity is the extent to which the particular aberrant STPs composing a covering vary from sample to sample; this variability is captured by the entropy of the distribution over configurations.We apply this program to data from TCGA for six distinct tumor types (breast, prostate, lung, colon, liver, and kidney cancer). This enables an efficient simplification of the complex landscape observed in cancer populations, resulting in the identification of novel signatures of molecular alterations which are not detected with frequency-based criteria. Estimates of cancer heterogeneity across tumor phenotypes reveals a stable pattern: entropy increases with disease severity. This framework is then well-suited to accommodate the expanding complexity of cancer genomes and epigenomes emerging from large consortia projects.Author SummaryA large variety of genomic and transcriptomic aberrations are observed in cancer cells, and their identity, location, and frequency can be highly indicative of the particular subtype or molecular phenotype, and thereby inform treatment options. However, elucidating this association between sets of aberrations and subtypes of cancer is severely impeded by considerable diversity in the set of aberrations across samples from the same population. Most attempts at analyzing tumor heterogeneity have dealt with either the genome or transcriptome in isolation. Here we present a novel, multi-omics approach for quantifying heterogeneity by determining a small set of paired DNA-RNA aberrations that incorporates potential downstream effects on gene expression. We apply integer programming to identify a small set of paired aberrations such that at least one among them is present in every sample of a given cancer population. The resulting “coverings” are analyzed for six cancer cohorts from the Cancer Genome Atlas, and facilitate introducing an information-theoretic measure of heterogeneity. Our results identify many known facets of tumorigenesis as well as suggest potential novel genes and interactions of interest.Data Availability StatementRNA-Seq data, somatic mutation data and copy number data for The Cancer Genome Atlas were obtained through the Xena Cancer Genome Browser database (https://xenabrowser.net) from individual cancer type cohorts. Processed data in the form of TAB delimited files, and selected tissue-level coverings (in excel format) are provided as additional supplementary material and are also available from the Marchionni laboratory website (www.marchionnilab.org/signatures.html)

2018 ◽  
Author(s):  
SR Rosario ◽  
MD Long ◽  
HC Affronti ◽  
AM Rowsam ◽  
KH Eng ◽  
...  

AbstractUnderstanding the levels of metabolic dysregulation in different disease settings is vital for the safe and effective incorporation of metabolism-targeted therapeutics in the clinic. Using transcriptomic data from 10,704 tumor and normal samples from The Cancer Genome Atlas, across 26 disease sites, we developed a novel bioinformatics pipeline that distinguishes tumor from normal tissues, based on differential gene expression for 114 metabolic pathways. This pathway dysregulation was confirmed in separate patient populations, further demonstrating the robustness of this approach. A bootstrapping simulation was then applied to assess whether these alterations were biologically meaningful, rather than expected by chance. We provide distinct examples of the types of analysis that can be accomplished with this tool to understand cancer specific metabolic dysregulation, highlighting novel pathways of interest in both common and rare disease sites. Utilizing a pathway mapping approach to understand patterns of metabolic flux, differential drug sensitivity, can accurately be predicted. Further, the identification of Master Metabolic Transcriptional Regulators, whose expression was highly correlated with pathway gene expression, explains why metabolic differences exist in different disease sites. We demonstrate these also have the ability to segregate patient populations and predict responders to different metabolism-targeted therapeutics.


2018 ◽  
Vol 33 (3) ◽  
pp. 293-300 ◽  
Author(s):  
Min-hang Zhou ◽  
Hong-wei Zhou ◽  
Mo Liu ◽  
Jun-zhong Sun

Purpose: The role of microRNA (miRNA) in cholangiocarcinoma was not clear. The aim of this study was to find the potential diagnostic and prognostic miRNA in cholangiocarcinoma patients. Methods: The miRNA expression profiles in cholangiocarcinoma patients from The Cancer Genome Atlas and Gene Expression Omnibus (GSE53870) were analyzed. The comparison of overall survival was performed using the Kaplan–Meier method. The targeted genes of prognostic miRNA were identified in miRanda, PicTar, or TargetScan, and their cell signaling pathways were analyzed by the Database for Annotation, Visualization and Integrated Discovery. Results: In The Cancer Genome Atlas and the Gene Expression Omnibus miRNA dataset, miR-92b and miR-99a were found with concordant directionality, up-regulated and down-regulated, respectively. In The Cancer Genome Atlas survival data, patients with the high level of miR-99b had obviously shorter overall survival time ( P=0.038). However, the level of miR-99a was not found to be significant. The 17 shared target genes of miR-92b were identified, such as DAB21IP, BCL21L11, SPHK2, PER2, and TSC1. The related pathways included positive regulation of transcription, positive regulation of cellular biosynthetic process, regulation of programmed cell death, etc. Conclusion: miR-92b was up-regulated in cholangiocarcinoma compared with normal controls. The high level of miR-92b was associated with adverse outcomes in cholangiocarcinoma patients, which might be partly explained by the targeted genes of miR-92b and their signaling pathways.


2018 ◽  
pp. 1-19 ◽  
Author(s):  
Lawrence N. Kwong ◽  
Mariana Petaccia De Macedo ◽  
Lauren Haydu ◽  
Aron Y. Joon ◽  
Michael T. Tetzlaff ◽  
...  

Purpose Initiatives such as The Cancer Genome Atlas and International Cancer Genome Consortium have generated high-quality, multiplatform molecular data from thousands of frozen tumor samples. Although these initiatives have provided invaluable insight into cancer biology, a tremendous potential resource remains largely untapped in formalin-fixed, paraffin-embedded (FFPE) samples that are more readily available but which can present technical challenges because of crosslinking of fragile molecules such as RNA. Materials and Methods We extracted RNA from FFPE primary melanomas and assessed two gene expression platforms—genome-wide RNA sequencing and targeted NanoString—for their ability to generate coherent biologic signals. To do so, we generated an improved approach to quantifying gene expression pathways. We refined pathway scores through correlation-guided gene subsetting. We also make comparisons to The Cancer Genome Atlas and other publicly available melanoma datasets. Results The comparison of the gene expression patterns to each other, to established biologic modules, and to clinical and immunohistochemical data confirmed the fidelity of biologic signals from both platforms using FFPE samples to known biology. Moreover, correlations with patient outcome data were consistent with previous frozen-tissue–based studies. Conclusion FFPE samples from previously difficult-to-access cancer types, such as small primary melanomas, represent a valuable and previously unexploited source of analyte for RNA sequencing and NanoString platforms. This work provides an important step toward the use of such platforms to unlock novel molecular underpinnings and inform future biologically driven clinical decisions.


2021 ◽  
Vol 15 (1) ◽  
pp. 29-41
Author(s):  
Peng Qiao ◽  
Di Zhang ◽  
Song Zeng ◽  
Yicun Wang ◽  
Biao Wang ◽  
...  

Aim: This study aims to identify novel marker to predict biochemical recurrence (BCR) in prostate cancer patients after radical prostatectomy with negative surgical margin. Materials & methods: The Cancer Genome Atlas database, Gene Expression Omnibus database and Cancer Cell Line Encyclopedia database were employed. The ensemble support vector machine-recursive feature elimination method was performed to select crucial gene for BCR. Results: We identified MYLK as a novel and independent biomarker for BCR in The Cancer Genome Atlas training cohort and confirmed in four independent Gene Expression Omnibus validation cohorts. Multi-omic analysis suggested that MYLK was a DNA methylation-driven gene. Additionally, MYLK had significant positive correlations with immune infiltrations. Conclusion: MYLK was identified and validated as a novel, robust and independent biomarker for BCR in prostate cancer.


Sign in / Sign up

Export Citation Format

Share Document