scholarly journals Predicting master transcription factors from pan-cancer expression data

2019 ◽  
Author(s):  
Jessica Reddy ◽  
Marcos A. S. Fonseca ◽  
Rosario I Corona ◽  
Robbin Nameki ◽  
Felipe Segato Dezem ◽  
...  

The function of critical developmental regulators can be subverted by cancer cells to control expression of oncogenic transcriptional programs. These "master transcription factors" (MTFs) are often essential for cancer cell survival and represent vulnerabilities that can be exploited therapeutically. The current approaches to identify candidate MTFs examine super-enhancer associated transcription factor-encoding genes with high connectivity in network models. This relies on chromatin immunoprecipitation-sequencing (ChIP-seq) data, which is technically challenging to obtain from primary tumors, and is currently unavailable for many cancer types and clinically relevant subtypes. In contrast, gene expression data are more widely available, especially for rare tumors and subtypes where MTFs have yet to be discovered. We have developed a predictive algorithm called CaCTS (Cancer Core Transcription factor Specificity) to identify candidate MTFs using pan-cancer RNA-sequencing data from The Cancer Genome Atlas. The algorithm identified 273 candidate MTFs across 34 tumor types and recovered known tumor MTFs. We also made novel predictions, including for cancer types and subtypes for which MTFs have not yet been characterized. Clustering based on MTF predictions reproduced anatomic groupings of tumors that share 1-2 lineage-specific candidates, but also dictated functional groupings, such as a squamous group that comprised five tumor subtypes sharing 3 common MTFs. PAX8, SOX17, and MECOM were candidate factors in high-grade serous ovarian cancer (HGSOC), an aggressive tumor type where the core regulatory circuit is currently uncharacterized. PAX8, SOX17, and MECOM are required for cell viability and lie proximal to super-enhancers in HGSOC cells. ChIP-seq revealed that these factors co-occupy HGSOC regulatory elements globally and co-bind at critical gene loci including MUC16 (CA-125). Addiction to these factors was confirmed in studies using THZ1 to inhibit transcription in HGSOC cells, suggesting early down-regulation of these genes may be responsible for cytotoxic effects of THZ1 on HGSOC models. Identification of MTFs across 34 tumor types and 140 subtypes, especially for those with limited understanding of transcriptional drivers paves the way to therapeutic targeting of MTFs in a broad spectrum of cancers.

2016 ◽  
Author(s):  
Isidro Cortes-Ciriano ◽  
Sejoon Lee ◽  
Woong-Yang Park ◽  
Tae-Min Kim ◽  
Peter J. Park

ABSTRACTMicrosatellite instability (MSI) refers to the hypermutability of the cancer genome due to impaired DNA mismatch repair. Although MSI has been studied for decades, the large amount of sequencing data now available allows us to examine the molecular fingerprints of MSI in greater detail. Here, we analyze ~8000 exome and ~1000 whole-genome pairs across 23 cancer types. Our pan-cancer analysis reveals that the prevalence of MSI events is highly variable within and across tumor types including some in which MSI is not typically examined. We also identify genes in DNA repair and oncogenic pathways recurrently subject to MSI and uncover non-coding loci that frequently display MSI events. Finally, we propose an exomebased predictive model for the MSI phenotype that achieves high sensitivity and specificity. These results advance our understanding of the genomic drivers and consequences of MSI, and a comprehensive catalog of tumor-type specific MSI loci we have generated enables efficient panel-based MSI testing to identify patients who are likely to benefit from immunotherapy.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Yuanyuan Li ◽  
David M. Umbach ◽  
Adrienna Bingham ◽  
Qi-Jing Li ◽  
Yuan Zhuang ◽  
...  

Abstract Background Tumor purity is the percent of cancer cells present in a sample of tumor tissue. The non-cancerous cells (immune cells, fibroblasts, etc.) have an important role in tumor biology. The ability to determine tumor purity is important to understand the roles of cancerous and non-cancerous cells in a tumor. Methods We applied a supervised machine learning method, XGBoost, to data from 33 TCGA tumor types to predict tumor purity using RNA-seq gene expression data. Results Across the 33 tumor types, the median correlation between observed and predicted tumor-purity ranged from 0.75 to 0.87 with small root mean square errors, suggesting that tumor purity can be accurately predicted υσινγ expression data. We further confirmed that expression levels of a ten-gene set (CSF2RB, RHOH, C1S, CCDC69, CCL22, CYTIP, POU2AF1, FGR, CCL21, and IL7R) were predictive of tumor purity regardless of tumor type. We tested whether our set of ten genes could accurately predict tumor purity of a TCGA-independent data set. We showed that expression levels from our set of ten genes were highly correlated (ρ = 0.88) with the actual observed tumor purity. Conclusions Our analyses suggested that the ten-gene set may serve as a biomarker for tumor purity prediction using gene expression data.


Cancers ◽  
2019 ◽  
Vol 11 (11) ◽  
pp. 1810 ◽  
Author(s):  
Joe Ibrahim ◽  
Ken Op de Beeck ◽  
Erik Fransen ◽  
Marc Peeters ◽  
Guy Van Camp

Due to the elevated rates of incidence and mortality of cancer, early and accurate detection is crucial for achieving optimal treatment. Molecular biomarkers remain important screening and detection tools, especially in light of novel blood-based assays. DNA methylation in cancer has been linked to tumorigenesis, but its value as a biomarker has not been fully explored. In this study, we have investigated the methylation patterns of the Gasdermin E gene across 14 different tumor types using The Cancer Genome Atlas (TCGA) methylation data (N = 6502). We were able to identify six CpG sites that could effectively distinguish tumors from normal samples in a pan-cancer setting (AUC = 0.86). This combination of pan-cancer biomarkers was validated in six independent datasets (AUC = 0.84–0.97). Moreover, we tested 74,613 different combinations of six CpG probes, where we identified tumor-specific signatures that could differentiate one tumor type versus all the others (AUC = 0.79–0.98). In all, methylation patterns exhibited great variation between cancer and normal tissues, but were also tumor specific. Our analyses highlight that a Gasdermin E methylation biomarker assay, not only has the potential for being a methylation-specific pan-cancer detection marker, but it also possesses the capacity to discriminate between different types of tumors.


2020 ◽  
Vol 223 (14) ◽  
pp. jeb221622
Author(s):  
Sarah M. Ryan ◽  
Kaitie Wildman ◽  
Briseida Oceguera-Perez ◽  
Scott Barbee ◽  
Nathan T. Mortimer ◽  
...  

ABSTRACTAs organisms are constantly exposed to the damaging effects of oxidative stress through both environmental exposure and internal metabolic processes, they have evolved a variety of mechanisms to cope with this stress. One such mechanism is the highly conserved p38 MAPK (p38K) pathway, which is known to be post-translationally activated in response to oxidative stress, resulting in the activation of downstream antioxidant targets. However, little is known about the role of p38K transcriptional regulation in response to oxidative stress. Therefore, we analyzed the p38K gene family across the genus Drosophila to identify conserved regulatory elements. We found that oxidative stress exposure results in increased p38K protein levels in multiple Drosophila species and is associated with increased oxidative stress resistance. We also found that the p38Kb genomic locus includes conserved AP-1 and lola-PT transcription factor consensus binding sites. Accordingly, over-expression of these transcription factors in D. melanogaster is sufficient to induce transcription of p38Kb and enhances resistance to oxidative stress. We further found that the presence of a putative lola-PT binding site in the p38Kb locus of a given species is predictive of the species' survival in response to oxidative stress. Through our comparative genomics approach, we have identified biologically relevant putative transcription factor binding sites that regulate the expression of p38Kb and are associated with resistance to oxidative stress. These findings reveal a novel mode of regulation for p38K genes and suggest that transcription may play as important a role in p38K-mediated stress responses as post-translational modifications.


Cancers ◽  
2019 ◽  
Vol 11 (10) ◽  
pp. 1546 ◽  
Author(s):  
Alena Kopkova ◽  
Jiri Sana ◽  
Tana Machackova ◽  
Marek Vecera ◽  
Lenka Radova ◽  
...  

Central nervous system (CNS) malignancies include primary tumors that originate within the CNS as well as secondary tumors that develop as a result of metastatic spread. Circulating microRNAs (miRNAs) were found in almost all human body fluids including cerebrospinal fluid (CSF), and they seem to be highly stable and resistant to even extreme conditions. The overall aim of our study was to identify specific CSF miRNA patterns that could differentiate among brain tumors. These new biomarkers could potentially aid borderline or uncertain imaging results onto diagnosis of CNS malignancies, avoiding most invasive procedures such as stereotactic biopsy or biopsy. In total, 175 brain tumor patients (glioblastomas, low-grade gliomas, meningiomas and brain metastases), and 40 non-tumor patients with hydrocephalus as controls were included in this prospective monocentric study. Firstly, we performed high-throughput miRNA profiling (Illumina small RNA sequencing) on a discovery cohort of 70 patients and 19 controls and identified specific miRNA signatures of all brain tumor types tested. Secondly, validation of 9 candidate miRNAs was carried out on an independent cohort of 105 brain tumor patients and 21 controls using qRT-PCR. Based on the successful results of validation and various combination patterns of only 5 miRNA levels (miR-30e, miR-140, let-7b, mR-10a and miR-21-3p) we proposed CSF-diagnostic scores for each tumor type which enabled to distinguish them from healthy donors and other tumor types tested. In addition to this primary diagnostic tool, we described the prognostic potential of the combination of miR-10b and miR-196b levels in CSF of glioblastoma patients. In conclusion, we performed the largest study so far focused on CSF miRNA profiling in patients with brain tumors, and we believe that this new class of biomarkers have a strong potential as a diagnostic and prognostic tool in these patients.


2017 ◽  
Vol 35 (15_suppl) ◽  
pp. e23141-e23141
Author(s):  
Juan Carlos Malpartida ◽  
Eric Vick ◽  
Noah Hunter Richardson ◽  
Kruti Patel ◽  
Matthew K Stein ◽  
...  

e23141 Background: Discovered as a novel aberration in congenital fibrosarcoma (CF), the ETV6-NTRK3 translocation (EN) confers oncogenic potential and is inhibited by crizotinib. The present study aims to survey the scope of neoplasms that harbor EN across tumor types. Methods: Utilizing the National Cancer Institute’s Mitelman Database (MD) of Chromosome Aberrations and Gene Fusions patients (pts) were identified with EN and categorized based on tumor type, subtype and incidence. Cancer pts who received tumor profiling with Caris were also surveyed for EN. Results: 47 pts with EN across 12 cancer types were extracted from the MD and had median age of 0.17 years (7 unreported); 38% male; 51% acquired malignancies, 49% congenital; 62% cases were pediatric, 23% adult and 15% unknown. 0/204 pts with Caris tumor profiling were found to have an EN. Cancers with the highest number of EN were: 15 (31.9% EN data set) congenital mesoblastic nephromas (CMN), 10 (21.3%) CF, 7 (14.9%) breast carcinoma (BC; 6 secretory ductal carcinoma (SD) and 1 invasive adenocarcinoma (IA)) and 3 (6.4%) colorectal carcinoma (CRC). EN were found in 8 other malignancies (Table 1). Cancer types with the highest incidence of EN+ cases in the MD were gastrointestinal stromal tumor (GIST; 100%), CMN (75%) and CF (23.3%). Conclusions: These results further our understanding of the distribution of ETV6-NTRK3 translocations in multiple tumor types across the age spectrum and suggest that pts with CMN, CF, BC and CRC requiring high order therapy should be considered for NTRK3-based treatment. [Table: see text]


2007 ◽  
Vol 4 (2) ◽  
pp. 1-23
Author(s):  
Amitava Karmaker ◽  
Kihoon Yoon ◽  
Mark Doderer ◽  
Russell Kruzelock ◽  
Stephen Kwek

Summary Revealing the complex interaction between trans- and cis-regulatory elements and identifying these potential binding sites are fundamental problems in understanding gene expression. The progresses in ChIP-chip technology facilitate identifying DNA sequences that are recognized by a specific transcription factor. However, protein-DNA binding is a necessary, but not sufficient, condition for transcription regulation. We need to demonstrate that their gene expression levels are correlated to further confirm regulatory relationship. Here, instead of using a linear correlation coefficient, we used a non-linear function that seems to better capture possible regulatory relationships. By analyzing tissue-specific gene expression profiles of human and mouse, we delineate a list of pairs of transcription factor and gene with highly correlated expression levels, which may have regulatory relationships. Using two closely-related species (human and mouse), we perform comparative genome analysis to cross-validate the quality of our prediction. Our findings are confirmed by matching publicly available TFBS databases (like TRANFAC and ConSite) and by reviewing biological literature. For example, according to our analysis, 80% and 85.71% of the targets genes associated with E2F5 and RELB transcription factors have the corresponding known binding sites. We also substantiated our results on some oncogenes with the biomedical literature. Moreover, we performed further analysis on them and found that BCR and DEK may be regulated by some common transcription factors. Similar results for BTG1, FCGR2B and LCK genes were also reported.


2021 ◽  
Author(s):  
Jonathan P. Karr ◽  
John J. Ferrie ◽  
Robert Tjian ◽  
Xavier Darzacq

How distal cis-regulatory elements (e.g., enhancers) communicate with promoters remains an unresolved question of fundamental importance. Although transcription factors and cofactors are known to mediate this communication, the mechanism by which diffusible molecules relay regulatory information from one position to another along the chromosome is a biophysical puzzle—one that needs to be revisited in light of recent data that cannot easily fit into previous solutions. Here we propose a new model that diverges from the textbook enhancer–promoter looping paradigm and offer a synthesis of the literature to make a case for its plausibility, focusing on the coactivator p300.


2018 ◽  
Author(s):  
Boyu Lyu ◽  
Anamul Haque

ABSTRACTDifferential analysis occupies the most significant portion of the standard practices of RNA-Seq analysis. However, the conventional method is matching the tumor samples to the normal samples, which are both from the same tumor type. The output using such method would fail in differentiating tumor types because it lacks the knowledge from other tumor types. Pan-Cancer Atlas provides us with abundant information on 33 prevalent tumor types which could be used as prior knowledge to generate tumor-specific biomarkers. In this paper, we embedded the high dimensional RNA-Seq data into 2-D images and used a convolutional neural network to make classification of the 33 tumor types. The final accuracy we got was 95.59%, higher than another paper applying GA/KNN method on the same dataset. Based on the idea of Guided Grad Cam, as to each class, we generated significance heat-map for all the genes. By doing functional analysis on the genes with high intensities in the heat-maps, we validated that these top genes are related to tumor-specific pathways, and some of them have already been used as biomarkers, which proved the effectiveness of our method. As far as we know, we are the first to apply convolutional neural network on Pan-Cancer Atlas for classification, and we are also the first to match the significance of classification with the importance of genes. Our experiment results show that our method has a good performance and could also apply in other genomics data.


Sign in / Sign up

Export Citation Format

Share Document