scholarly journals MAGIC: A tool for predicting transcription factors and cofactors binding sites in gene sets using ENCODE data

2018 ◽  
Author(s):  
A Roopra

ABSTRACTTranscriptomic profiling is an immensly powerful hypothesis generating tool. Whether one is comparing an experimental versus control condition or collecting transcriptomes from cohorts of disease tissue, it is often necessary to determine which transcription factors (TFs) and cofactors drive programs of gene expression in the datasets. Most available tools rely on searching for TF binding motifs near promoters of genes in a gene set. This approach can work well for TFs with extended recognition elements but is less useful for shorter elements and does not work at all for cofactors. The Encyclopedia Of DNA Elements (ENCODE) archives ChIPseq tracks of 169 TFs and cofactors assayed in 91 cells lines. The algorithm presented herein, Multiple Aligned Genomic Integration of ChIP (MAGIC), uses ENCODE ChIPseq data to look for statistical enrichment of TFs and cofactors in gene bodies and flanking regions in gene sets. When compared to a commonly used web resource o-Possum, MAGIC was able to more accurately predict TFs and cofactors that drive gene changes in 3 settings: 1) A cell line expressing or lacking REST, 2) Breast tumors divided along PAM50 designations and 3) Whole brain samples from WT mice or mice lacking CTCF in a particular neuronal subtype. In summary, MAGIC is a standalone application that runs on OSX machines and has a simple interface that produces meaningful predictions of which TFs and cofactors are enriched in a gene set.

2016 ◽  
Vol 473 (13) ◽  
pp. 1967-1976 ◽  
Author(s):  
Katia Peñuelas-Urquides ◽  
Carolina Becerril-Esquivel ◽  
Laura C. Mendoza-de-León ◽  
Beatriz Silva-Ramírez ◽  
José Dávila-Velderrain ◽  
...  

Dystrophin Dp71, the smallest product encoded by the Duchenne muscular dystrophy gene, is ubiquitously expressed in all non-muscle cells. Although Dp71 is involved in various cellular processes, the mechanisms underlying its expression have been little studied. In hepatic cells, Dp71 expression is down-regulated by the xenobiotic β-naphthoflavone. However, the effectors of this regulation remain unknown. In the present study we aimed at identifying DNA elements and transcription factors involved in Dp71 expression in hepatic cells. Relevant DNA elements on the Dp71 promoter were identified by comparing Dp71 5′-end flanking regions between species. The functionality of these elements was demonstrated by site-directed mutagenesis. Using EMSAs and ChIP, we showed that the Sp1 (specificity protein 1), Sp3 (specificity protein 3) and YY1 (Yin and Yang 1) transcription factors bind to the Dp71 promoter region. Knockdown of Sp1, Sp3 and YY1 in hepatic cells increased endogenous Dp71 expression, but reduced Dp71 promoter activity. In summary, Dp71 expression in hepatic cells is carried out, in part, by YY1-, Sp1- and Sp3-mediated transcription from the Dp71 promoter.


2018 ◽  
Author(s):  
Zhenjia Wang ◽  
Mete Civelek ◽  
Clint L. Miller ◽  
Nathan C. Sheffield ◽  
Michael J. Guertin ◽  
...  

AbstractSummaryIdentification of functional transcription factors that regulate a given gene set is an important problem in gene regulation studies. Conventional approaches for identifying transcription factors, such as DNA sequence motif analysis, are unable to predict functional binding of specific factors and not sensitive to detect factors binding at distal enhancers. Here we present Binding Analysis for Regulation of Transcription (BART), a novel computational method and software package for predicting functional transcription factors that regulate a query gene set or associate with a query genomic profile, based on more than 6,000 existing ChIP-seq datasets for over 400 factors in human or mouse. This method demonstrates the advantage of utilizing publicly available data for functional genomics research.AvailabilityBART is implemented in Python and available at http://faculty.virginia.edu/zanglab/bartContact: [email protected]


2018 ◽  
Vol 21 (2) ◽  
pp. 74-83
Author(s):  
Tzu-Hung Hsiao ◽  
Yu-Chiao Chiu ◽  
Yu-Heng Chen ◽  
Yu-Ching Hsu ◽  
Hung-I Harry Chen ◽  
...  

Aim and Objective: The number of anticancer drugs available currently is limited, and some of them have low treatment response rates. Moreover, developing a new drug for cancer therapy is labor intensive and sometimes cost prohibitive. Therefore, “repositioning” of known cancer treatment compounds can speed up the development time and potentially increase the response rate of cancer therapy. This study proposes a systems biology method for identifying new compound candidates for cancer treatment in two separate procedures. Materials and Methods: First, a “gene set–compound” network was constructed by conducting gene set enrichment analysis on the expression profile of responses to a compound. Second, survival analyses were applied to gene expression profiles derived from four breast cancer patient cohorts to identify gene sets that are associated with cancer survival. A “cancer–functional gene set– compound” network was constructed, and candidate anticancer compounds were identified. Through the use of breast cancer as an example, 162 breast cancer survival-associated gene sets and 172 putative compounds were obtained. Results: We demonstrated how to utilize the clinical relevance of previous studies through gene sets and then connect it to candidate compounds by using gene expression data from the Connectivity Map. Specifically, we chose a gene set derived from a stem cell study to demonstrate its association with breast cancer prognosis and discussed six new compounds that can increase the expression of the gene set after the treatment. Conclusion: Our method can effectively identify compounds with a potential to be “repositioned” for cancer treatment according to their active mechanisms and their association with patients’ survival time.


2020 ◽  
Vol 15 ◽  
Author(s):  
Chen-An Tsai ◽  
James J. Chen

Background: Gene set enrichment analyses (GSEA) provide a useful and powerful approach to identify differentially expressed gene sets with prior biological knowledge. Several GSEA algorithms have been proposed to perform enrichment analyses on groups of genes. However, many of these algorithms have focused on identification of differentially expressed gene sets in a given phenotype. Objective: In this paper, we propose a gene set analytic framework, Gene Set Correlation Analysis (GSCoA), that simultaneously measures within and between gene sets variation to identify sets of genes enriched for differential expression and highly co-related pathways. Methods: We apply co-inertia analysis to the comparisons of cross-gene sets in gene expression data to measure the costructure of expression profiles in pairs of gene sets. Co-inertia analysis (CIA) is one multivariate method to identify trends or co-relationships in multiple datasets, which contain the same samples. The objective of CIA is to seek ordinations (dimension reduction diagrams) of two gene sets such that the square covariance between the projections of the gene sets on successive axes is maximized. Simulation studies illustrate that CIA offers superior performance in identifying corelationships between gene sets in all simulation settings when compared to correlation-based gene set methods. Result and Conclusion: We also combine between-gene set CIA and GSEA to discover the relationships between gene sets significantly associated with phenotypes. In addition, we provide a graphical technique for visualizing and simultaneously exploring the associations of between and within gene sets and their interaction and network. We then demonstrate integration of within and between gene sets variation using CIA and GSEA, applied to the p53 gene expression data using the c2 curated gene sets. Ultimately, the GSCoA approach provides an attractive tool for identification and visualization of novel associations between pairs of gene sets by integrating co-relationships between gene sets into gene set analysis.


2019 ◽  
Vol 8 (10) ◽  
pp. 1580 ◽  
Author(s):  
Kyoung Min Moon ◽  
Kyueng-Whan Min ◽  
Mi-Hye Kim ◽  
Dong-Hoon Kim ◽  
Byoung Kwan Son ◽  
...  

Ninety percent of patients with scrub typhus (SC) with vasculitis-like syndrome recover after mild symptoms; however, 10% can suffer serious complications, such as acute respiratory failure (ARF) and admission to the intensive care unit (ICU). Predictors for the progression of SC have not yet been established, and conventional scoring systems for ICU patients are insufficient to predict severity. We aimed to identify simple and robust indicators to predict aggressive behaviors of SC. We evaluated 91 patients with SC and 81 non-SC patients who were admitted to the ICU, and 32 cases from the public functional genomics data repository for gene expression analysis. We analyzed the relationships between several predictors and clinicopathological characteristics in patients with SC. We performed gene set enrichment analysis (GSEA) to identify SC-specific gene sets. The acid-base imbalance (ABI), measured 24 h before serious complications, was higher in patients with SC than in non-SC patients. A high ABI was associated with an increased incidence of ARF, leading to mechanical ventilation and worse survival. GSEA revealed that SC correlated to gene sets reflecting inflammation/apoptotic response and airway inflammation. ABI can be used to indicate ARF in patients with SC and assist with early detection.


2018 ◽  
Vol 314 (4) ◽  
pp. L617-L625 ◽  
Author(s):  
Arjun Mohan ◽  
Anagha Malur ◽  
Matthew McPeek ◽  
Barbara P. Barna ◽  
Lynn M. Schnapp ◽  
...  

To advance our understanding of the pathobiology of sarcoidosis, we developed a multiwall carbon nanotube (MWCNT)-based murine model that shows marked histological and inflammatory signal similarities to this disease. In this study, we compared the alveolar macrophage transcriptional signatures of our animal model with human sarcoidosis to identify overlapping molecular programs. Whole genome microarrays were used to assess gene expression of alveolar macrophages in six MWCNT-exposed and six control animals. The results were compared with the transcriptional profiles of alveolar immune cells in 15 sarcoidosis patients and 12 healthy humans. Rigorous statistical methods were used to identify differentially expressed genes. To better elucidate activated pathways, integrated network and gene set enrichment analysis (GSEA) was performed. We identified over 1,000 differentially expressed between control and MWCNT mice. Gene ontology functional analysis showed overrepresentation of processes primarily involved in immunity and inflammation in MCWNT mice. Applying GSEA to both mouse and human samples revealed upregulation of 92 gene sets in MWCNT mice and 142 gene sets in sarcoidosis patients. Commonly activated pathways in both MWCNT mice and sarcoidosis included adaptive immunity, T-cell signaling, IL-12/IL-17 signaling, and oxidative phosphorylation. Differences in gene set enrichment between MWCNT mice and sarcoidosis patients were also observed. We applied network analysis to differentially expressed genes common between the MWCNT model and sarcoidosis to identify key drivers of disease. In conclusion, an integrated network and transcriptomics approach revealed substantial functional similarities between a murine model and human sarcoidosis particularly with respect to activation of immune-specific pathways.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Mike Fang ◽  
Brian Richardson ◽  
Cheryl M. Cameron ◽  
Jean-Eudes Dazard ◽  
Mark J. Cameron

Abstract Background In this study, we demonstrate that our modified Gene Set Enrichment Analysis (GSEA) method, drug perturbation GSEA (dpGSEA), can detect phenotypically relevant drug targets through a unique transcriptomic enrichment that emphasizes biological directionality of drug-derived gene sets. Results We detail our dpGSEA method and show its effectiveness in detecting specific perturbation of drugs in independent public datasets by confirming fluvastatin, paclitaxel, and rosiglitazone perturbation in gastroenteropancreatic neuroendocrine tumor cells. In drug discovery experiments, we found that dpGSEA was able to detect phenotypically relevant drug targets in previously published differentially expressed genes of CD4+T regulatory cells from immune responders and non-responders to antiviral therapy in HIV-infected individuals, such as those involved with virion replication, cell cycle dysfunction, and mitochondrial dysfunction. dpGSEA is publicly available at https://github.com/sxf296/drug_targeting. Conclusions dpGSEA is an approach that uniquely enriches on drug-defined gene sets while considering directionality of gene modulation. We recommend dpGSEA as an exploratory tool to screen for possible drug targeting molecules.


2020 ◽  
Vol 86 (9) ◽  
Author(s):  
Gaili Fan ◽  
Huawei Zheng ◽  
Kai Zhang ◽  
Veena Devi Ganeshan ◽  
Stephen Obol Opiyo ◽  
...  

ABSTRACT The homeobox gene family of transcription factors (HTF) controls many developmental pathways and physiological processes in eukaryotes. We previously showed that a conserved HTF in the plant-pathogenic fungus Fusarium graminearum, Htf1 (FgHtf1), regulates conidium morphology in that organism. This study investigated the mechanism of FgHtf1-mediated regulation and identified putative FgHtf1 target genes by a chromatin immunoprecipitation assay combined with parallel DNA sequencing (ChIP-seq) and RNA sequencing. A total of 186 potential binding peaks, including 142 genes directly regulated by FgHtf1, were identified. Subsequent motif prediction analysis identified two DNA-binding motifs, TAAT and CTTGT. Among the FgHtf1 target genes were FgHTF1 itself and several important conidiation-related genes (e.g., FgCON7), the chitin synthase pathway genes, and the aurofusarin biosynthetic pathway genes. In addition, FgHtf1 may regulate the cAMP-protein kinase A (PKA)-Msn2/4 and Ca2+-calcineurin-Crz1 pathways. Taken together, these results suggest that, in addition to autoregulation, FgHtf1 also controls global gene expression and promotes a shift to aerial growth and conidiation in F. graminearum by activation of FgCON7 or other conidiation-related genes. IMPORTANCE The homeobox gene family of transcription factors is known to be involved in the development and conidiation of filamentous fungi. However, the regulatory mechanisms and downstream targets of homeobox genes remain unclear. FgHtf1 is a homeobox transcription factor that is required for phialide development and conidiogenesis in the plant pathogen F. graminearum. In this study, we identified FgHtf1-controlled target genes and binding motifs. We found that, besides autoregulation, FgHtf1 also controls global gene expression and promotes conidiation in F. graminearum by activation of genes necessary for aerial growth, FgCON7, and other conidiation-related genes.


Sign in / Sign up

Export Citation Format

Share Document