scholarly journals CRUP: a comprehensive framework to predict condition-specific regulatory units

2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Anna Ramisch ◽  
Verena Heinrich ◽  
Laura V. Glaser ◽  
Alisa Fuchs ◽  
Xinyi Yang ◽  
...  

Abstract We present the software Condition-specific Regulatory Units Prediction (CRUP) to infer from epigenetic marks a list of regulatory units consisting of dynamically changing enhancers with their target genes. The workflow consists of a novel pre-trained enhancer predictor that can be reliably applied across cell types and species, solely based on histone modification ChIP-seq data. Enhancers are subsequently assigned to different conditions and correlated with gene expression to derive regulatory units. We thoroughly test and then apply CRUP to a rheumatoid arthritis model, identifying enhancer-gene pairs comprising known disease genes as well as new candidate genes.

2018 ◽  
Author(s):  
Anna Ramisch ◽  
Verena Heinrich ◽  
Laura V. Glaser ◽  
Alisa Fuchs ◽  
Xinyi Yang ◽  
...  

AbstractWe present the software CRUP (Condition-specific Regulatory Units Prediction) to infer from epigenetic marks a list of regulatory units consisting of dynamically changing enhancers with their target genes. The workflow consists of a novel pre-trained enhancer predictor that can be reliably applied across cell lines and species, solely based on histone modification ChIP-seq data. Enhancers are subsequently assigned to different conditions and correlated with gene expression to derive regulatory units. We thoroughly test and then apply CRUP to a rheumatoid arthritis model, identifying enhancer-gene pairs comprising known disease genes as well as new candidate genes.Availabilityhttps://github.com/VerenaHeinrich/CRUP


Development ◽  
2000 ◽  
Vol 127 (15) ◽  
pp. 3305-3312 ◽  
Author(s):  
H.L. Ashe ◽  
M. Mannervik ◽  
M. Levine

The dorsal ectoderm of the Drosophila embryo is subdivided into different cell types by an activity gradient of two TGF(β) signaling molecules, Decapentaplegic (Dpp) and Screw (Scw). Patterning responses to this gradient depend on a secreted inhibitor, Short gastrulation (Sog) and a newly identified transcriptional repressor, Brinker (Brk), which are expressed in neurogenic regions that abut the dorsal ectoderm. Here we examine the expression of a number of Dpp target genes in transgenic embryos that contain ectopic stripes of Dpp, Sog and Brk expression. These studies suggest that the Dpp/Scw activity gradient directly specifies at least three distinct thresholds of gene expression in the dorsal ectoderm of gastrulating embryos. Brk was found to repress two target genes, tailup and pannier, that exhibit different limits of expression within the dorsal ectoderm. These results suggest that the Sog inhibitor and Brk repressor work in concert to establish sharp dorsolateral limits of gene expression. We also present evidence that the activation of Dpp/Scw target genes depends on the Drosophila homolog of the CBP histone acetyltransferase.


2019 ◽  
Author(s):  
Jing Yang ◽  
Amanda McGovern ◽  
Paul Martin ◽  
Kate Duffus ◽  
Xiangyu Ge ◽  
...  

AbstractGenome-wide association studies have identified genetic variation contributing to complex disease risk. However, assigning causal genes and mechanisms has been more challenging because disease-associated variants are often found in distal regulatory regions with cell-type specific behaviours. Here, we collect ATAC-seq, Hi-C, Capture Hi-C and nuclear RNA-seq data in stimulated CD4+ T-cells over 24 hours, to identify functional enhancers regulating gene expression. We characterise changes in DNA interaction and activity dynamics that correlate with changes gene expression, and find that the strongest correlations are observed within 200 kb of promoters. Using rheumatoid arthritis as an example of T-cell mediated disease, we demonstrate interactions of expression quantitative trait loci with target genes, and confirm assigned genes or show complex interactions for 20% of disease associated loci, including FOXO1, which we confirm using CRISPR/Cas9.


2020 ◽  
Author(s):  
SK Reilly ◽  
SJ Gosai ◽  
A Gutierrez ◽  
JC Ulirsch ◽  
M Kanai ◽  
...  

AbstractCRISPR screens for cis-regulatory elements (CREs) have shown unprecedented power to endogenously characterize the non-coding genome. To characterize CREs we developed HCR-FlowFISH (Hybridization Chain Reaction Fluorescent In-Situ Hybridization coupled with Flow Cytometry), which directly quantifies native transcripts within their endogenous loci following CRISPR perturbations of regulatory elements, eliminating the need for restrictive phenotypic assays such as growth or transcript-tagging. HCR-FlowFISH accurately quantifies gene expression across a wide range of transcript levels and cell types. We also developed CASA (CRISPR Activity Screen Analysis), a hierarchical Bayesian model to identify and quantify CRE activity. Using >270,000 perturbations, we identified CREs for GATA1, HDAC6, ERP29, LMO2, MEF2C, CD164, NMU, FEN1 and the FADS gene cluster. Our methods detect subtle gene expression changes and identify CREs regulating multiple genes, sometimes at different magnitudes and directions. We demonstrate the power of HCR-FlowFISH to parse genome-wide association signals by nominating causal variants and target genes.


Author(s):  
Nurlan Kerimov ◽  
James D Hayhurst ◽  
Kateryna Peikova ◽  
Jonathan R Manning ◽  
Peter Walter ◽  
...  

An increasing number of gene expression quantitative trait locus (eQTL) studies have made summary statistics publicly available, which can be used to gain insight into complex human traits by downstream analyses, such as fine mapping and colocalisation. However, differences between these datasets, in their variants tested, allele codings, and in the transcriptional features quantified, are a barrier to their widespread use. Consequently, target genes for most GWAS signals have still not been identified. Here, we present the eQTL Catalogue (https://www.ebi.ac.uk/eqtl/), a resource which contains quality controlled, uniformly re-computed QTLs from 21 eQTL studies. We find that for matching cell types and tissues, the eQTL effect sizes are highly reproducible between studies, enabling the integrative analysis of these data. Although most cis-eQTLs were shared between most bulk tissues, the analysis of purified cell types identified a greater diversity of cell-type-specific eQTLs, a subset of which also manifested as novel disease colocalisations. Our summary statistics can be downloaded by FTP, accessed via a REST API, and visualised on the Ensembl genome browser. New datasets will continuously be added to the eQTL Catalogue, enabling the systematic interpretation of human GWAS associations across many cell types and tissues.


2021 ◽  
Vol 53 (9) ◽  
pp. 1290-1299
Author(s):  
Nurlan Kerimov ◽  
James D. Hayhurst ◽  
Kateryna Peikova ◽  
Jonathan R. Manning ◽  
Peter Walter ◽  
...  

AbstractMany gene expression quantitative trait locus (eQTL) studies have published their summary statistics, which can be used to gain insight into complex human traits by downstream analyses, such as fine mapping and co-localization. However, technical differences between these datasets are a barrier to their widespread use. Consequently, target genes for most genome-wide association study (GWAS) signals have still not been identified. In the present study, we present the eQTL Catalogue (https://www.ebi.ac.uk/eqtl), a resource of quality-controlled, uniformly re-computed gene expression and splicing QTLs from 21 studies. We find that, for matching cell types and tissues, the eQTL effect sizes are highly reproducible between studies. Although most QTLs were shared between most bulk tissues, we identified a greater diversity of cell-type-specific QTLs from purified cell types, a subset of which also manifested as new disease co-localizations. Our summary statistics are freely available to enable the systematic interpretation of human GWAS associations across many cell types and tissues.


2020 ◽  
Vol 21 (23) ◽  
pp. 9052
Author(s):  
Indrek Teino ◽  
Antti Matvere ◽  
Martin Pook ◽  
Inge Varik ◽  
Laura Pajusaar ◽  
...  

Aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor, which mediates the effects of a variety of environmental stimuli in multiple tissues. Recent advances in AHR biology have underlined its importance in cells with high developmental potency, including pluripotent stem cells. Nonetheless, there is little data on AHR expression and its role during the initial stages of stem cell differentiation. The purpose of this study was to investigate the temporal pattern of AHR expression during directed differentiation of human embryonic stem cells (hESC) into neural progenitor, early mesoderm and definitive endoderm cells. Additionally, we investigated the effect of the AHR agonist 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) on the gene expression profile in hESCs and differentiated cells by RNA-seq, accompanied by identification of AHR binding sites by ChIP-seq and epigenetic landscape analysis by ATAC-seq. We showed that AHR is differentially regulated in distinct lineages. We provided evidence that TCDD alters gene expression patterns in hESCs and during early differentiation. Additionally, we identified novel potential AHR target genes, which expand our understanding on the role of this protein in different cell types.


2019 ◽  
Vol 35 (24) ◽  
pp. 5067-5077 ◽  
Author(s):  
Jiyun Zhou ◽  
Qin Lu ◽  
Lin Gui ◽  
Ruifeng Xu ◽  
Yunfei Long ◽  
...  

AbstractMotivationThe prediction of transcription factor binding sites (TFBSs) is crucial for gene expression analysis. Supervised learning approaches for TFBS predictions require large amounts of labeled data. However, many TFs of certain cell types either do not have sufficient labeled data or do not have any labeled data.ResultsIn this paper, a multi-task learning framework (called MTTFsite) is proposed to address the lack of labeled data problem by leveraging on labeled data available in cross-cell types. The proposed MTTFsite contains a shared CNN to learn common features for all cell types and a private CNN for each cell type to learn private features. The common features are aimed to help predicting TFBSs for all cell types especially those cell types that lack labeled data. MTTFsite is evaluated on 241 cell type TF pairs and compared with a baseline method without using any multi-task learning model and a fully shared multi-task model that uses only a shared CNN and do not use private CNNs. For cell types with insufficient labeled data, results show that MTTFsite performs better than the baseline method and the fully shared model on more than 89% pairs. For cell types without any labeled data, MTTFsite outperforms the baseline method and the fully shared model by more than 80 and 93% pairs, respectively. A novel gene expression prediction method (called TFChrome) using both MTTFsite and histone modification features is also presented. Results show that TFBSs predicted by MTTFsite alone can achieve good performance. When MTTFsite is combined with histone modification features, a significant 5.7% performance improvement is obtained.Availability and implementationThe resource and executable code are freely available at http://hlt.hitsz.edu.cn/MTTFsite/ and http://www.hitsz-hlt.com:8080/MTTFsite/.Supplementary informationSupplementary data are available at Bioinformatics online.


2005 ◽  
Vol 25 (23) ◽  
pp. 10235-10250 ◽  
Author(s):  
Anna H. Schuh ◽  
Alex J. Tipping ◽  
Allison J. Clark ◽  
Isla Hamlett ◽  
Boris Guyot ◽  
...  

ABSTRACT Lineage specification and cellular maturation require coordinated regulation of gene expression programs. In large part, this is dependent on the activator and repressor functions of protein complexes associated with tissue-specific transcriptional regulators. In this study, we have used a proteomic approach to characterize multiprotein complexes containing the key hematopoietic regulator SCL in erythroid and megakaryocytic cell lines. One of the novel SCL-interacting proteins identified in both cell types is the transcriptional corepressor ETO-2. Interaction between endogenous proteins was confirmed in primary cells. We then showed that SCL complexes are shared but also significantly differ in the two cell types. Importantly, SCL/ETO-2 interacts with another corepressor, Gfi-1b, in red cells but not megakaryocytes. The SCL/ETO-2/Gfi-1b association is lost during erythroid differentiation of primary fetal liver cells. Genetic studies of erythroid cells show that ETO-2 exerts a repressor effect on SCL target genes. We suggest that, through its association with SCL, ETO-2 represses gene expression in the early stages of erythroid differentiation and that alleviation/modulation of the repressive state is then required for expression of genes necessary for terminal erythroid maturation to proceed.


2021 ◽  
Author(s):  
Manuel Tavares ◽  
Garima Khandelwal ◽  
Joanne Mutter ◽  
Keijo Viiri ◽  
Manuel Beltran ◽  
...  

Polycomb repressive complex 2 (PRC2) methylates histone H3 lysine 27 (H3K27me3) to maintain repression of genes specific for other cell types and is essential for cell differentiation. In endometrial stromal sarcoma, the PRC2 subunit SUZ12 is often fused with the NuA4/TIP60 subunit JAZF1. Here, we show that JAZF1-SUZ12 dysregulates PRC2 composition, recruitment, histone modification, gene expression and cell differentiation. The loss of the SUZ12 N-terminus in the fusion protein disrupted interaction with the PRC2 accessory factors JARID2, EPOP and PALI1 and prevented recruitment of PRC2 from RNA to chromatin. In undifferentiated cells, JAZF1-SUZ12 occupied PRC2 target genes but gained a JAZF1-like binding profile during cell differentiation. JAZF1-SUZ12 reduced H3K27me3 and increased H4Kac at PRC2 target genes, and this was associated with disruption in gene expression and cell differentiation programs. These results reveal the defects in chromatin regulation caused by JAZF1-SUZ12, which may underlie its role in oncogenesis.


Sign in / Sign up

Export Citation Format

Share Document