scholarly journals DNA sequence models of genome-wide Drosophila melanogaster Polycomb binding sites improve generalization to independent Polycomb Response Elements

2019 ◽  
Vol 47 (15) ◽  
pp. 7781-7797 ◽  
Author(s):  
Bjørn André Bredesen ◽  
Marc Rehmsmeier

Abstract Polycomb Response Elements (PREs) are cis-regulatory DNA elements that maintain gene transcription states through DNA replication and mitosis. PREs have little sequence similarity, but are enriched in a number of sequence motifs. Previous methods for modelling Drosophila melanogaster PRE sequences (PREdictor and EpiPredictor) have used a set of 7 motifs and a training set of 12 PREs and 16-23 non-PREs. Advances in experimental methods for mapping chromatin binding factors and modifications has led to the publication of several genome-wide sets of Polycomb targets. In addition to the seven motifs previously used, PREs are enriched in the GTGT motif, recently associated with the sequence-specific DNA binding protein Combgap. We investigated whether models trained on genome-wide Polycomb sites generalize to independent PREs when trained with control sequences generated by naive PRE models and including the GTGT motif. We also developed a new PRE predictor: SVM-MOCCA. Training PRE predictors with genome-wide experimental data improves generalization to independent data, and SVM-MOCCA predicts the majority of PREs in three independent experimental sets. We present 2908 candidate PREs enriched in sequence and chromatin signatures. 2412 of these are also enriched in H3K4me1, a mark of Trithorax activated chromatin, suggesting that PREs/TREs have a common sequence code.

2009 ◽  
Vol 30 (3) ◽  
pp. 820-828 ◽  
Author(s):  
Melissa D. Cunningham ◽  
J. Lesley Brown ◽  
Judith A. Kassis

ABSTRACT The Polycomb group proteins (PcGs) play a vital role throughout development by maintaining precise gene expression patterns. In Drosophila melanogaster, PcG-mediated gene silencing is achieved through DNA elements called Polycomb response elements (PREs); however, the mechanism for establishing silencing and the requirements and composition of a working PRE are not fully understood. We have used the computer program jPREdictor to uncover PREs located within the invected (inv) locus. The functionalities of these predicted PREs were tested in two different assays: one analyzing their abilities to maintain expression of a β-galactosidase reporter gene and the other evaluating their abilities to establish pairing-sensitive silencing of the mini-white reporter in the vector pCaSpeR. We have identified two previously uncharacterized PREs at the inv gene and demonstrate that they produce similar results in the two assays. Our results indicate that clusters of protein binding sites do not accurately predict PREs and provide new insight into the DNA sequence requirements for the binding of the PcG protein Pho. Finally, our data show that PREs and regulatory DNA from different genes can function together to establish PcG-mediated silencing, highlighting the versatility of PREs despite discrepancies in the number and location of DNA binding sites.


2006 ◽  
Vol 38 (6) ◽  
pp. 694-699 ◽  
Author(s):  
Bas Tolhuis ◽  
Inhua Muijrers ◽  
Elzo de Wit ◽  
Hans Teunissen ◽  
Wendy Talhout ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Nikolay Postika ◽  
Paul Schedl ◽  
Pavel Georgiev ◽  
Olga Kyrchanova

AbstractThe autonomy of segment-specific regulatory domains in the Bithorax complex is conferred by boundary elements and associated Polycomb response elements (PREs). The Fab-6 boundary is located at the junction of the iab-5 and iab-6 domains. Previous studies mapped it to a nuclease hypersensitive region 1 (HS1), while the iab-6 PRE was mapped to a second hypersensitive region HS2 nearly 3 kb away. To analyze the role of HS1 and HS2 in boundary we generated deletions of HS1 or HS1 + HS2 that have attP site for boundary replacement experiments. The 1389 bp HS1 deletion can be rescued by a 529 bp core Fab-6 sequence that includes two CTCF sites. However, Fab-6 HS1 cannot rescue the HS1 + HS2 deletion or substitute for another BX-C boundary – Fab-7. For this it must be combined with a PRE, either Fab-7 HS3, or Fab-6 HS2. These findings suggest that the boundary function of Fab-6 HS1 must be bolstered by a second element that has PRE activity.


2021 ◽  
Vol 7 (20) ◽  
pp. eabf2229
Author(s):  
Bastian Stielow ◽  
Yuqiao Zhou ◽  
Yinghua Cao ◽  
Clara Simon ◽  
Hans-Martin Pogoda ◽  
...  

CpG islands (CGIs) are key regulatory DNA elements at most promoters, but how they influence the chromatin status and transcription remains elusive. Here, we identify and characterize SAMD1 (SAM domain-containing protein 1) as an unmethylated CGI-binding protein. SAMD1 has an atypical winged-helix domain that directly recognizes unmethylated CpG-containing DNA via simultaneous interactions with both the major and the minor groove. The SAM domain interacts with L3MBTL3, but it can also homopolymerize into a closed pentameric ring. At a genome-wide level, SAMD1 localizes to H3K4me3-decorated CGIs, where it acts as a repressor. SAMD1 tethers L3MBTL3 to chromatin and interacts with the KDM1A histone demethylase complex to modulate H3K4me2 and H3K4me3 levels at CGIs, thereby providing a mechanism for SAMD1-mediated transcriptional repression. The absence of SAMD1 impairs ES cell differentiation processes, leading to misregulation of key biological pathways. Together, our work establishes SAMD1 as a newly identified chromatin regulator acting at unmethylated CGIs.


2021 ◽  
Author(s):  
Michael Tun Yin Lam ◽  
Sascha H Duttke ◽  
Mazen Faris Odish ◽  
Hiep D Le ◽  
Emily A Hansen ◽  
...  

The contribution of transcription factors (TFs) and gene regulatory programs in the immune response to COVID-19 and their relationship to disease outcome is not fully understood. Analysis of genome-wide changes in transcription at both promoter-proximal and distal cis-regulatory DNA elements, collectively termed the 'active cistrome,' offers an unbiased assessment of TF activity identifying key pathways regulated in homeostasis or disease. Here, we profiled the active cistrome from peripheral leukocytes of critically ill COVID-19 patients to identify major regulatory programs and their dynamics during SARS-CoV-2 associated acute respiratory distress syndrome (ARDS). We identified TF motifs that track the severity of COVID-19 lung injury, disease resolution, and outcome. We used unbiased clustering to reveal distinct cistrome subsets delineating the regulation of pathways, cell types, and the combinatorial activity of TFs. We found critical roles for regulatory networks driven by stimulus and lineage determining TFs, showing that STAT and E2F/MYB regulatory programs targeting myeloid cells are activated in patients with poor disease outcomes and associated with single nucleotide genetic variants implicated in COVID-19 susceptibility. Integration with single-cell RNA-seq found that STAT and E2F/MYB activation converged in specific neutrophils subset found in patients with severe disease. Collectively we demonstrate that cistrome analysis facilitates insight into disease mechanisms and provides an unbiased approach to evaluate global changes in transcription factor activity and stratify patient disease severity.


2019 ◽  
Vol 30 (3) ◽  
pp. 421-441 ◽  
Author(s):  
Karsten B. Sieber ◽  
Anna Batorsky ◽  
Kyle Siebenthall ◽  
Kelly L. Hudkins ◽  
Jeff D. Vierstra ◽  
...  

BackgroundLinking genetic risk loci identified by genome-wide association studies (GWAS) to their causal genes remains a major challenge. Disease-associated genetic variants are concentrated in regions containing regulatory DNA elements, such as promoters and enhancers. Although researchers have previously published DNA maps of these regulatory regions for kidney tubule cells and glomerular endothelial cells, maps for podocytes and mesangial cells have not been available.MethodsWe generated regulatory DNA maps (DNase-seq) and paired gene expression profiles (RNA-seq) from primary outgrowth cultures of human glomeruli that were composed mainly of podocytes and mesangial cells. We generated similar datasets from renal cortex cultures, to compare with those of the glomerular cultures. Because regulatory DNA elements can act on target genes across large genomic distances, we also generated a chromatin conformation map from freshly isolated human glomeruli.ResultsWe identified thousands of unique regulatory DNA elements, many located close to transcription factor genes, which the glomerular and cortex samples expressed at different levels. We found that genetic variants associated with kidney diseases (GWAS) and kidney expression quantitative trait loci were enriched in regulatory DNA regions. By combining GWAS, epigenomic, and chromatin conformation data, we functionally annotated 46 kidney disease genes.ConclusionsWe demonstrate a powerful approach to functionally connect kidney disease-/trait–associated loci to their target genes by leveraging unique regulatory DNA maps and integrated epigenomic and genetic analysis. This process can be applied to other kidney cell types and will enhance our understanding of genome regulation and its effects on gene expression in kidney disease.


Blood ◽  
2019 ◽  
Vol 133 (7) ◽  
pp. 724-729 ◽  
Author(s):  
Maoxiang Qian ◽  
Heng Xu ◽  
Virginia Perez-Andreu ◽  
Kathryn G. Roberts ◽  
Hui Zhang ◽  
...  

Abstract Acute lymphoblastic leukemia (ALL) is the most common malignancy in children. Characterized by high levels of Native American ancestry, Hispanics are disproportionally affected by this cancer with high incidence and inferior survival. However, the genetic basis for this disparity remains poorly understood because of a paucity of genome-wide investigation of ALL in Hispanics. Performing a genome-wide association study (GWAS) in 940 Hispanic children with ALL and 681 ancestry-matched non-ALL controls, we identified a novel susceptibility locus in the ERG gene (rs2836365; P = 3.76 × 10−8; odds ratio [OR] = 1.56), with independent validation (P = .01; OR = 1.43). Imputation analyses pointed to a single causal variant driving the association signal at this locus overlapping with putative regulatory DNA elements. The effect size of the ERG risk variant rose with increasing Native American genetic ancestry. The ERG risk genotype was underrepresented in ALL with the ETV6-RUNX1 fusion (P < .0005) but enriched in the TCF3-PBX1 subtype (P < .05). Interestingly, ALL cases with germline ERG risk alleles were significantly less likely to have somatic ERG deletion (P < .05). Our results provide novel insights into genetic predisposition to ALL and its contribution to racial disparity in this cancer.


Sign in / Sign up

Export Citation Format

Share Document