Single-molecule regulatory architectures captured by chromatin fiber sequencing

Science ◽  
2020 ◽  
Vol 368 (6498) ◽  
pp. 1449-1454 ◽  
Author(s):  
Andrew B. Stergachis ◽  
Brian M. Debo ◽  
Eric Haugen ◽  
L. Stirling Churchman ◽  
John A. Stamatoyannopoulos

Gene regulation is chiefly determined at the level of individual linear chromatin molecules, yet our current understanding of cis-regulatory architectures derives from fragmented sampling of large numbers of disparate molecules. We developed an approach for precisely stenciling the structure of individual chromatin fibers onto their composite DNA templates using nonspecific DNA N6-adenine methyltransferases. Single-molecule long-read sequencing of chromatin stencils enabled nucleotide-resolution readout of the primary architecture of multikilobase chromatin fibers (Fiber-seq). Fiber-seq exposed widespread plasticity in the linear organization of individual chromatin fibers and illuminated principles guiding regulatory DNA actuation, the coordinated actuation of neighboring regulatory elements, single-molecule nucleosome positioning, and single-molecule transcription factor occupancy. Our approach and results open new vistas on the primary architecture of gene regulation.

Author(s):  
Jeff Vierstra ◽  
John Lazar ◽  
Richard Sandstrom ◽  
Jessica Halow ◽  
Kristen Lee ◽  
...  

AbstractCombinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, yet it remains challenging to distinguish variants that impact regulatory function2. Genomic DNase I footprinting enables quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3–5. However, to date only a small fraction of such sites have been precisely resolved on the human genome sequence5. To enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate at nucleotide resolution ~4.5 million compact genomic elements encoding transcription factor occupancy. We map the fine-scale structure of ~1.6 million DHS and show that the overwhelming majority is populated by well-spaced sites of single transcription factor:DNA interaction. Cell context-dependent cis-regulation is chiefly executed by wholesale actuation of accessibility at regulatory DNA versus by differential transcription factor occupancy within accessible elements. We show further that the well-described enrichment of disease- and phenotypic trait-associated genetic variants in regulatory regions1,6 is almost entirely attributable to variants localizing within footprints, and that functional variants impacting transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find that the global density of human genetic variation is markedly increased within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a new framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.


2018 ◽  
Author(s):  
Zohar Shipony ◽  
Georgi K. Marinov ◽  
Matthew P. Swaffer ◽  
Nasa A. Sinott-Armstrong ◽  
Jan M. Skotheim ◽  
...  

AbstractActive regulatory elements in eukaryotes are typically characterized by an open, nucleosome-depleted chromatin structure; mapping areas of open chromatin has accordingly emerged as a widely used tool in the arsenal of modern functional genomics. However, existing approaches for profiling chromatin accessibility are limited by their reliance on DNA fragmentation and short read sequencing, which leaves them unable to provide information about the state of chromatin on larger scales or reveal coordination between the chromatin state of individual distal regulatory elements. To address these limitations, we have developed a method for profiling accessibility of individual chromatin fibers at multi-kilobase length scale (SMAC-seq, or Single-Molecule long-read Accessible Chromatin mapping sequencing assay), enabling the simultaneous, high-resolution, single-molecule assessment of the chromatin state of distal genomic elements. Our strategy is based on combining the preferential methylation of open chromatin regions by DNA methyltransferases (CpG and GpC 5-methylcytosine (5mC) and N6-methyladenosine (m6A) enzymes) and the ability of long-read single-molecule nanopore sequencing to directly read out the methylation state of individual DNA bases. Applying SMAC-seq to the budding yeast Saccharomyces cerevisiae, we demonstrate that aggregate SMAC-seq signals match bulk-level accessibility measurements, observe single-molecule protection footprints of nucleosomes and transcription factors, and quantify the correlation between the chromatin states of distal genomic elements.


2020 ◽  
Author(s):  
Charles E. Breeze ◽  
John Lazar ◽  
Tim Mercer ◽  
Jessica Halow ◽  
Ida Washington ◽  
...  

AbstractEarly mammalian development is orchestrated by genome-encoded regulatory elements populated by a changing complement of regulatory factors, creating a dynamic chromatin landscape. To define the spatiotemporal organization of regulatory DNA landscapes during mouse development and maturation, we generated nucleotide-resolution DNA accessibility maps from 15 tissues sampled at 9 intervals spanning post-conception day 9.5 through early adult, and integrated these with 41 adult-stage DNase-seq profiles to create a global atlas of mouse regulatory DNA. Collectively, we delineated >1.8 million DNase I hypersensitive sites (DHSs), with the vast majority displaying temporal and tissue-selective patterning. Here we show that tissue regulatory DNA compartments show sharp embryonic-to-fetal transitions characterized by wholesale turnover of DHSs and progressive domination by a diminishing number of transcription factors. We show further that aligning mouse and human fetal development on a regulatory axis exposes disease-associated variation enriched in early intervals lacking human samples. Our results provide an expansive new resource for decoding mammalian developmental regulatory programs.


Author(s):  
Zhiyi Sun ◽  
Romualdas Vaisvila ◽  
Bo Yan ◽  
Chloe Baum ◽  
Lana Saleh ◽  
...  

AbstractThe predominant methodology for DNA methylation analysis relies on the chemical deamination by sodium bisulfite of unmodified cytosine to uracil to permit the differential readout of methylated cytosines. Bisulfite treatment damages the DNA leading to fragmentation and loss of long-range methylation information. To overcome this limitation of bisulfite treated DNA we applied a new enzymatic deamination approach, termed EM-seq (Enzymatic Methyl-seq) to long-range sequencing technologies. Our methodology, named LR-EM-seq (Long Range Enzymatic Methyl-seq) preserves the integrity of DNA allowing long-range methylation profiling of 5-mC and 5-hmC over several kilobases of genomic DNA. When applied to known differentially methylated regions (DMR), LR-EM-seq achieves phasing of over 5 kb resulting in broader and better defined DMRs compared to previously reported. This result demonstrated the importance of phasing methylation for biologically relevant questions and the applicability of LR-EM-seq for long range epigenetic analysis at single molecule and single nucleotide resolution.


2018 ◽  
Vol 217 (4) ◽  
pp. 1181-1191 ◽  
Author(s):  
Zhe Liu ◽  
Robert Tjian

The assembly of sequence-specific enhancer-binding transcription factors (TFs) at cis-regulatory elements in the genome has long been regarded as the fundamental mechanism driving cell type–specific gene expression. However, despite extensive biochemical, genetic, and genomic studies in the past three decades, our understanding of molecular mechanisms underlying enhancer-mediated gene regulation remains incomplete. Recent advances in imaging technologies now enable direct visualization of TF-driven regulatory events and transcriptional activities at the single-cell, single-molecule level. The ability to observe the remarkably dynamic behavior of individual TFs in live cells at high spatiotemporal resolution has begun to provide novel mechanistic insights and promises new advances in deciphering causal–functional relationships of TF targeting, genome organization, and gene activation. In this review, we review current transcription imaging techniques and summarize converging results from various lines of research that may instigate a revision of models to describe key features of eukaryotic gene regulation.


2019 ◽  
Author(s):  
Maika Malig ◽  
Stella R. Hartono ◽  
Jenna M. Giafaglione ◽  
Lionel A. Sanz ◽  
Frederic Chedin

ABSTRACTR-loops are a prevalent class of non-B DNA structures that form during transcription upon reannealing of the nascent RNA to the template DNA strand. R-loops have been profiled using the S9.6 antibody to immunoprecipitate DNA:RNA hybrids. S9.6-based DNA:RNA immunoprecipitation (DRIP) techniques revealed that R-loops form dynamically over conserved genic hotspots. We developed an orthogonal profiling methodology that queries R-loops via the presence of long stretches of single-stranded DNA on the looped-out strand. Non-denaturing sodium bisulfite treatment catalyzes the conversion of unpaired cytosines to uracils, creating permanent genetic tags for the position of an R-loop. Long read, single-molecule PacBio sequencing allows the identification of R-loop ‘footprints’ at near nucleotide resolution in a strand-specific manner on single DNA molecules and at ultra-deep coverage. Single-molecule R-loop footprinting (SMRF-seq) revealed a strong agreement between S9.6-and bisulfite-based R-loop mapping and confirmed that R-loops form from unspliced transcripts over genic hotspots. Using the largest single-molecule R-loop dataset to date, we show that individual R-loops generate overlapping sets of molecular clusters that pile-up through larger R-loop-prone zones. SMRF-seq further established that R-loop distribution patterns are driven by both intrinsic DNA sequence features and DNA topological constraints, revealing the principles of R-loop formation.


Author(s):  
Diego Calderon ◽  
Andria Ellis ◽  
Riza M. Daza ◽  
Beth Martin ◽  
Jacob M. Tome ◽  
...  

AbstractGene regulation occurs through trans-acting factors (e.g. transcription factors) acting on cis-regulatory elements (e.g. enhancers). Massively parallel reporter assays (MPRAs) functionally survey large numbers of cis-regulatory elements for regulatory potential, but do not identify the trans-acting factors that mediate any observed effects. Here we describe transMPRA — a reporter assay that efficiently combines multiplex CRISPR-mediated perturbation and MPRAs to identify trans-acting factors that modulate the regulatory activity of specific enhancers.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Ayako Nishizawa ◽  
Kazuki Kumada ◽  
Keiko Tateno ◽  
Maiko Wagata ◽  
Sakae Saito ◽  
...  

AbstractPreeclampsia is a pregnancy-induced disorder that is characterized by hypertension and is a leading cause of perinatal and maternal–fetal morbidity and mortality. HLA-G is thought to play important roles in maternal–fetal immune tolerance, and the associations between HLA-G gene polymorphisms and the onset of pregnancy-related diseases have been explored extensively. Because contiguous genomic sequencing is difficult, the association between the HLA-G genotype and preeclampsia onset is controversial. In this study, genomic sequences of the HLA-G region (5.2 kb) from 31 pairs of mother–offspring genomic DNA samples (18 pairs from normal pregnancies/births and 13 from preeclampsia births) were obtained by single-molecule real-time sequencing using the PacBio RS II platform. The HLA-G alleles identified in our cohort matched seven known HLA-G alleles, but we also identified two new HLA-G alleles at the fourth-field resolution and compared them with nucleotide sequences from a public database that consisted of coding sequences that cover the 3.1-kb HLA-G gene span. Intriguingly, a potential association between preeclampsia onset and the poly T stretch within the downstream region of the HLA-G*01:01:01:01 allele was found. Our study suggests that long-read sequencing of HLA-G will provide clues for characterizing HLA-G variants that are involved in the pathophysiology of preeclampsia.


Sign in / Sign up

Export Citation Format

Share Document