scholarly journals Graph-based data integration predicts long-range regulatory interactions across the human genome

2014 ◽  
Author(s):  
Sofie Demeyer ◽  
Tom Michoel

Transcriptional regulation of gene expression is one of the main processes that affect cell diversification from a single set of genes. Regulatory proteins often interact with DNA regions located distally from the transcription start sites (TSS) of the genes. We developed a computational method that combines open chromatin and gene expression information for a large number of cell types to identify these distal regulatory elements. Our method builds correlation graphs for publicly available DNase-seq and exon array datasets with matching samples and uses graph-based methods to filter findings supported by multiple datasets and remove indirect interactions. The resulting set of interactions was validated with both anecdotal information of known long-range interactions and unbiased experimental data deduced from Hi-C and CAGE experiments. Our results provide a novel set of high-confidence candidate open chromatin regions involved in gene regulation, often located several Mb away from the TSS of their target gene.

2019 ◽  
Author(s):  
Priyanka Nandakumar ◽  
Dongwon Lee ◽  
Thomas J. Hoffmann ◽  
Georg B. Ehret ◽  
Dan Arking ◽  
...  

AbstractHundreds of loci have been associated with blood pressure traits from many genome-wide association studies. We identified an enrichment of these loci in aorta and tibial artery expression quantitative trait loci in our previous work in ∼100,000 Genetic Epidemiology Research on Aging (GERA) study participants. In the present study, we subsequently focused on determining putative regulatory regions for these and other tissues of relevance to blood pressure, to both fine-map these loci by pinpointing genes and variants of functional interest within them, and to identify any novel genes.We constructed maps of putative cis-regulatory elements using publicly available open chromatin data for the heart, aorta and tibial arteries, and multiple kidney cell types. Sequence variants within these regions may be evaluated quantitatively for their tissue- or cell-type-specific regulatory impact using deltaSVM functional scores, as described in our previous work. In order to identify genes of interest, we aggregate these variants in these putative cis-regulatory elements within 50Kb of the start or end of genes considered as “expressed” in these tissues or cell types using publicly available gene expression data, and use the deltaSVM scores as weights in the well-known group-wise sequence kernel association test (SKAT). We test for association with both blood pressure traits as well as expression within these tissues or cell types of interest, and identify several genes, including MTHFR, C10orf32, CSK, NOV, ULK4, SDCCAG8, SCAMP5, RPP25, HDGFRP3, VPS37B, and PPCDC. Although our study centers on blood pressure traits, we additionally examined two known genes, SCN5A and NOS1AP involved in the cardiac trait QT interval, in the Atherosclerosis Risk in Communities Study (ARIC), as a positive control, and observed an expected heart-specific effect. Thus, our method may be used to identify variants and genes for further functional testing using tissue- or cell-type-specific putative regulatory information.Author SummarySequence change in genes (“variants”) are linked to the presence and severity of different traits or diseases. However, as genes may be expressed in different tissues and at different times and degrees, using this information is expected to more accurately identify genes of interest. Variants within the genes are essential, but also in the sequences (“regulatory elements”) that control the genes’ expression in different tissues or cell types. In this study, we aim to use this information about expression and variants potentially involved in gene expression regulation to better pinpoint genes and variants in regulatory elements of interest for blood pressure regulation. We do so by taking advantage of such data that are publicly available, and use methods to combine information about variants in aggregate within a gene’s putative regulatory elements in tissues thought to be relevant for blood pressure, and identify several genes, meant to enable experimental follow-up.


Author(s):  
Zhen Miao ◽  
Michael S. Balzer ◽  
Ziyuan Ma ◽  
Hongbo Liu ◽  
Junnan Wu ◽  
...  

AbstractDetermining the epigenetic program that generates unique cell types in the kidney is critical for understanding cell-type heterogeneity during tissue homeostasis and injury response.Here, we profiled open chromatin and gene expression in developing and adult mouse kidneys at single cell resolution. We show critical reliance of gene expression on distal regulatory elements (enhancers). We define key cell type-specific transcription factors and major gene-regulatory circuits for kidney cells. Dynamic chromatin and expression changes during nephron progenitor differentiation demonstrated that podocyte commitment occurs early and is associated with sustained Foxl1 expression. Renal tubule cells followed a more complex differentiation, where Hfn4a was associated with proximal and Tfap2b with distal fate. Mapping single nucleotide variants associated with human kidney disease identified critical cell types, developmental stages, genes, and regulatory mechanisms.We provide a global single cell resolution view of chromatin accessibility of kidney development. The dataset is available via interactive public websites.


2020 ◽  
Author(s):  
Stephane Deschamps ◽  
John A Crow ◽  
Nadia Chaidir ◽  
Brooke Peterson-Burch ◽  
Sunil Kumar ◽  
...  

Abstract Background Three-dimensional chromatin loop structures connect regulatory elements to their target genes in regions known as anchors. In complex plant genomes, such as maize, it has been proposed that loops span heterochromatic regions marked by higher repeat content, but little is known on their spatial organization and genome-wide occurrence in relation to transcriptional activity. Results Here, ultra-deep Hi-C sequencing of maize B73 leaf tissue was combined with gene expression and open chromatin sequencing for chromatin loop discovery and correlation with transcriptional activity. Chromatin loops, made of two “anchors” flanking a loop “interior”, overlap with up to 90% of high-resolution interaction domains from a previous public maize interactome dataset. A majority of all anchors are shared between multiple loops, suggesting a highly dynamic environment, with a conserved set of anchors involved in multiple interaction networks. Chromatin loop interiors are marked by higher repeat contents than the anchors flanking them. A small fraction of high-resolution interaction anchors, fully embedded in larger chromatin loops, co-locate with active genes and putative protein-binding sites. Combinatorial analysis indicate that all anchors studied here co-locate with at least 81.5% of expressed genes and 74% of open chromatin regions. Up to 63% of all unique variants derived from a prior public maize eQTL datasets overlap with Hi-C loop anchors. Anchor annotation suggests that <7% of all loops detected from one Hi-C library are potentially devoid of any genes or regulatory elements. The overall conservation and organization of chromatin loop anchors in the maize genome suggest a loop modeling system hypothesized to resemble phase separation of repeat-rich regions. Conclusions A majority of expressed genes and open chromatin regions co-locate with a conserved set of chromatin loop anchors. The results presented here will be a useful reference to further investigate the function of chromatin loop anchors and of the formation of interaction regions in the regulation of gene expression in maize.


2019 ◽  
Author(s):  
Eirene Markenscoff-Papadimitriou ◽  
Sean Whalen ◽  
Pawel Przytycki ◽  
Reuben Thomas ◽  
Fadya Binyameen ◽  
...  

AbstractGene expression differs between cell types and regions within complex tissues such as the developing brain. To discover regulatory elements underlying this specificity, we generated genome-wide maps of chromatin accessibility in eleven anatomically-defined regions of the developing human telencephalon, including upper and deep layers of the prefrontal cortex. We predicted a subset of open chromatin regions (18%) that are most likely to be active enhancers, many of which are dynamic with 26% differing between early and late mid-gestation and 28% present in only one brain region. These region-specific predicted regulatory elements (pREs) are enriched proximal to genes with expression differences across regions and developmental stages and harbor distinct sequence motifs that suggest potential upstream regulators of regional and temporal transcription. We leverage this atlas to identify regulators of genes associated with autism spectrum disorder (ASD) including an enhancer of BCL11A, validated in mouse, and two functional de novo mutations in individuals with ASD in an enhancer of SLC6A1, validated in neuroblastoma cells. These applications demonstrate the utility of this atlas for decoding neurodevelopmental gene regulation in health and disease.SummaryTo discover regulatory elements driving the specificity of gene expression in different cell types and regions of the developing human brain, we generated an atlas of open chromatin from eleven dissected regions of the mid-gestation human telencephalon, including upper and deep layers of the prefrontal cortex. We identified a subset of open chromatin regions (OCRs), termed predicted regulatory elements (pREs), that are likely to function as developmental brain enhancers. pREs showed regional differences in chromatin accessibility, including many specific to one brain region, and were correlated with gene expression differences across the same regions and gestational ages. pREs allowed us to map neurodevelopmental disorder risk genes to developing telencephalic regions, and we identified three functional de novo noncoding variants in pREs that alter enhancer function. In addition, transgenic experiments in mouse validated enhancer activity for a pRE proximal to BCL11A, showing how this atlas serves as a resource for decoding neurodevelopmental gene regulation in health and disease.


2021 ◽  
Vol 22 (9) ◽  
pp. 4686
Author(s):  
Marta Irla ◽  
Sigrid Hakvåg ◽  
Trygve Brautaset

Genome-wide transcriptomic data obtained in RNA-seq experiments can serve as a reliable source for identification of novel regulatory elements such as riboswitches and promoters. Riboswitches are parts of the 5′ untranslated region of mRNA molecules that can specifically bind various metabolites and control gene expression. For that reason, they have become an attractive tool for engineering biological systems, especially for the regulation of metabolic fluxes in industrial microorganisms. Promoters in the genomes of prokaryotes are located upstream of transcription start sites and their sequences are easily identifiable based on the primary transcriptome data. Bacillus methanolicus MGA3 is a candidate for use as an industrial workhorse in methanol-based bioprocesses and its metabolism has been studied in systems biology approaches in recent years, including transcriptome characterization through RNA-seq. Here, we identify a putative lysine riboswitch in B. methanolicus, and test and characterize it. We also select and experimentally verify 10 putative B. methanolicus-derived promoters differing in their predicted strength and present their functionality in combination with the lysine riboswitch. We further explore the potential of a B. subtilis-derived purine riboswitch for regulation of gene expression in the thermophilic B. methanolicus, establishing a novel tool for inducible gene expression in this bacterium.


2016 ◽  
Author(s):  
Abdullah M. Khamis ◽  
Anna V. Lioznova ◽  
Artem V. Artemov ◽  
Vasily Ramensky ◽  
Vladimir B. Bajic ◽  
...  

AbstractDNA methylation is involved in regulation of gene expression. Although modern methods profile DNA methylation at single CpG sites, methylation levels are usually averaged over genomic regions in the downstream analyses. In this study we demonstrate that single CpG methylation can serve as a more accurate predictor of gene expression compared to average promoter / gene body methylation. CpG positions with significant correlation between methylation and expression of a gene nearby (named CpG traffic lights) are evolutionary conserved and enriched for exact TSS positions and active enhancers. Among all promoter types, CpG traffic lights are especially enriched in poised promoters. Genes that harbor CpG traffic lights are associated with development and signal transduction. Methylation levels of individual CpG traffic lights vary between cell types dramatically with the increased frequency of intermediate methylation levels, indicating cell population heterogeneity in CpG methylation levels. Being in line with the concept of the inherited stochastic epigenetic variation, methylation of such CpG positions might contribute to transcriptional regulation. Alternatively, one can hypothesize that traffic lights are markers of absent gene expression resulting from inactivation of their regulatory elements. The CpG traffic lights provide a promising insight into mechanisms of enhancer activity and gene regulation linking methylation of single CpG to expression.


Endocrinology ◽  
2021 ◽  
Author(s):  
Tal Refael ◽  
Philippa Melamed

Abstract The world of long non-coding RNAs (lncRNAs) has opened up massive new prospects in understanding the regulation of gene expression. Not only are there seemingly almost infinite numbers of lncRNAs in the mammalian cell, but they have highly diverse mechanisms of action. In the nucleus, some are chromatin-associated, transcribed from transcriptional enhancers (eRNAs) and/or direct changes in the epigenetic landscape with profound effects on gene expression. The pituitary gonadotrope is responsible for activation of reproduction through production and secretion of appropriate levels of the gonadotropic hormones. As such, it exemplifies a cell whose function is defined through changes in developmental and temporal patterns of gene expression, including those that are hormonally-induced. Roles for diverse distal regulatory elements and eRNAs in gonadotrope biology have only just begun to emerge. Here, we will present an overview of the different kinds of lncRNAs that alter gene expression, and what is known about their roles in regulating some of the key gonadotrope genes. We will also review various screens that have detected differentially expressed pituitary lncRNAs associated with changes in reproductive state, and those whose expression is found to play a role in gonadotrope-derived non-functioning pituitary adenomas. We hope to shed light on this exciting new field, emphasize the open questions, and encourage research to illuminate the roles of lncRNAs in various endocrine systems.


2017 ◽  
Vol 121 (suppl_1) ◽  
Author(s):  
Tal Golan Lagziel ◽  
Lilac Caspi ◽  
Yair Lewis ◽  
Izhak Kehat

The mammalian body contains several hundred cell types that share the same genome, but can express distinct gene signatures. This specification of gene expression is achieved through the activity of cis-regulatory genomic elements (CRE), such as enhancers, promoters, and silencers. The Assay for Transposase-Accessible Chromatin followed by sequencing (ATAC-seq) can identify nucleosome evicted open chromatin, an established marker of regulatory regions. Using a differential ATAC-seq approach, coupled with RNA-seq, H3K27ac ChiP-seq, and computational transcription factor (TFs) binding analysis we comprehensively mapped cell-type and condition specific cis regulatory elements for cardiac fibroblasts and cardiomyocytes, and outlined the TFs that control them. We show that in cardiomyocytes six main transcription factor groups, that control their own and each other’s expression, cooperatively bind discrete distal enhancers that are located at a variable distance from the transcription start site of their target genes. None of these factors is entirely tissue specific in expression, yet various combination of binding sites for these factors, densely clustered within a nucleosome length of genomic stretch make these CREs tissue specific. Multiple tissue specific CREs in turn, are clustered around highly tissue specific genes, and multiple factors, acting from the same and from different CREs can converge on these genes to control their tissue specific expression. Together our data puts forward a mechanistic multi-level combinatorial model for cardiac specific genes expression


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Zhen Miao ◽  
Michael S. Balzer ◽  
Ziyuan Ma ◽  
Hongbo Liu ◽  
Junnan Wu ◽  
...  

AbstractDetermining the epigenetic program that generates unique cell types in the kidney is critical for understanding cell-type heterogeneity during tissue homeostasis and injury response. Here, we profile open chromatin and gene expression in developing and adult mouse kidneys at single cell resolution. We show critical reliance of gene expression on distal regulatory elements (enhancers). We reveal key cell type-specific transcription factors and major gene-regulatory circuits for kidney cells. Dynamic chromatin and expression changes during nephron progenitor differentiation demonstrates that podocyte commitment occurs early and is associated with sustained Foxl1 expression. Renal tubule cells follow a more complex differentiation, where Hfn4a is associated with proximal and Tfap2b with distal fate. Mapping single nucleotide variants associated with human kidney disease implicates critical cell types, developmental stages, genes, and regulatory mechanisms. The single cell multi-omics atlas reveals key chromatin remodeling events and gene expression dynamics associated with kidney development.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Stéphane Deschamps ◽  
John A. Crow ◽  
Nadia Chaidir ◽  
Brooke Peterson-Burch ◽  
Sunil Kumar ◽  
...  

Abstract Background Three-dimensional chromatin loop structures connect regulatory elements to their target genes in regions known as anchors. In complex plant genomes, such as maize, it has been proposed that loops span heterochromatic regions marked by higher repeat content, but little is known on their spatial organization and genome-wide occurrence in relation to transcriptional activity. Results Here, ultra-deep Hi-C sequencing of maize B73 leaf tissue was combined with gene expression and open chromatin sequencing for chromatin loop discovery and correlation with hierarchical topologically-associating domains (TADs) and transcriptional activity. A majority of all anchors are shared between multiple loops from previous public maize high-resolution interactome datasets, suggesting a highly dynamic environment, with a conserved set of anchors involved in multiple interaction networks. Chromatin loop interiors are marked by higher repeat contents than the anchors flanking them. A small fraction of high-resolution interaction anchors, fully embedded in larger chromatin loops, co-locate with active genes and putative protein-binding sites. Combinatorial analyses indicate that all anchors studied here co-locate with at least 81.5% of expressed genes and 74% of open chromatin regions. Approximately 38% of all Hi-C chromatin loops are fully embedded within hierarchical TAD-like domains, while the remaining ones share anchors with domain boundaries or with distinct domains. Those various loop types exhibit specific patterns of overlap for open chromatin regions and expressed genes, but no apparent pattern of gene expression. In addition, up to 63% of all unique variants derived from a prior public maize eQTL dataset overlap with Hi-C loop anchors. Anchor annotation suggests that < 7% of all loops detected here are potentially devoid of any genes or regulatory elements. The overall organization of chromatin loop anchors in the maize genome suggest a loop modeling system hypothesized to resemble phase separation of repeat-rich regions. Conclusions Sets of conserved chromatin loop anchors mapping to hierarchical domains contains core structural components of the gene expression machinery in maize. The data presented here will be a useful reference to further investigate their function in regard to the formation of transcriptional complexes and the regulation of transcriptional activity in the maize genome.


Sign in / Sign up

Export Citation Format

Share Document