scholarly journals Disease-associated genetic variants in the regulatory regions of human genes: mechanisms of action on transcription and genomic resources for dissecting these mechanisms

2021 ◽  
Vol 25 (1) ◽  
pp. 18-29
Author(s):  
E. V. Ignatieva ◽  
E. A. Matrosova

Whole genome and whole exome sequencing technologies play a very important role in the studies of the genetic aspects of the pathogenesis of various diseases. The ample use of genome-wide and exome-wide association study methodology (GWAS and EWAS) made it possible to identify a large number of genetic variants associated with diseases. This information is accumulated in the databases like GWAS central, GWAS catalog, OMIM, ClinVar, etc. Most of the variants identified by the GWAS technique are located in the noncoding regions of the human genome. According to the ENCODE project, the fraction of regions in the human genome potentially involved in transcriptional control is many times greater than the fraction of coding regions. Thus, genetic variation in noncoding regions of the genome can increase the susceptibility to diseases by disrupting various regulatory elements (promoters, enhancers, silencers, insulator regions, etc.). However, identification of the mechanisms of influence of pathogenic genetic variants on the diseases risk is difficult due to a wide variety of regulatory elements. The present review focuses on the molecular genetic mechanisms by which pathogenic genetic variants affect gene expression. At the same time, attention is concentrated on the transcriptional level of regulation as an initial step in the expression of any gene. A triggering event mediating the effect of a pathogenic genetic variant on the level of gene expression can be, for example, a change in the functional activity of transcription factor binding sites (TFBSs) or DNA methylation change, which, in turn, affects the functional activity of promoters or enhancers. Dissecting the regulatory roles of polymorphic loci have been impossible without close integration of modern experimental approaches with computer analysis of a growing wealth of genetic and biological data obtained using omics technologies. The review provides a brief description of a number of the most well-known public genomic information resources containing data obtained using omics technologies, including (1) resources that accumulate data on the chromatin states and the regions of transcription factor binding derived from ChIP-seq experiments; (2) resources containing data on genomic loci, for which allele-specific transcription factor binding was revealed based on ChIP-seq technology; (3) resources containing in silico predicted data on the potential impact of genetic variants on the transcription factor binding sites.

2021 ◽  
Vol 22 (12) ◽  
pp. 6454
Author(s):  
Arina O. Degtyareva ◽  
Elena V. Antontseva ◽  
Tatiana I. Merkulova

The vast majority of the genetic variants (mainly SNPs) associated with various human traits and diseases map to a noncoding part of the genome and are enriched in its regulatory compartment, suggesting that many causal variants may affect gene expression. The leading mechanism of action of these SNPs consists in the alterations in the transcription factor binding via creation or disruption of transcription factor binding sites (TFBSs) or some change in the affinity of these regulatory proteins to their cognate sites. In this review, we first focus on the history of the discovery of regulatory SNPs (rSNPs) and systematized description of the existing methodical approaches to their study. Then, we brief the recent comprehensive examples of rSNPs studied from the discovery of the changes in the TFBS sequence as a result of a nucleotide substitution to identification of its effect on the target gene expression and, eventually, to phenotype. We also describe state-of-the-art genome-wide approaches to identification of regulatory variants, including both making molecular sense of genome-wide association studies (GWAS) and the alternative approaches the primary goal of which is to determine the functionality of genetic variants. Among these approaches, special attention is paid to expression quantitative trait loci (eQTLs) analysis and the search for allele-specific events in RNA-seq (ASE events) as well as in ChIP-seq, DNase-seq, and ATAC-seq (ASB events) data.


2019 ◽  
Author(s):  
Lianggang Huang ◽  
Xuejie Li ◽  
Liangbo Dong ◽  
Bin Wang ◽  
Li Pan

AbstractTo identify cis-regulatory elements (CREs) and motifs of TF binding is an important step in understanding the regulatory functions of TF binding and gene expression. The lack of experimentally determined and computationally inferred data means that the genome-wide CREs and TF binding sites (TFBs) in filamentous fungi remain unknown. ATAC-seq is a technique that provides a high-resolution measurement of chromatin accessibility to Tn5 transposase integration. In filamentous fungi, the existence of cell walls and the difficulty in purifying nuclei have prevented the routine application of this technique. Herein, we modified the ATAC-seq protocol in filamentous fungi to identify and map open chromatin and TF-binding sites on a genome-scale. We applied the assay for ATAC-seq among different Aspergillus species, during different culture conditions, and among TF-deficient strains to delineate open chromatin regions and TFBs across each genome. The syntenic orthologues regions and differential changes regions of chromatin accessibility were responsible for functional conservative regulatory elements and differential gene expression in the Aspergillus genome respectively. Importantly, 17 and 15 novel transcription factor binding motifs that were enriched in the genomic footprints identified from ATAC-seq data of A. niger, were verified in vivo by our artificial synthetic minimal promoter system, respectively. Furthermore, we first confirmed the strand-specific patterns of Tn5 transposase around the binding sites of known TFs by comparing ATAC-seq data of TF-deficient strains with the data from a wild-type strain.


Genes ◽  
2018 ◽  
Vol 9 (9) ◽  
pp. 446 ◽  
Author(s):  
Shijie Xin ◽  
Xiaohui Wang ◽  
Guojun Dai ◽  
Jingjing Zhang ◽  
Tingting An ◽  
...  

The proinflammatory cytokine, interleukin-6 (IL-6), plays a critical role in many chronic inflammatory diseases, particularly inflammatory bowel disease. To investigate the regulation of IL-6 gene expression at the molecular level, genomic DNA sequencing of Jinghai yellow chickens (Gallus gallus) was performed to detect single-nucleotide polymorphisms (SNPs) in the region −2200 base pairs (bp) upstream to 500 bp downstream of IL-6. Transcription factor binding sites and CpG islands in the IL-6 promoter region were predicted using bioinformatics software. Twenty-eight SNP sites were identified in IL-6. Four of these 28 SNPs, three [−357 (G > A), −447 (C > G), and −663 (A > G)] in the 5′ regulatory region and one in the 3′ non-coding region [3177 (C > T)] are not labelled in GenBank. Bioinformatics analysis revealed 11 SNPs within the promoter region that altered putative transcription factor binding sites. Furthermore, the C-939G mutation in the promoter region may change the number of CpG islands, and SNPs in the 5′ regulatory region may influence IL-6 gene expression by altering transcription factor binding or CpG methylation status. Genetic diversity analysis revealed that the newly discovered A-663G site significantly deviated from Hardy-Weinberg equilibrium. These results provide a basis for further exploration of the promoter function of the IL-6 gene and the relationships of these SNPs to intestinal inflammation resistance in chickens.


2019 ◽  
Author(s):  
Ning Qing Liu ◽  
Michela Maresca ◽  
Teun van den Brand ◽  
Luca Braccioli ◽  
Marijne M.G.A. Schijns ◽  
...  

SUMMARYThe cohesin complex plays essential roles in sister chromatin cohesin, chromosome organization and gene expression. The role of cohesin in gene regulation is incompletely understood. Here, we report that the cohesin release factor WAPL is crucial for maintaining a pool of dynamic cohesin bound to regions that are associated with lineage specific genes in mouse embryonic stem cells. These regulatory regions are enriched for active enhancer marks and transcription factor binding sites, but largely devoid of CTCF binding sites. Stabilization of cohesin, which leads to a loss of dynamic cohesin from these regions, does not affect transcription factor binding or active enhancer marks, but does result in changes in promoter-enhancer interactions and downregulation of genes. Acute cohesin depletion can phenocopy the effect of WAPL depletion, showing that cohesin plays a crucial role in maintaining expression of lineage specific genes. The binding of dynamic cohesin to chromatin is dependent on the pluripotency transcription factor OCT4, but not NANOG. Finally, dynamic cohesin binding sites are also found in differentiated cells, suggesting that they represent a general regulatory principle. We propose that cohesin dynamically binding to regulatory sites creates a favorable spatial environment in which promoters and enhancers can communicate to ensure proper gene expression.HIGHLIGHTSThe cohesin release factor WAPL is crucial for maintaining a pluripotency-specific phenotype.Dynamic cohesin is enriched at lineage specific loci and overlaps with binding sites of pluripotency transcription factors.Expression of lineage specific genes is maintained by dynamic cohesin binding through the formation of promoter-enhancer associated self-interaction domains.CTCF-independent cohesin binding to chromatin is controlled by the pioneer factor OCT4.


BMC Biology ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Lianggang Huang ◽  
Xuejie Li ◽  
Liangbo Dong ◽  
Bin Wang ◽  
Li Pan

Abstract Background The identification of open chromatin regions and transcription factor binding sites (TFBs) is an important step in understanding the regulation of gene expression in diverse species. ATAC-seq is a technique used for such purpose by providing high-resolution measurements of chromatin accessibility revealed through integration of Tn5 transposase. However, the existence of cell walls in filamentous fungi and associated difficulty in purifying nuclei have precluded the routine application of this technique, leading to a lack of experimentally determined and computationally inferred data on the identity of genome-wide cis-regulatory elements (CREs) and TFBs. In this study, we constructed an ATAC-seq platform suitable for filamentous fungi and generated ATAC-seq libraries of Aspergillus niger and Aspergillus oryzae grown under a variety of conditions. Results We applied the ATAC-seq assay for filamentous fungi to delineate the syntenic orthologue and differentially changed chromatin accessibility regions among different Aspergillus species, during different culture conditions, and among specific TF-deleted strains. The syntenic orthologues of accessible regions were responsible for the conservative functions across Aspergillus species, while regions differentially changed between culture conditions and TFs mutants drove differential gene expression programs. Importantly, we suggest criteria to determine TFBs through the analysis of unbalanced cleavage of distinct TF-bound DNA strands by Tn5 transposase. Based on this criterion, we constructed data libraries of the in vivo genomic footprint of A. niger under distinct conditions, and generated a database of novel transcription factor binding motifs through comparison of footprints in TF-deleted strains. Furthermore, we validated the novel TFBs in vivo through an artificial synthetic minimal promoter system. Conclusions We characterized the chromatin accessibility regions of filamentous fungi species, and identified a complete TFBs map by ATAC-seq, which provides valuable data for future analyses of transcriptional regulation in filamentous fungi.


Sign in / Sign up

Export Citation Format

Share Document