Some microsatellites may act as novel polymorphic cis-regulatory elements through transcription factor binding

Gene ◽  
2004 ◽  
Vol 341 ◽  
pp. 149-165 ◽  
Author(s):  
Alvaro Rada Iglesias ◽  
Ellen Kindlund ◽  
Martti Tammi ◽  
Claes Wadelius
2019 ◽  
Author(s):  
Arif Harmanci ◽  
Akdes Serin Harmanci ◽  
Jyothishmathi Swaminathan ◽  
Vidya Gopalakrishnan

Abstract Motivation Functional genomics experiments generate genomewide signal profiles that are dense information sources for annotating the regulatory elements. These profiles measure epigenetic activity at the nucleotide resolution and they exhibit distinctive patterns as they fluctuate along the genome. Most notable of these patterns are the valley patterns that are prevalently observed in assays such as ChIP Sequencing and bisulfite sequencing. The genomic positions of valleys pinpoint locations of cis-regulatory elements such as enhancers and insulators. Systematic identification of the valleys provides novel information for delineating the annotation of regulatory elements. Nevertheless, the valleys are not reported by majority of the analysis pipelines. Results We describe EpiSAFARI, a computational method for sensitive detection of valleys from diverse types of epigenetic profiles. EpiSAFARI employs a novel smoothing method for decreasing noise in signal profiles and accounts for technical factors such as sparse signals, mappability, and nucleotide content. In performance comparisons, EpiSAFARI performs favorably in terms of accuracy. The histone modification valleys detected by EpiSAFARI exhibit high conservation, transcription factor binding, and they are enriched in nascent transcription. In addition, the large clusters of histone valleys are found to be enriched at the promoters of the developmentally associated genes. Differential histone valleys exhibit concordance with differential DNase signal at cell line specific valleys. DNA methylation valleys exhibit elevated conservation and high transcription factor binding. Specifically, we observed enriched binding of transcription factors associated with chromatin structure around methyl-valleys. Availability EpiSAFARI is publicly available at https://github.com/harmancilab/EpiSAFARI Supplementary information Supplementary data are available at Bioinformatics online.


2014 ◽  
Author(s):  
Anil Raj ◽  
Heejung Shim ◽  
Yoav Gilad ◽  
Jonathan K Pritchard ◽  
Matthew Stephens

Motivation: Understanding global gene regulation depends critically on accurate annotation of regulatory elements that are functional in a given cell type. CENTIPEDE, a powerful, probabilistic framework for identifying transcription factor binding sites from tissue-specific DNase I cleavage patterns and genomic sequence content, leverages the hypersensitivity of factor-bound chromatin and the information in the DNase I spatial cleavage profile characteristic of each DNA binding protein to accurately infer functional factor binding sites. However, the model for the spatial profile in this framework underestimates the substantial variation in the DNase I cleavage profiles across factor-bound genomic locations and across replicate measurements of chromatin accessibility. Results: In this work, we adapt a multi-scale modeling framework for inhomogeneous Poisson processes to better model the underlying variation in DNase I cleavage patterns across genomic locations bound by a transcription factor. In addition to modeling variation, we also model spatial structure in the heterogeneity in DNase I cleavage patterns for each factor. Using DNase-seq measurements assayed in a lymphoblastoid cell line, we demonstrate the improved performance of this model for several transcription factors by comparing against the Chip-Seq peaks for those factors. Finally, we propose an extension to this framework that allows for a more flexible background model and evaluate the additional gain in accuracy achieved when the background model parameters are estimated using DNase-seq data from naked DNA. The proposed model can also be applied to paired-end ATAC-seq and DNase-seq data in a straightforward manner. Availability: msCentipede, a Python implementation of an algorithm to infer transcription factor binding using this model, is made available at https://github.com/rajanil/msCentipede


2021 ◽  
Author(s):  
Tyler Hansen ◽  
Emily Hodges

Transcriptional enhancers control cell-type specific gene expression in humans and dysfunction can lead to debilitating diseases, including cancer. Identifying bona-fide enhancers is difficult due to a lack of spatial or sequence constraints. In addition, only a small percentage of the genome is accessible in matured cell types; and therefore, most enhancers are inactive due to their chromatin context rather than intrinsic properties of the DNA sequence itself. For this reason, we decided to assay regulatory activity exclusively within accessible chromatin. To do this, we combined assay for transposase-accessible chromatin using sequencing (ATAC-seq) with self-transcribing active regulatory region sequencing (STARR-seq); we call this method ATAC-STARR-seq. With ATAC-STARR-seq, we identify both active and silent regulatory elements in GM12878 B cells; these active and silent elements are enriched for transcription factor motifs and histone modifications associated with activating and repressing regulation, respectively. We also show that ATAC-STARR-seq quantifies chromatin accessibility and transcription factor binding. We integrate this information and subset active regions based on transcription factor binding profiles. Depending on the transcription factors bound, subsets are enriched for distinct reactome pathways. Altogether, this highlights the power of ATAC-STARR-seq to investigate the transcriptional regulatory landscape of the human genome.


Blood ◽  
2007 ◽  
Vol 110 (13) ◽  
pp. 4503-4510 ◽  
Author(s):  
Marco De Gobbi ◽  
Eduardo Anguita ◽  
Jim Hughes ◽  
Jacqueline A. Sloane-Stanley ◽  
Jacqueline A. Sharpe ◽  
...  

To address the mechanism by which the human globin genes are activated during erythropoiesis, we have used a tiled microarray to analyze the pattern of transcription factor binding and associated histone modifications across the telomeric region of human chromosome 16 in primary erythroid and nonerythroid cells. This 220-kb region includes the α globin genes and 9 widely expressed genes flanking the α globin locus. This un-biased, comprehensive analysis of transcription factor binding and histone modifications (acetylation and methylation) described here not only identified all known cis-acting regulatory elements in the human α globin cluster but also demonstrated that there are no additional erythroid-specific regulatory elements in the 220-kb region tested. In addition, the pattern of histone modification distinguished promoter elements from potential enhancer elements across this region. Finally, comparison of the human and mouse orthologous regions in a unique mouse model, with both regions coexpressed in the same animal, showed significant differences that may explain how these 2 clusters are regulated differently in vivo.


2021 ◽  
Vol 25 (1) ◽  
pp. 18-29
Author(s):  
E. V. Ignatieva ◽  
E. A. Matrosova

Whole genome and whole exome sequencing technologies play a very important role in the studies of the genetic aspects of the pathogenesis of various diseases. The ample use of genome-wide and exome-wide association study methodology (GWAS and EWAS) made it possible to identify a large number of genetic variants associated with diseases. This information is accumulated in the databases like GWAS central, GWAS catalog, OMIM, ClinVar, etc. Most of the variants identified by the GWAS technique are located in the noncoding regions of the human genome. According to the ENCODE project, the fraction of regions in the human genome potentially involved in transcriptional control is many times greater than the fraction of coding regions. Thus, genetic variation in noncoding regions of the genome can increase the susceptibility to diseases by disrupting various regulatory elements (promoters, enhancers, silencers, insulator regions, etc.). However, identification of the mechanisms of influence of pathogenic genetic variants on the diseases risk is difficult due to a wide variety of regulatory elements. The present review focuses on the molecular genetic mechanisms by which pathogenic genetic variants affect gene expression. At the same time, attention is concentrated on the transcriptional level of regulation as an initial step in the expression of any gene. A triggering event mediating the effect of a pathogenic genetic variant on the level of gene expression can be, for example, a change in the functional activity of transcription factor binding sites (TFBSs) or DNA methylation change, which, in turn, affects the functional activity of promoters or enhancers. Dissecting the regulatory roles of polymorphic loci have been impossible without close integration of modern experimental approaches with computer analysis of a growing wealth of genetic and biological data obtained using omics technologies. The review provides a brief description of a number of the most well-known public genomic information resources containing data obtained using omics technologies, including (1) resources that accumulate data on the chromatin states and the regions of transcription factor binding derived from ChIP-seq experiments; (2) resources containing data on genomic loci, for which allele-specific transcription factor binding was revealed based on ChIP-seq technology; (3) resources containing in silico predicted data on the potential impact of genetic variants on the transcription factor binding sites.


Sign in / Sign up

Export Citation Format

Share Document