Computational annotation of miRNA transcription start sites

Abstract Motivation MicroRNAs (miRNAs) are small noncoding RNAs that play important roles in gene regulation and phenotype development. The identification of miRNA transcription start sites (TSSs) is critical to understand the functional roles of miRNA genes and their transcriptional regulation. Unlike protein-coding genes, miRNA TSSs are not directly detectable from conventional RNA-Seq experiments due to miRNA-specific process of biogenesis. In the past decade, large-scale genome-wide TSS-Seq and transcription activation marker profiling data have become available, based on which, many computational methods have been developed. These methods have greatly advanced genome-wide miRNA TSS annotation. Results In this study, we summarized recent computational methods and their results on miRNA TSS annotation. We collected and performed a comparative analysis of miRNA TSS annotations from 14 representative studies. We further compiled a robust set of miRNA TSSs (RSmirT) that are supported by multiple studies. Integrative genomic and epigenomic data analysis on RSmirT revealed the genomic and epigenomic features of miRNA TSSs as well as their relations to protein-coding and long non-coding genes. Contact [email protected], [email protected]

Download Full-text

Roles for small noncoding RNAs in silencing of retrotransposons in the mammalian brain

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1609287113 ◽

2016 ◽

Vol 113 (45) ◽

pp. 12697-12702 ◽

Cited By ~ 33

Author(s):

Sayan Nandi ◽

Dhruva Chandramohan ◽

Luana Fioriti ◽

Ari M. Melnick ◽

Jean M. Hébert ◽

...

Keyword(s):

Small Rnas ◽

Large Scale ◽

Regulation Of Gene Expression ◽

Mammalian Brain ◽

Long Term Memory ◽

Protein Coding ◽

Functional Roles ◽

Small Noncoding Rnas ◽

Single Base Pair

Piwi-interacting RNAs (piRNAs), long thought to be restricted to germline, have recently been discovered in neurons of Aplysia, with a role in the epigenetic regulation of gene expression underlying long-term memory. We here ask whether piwi/piRNAs are also expressed and have functional roles in the mammalian brain. Large-scale RNA sequencing and subsequent analysis of protein expression revealed the presence in brain of several piRNA biogenesis factors including a mouse piwi (Mili), as well as small RNAs, albeit at low levels, resembling conserved piRNAs in mouse testes [primarily LINE1 (long interspersed nuclear element1) retrotransposon-derived]. Despite the seeming low expression of these putative piRNAs, single-base pair CpG methylation analyses across the genome of Mili/piRNA-deficient (Mili−/−) mice demonstrate that brain genomic DNA is preferentially hypomethylated within intergenic areas and LINE1 promoter areas of the genome. Furthermore, Mili mutant mice exhibit behavioral deficits such as hyperactivity and reduced anxiety. These results suggest that putative piRNAs exist in mammalian brain, and similar to the role of piRNAs in testes, they may be involved in the silencing of retrotransposons, which in brain have critical roles in contributing to genomic heterogeneity underlying adaptation, stress response, and brain pathology. We also describe the presence of another class of small RNAs in the brain, with features of endogenous siRNAs, which may have taken over the role of invertebrate piRNAs in their capacity to target both transposons, as well as protein-coding genes. Thus, RNA interference through gene and retrotransposon silencing previously encountered in Aplysia may also have potential roles in the mammalian brain.

Download Full-text

Genome-Wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli

PLoS ONE ◽

10.1371/journal.pone.0007526 ◽

2009 ◽

Vol 4 (10) ◽

pp. e7526 ◽

Cited By ~ 184

Author(s):

Alfredo Mendoza-Vargas ◽

Leticia Olvera ◽

Maricela Olvera ◽

Ricardo Grande ◽

Leticia Vega-Alvarado ◽

...

Keyword(s):

Transcription Factor ◽

Binding Sites ◽

Transcription Factor Binding Sites ◽

Transcription Factor Binding ◽

Transcription Start ◽

Transcription Start Sites ◽

E Coli ◽

Factor Binding ◽

Genome Wide

Download Full-text

Transcription start sites at the end of protein-coding genes

Human Genomics ◽

10.1186/s40246-018-0146-6 ◽

2018 ◽

Vol 12 (1) ◽

Cited By ~ 1

Author(s):

Ming-Yu Huang ◽

Ji-Long Liu

Keyword(s):

Transcription Start ◽

Protein Coding ◽

Transcription Start Sites ◽

Protein Coding Genes

Download Full-text

Genome-wide analysis reveals strong correlation between CpG islands with nearby transcription start sites of genes and their tissue specificity

Gene ◽

10.1016/j.gene.2005.01.012 ◽

2005 ◽

Vol 350 (2) ◽

pp. 129-136 ◽

Cited By ~ 63

Author(s):

Riu Yamashita ◽

Yutaka Suzuki ◽

Sumio Sugano ◽

Kenta Nakai

Keyword(s):

Strong Correlation ◽

Tissue Specificity ◽

Cpg Islands ◽

Transcription Start ◽

Transcription Start Sites ◽

Genome Wide Analysis ◽

Genome Wide

Download Full-text

Genome-Wide Identification of Estrogen Receptor α-Binding Sites in Mouse Liver

Molecular Endocrinology ◽

10.1210/me.2007-0121 ◽

2008 ◽

Vol 22 (1) ◽

pp. 10-22 ◽

Cited By ~ 83

Author(s):

Hui Gao ◽

Susann Fält ◽

Albin Sandelin ◽

Jan-Åke Gustafsson ◽

Karin Dahlman-Wright

Keyword(s):

Estrogen Receptor ◽

Binding Sites ◽

Mouse Liver ◽

Estrogen Receptor Α ◽

Estrogen Signaling ◽

Transcription Start ◽

Expression Levels ◽

Transcription Start Sites ◽

Genome Wide ◽

Binding Regions

Abstract We report the genome-wide identification of estrogen receptor α (ERα)-binding regions in mouse liver using a combination of chromatin immunoprecipitation and tiled microarrays that cover all nonrepetitive sequences in the mouse genome. This analysis identified 5568 ERα-binding regions. In agreement with what has previously been reported for human cell lines, many ERα-binding regions are located far away from transcription start sites; approximately 40% of ERα-binding regions are located within 10 kb of annotated transcription start sites. Almost 50% of ERα-binding regions overlap genes. The majority of ERα-binding regions lie in regions that are evolutionarily conserved between human and mouse. Motif-finding algorithms identified the estrogen response element, and variants thereof, together with binding sites for activator protein 1, basic-helix-loop-helix proteins, ETS proteins, and Forkhead proteins as the most common motifs present in identified ERα-binding regions. To correlate ERα binding to the promoter of specific genes, with changes in expression levels of the corresponding mRNAs, expression levels of selected mRNAs were assayed in livers 2, 4, and 6 h after treatment with ERα-selective agonist propyl pyrazole triol. Five of these eight selected genes, Shp, Stat3, Pdgds, Pck1, and Pdk4, all responded to propyl pyrazole triol after 4 h treatment. These results extend our previous studies using gene expression profiling to characterize estrogen signaling in mouse liver, by characterizing the first step in this signaling cascade, the binding of ERα to DNA in intact chromatin.

Download Full-text

Deep Annotation of Protein Function across Diverse Bacteria from Mutant Phenotypes

10.1101/072470 ◽

2016 ◽

Cited By ~ 21

Author(s):

Morgan N. Price ◽

Kelly M. Wetmore ◽

R. Jordan Waters ◽

Mark Callaghan ◽

Jayashree Ray ◽

...

Keyword(s):

Protein Function ◽

Large Scale ◽

Hypothetical Proteins ◽

Data Set ◽

Protein Coding ◽

Bacterial Proteins ◽

Genome Wide ◽

Protein Functions ◽

Mutant Phenotypes ◽

Related Proteins

SummaryThe function of nearly half of all protein-coding genes identified in bacterial genomes remains unknown. To systematically explore the functions of these proteins, we generated saturated transposon mutant libraries from 25 diverse bacteria and we assayed mutant phenotypes across hundreds of distinct conditions. From 3,903 genome-wide mutant fitness assays, we obtained 14.9 million gene phenotype measurements and we identified a mutant phenotype for 8,487 proteins with previously unknown functions. The majority of these hypothetical proteins (57%) had phenotypes that were either specific to a few conditions or were similar to that of another gene, thus enabling us to make informed predictions of protein function. For 1,914 of these hypothetical proteins, the functional associations are conserved across related proteins from different bacteria, which confirms that these associations are genuine. This comprehensive catalogue of experimentally-annotated protein functions also enables the targeted exploration of specific biological processes. For example, sensitivity to a DNA-damaging agent revealed 28 known families of DNA repair proteins and 11 putative novel families. Across all sequenced bacteria, 14% of proteins that lack detailed annotations have an ortholog with a functional association in our data set. Our study demonstrates the utility and scalability of high-throughput genetics for large-scale annotation of bacterial proteins and provides a vast compendium of experimentally-determined protein functions across diverse bacteria.

Download Full-text

Dynamics of transcription-dependent H3K36me3 marking by the SETD2:IWS1:SPT6 ternary complex

10.1101/636084 ◽

2019 ◽

Cited By ~ 2

Author(s):

Katerina Cermakova ◽

Eric A. Smith ◽

Vaclav Veverka ◽

H. Courtney Hodges

Keyword(s):

Ternary Complex ◽

Spatial Distributions ◽

Epigenetic Memory ◽

Dynamic Features ◽

Transcription Start ◽

Accurate Representation ◽

Transcription Start Sites ◽

Genome Wide ◽

Genome Wide Data ◽

Tight Correlation

AbstractSETD2 contributes to gene expression by marking gene bodies with H3K36me3, which is thought to assist in the concentration of transcription machinery at the small portion of the coding genome. Despite extensive genome-wide data revealing the precise localization of H3K36me3 over gene bodies, the physical basis for the accumulation, maintenance, and sharp borders of H3K36me3 over these sites remains rudimentary. Here we propose a model of H3K36me3 marking based on stochastic transcription-dependent placement and transcription-independent spreading. Our analysis of the spatial distributions and dynamic features of these marks indicates that transcription-dependent placement dominates the establishment of H3K36me3 domains compared to transcription-independent spreading processes, and that turnover of H3K36me3 limits its capacity for epigenetic memory. By adding additional terms for asymmetric histone turnover occurring at transcription start sites, our model provides a remarkably accurate representation of H3K36me3 levels and dynamics over gene bodies. Furthermore, we validate our findings by revealing that loss of SPT6 impairs the transcription-coupled activity of the SETD2:IWS1:SPT6 ternary complex, thereby reducing the tight correlation between transcription and H3K36me3 levels at gene bodies.

Download Full-text

Transcription initiation mapping in 31 bovine tissues reveals complex promoter activity, pervasive transcription, and tissue-specific promoter usage

10.1101/2020.09.05.284547 ◽

2020 ◽

Author(s):

D.E. Goszczynski ◽

M.M. Halstead ◽

A.D. Islas-Trejo ◽

H. Zhou ◽

P.J. Ross

Keyword(s):

Transcription Initiation ◽

Promoter Activity ◽

Bovine Genome ◽

Transcription Start ◽

Protein Coding ◽

Tissue Specific ◽

Transcription Start Sites ◽

Expression Control ◽

Tissue Specific Promoter ◽

Genome Annotations

ABSTRACTCharacterizing transcription start sites is essential for understanding the regulatory mechanisms that control gene expression. Recently, a new bovine genome assembly (ARS-UCD1.2) with high continuity, accuracy, and completeness was released; however, the functional annotation of the bovine genome lacks precise transcription start sites and includes a low number of transcripts in comparison to human and mouse. Using the RAMPAGE approach, this study identified transcription start sites at high resolution in a large collection of bovine tissues. We found several known and novel transcription start sites attributed to promoters of protein coding and lncRNA genes that were validated through experimental and in silico evidence. With these findings, the annotation of transcription start sites in cattle reached a level comparable to the mouse and human genome annotations. In addition, we identified and characterized transcription start sites for antisense transcripts derived from bidirectional promoters, potential lncRNAs, mRNAs, and pre-miRNAs. We also analyzed the quantitative aspects of RAMPAGE data for producing a promoter activity atlas, reaching highly reproducible results comparable to traditional RNA-Seq. Lastly, gene co-expression networks revealed an impressive use of tissue-specific promoters, especially between brain and testicle, which expressed several genes in common from alternate transcription start sites. Regions surrounding co-expressed modules were enriched in binding factor motifs representative of their tissues. This annotation will be highly useful for future studies on expression control in cattle and other species. Furthermore, these data provide significant insight into transcriptional activity for a comprehensive set of tissues.

Download Full-text

Genome-wide profiling of transcribed enhancers during macrophage activation

10.1101/163519 ◽

2017 ◽

Author(s):

Elena Denisenko ◽

Reto Guler ◽

Musa Mhlanga ◽

Harukazu Suzuki ◽

Frank Brombacher ◽

...

Keyword(s):

Gene Expression ◽

Transcriptional Activation ◽

Large Scale ◽

Transcriptional Control ◽

Macrophage Activation ◽

Transcriptional Responses ◽

Protein Coding ◽

Genome Wide ◽

Ifn Γ ◽

Cap Analysis

AbstractMacrophages are sentinel cells essential for tissue homeostasis and host defence. Owing to their plasticity, macrophages acquire a range of functional phenotypes in response to microenvironmental stimuli, of which M(IFN-γ) and M(IL-4/IL-13) are well-known for their opposing pro- and anti-inflammatory roles. Enhancers have emerged as regulatory DNA elements crucial for transcriptional activation of gene expression. Using cap analysis of gene expression and epigenetic data, we identify on large-scale transcribed enhancers in mouse macrophages, their time kinetics and target protein-coding genes. We observe an increase in target gene expression, concomitant with increasing numbers of associated enhancers and find that genes associated to many enhancers show a shift towards stronger enrichment for macrophage-specific biological processes. We infer enhancers that drive transcriptional responses of genes upon M(IFN-γ) and M(IL-4/IL-13) macrophage activation and demonstrate stimuli-specificity of regulatory associations. Finally, we show that enhancer regions are enriched for binding sites of inflammation-related transcription factors, suggesting a link between stimuli response and enhancer transcriptional control. Our study provides new insights into genome-wide enhancer-mediated transcriptional control of macrophage genes, including those implicated in macrophage activation, and offers a detailed genome-wide catalogue to further elucidate enhancer regulation in macrophages.

Download Full-text

DChIPRep, an R/Bioconductor package for differential enrichment analysis in chromatin studies

10.7287/peerj.preprints.1723v2 ◽

2016 ◽

Author(s):

Christophe D Chabbert ◽

Lars M Steinmetz ◽

Bernd Klaus

Keyword(s):

Input Data ◽

Enrichment Analysis ◽

Bioconductor Package ◽

Analytic Framework ◽

Transcription Start ◽

Transcription Start Sites ◽

Genome Wide ◽

Bioconductor Project ◽

User Friendly ◽

Genome Wide Study

The genome–wide study of epigenetic states requires the integrative analysis of histone modification ChIP–seq data. Here, we introduce an easy–to–use analytic framework to compare profiles of enrichment in histone modifications around classes of genomic elements, e.g. transcription start sites (TSS). Our framework is available via the user–friendly R/Bioconductor package DChIPRep. DChIPRep uses biological replicate information as well as chromatin Input data to allow for a rigorous assessment of differential enrichment. DChIPRep is available for download through the Bioconductor project at http://bioconductor.org/packages/DChIPRep. Contact [email protected]

Download Full-text