scholarly journals TSSr: an R package for comprehensive analyses of TSS sequencing data

2021 ◽  
Vol 3 (4) ◽  
Author(s):  
Zhaolian Lu ◽  
Keenan Berry ◽  
Zhenbin Hu ◽  
Yu Zhan ◽  
Tae-Hyuk Ahn ◽  
...  

Abstract Transcription initiation is regulated in a highly organized fashion to ensure proper cellular functions. Accurate identification of transcription start sites (TSSs) and quantitative characterization of transcription initiation activities are fundamental steps for studies of regulated transcriptions and core promoter structures. Several high-throughput techniques have been developed to sequence the very 5′end of RNA transcripts (TSS sequencing) on the genome scale. Bioinformatics tools are essential for processing, analysis, and visualization of TSS sequencing data. Here, we present TSSr, an R package that provides rich functions for mapping TSS and characterizations of structures and activities of core promoters based on all types of TSS sequencing data. Specifically, TSSr implements several newly developed algorithms for accurately identifying TSSs from mapped sequencing reads and inference of core promoters, which are a prerequisite for subsequent functional analyses of TSS data. Furthermore, TSSr also enables users to export various types of TSS data that can be visualized by genome browser for inspection of promoter activities in association with other genomic features, and to generate publication-ready TSS graphs. These user-friendly features could greatly facilitate studies of transcription initiation based on TSS sequencing data. The source code and detailed documentations of TSSr can be freely accessed at https://github.com/Linlab-slu/TSSr.

2021 ◽  
Vol 22 (11) ◽  
pp. 5704
Author(s):  
Xiaofan Feng ◽  
Mario Andrea Marchisio

Promoters are fundamental components of synthetic gene circuits. They are DNA segments where transcription initiation takes place. New constitutive and regulated promoters are constantly engineered in order to meet the requirements for protein and RNA expression into different genetic networks. In this work, we constructed and optimized new synthetic constitutive promoters for the yeast Saccharomyces cerevisiae. We started from foreign (e.g., viral) core promoters as templates. They are, usually, unfunctional in yeast but can be activated by extending them with a short sequence, from the CYC1 promoter, containing various transcription start sites (TSSs). Transcription was modulated by mutating the TATA box composition and varying its distance from the TSS. We found that gene expression is maximized when the TATA box has the form TATAAAA or TATATAA and lies between 30 and 70 nucleotides upstream of the TSS. Core promoters were turned into stronger promoters via the addition of a short UAS. In particular, the 40 nt bipartite UAS from the GPD promoter can enhance protein synthesis considerably when placed 150 nt upstream of the TATA box. Overall, we extended the pool of S. cerevisiae promoters with 59 new samples, the strongest overcoming the native TEF2 promoter.


2017 ◽  
Author(s):  
Sarah Rennie ◽  
Maria Dalby ◽  
Marta Lloret-Llinares ◽  
Stylianos Bakoulis ◽  
Christian Dalager Vaagensø ◽  
...  

ABSTRACTMammalian gene promoters and enhancers share many properties. They are composed of a unified promoter architecture of divergent transcripton initiation and gene promoters may exhibit enhancer function. However, it is currently unclear how expression strength of a regulatory element relates to its enhancer strength and if the unifying architecture is conserved across Metazoa. Here we investigate the transcription initiation landscape and its associated RNA decay in D. melanogaster. Surprisingly, we find that the majority of active gene-distal enhancers and a considerable fraction of gene promoters are divergently transcribed. We observe quantitative relationships between enhancer potential, expression level and core promoter strength, providing an explanation for indirectly related histone modifications that are reflecting expression levels. Lowly abundant unstable RNAs initiated from weak core promoters are key characteristics of gene-distal developmental enhancers, while the housekeeping enhancer strengths of gene promoters reflect their expression strengths. The different layers of regulation mediated by gene-distal enhancers and gene promoters are also reflected in chromatin interaction data. Our results suggest a unified promoter architecture of many D. melanogaster regulatory elements, that is universal across Metazoa, whose regulatory functions seem to be related to their core promoter elements.


PLoS ONE ◽  
2019 ◽  
Vol 14 (5) ◽  
pp. e0216471 ◽  
Author(s):  
Davide Bolognini ◽  
Niccolò Bartalucci ◽  
Alessandra Mingrino ◽  
Alessandro Maria Vannucchi ◽  
Alberto Magi

Author(s):  
Zhaolian Lu ◽  
Zhenguo Lin

ABSTRACTThe molecular process of transcription by RNA Polymerase II is highly conserved among eukaryotes (“classic model”). Intriguingly, a distinct way of locating transcription start sites (TSSs) was found in a budding yeast Saccharomyces cerevisiae (“scanning model”). The origin of the “scanning model” and its underlying genetic mechanisms remain unsolved. Herein, we applied genomic approaches to address these questions. We first identified TSSs at a single-nucleotide resolution for 12 yeast species using the nAnT-iCAGE technique, which significantly improved the annotations of these genomes by providing accurate 5’boundaries of protein-coding genes. We then infer the initiation mechanism of a species based on its TSS maps and genome sequences. We found that the “scanning model” had originated after the split of Yarrowia lipolytica and the rest of budding yeasts. An adenine-rich region immediately upstream of TSS had appeared during the evolution of the “scanning model” species, which might facilitate TSS selection in these species. Both initiation mechanisms share a strong preference for pyrimidine-purine dinucleotides surrounding the TSS. Our results suggested that the purine is required for accurately recruiting the first nucleotide, increasing the chance of being capped during mRNA maturation, which is critical for efficient translation initiation. Based on our findings, we proposed a model of TSS selection for the “scanning model” species. Besides, our study also demonstrated that the intrinsic sequence feature primarily determines the distribution of initiation activities within a core promoter (core promoter shape).


2019 ◽  
Author(s):  
Davide Bolognini ◽  
Niccolò Bartalucci ◽  
Alessandra Mingrino ◽  
Alessandro Maria Vannucchi ◽  
Alberto Magi

AbstractMinION and GridION X5 from Oxford Nanopore Technologies are devices for real-time DNA and RNA sequencing. On the one hand, MinION is the only real-time, low cost and portable sequencing device and, thanks to its unique properties, is becoming more and more popular among biologists; on the other, GridION X5, mainly for its costs, is less widespread but highly suitable for researchers with large sequencing projects. Despite the fact that Oxford Nanopore Technologies’ devices have been increasingly used in the last few years, there is a lack of high-performing and user-friendly tools to handle the data outputted by both MinION and GridION X5 platforms. Here we present NanoR, a cross-platform R package designed with the purpose to simplify and improve nanopore data visualization. Indeed, NanoR is built on few functions but overcomes the capabilities of existing tools to extract meaningful informations from MinION sequencing data; in addition, as exclusive features, NanoR can deal with GridION X5 sequencing outputs and allows comparison of both MinION and GridION X5 sequencing data in one command. NanoR is released as free package for R at https://github.com/davidebolo1993/NanoR.


2018 ◽  
Author(s):  
Zhaolian Lu ◽  
Zhenguo Lin

AbstractTranscription initiation is finely regulated to ensure the proper expression and function of these genes. The regulated transcription initiation in response to various environmental cues in the model organism Saccharomyces cerevisiae has not been systematically investigated. In this study, we generated quantitative maps of transcription start site (TSS) at a single-nucleotide resolution for S. cerevisiae grown in nine different conditions using no-amplification non-tagging Cap analysis of gene expression (nAnT-iCAGE) sequencing. Based on 337 million uniquely mapped CAGE tags, we mapped ~1 million well-supported TSSs, suggesting highly pervasive transcription initiation in the compact genome of yeast. The comprehensive TSS maps allowed us to identify core promoters for ~96% verified protein-coding genes and to revise the predicted translation start codon for 183 genes. We found that 56% of yeast genes have at least two core promoters and alternative usage of different core promoters in a gene is widespread in response to changing environments. More importantly, most core promoter shifts are coupled with differential gene expression, indicating that core promoter shift might play an important role in controlling transcriptional activity of yeast genes. Based on their dynamic activities, we divided yeast core promoters as constitutive core promoters (55%) and inducible core promoters (45%). The two classes of core promoters exhibit distinctive patterns in transcriptional abundance, chromatin structure, promoter shape, and sequence context. In summary, the quantitative TSS maps generated by this study improved the annotation of yeast genome, and revealed a highly pervasive and dynamic nature of transcription initiation in yeast.


2016 ◽  
Author(s):  
Yun Chen ◽  
Athma A. Pai ◽  
Jan Herudek ◽  
Michal Lubas ◽  
Nicola Meola ◽  
...  

AbstractMammalian transcriptomes are complex and formed by extensive promoter activity. In addition, gene promoters are largely divergent and initiate transcription of reverse-oriented promoter upstream transcripts (PROMPTs). Although PROMPTs are commonly terminated early, influenced by polyadenylation sites, promoters often cluster so that the divergent activity of one might impact another. Here, we find that the distance between promoters strongly correlates with the expression, stability and length of their associated PROMPTs. Adjacent promoters driving divergent mRNA transcription support PROMPT formation, but due to polyadenylation site constraints, these transcripts tend to spread into the neighboring mRNA on the same strand. This mechanism to derive new alternative mRNA transcription start sites (TSSs) is also evident at closely spaced promoters supporting convergent mRNA transcription. We suggest that basic building blocks of divergently transcribed core promoter pairs, in combination with the wealth of TSSs in mammalian genomes, provides a framework with which evolution shapes transcriptomes.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Hemant Gupta ◽  
Khyati Chandratre ◽  
Siddharth Sinha ◽  
Teng Huang ◽  
Xiaobing Wu ◽  
...  

Abstract Background Core promoter controls transcription initiation. However, little is known for core promoter diversity in the human genome and its relationship with diseases. We hypothesized that as a functional important component in the genome, the core promoter in the human genome could be under evolutionary selection, as reflected by its highly diversification in order to adjust gene expression for better adaptation to the different environment. Results Applying the “Exome-based Variant Detection in Core-promoters” method, we analyzed human core-promoter diversity by using the 2682 exome data sets of 25 worldwide human populations sequenced by the 1000 Genome Project. Collectively, we identified 31,996 variants in the core promoter region (− 100 to + 100) of 12,509 human genes (https://dbhcpd.fhs.um.edu.mo). Analyzing the rich variation data identified highly ethnic-specific patterns of core promoter variation between different ethnic populations, the genes with highly variable core promoters, the motifs affected by the variants, and their involved functional pathways. eQTL test revealed that 12% of core promoter variants can significantly alter gene expression level. Comparison with GWAS data we located 163 variants as the GWAS identified traits associated with multiple diseases, half of these variants can alter gene expression. Conclusion Data from our study reals the highly diversified nature of core promoter in the human genome, and highlights that core promoter variation could play important roles not only in gene expression regulation but also in disease predisposition.


2018 ◽  
Author(s):  
Luca Alessandrì ◽  
Marco Beccuti ◽  
Maddalena Arigoni ◽  
Martina Olivero ◽  
Greta Romano ◽  
...  

AbstractSummarySingle-cell RNA sequencing has emerged as an essential tool to investigate cellular heterogeneity, and highlighting cell sub-population specific signatures. Nowadays, dedicated and user-friendly bioinformatics workflows are required to exploit the deconvolution of single-cells transcriptome. Furthermore, there is a growing need of bioinformatics workflows granting both functional, i.e. saving information about data and analysis parameters, and computation reproducibility, i.e. storing the real image of the computation environment. Here, we present rCASC a modular RNAseq analysis workflow allowing data analysis from counts generation to cell sub-population signatures identification, granting both functional and computation reproducibility.Availability and ImplementationrCASC is part of the reproducible bioinfomatics project. rCASC is a docker based application controlled by a R package available at https://github.com/kendomaniac/rCASC.Supplementary informationSupplementary data are available at rCASC github


2019 ◽  
Author(s):  
Jonathan McMillan ◽  
Zhaolian Lu ◽  
Judith S. Rodriguez ◽  
Tae-Hyuk Ahn ◽  
Zhenguo Lin

AbstractThe transcription initiation landscape of eukaryotic genes is complex and highly dynamic. In eukaryotes, genes can generate multiple transcript variants that differ in 5’ boundaries due to usages of alternative transcription start sites (TSSs), and the abundance of transcript isoforms are highly variable. Due to a large number and complexity of the TSSs, it is not feasible to depict details of transcript initiation landscape of all genes using text-format genome annotation files. Therefore, it is necessary to provide data visualization of TSSs to represent quantitative TSS maps and the core promoters. In addition, the selection and activity of TSSs are influenced by various factors, such as transcription factors, chromatin remodeling, and histone modifications. Thus, integration and visualization of functional genomic data related to these features could provide a better understanding of the gene promoter architecture and regulatory mechanism of transcription initiation. Yeast species play important roles for the research and human society, yet no database provides visualization and integration of functional genomic data in yeast. Here, we generated quantitative TSS maps for twelve important yeast species, inferred their core promoters, and built a public database, YeasTSS (www.yeastss.org). YeasTSS was designed as a central portal for visualization and integration of the TSS maps, core promoters and functional genomic data related to transcription initiation in yeast. YeasTSS is expected to benefit the research community and public education for improving genome annotation, studies of promoter structure, regulated control of transcription initiation and inferring gene regulatory network.


Sign in / Sign up

Export Citation Format

Share Document