scholarly journals Genomic determinants for initiation and length of natural antisense transcripts in Entamoeba histolytica

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Damien Mornico ◽  
Chung-Chau Hon ◽  
Mikael Koutero ◽  
Christian Weber ◽  
Jean-Yves Coppee ◽  
...  

AbstractNatural antisense transcripts (NAT) have been reported in prokaryotes and eukaryotes. While the functions of most reported NATs remain unknown, their potentials in regulating the transcription of their counterparts have been speculated. Entamoeba histolytica, which is a unicellular eukaryotic parasite, has a compact protein-coding genome with very short intronic and intergenic regions. The regulatory mechanisms of gene expression in this compact genome are under-described. In this study, by genome-wide mapping of RNA-Seq data in the genome of E. histolytica, we show that a substantial fraction of its protein-coding genes (28%) has significant transcription on their opposite strand (i.e. NAT). Intriguingly, we found the location of transcription start sites or polyadenylation sites of NAT are determined by the specific motifs encoded on the opposite strand of the gene coding sequences, thereby providing a compact regulatory system for gene transcription. Moreover, we demonstrated that NATs are globally up-regulated under various environmental conditions including temperature stress and pathogenicity. While NATs do not appear to be consequences of spurious transcription, they may play a role in regulating gene expression in E. histolytica, a hypothesis which needs to be tested.

2010 ◽  
Vol 38 (4) ◽  
pp. 1144-1149 ◽  
Author(s):  
Andreas Werner ◽  
Daniel Swan

NATs (natural antisense transcripts) are important regulators of eukaryotic gene expression. Interference between the expression of protein-coding sense transcripts and the corresponding NAT is well documented. In the present review, we focus on an additional, higher-order role of NATs that is currently emerging. The recent discovery of endogenous siRNAs (short interfering RNAs), as well as NAT-induced transcriptional gene silencing, are key to the proposed novel function of NATs.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 93
Author(s):  
Peng Qin ◽  
Ann E. Loraine ◽  
Sheila McCormick

Background: cis-NATs (cis-natural antisense transcripts) are transcribed from opposite strands of adjacent genes and have been shown to regulate gene expression by generating small RNAs from the overlapping region. cis-NATs are important for plant development and resistance to pathogens and stress. Several genome-wide investigations identified a number of cis-NAT pairs, but these investigations predicted cis-NATS using expression data from bulk samples that included lots of cell types. Some cis-NAT pairs identified from those investigations might not be functional, because both transcripts of cis-NAT pairs need to be co-expressed in the same cell. Pollen only contains two cell types, two sperm and one vegetative cell, which makes cell-specific investigation of cis-NATs possible. Methods: We investigated potential protein-coding cis-NATs in pollen and sperm using pollen RNA-seq data and TAIR10 gene models using the Integrated Genome Browser.  We then used sperm microarray data and sRNAs in sperm and pollen to determine possibly functional cis-NATs in the sperm or vegetative cell, respectively. Results: We identified 1471 potential protein-coding cis-NAT pairs, including 131 novel pairs that were not present in TAIR10 gene models. In pollen, 872 possibly functional pairs were identified. 72 and 56 pairs were potentially functional in sperm and vegetative cells, respectively. sRNAs were detected at 794 genes, belonging to 739 pairs. Conclusion: These potential candidates in sperm and the vegetative cell are tools for understanding gene expression mechanisms in pollen.


Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 390-390
Author(s):  
Paul F. Bray ◽  
Steven E. McKenzie ◽  
Leonard C. Edelstein ◽  
Srikanth Nagalla ◽  
Kathleen Delgrosso ◽  
...  

Abstract Abstract 390 A conspicuous lesson that has emerged from the 1000 Genomes Project is the greater genetic variation in the population than previously appreciated. Transcriptomics is rapidly assuming a prominent role in the understanding of basic molecular mechanisms accounting for variation within the normal population and disease states. Besides protein-coding RNAs, the importance of non-coding RNAs (ncRNAs) – primarily as regulators of gene expression – is well recognized but largely unexplored. The platelet transcriptome reflects megakaryocyte RNA content at the time of proplatelet release, subsequent splicing events, selective packaging and platelet RNA stability. An accurate understanding of the platelet transcriptome has both biological (improved understanding of platelet protein translation and the mechanisms of megakaryocyte/platelet gene expression) and clinical (novel biomarkers of disease) relevance. We carried out transcriptome sequencing of total RNA isolated from leukocyte-depleted platelet preparations from four healthy adults using an AB/LT SOLiD™ system. For each individual, we constructed 3 libraries: a) long (≥ 40 nucleotides) total RNA, b) long RNA depleted of rRNA, and c) short (< 40 nucleotides) RNA. ∼1 billion reads from the 12 datasets were mapped on each chromosome and strand of the human genome. About one-third mapped uniquely, similar to other unbiased methods like SAGE. Normalizing for transcript length and scale using ß-actin expression level provided the ability to appropriately scale expression within a read-set and to compare expression levels across read-sets. Of the known protein-coding loci, ∼9,500 were present in human platelets. Plotting the number of protein-coding genes as a function of the level of normalized expression underscored different gene estimates between total and rRNA-depleted RNA preparations, and substantial inter-individual variation in the less abundant genes. RT-PCR validated the RNA-seq estimates of transcript levels exhibiting a range of >3 orders of magnitude of normalized read counts (r=0.7757; p=0.0001). A strong correlation was measured between mRNAs identified by RNA-seq and 3 published microarray datasets for well-expressed mRNAs, although RNA-seq identified many more transcripts of lower abundance. Unexpectedly, ribosomal RNA depletion significantly and adversely affected estimates of the relative abundance of transcripts including members of the RNA interference pathway DGCR8, DROSHA, XPO5, DICER1, EIF2C1-4, which exhibited large differences (up to 32-fold) between the total and rRNA-depleted preparations. A rigorous and highly stringent approach identified bona fide intronic regions that gave rise to 6,992 and 1,236 currently uncharacterized long and short RNA transcripts, respectively. We discovered numerous previously unreported antisense transcripts: 1) to known protein-coding regions of the genome, 2) 10 miRNA precursors where each locus generated 1–2 distinct antisense transcripts, presumably mature and “star” miRNAs, and 3) long and short RNAs antisense to several known repeat families. We did not observe enrichment of long-intergenic ncRNAs. We considered various possible explanations for the ∼60% sequence reads that could not be mapped on the genome. Much more lenient parameter settings only accounted for only ∼6.5% sequenced reads. An even smaller fraction of reads was observed when considering all possible combinations of exon-exon junctions in the genome (12,382,819 junctions) and the highly polymorphic HLA region of chr 6, indicating these did not contribute in any substantive manner to the platelet transcriptome. Lastly, RNA-seq was highly reproducible (>97 for 1 subject studied on 4 occasions). In summary, our work reveals a richness and diversity of platelet RNA molecules, suggesting a context where platelet biology transcends protein- and mRNA-centric descriptions. We will provide a publicly available web tool of these data embedded in a local mirror of the UCSC genome browser, facilitating the elucidation of previously unappreciated molecular species and molecular interactions. This will eventually permit an improved understanding of the molecular mechanisms that regulate platelet physiology and that contribute to disorders of thrombosis, hemostasis and inflammation. Disclosures: No relevant conflicts of interest to declare.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Shuai Wang ◽  
Hairong Xie ◽  
Fei Mao ◽  
Haiyan Wang ◽  
Shu Wang ◽  
...  

Abstract Background Direct analogs of chemically modified bases that carry important epigenetic information, such as 5-methylcytosine (m5C)/5-methyldeoxycytosine (5mC), 5-hydroxymethylcytosine (hm5C)/5-hydroxymethyldeoxycytosine (5hmC), and N6-methyladenosine (m6A)/N6-methyldeoxyadenosine (6mA), are detected in both RNA and DNA, respectively. The modified base N4-acetylcytosine (ac4C) is well studied in RNAs, but its presence and epigenetic roles in cellular DNA have not been explored. Results Here, we demonstrate the existence of N4-acetyldeoxycytosine (4acC) in genomic DNA of Arabidopsis with multiple detection methods. Genome-wide profiling of 4acC modification reveals that 4acC peaks are mostly distributed in euchromatin regions and present in nearly half of the expressed protein-coding genes in Arabidopsis. 4acC is mainly located around transcription start sites and positively correlates with gene expression levels. Imbalance of 5mC does not directly affect 4acC modification. We also characterize the associations of 4acC with 5mC and histone modifications that cooperatively regulate gene expression. Moreover, 4acC is also detected in genomic DNA of rice, maize, mouse, and human by mass spectrometry. Conclusions Our findings reveal 4acC as a hitherto unknown DNA modification in higher eukaryotes. We identify potential interactions of this mark with other epigenetic marks in gene expression regulation.


2020 ◽  
Vol 48 (15) ◽  
pp. 8509-8528
Author(s):  
Mengjun Wu ◽  
Evdoxia Karadoulama ◽  
Marta Lloret-Llinares ◽  
Jerome Olivier Rouviere ◽  
Christian Skov Vaagensø ◽  
...  

Abstract The ribonucleolytic exosome complex is central for nuclear RNA degradation, primarily targeting non-coding RNAs. Still, the nuclear exosome could have protein-coding (pc) gene-specific regulatory activities. By depleting an exosome core component, or components of exosome adaptor complexes, we identify ∼2900 transcription start sites (TSSs) from within pc genes that produce exosome-sensitive transcripts. At least 1000 of these overlap with annotated mRNA TSSs and a considerable portion of their transcripts share the annotated mRNA 3′ end. We identify two types of pc-genes, both employing a single, annotated TSS across cells, but the first type primarily produces full-length, exosome-sensitive transcripts, whereas the second primarily produces prematurely terminated transcripts. Genes within the former type often belong to immediate early response transcription factors, while genes within the latter are likely transcribed as a consequence of their proximity to upstream TSSs on the opposite strand. Conversely, when genes have multiple active TSSs, alternative TSSs that produce exosome-sensitive transcripts typically do not contribute substantially to overall gene expression, and most such transcripts are prematurely terminated. Our results display a complex landscape of sense transcription within pc-genes and imply a direct role for nuclear RNA turnover in the regulation of a subset of pc-genes.


Viruses ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 795
Author(s):  
Rui Li ◽  
Rachel Sklutuis ◽  
Jennifer L. Groebner ◽  
Fabio Romerio

Natural antisense transcripts (NATs) represent a class of RNA molecules that are transcribed from the opposite strand of a protein-coding gene, and that have the ability to regulate the expression of their cognate protein-coding gene via multiple mechanisms. NATs have been described in many prokaryotic and eukaryotic systems, as well as in the viruses that infect them. The human immunodeficiency virus (HIV-1) is no exception, and produces one or more NAT from a promoter within the 3’ long terminal repeat. HIV-1 antisense transcripts have been the focus of several studies spanning over 30 years. However, a complete appreciation of the role that these transcripts play in the virus lifecycle is still lacking. In this review, we cover the current knowledge about HIV-1 NATs, discuss some of the questions that are still open and identify possible areas of future research.


2021 ◽  
Vol 22 (9) ◽  
pp. 4686
Author(s):  
Marta Irla ◽  
Sigrid Hakvåg ◽  
Trygve Brautaset

Genome-wide transcriptomic data obtained in RNA-seq experiments can serve as a reliable source for identification of novel regulatory elements such as riboswitches and promoters. Riboswitches are parts of the 5′ untranslated region of mRNA molecules that can specifically bind various metabolites and control gene expression. For that reason, they have become an attractive tool for engineering biological systems, especially for the regulation of metabolic fluxes in industrial microorganisms. Promoters in the genomes of prokaryotes are located upstream of transcription start sites and their sequences are easily identifiable based on the primary transcriptome data. Bacillus methanolicus MGA3 is a candidate for use as an industrial workhorse in methanol-based bioprocesses and its metabolism has been studied in systems biology approaches in recent years, including transcriptome characterization through RNA-seq. Here, we identify a putative lysine riboswitch in B. methanolicus, and test and characterize it. We also select and experimentally verify 10 putative B. methanolicus-derived promoters differing in their predicted strength and present their functionality in combination with the lysine riboswitch. We further explore the potential of a B. subtilis-derived purine riboswitch for regulation of gene expression in the thermophilic B. methanolicus, establishing a novel tool for inducible gene expression in this bacterium.


Sign in / Sign up

Export Citation Format

Share Document