scholarly journals Are Antisense Proteins in Prokaryotes Functional?

Author(s):  
Zachary Ardern ◽  
Klaus Neuhaus ◽  
Siegfried Scherer

AbstractMany prokaryotic RNAs are transcribed from loci outside of annotated protein coding genes. Across bacterial species hundreds of short open reading frames antisense to annotated genes show evidence of both transcription and translation, for instance in ribosome profiling data. Determining the functional fraction of these protein products awaits further research, including insights from studies of molecular interactions and detailed evolutionary analysis. There are multiple lines of evidence however that many of these newly discovered proteins are of use to the organism. Condition-specific phenotypes have been characterised for a few. These proteins should be added to genome annotations, and the methods for predicting them standardised. Evolutionary analysis of these typically young sequences also may provide important insights into gene evolution. This research should be prioritised for its exciting potential to uncover large numbers of novel proteins with extremely diverse potential practical uses, including applications in synthetic biology and responding to pathogens.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
David S. M. Lee ◽  
Joseph Park ◽  
Andrew Kromer ◽  
Aris Baras ◽  
Daniel J. Rader ◽  
...  

AbstractRibosome-profiling has uncovered pervasive translation in non-canonical open reading frames, however the biological significance of this phenomenon remains unclear. Using genetic variation from 71,702 human genomes, we assess patterns of selection in translated upstream open reading frames (uORFs) in 5’UTRs. We show that uORF variants introducing new stop codons, or strengthening existing stop codons, are under strong negative selection comparable to protein-coding missense variants. Using these variants, we map and validate gene-disease associations in two independent biobanks containing exome sequencing from 10,900 and 32,268 individuals, respectively, and elucidate their impact on protein expression in human cells. Our results suggest translation disrupting mechanisms relating uORF variation to reduced protein expression, and demonstrate that translation at uORFs is genetically constrained in 50% of human genes.


2019 ◽  
Vol 1 (1) ◽  
pp. e2-e2 ◽  
Author(s):  
Jorge Ruiz-Orera ◽  
M Mar Albà

Abstract The mammalian transcriptome includes thousands of transcripts that do not correspond to annotated protein-coding genes and that are known as long non-coding RNAs (lncRNAs). A handful of lncRNAs have well-characterized regulatory functions but the biological significance of the majority of them is not well understood. LncRNAs that are conserved between mice and humans are likely to be enriched in functional sequences. Here, we investigate the presence of different types of ribosome profiling signatures in lncRNAs and how they relate to sequence conservation. We find that lncRNA-conserved regions contain three times more ORFs with translation evidence than non-conserved ones, and identify nine cases that display significant sequence constraints at the amino acid sequence level. The study also reveals that conserved regions in intergenic lncRNAs are significantly enriched in protein–RNA interaction signatures when compared to non-conserved ones; this includes sites in well-characterized lncRNAs, such as Cyrano, Malat1, Neat1 and Meg3, as well as in tens of lncRNAs of unknown function. This work illustrates how the analysis of ribosome profiling data coupled with evolutionary analysis provides new opportunities to explore the lncRNA functional landscape.


2021 ◽  
Author(s):  
Yuta Hiragori ◽  
Hiro Takahashi ◽  
Noriya Hayashi ◽  
Shun Sasaki ◽  
Kodai Nakao ◽  
...  

Upstream open reading frames (uORFs) are short ORFs found in the 5′-UTRs of many eukaryotic transcripts and can influence the translation of protein-coding main ORFs (mORFs). Recent genome-wide ribosome profiling studies have revealed that thousands of uORFs initiate translation at non-AUG start codons. However, the physiological significance of these non-AUG uORFs has so far been demonstrated for only a few of them. It is conceivable that physiologically important non-AUG uORFs are evolutionarily conserved across species. In this study, using a combination of bioinformatics and experimental approaches, we searched the Arabidopsis genome for non-AUG-initiated uORFs with conserved sequences that control the expression of the mORF-encoded proteins. As a result, we identified four novel regulatory non-AUG uORFs. Among these, two exerted repressive effects on mORF expression in an amino acid sequence-dependent manner. These two non-AUG uORFs are likely to encode regulatory peptides that cause ribosome stalling, thereby enhancing their repressive effects. In contrast, one of the identified regulatory non-AUG uORFs promoted mORF expression by alleviating the inhibitory effect of a downstream AUG-initiated uORF. These findings provide insights into the mechanisms that enable non-AUG uORFs to play regulatory roles despite their low translation initiation efficiencies.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 464 ◽  
Author(s):  
Leos G. Kral ◽  
Sara Watson

Background: Mitochondrial DNA of vertebrates contains genes for 13 proteins involved in oxidative phosphorylation. Some of these genes have been shown to undergo adaptive evolution in a variety of species. This study examines all mitochondrial protein coding genes in 11 darter species to determine if any of these genes show evidence of positive selection. Methods: The mitogenome from four darter was sequenced and annotated. Mitogenome sequences for another seven species were obtained from GenBank. Alignments of each of the protein coding genes were subject to codon-based identification of positive selection by Selecton, MEME and FEL. Results: Evidence of positive selection was obtained for six of the genes by at least one of the methods. CYTB was identified as having evolved under positive selection by all three methods at the same codon location. Conclusions: Given the evidence for positive selection of mitochondrial protein coding genes in darters, a more extensive analysis of mitochondrial gene evolution in all the extant darter species is warranted.


2016 ◽  
Author(s):  
Jorge Ruiz-Orera ◽  
Pol Verdaguer-Grau ◽  
José Luis Villanueva-Cañas ◽  
Xavier Messeguer ◽  
M Mar Albà

AbstractThere is accumulating evidence that some genes have originated de novo from previously non-coding genomic sequences. However, the processes underlying de novo gene birth are still enigmatic. In particular, the appearance of a new functional protein seems highly improbable unless there is already a pool of neutrally evolving peptides that can at some point acquire new functions. Here we show for the first time that such peptides do not only exist but that they are prevalent among the translation products of mouse genes that lack homologues in rat and human. The data suggests that the translation of these peptides is due to the chance occurrence of open reading frames with a favorable codon composition. Our approach combines ribosome profiling experiments, proteomics data and non-synonymous and synonymous nucleotide polymorphism analysis. We propose that effectively neutral processes involving the expression of thousands of transcripts all the way down to proteins provide a basis for de novo gene evolution.


2018 ◽  
Author(s):  
Anica Scholz ◽  
Florian Eggenhofer ◽  
Rick Gelhausen ◽  
Björn Grüning ◽  
Kathi Zarnack ◽  
...  

AbstractRibosome profiling (ribo-seq) provides a means to analyze active translation by determining ribosome occupancy in a transcriptome-wide manner. The vast majority of ribosome protected fragments (RPFs) resides within the protein-coding sequence of mRNAs. However, commonly reads are also found within the transcript leader sequence (TLS) (aka 5’ untranslated region) preceding the main open reading frame (ORF), indicating the translation of regulatory upstream ORFs (uORFs). Here, we present a workflow for the identification of translation-regulatory uORFs. Specifically, uORF-Tools identifies uORFs within a given dataset and generates a uORF annotation file. In addition, a comprehensive human uORF annotation file, based on 35 ribo-seq files, is provided, which can serve as an alternative input file for the workflow. To assess the translation-regulatory activity of the uORFs, stimulus-induced changes in the ratio of the RPFs residing in the main ORFs relative to those found in the associated uORFs are determined. The resulting output file allows for the easy identification of candidate uORFs, which have translation-inhibitory effects on their associated main ORFs. uORF-Tools is available as a free and open Snakemake workflow at https://github.com/Biochemistry1-FFM/uORF-Tools. It is easily installed and all necessary tools are provided in a version-controlled manner, which also ensures lasting usability. uORF-Tools is designed for intuitive use and requires only limited computing times and resources.


2017 ◽  
Author(s):  
Pierre Murat ◽  
Giovanni Marsico ◽  
Barbara Herdy ◽  
Avazeh Ghanbarian ◽  
Guillem Portella ◽  
...  

ABSTRACTRNA secondary structures in the 5’ untranslated regions (UTRs) of mRNAs have been characterised as key determinants of translation initiation. However the role of non-canonical secondary structures, such as RNA G-quadruplexes (rG4s), in modulating translation of human mRNAs and the associated mechanisms remain largely unappreciated. Here we use a ribosome profiling strategy to investigate the translational landscape of human mRNAs with structured 5’ untranslated regions (5’-UTR). We found that inefficiently translated mRNAs, containing rG4-forming sequences in their 5’-UTRs, have an accumulation of ribosome footprints in their 5’-UTRs. We show that rG4-forming sequences are determinants of 5’-UTR translation, suggesting that the folding of rG4 structures thwarts the translation of protein coding sequences (CDS) by stimulating the translation of repressive upstream open reading frames (uORFs). To support our model, we demonstrate that depletion of two rG4s-specialised DEAH-box helicases, DHX36 and DHX9, shifts translation towards rG4-containing uORFs reducing the translation of selected transcripts comprising proto-oncogenes, transcription factors and epigenetic regulators. Transcriptome-wide identification of DHX9 binding sites using individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) demonstrate that translation regulation is mediated through direct physical interaction between the helicase and its rG4 substrate. Our findings unveil a previously unknown role for non-canonical structures in governing 5’-UTR translation and suggest that the interaction of helicases with rG4s could be considered as a target for future therapeutic intervention.


Author(s):  
Katarzyna Pancer ◽  
Aleksandra Milewska ◽  
Katarzyna Owczarek ◽  
Agnieszka Dabrowska ◽  
Wojciech Branicki ◽  
...  

AbstractSARS-CoV-2 genome annotation revealed the presence of 10 open reading frames (ORFs), of which the last one (ORF10) is positioned downstream the N gene. It is a hypothetical gene, which was speculated to encode a 38 aa protein. This hypothetical protein does not share sequence similarity with any other known protein and cannot be associated with a function. While the role of this ORF10 was proposed, there is a growing evidence showing that the ORF10 is not a coding region.Here, we identified SARS-CoV-2 variants in which the ORF10 gene was prematurely terminated. The disease was not attenuated, and the transmissibility between humans was not hampered. Also in vitro, the strains replicated similarly, as the related viruses with the intact ORF10. Altogether, based on clinical observation and laboratory analyses, it appears that the ORF10 protein is not essential in humans. This observation further proves that the ORF10 should not be treated as the protein-coding gene, and the genome annotations should be amended.


Agronomy ◽  
2020 ◽  
Vol 10 (9) ◽  
pp. 1405
Author(s):  
Gurusamy Raman ◽  
SeonJoo Park

The plant “False Lily of the Valley”, Speirantha gardenii is restricted to south-east China and considered as an endemic plant. Due to its limited availability, this plant was less studied. Hence, this study is focused on its molecular studies, where we have sequenced the complete chloroplast genome of S. gardenii and this is the first report on the chloroplast genome sequence of Speirantha. The complete S. gardenii chloroplast genome is of 156,869 bp in length with 37.6% GC, which included a pair of inverted repeats (IRs) each of 26,437 bp that separated a large single-copy (LSC) region of 85,368 bp and a small single-copy (SSC) region of 18,627 bp. The chloroplast genome comprises 81 protein-coding genes, 30 tRNA and four rRNA unique genes. Furthermore, a total of 699 repeats and 805 simple-sequence repeats (SSRs) markers are identified in the genome. Additionally, KA/KS nucleotide substitution analysis showed that seven protein-coding genes have highly diverged and identified nine amino acid sites under potentially positive selection in these genes. Phylogenetic analyses suggest that S. gardenii species has a closer genetic relationship to the Reineckea, Rohdea and Convallaria genera. The present study will provide insights into developing a lineage-specific marker for genetic diversity and gene evolution studies in the Nolinoideae taxa.


eLife ◽  
2017 ◽  
Vol 6 ◽  
Author(s):  
Sondos Samandi ◽  
Annie V Roy ◽  
Vivian Delcourt ◽  
Jean-François Lucier ◽  
Jules Gagnon ◽  
...  

Recent functional, proteomic and ribosome profiling studies in eukaryotes have concurrently demonstrated the translation of alternative open-reading frames (altORFs) in addition to annotated protein coding sequences (CDSs). We show that a large number of small proteins could in fact be coded by these altORFs. The putative alternative proteins translated from altORFs have orthologs in many species and contain functional domains. Evolutionary analyses indicate that altORFs often show more extreme conservation patterns than their CDSs. Thousands of alternative proteins are detected in proteomic datasets by reanalysis using a database containing predicted alternative proteins. This is illustrated with specific examples, including altMiD51, a 70 amino acid mitochondrial fission-promoting protein encoded in MiD51/Mief1/SMCR7L, a gene encoding an annotated protein promoting mitochondrial fission. Our results suggest that many genes are multicoding genes and code for a large protein and one or several small proteins.


Sign in / Sign up

Export Citation Format

Share Document