scholarly journals Readthrough Errors Purge Deleterious Cryptic Sequences, Facilitating the Birth of Coding Sequences

2020 ◽  
Vol 37 (6) ◽  
pp. 1761-1774 ◽  
Author(s):  
Luke J Kosinski ◽  
Joanna Masel

Abstract De novo protein-coding innovations sometimes emerge from ancestrally noncoding DNA, despite the expectation that translating random sequences is overwhelmingly likely to be deleterious. The “preadapting selection” hypothesis claims that emergence is facilitated by prior, low-level translation of noncoding sequences via molecular errors. It predicts that selection on polypeptides translated only in error is strong enough to matter and is strongest when erroneous expression is high. To test this hypothesis, we examined noncoding sequences located downstream of stop codons (i.e., those potentially translated by readthrough errors) in Saccharomyces cerevisiae genes. We identified a class of “fragile” proteins under strong selection to reduce readthrough, which are unlikely substrates for co-option. Among the remainder, sequences showing evidence of readthrough translation, as assessed by ribosome profiling, encoded C-terminal extensions with higher intrinsic structural disorder, supporting the preadapting selection hypothesis. The cryptic sequences beyond the stop codon, rather than spillover effects from the regular C-termini, are primarily responsible for the higher disorder. Results are robust to controlling for the fact that stronger selection also reduces the length of C-terminal extensions. These findings indicate that selection acts on 3′ UTRs in Saccharomyces cerevisiae to purge potentially deleterious variants of cryptic polypeptides, acting more strongly in genes that experience more readthrough errors.

2019 ◽  
Author(s):  
Luke Kosinski ◽  
Joanna Masel

AbstractDe novo protein-coding innovations sometimes emerge from ancestrally non-coding DNA, despite the expectation that translating random sequences is overwhelmingly likely to be deleterious. The “pre-adapting selection” hypothesis claims that emergence is facilitated by prior, low-level translation of non-coding sequences via molecular errors. It predicts that selection on polypeptides translated only in error is strong enough to matter, and is strongest when erroneous expression is high. To test this hypothesis, we examined non-coding sequences located downstream of stop codons (i.e. those potentially translated by readthrough errors) in Saccharomyces cerevisiae genes. We identified a class of “fragile” proteins under strong selection to reduce readthrough, which are unlikely substrates for co-option. Among the remainder, sequences showing evidence of readthrough translation, as assessed by ribosome profiling, encoded C-terminal extensions with higher intrinsic structural disorder, supporting the pre-adapting selection hypothesis. The cryptic sequences beyond the stop codon, rather than spillover effects from the regular C-termini, are primarily responsible for the higher disorder. Results are robust to controlling for the fact that stronger selection also reduces the length of C-terminal extensions. These findings indicate that selection acts on 3′ UTRs in S. cerevisiae to purge potentially deleterious variants of cryptic polypeptides, acting more strongly in genes that experience more readthrough errors.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Chen Xie ◽  
Cemalettin Bekpen ◽  
Sven Künzel ◽  
Maryam Keshavarz ◽  
Rebecca Krebs-Wheaton ◽  
...  

The de novo emergence of new genes has been well documented through genomic analyses. However, a functional analysis, especially of very young protein-coding genes, is still largely lacking. Here, we identify a set of house mouse-specific protein-coding genes and assess their translation by ribosome profiling and mass spectrometry data. We functionally analyze one of them, Gm13030, which is specifically expressed in females in the oviduct. The interruption of the reading frame affects the transcriptional network in the oviducts at a specific stage of the estrous cycle. This includes the upregulation of Dcpp genes, which are known to stimulate the growth of preimplantation embryos. As a consequence, knockout females have their second litters after shorter times and have a higher infanticide rate. Given that Gm13030 shows no signs of positive selection, our findings support the hypothesis that a de novo evolved gene can directly adopt a function without much sequence adaptation.


2019 ◽  
Vol 47 (20) ◽  
pp. 10543-10552 ◽  
Author(s):  
Alexander Donath ◽  
Frank Jühling ◽  
Marwa Al-Arab ◽  
Stephan H Bernhart ◽  
Franziska Reinhardt ◽  
...  

Abstract With the rapid increase of sequenced metazoan mitochondrial genomes, a detailed manual annotation is becoming more and more infeasible. While it is easy to identify the approximate location of protein-coding genes within mitogenomes, the peculiar processing of mitochondrial transcripts, however, makes the determination of precise gene boundaries a surprisingly difficult problem. We have analyzed the properties of annotated start and stop codon positions in detail, and use the inferred patterns to devise a new method for predicting gene boundaries in de novo annotations. Our method benefits from empirically observed prevalances of start/stop codons and gene lengths, and considers the dependence of these features on variations of genetic codes. Albeit not being perfect, our new approach yields a drastic improvement in the accuracy of gene boundaries and upgrades the mitochondrial genome annotation server MITOS to an even more sophisticated tool for fully automatic annotation of metazoan mitochondrial genomes.


Genetics ◽  
2008 ◽  
Vol 179 (1) ◽  
pp. 487-496 ◽  
Author(s):  
Jing Cai ◽  
Ruoping Zhao ◽  
Huifeng Jiang ◽  
Wen Wang

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Andreas Lange ◽  
Prajal H. Patel ◽  
Brennen Heames ◽  
Adam M. Damry ◽  
Thorsten Saenger ◽  
...  

AbstractComparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from noncoding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus. Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and circular dichroism (CD) data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard’s orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard’s structure appears to have been maintained with only minor changes over millions of years.


2020 ◽  
Vol 12 (11) ◽  
pp. 2183-2195
Author(s):  
Daniel Dowling ◽  
Jonathan F Schmitz ◽  
Erich Bornberg-Bauer

Abstract In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA can also produce hazardous protein molecules, which can misfold and/or form toxic aggregates. The dynamics of how de novo proteins emerge from potentially toxic raw materials and what influences their long-term survival are unknown. Here, using transcriptomic data from human and five other primates, we generate a set of transcribed human ORFs at six conservation levels to investigate which properties influence the early emergence and long-term retention of these expressed ORFs. As these taxa diverged from each other relatively recently, we present a fine scale view of the evolution of novel sequences over recent evolutionary time. We find that novel human-restricted ORFs are preferentially located on GC-rich gene-dense chromosomes, suggesting their retention is linked to pre-existing genes. Sequence properties such as intrinsic structural disorder and aggregation propensity—which have been proposed to play a role in survival of de novo genes—remain unchanged over time. Even very young sequences code for proteins with low aggregation propensities, suggesting that genomic regions with many novel transcribed ORFs are concomitantly less likely to produce ORFs which code for harmful toxic proteins. Our data indicate that the survival of these novel ORFs is largely stochastic rather than shaped by selection.


2015 ◽  
Author(s):  
Katarzyna B Hooks ◽  
Samina Naseeb ◽  
Sam Griffiths-Jones ◽  
Daniela Delneri

The Saccharomyces cerevisiae genome has undergone extensive intron loss during its evolutionary history. It has been suggested that the few remaining introns (in only 5% of protein-coding genes) are retained because of their impact on function under stress conditions. Here, we explore the possibility that novel non-coding RNA structures (ncRNAs) are embedded within intronic sequences and are contributing to phenotype and intron retention in yeast. We employed de novo RNA structure prediction tools to screen intronic sequences in S. cerevisiae and 36 other fungi. We identified and validated 19 new intronic RNAs via RNAseq and RT-PCR. Contrary to common belief that excised introns are rapidly degraded, we found that, in six cases, the excised introns were maintained intact in the cells. In other two cases we showed that the ncRNAs were further processed from their introns. RNAseq analysis confirmed higher expression of introns in the ribosomial protein genes containing predicted RNA structures. We deleted the novel intronic RNA structure within the GLC7 intron and showed that this predicted ncRNA, rather than the intron itself, is responsible for the cell???s ability to respond to salt stress. We also showed a direct association between the presence of the intronic ncRNA and GLC7 expression. Overall, these data support the notion that some introns may have been maintained in the genome because they harbour functional ncRNAs.


2015 ◽  
Author(s):  
Lorenzo Calviello ◽  
Neelanjan Mukherjee ◽  
Emanuel Wyler ◽  
Henrik Zauber ◽  
Antje Hirsekorn ◽  
...  

RNA sequencing protocols allow for quantifying gene expression regulation at each individual step, from transcription to protein synthesis. Ribosome Profiling (Ribo-seq) maps the positions of translating ribosomes over the entire transcriptome. Despite its great potential, a rigorous statistical approach to identify translated regions by means of the characteristic three-nucleotide periodicity of Ribo-seq data is not yet available. To fill this gap, we developed RiboTaper, which quantifies the significance of periodic Ribo-seq reads via spectral analysis methods. We applied RiboTaper on newly generated, deep Ribo-seq data in HEK293 cells, to derive an extensive map of translation that covers Open Reading Frame (ORF) annotations for more than 11,000 protein- coding genes. We also find distinct ribosomal signatures for several hundred detected upstream ORFs and ORFs in annotated non-coding genes (ncORFs). Mass spectrometry data confirms that RiboTaper achieves excellent coverage of the cellular proteome and validates dozens of novel peptide products. Collectively, RiboTaper (available at https://ohlerlab.mdc-berlin.de/software/ ) is a powerful method for comprehensive de novo identification of actively used ORFs in the human genome.


2018 ◽  
Author(s):  
Villu Kasari ◽  
Tõnu Margus ◽  
Gemma C. Atkinson ◽  
Marcus J.O. Johansson ◽  
Vasili Hauryliuk

AbstractIn addition to the standard set of translation factors common in eukaryotic organisms, protein synthesis in the yeast Saccharomyces cerevisiae requires an ABCF ATPase factor eEF3, eukaryotic Elongation Factor 3. eEF3 is an E-site binder that was originally identified as an essential factor involved in the elongation stage of protein synthesis. Recent biochemical experiments suggest an additional function of eEF3 in ribosome recycling. We have characterised the global effects of eEF3 depletion on translation using ribosome profiling. Depletion of eEF3 results in decreased ribosome density at the stop codon, indicating that ribosome recycling does not become rate limiting when eEF3 levels are low. Consistent with a defect in translation elongation, eEF3 depletion causes a moderate redistribution of ribosomes towards the 5’ part of the open reading frames. We observed no E-site codon-or amino acid-specific ribosome stalling upon eEF3 depletion, supporting its role as a general elongation factor. Surprisingly, depletion of eEF3 leads to a relative decrease in P-site proline stalling, which we hypothesise is a secondary effect of generally decreased translation and/or decreased competition for the E-site with eIF5A.


2020 ◽  
Author(s):  
G Loughran ◽  
AV Zhdanov ◽  
MS Mikhaylova ◽  
FN Rozov ◽  
PN Datskevich ◽  
...  

AbstractWhile near cognate codons are frequently used for translation initiation in eukaryotes, their efficiencies are usually low (<10% compared to an AUG in optimal context). Here we describe a rare case of highly efficient near cognate initiation. A CUG triplet located in the 5’ leader of POLG mRNA initiates almost as efficiently (~60-70%) as an AUG in optimal context. This CUG directs translation of a conserved 260 triplet-long overlapping ORF, which we call POLGARF (POLGAlternative Reading Frame). Translation of a short upstream ORF 5’ of this CUG governs the ratio between DNA polymerase and POLGARF produced from a single POLG mRNA. Functional investigation of POLGARF points to extracellular signalling. While unprocessed POLGARF resides in the nucleoli together with its interacting partner C1QBP, serum stimulation results in rapid secretion of POLGARF C-terminal fragment. Phylogenetic analysis shows that POLGARF evolved ~160 million years ago due to an MIR transposition into the 5’ leader sequence of the mammalian POLG gene which became fixed in placental mammals. The discovery of POLGARF unveils a previously undescribed mechanism of de novo protein-coding gene evolution.Significance StatementIn this study, we describe previously unknown mechanism of de novo protein-coding gene evolution. We show that the POLG gene, which encodes the catalytic subunit of mitochondrial DNA polymerase, is in fact a dual coding gene. Ribosome profiling, phylogenetic conservation, and reporter construct analyses all demonstrate that POLG mRNA possesses a conserved CUG codon which serves as a start of translation for an exceptionally long overlapping open reading frame (260 codons in human) present in all placental mammals. We called the protein encoded in this alternative reading frame POLGARF. We provide evidence that the evolution of POLGARF was incepted upon insertion of an MIR transposable element of the SINE family.


Sign in / Sign up

Export Citation Format

Share Document