scholarly journals Evolutionary divergence of novel open reading frames in cichlids speciation

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Shraddha Puntambekar ◽  
Rachel Newhouse ◽  
Jaime San Miguel Navas ◽  
Ruchi Chauhan ◽  
Grégoire Vernaz ◽  
...  

AbstractNovel open reading frames (nORFs) with coding potential may arise from noncoding DNA. Not much is known about their emergence, functional role, fixation in a population or contribution to adaptive radiation. Cichlids fishes exhibit extensive phenotypic diversification and speciation. Encounters with new environments alone are not sufficient to explain this striking diversity of cichlid radiation because other taxa coexistent with the Cichlidae demonstrate lower species richness. Wagner et al. analyzed cichlid diversification in 46 African lakes and reported that both extrinsic environmental factors and intrinsic lineage-specific traits related to sexual selection have strongly influenced the cichlid radiation, which indicates the existence of unknown molecular mechanisms responsible for rapid phenotypic diversification, such as emergence of novel open reading frames (nORFs). In this study, we integrated transcriptomic and proteomic signatures from two tissues of two cichlids species, identified nORFs and performed evolutionary analysis on these nORF regions. Our results suggest that the time scale of speciation of the two species and evolutionary divergence of these nORF genomic regions are similar and indicate a potential role for these nORFs in speciation of the cichlid fishes.

Author(s):  
Shraddha Puntambekar ◽  
Rachel Newhouse ◽  
Jaime San Miguel Navas ◽  
Ruchi Chauhan ◽  
Grégoire Vernaz ◽  
...  

AbstractNovel open reading frames (nORFs) with coding potential may arise from noncoding DNA. Not much is known about their emergence, functional role, fixation in a population or contribution to adaptive radiation. Cichlids fishes exhibit extensive phenotypic diversification and speciation. Encounters with new environments alone are not sufficient to explain this striking diversity of cichlid radiation because other taxa coexistent with the Cichlidae demonstrate lower species richness. Wagner et al analyzed cichlid diversification in 46 African lakes and reported that both extrinsic environmental factors and intrinsic lineage-specific traits related to sexual selection have strongly influenced the cichlid radiation 1, which indicates the existence of unknown molecular mechanisms responsible for rapid phenotypic diversification, such as emergence of novel open reading frames (nORFs). In this study, we integrated transcriptomic and proteomic signatures from two tissues of two cichlids species, identified nORFs and performed evolutionary analysis on these nORF regions. Our results suggest that the time scale of speciation of the two species and evolutionary divergence of these nORF genomic regions are similar and indicate a potential role for these nORFs in speciation of the cichlid fishes.


2020 ◽  
Vol 12 (11) ◽  
pp. 2183-2195
Author(s):  
Daniel Dowling ◽  
Jonathan F Schmitz ◽  
Erich Bornberg-Bauer

Abstract In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA can also produce hazardous protein molecules, which can misfold and/or form toxic aggregates. The dynamics of how de novo proteins emerge from potentially toxic raw materials and what influences their long-term survival are unknown. Here, using transcriptomic data from human and five other primates, we generate a set of transcribed human ORFs at six conservation levels to investigate which properties influence the early emergence and long-term retention of these expressed ORFs. As these taxa diverged from each other relatively recently, we present a fine scale view of the evolution of novel sequences over recent evolutionary time. We find that novel human-restricted ORFs are preferentially located on GC-rich gene-dense chromosomes, suggesting their retention is linked to pre-existing genes. Sequence properties such as intrinsic structural disorder and aggregation propensity—which have been proposed to play a role in survival of de novo genes—remain unchanged over time. Even very young sequences code for proteins with low aggregation propensities, suggesting that genomic regions with many novel transcribed ORFs are concomitantly less likely to produce ORFs which code for harmful toxic proteins. Our data indicate that the survival of these novel ORFs is largely stochastic rather than shaped by selection.


1987 ◽  
Vol 7 (12) ◽  
pp. 4266-4272 ◽  
Author(s):  
L W Stanton ◽  
J M Bishop

NMYC is a gene whose amplification and overexpression have been implicated in the generation of certain human malignancies. Little is known of how the expression of NMYC is normally controlled. We have therefore characterized transcription from the gene and the structure and stability of the resulting mRNAs. Transcription from NMYC is exceptionally complex: it initiates at numerous sites that may be grouped under the control of two promoters, and the multiplicity of initiation sites combines with alternative splicing to engender two forms of mRNA. The mRNAs have different 5' leader sequences (alternative first exons of the gene) but identical bodies (the second and third exons of the gene). Both forms of mRNA are unstable, with half-lives of ca. 15 min. Both encode the previously identified 65,000 and 67,000-dalton products of NMYC. However, the alternative first exons contain distinctive open reading frames that may diversify the coding potential of NMYC. The complexities in transcription of NMYC expand the means by which expression of the gene might be controlled.


2006 ◽  
Vol 80 (8) ◽  
pp. 4179-4182 ◽  
Author(s):  
Pierre Rivailler ◽  
Amitinder Kaur ◽  
R. Paul Johnson ◽  
Fred Wang

ABSTRACT A pathogenic isolate of rhesus cytomegalovirus (rhCMV 180.92) was cloned, sequenced, and annotated. Comparisons with the published rhCMV 68.1 genome revealed 8 open reading frames (ORFs) in isolate 180.92 that are absent in 68.1, 10 ORFs in 68.1 that are absent in 180.92, and 34 additional ORFs that were not previously annotated. Most of the differences appear to be due to genetic rearrangements in both isolates from a region that is frequently altered in human CMV (hCMV) during in vitro passage. These results indicate that the rhCMV ORF repertoire is larger than previously recognized. Like hCMV, understanding of the complete coding capacity of rhCMV is complicated by genomic instability and may require comparisons with additional isolates in vitro and in vivo.


2021 ◽  
Author(s):  
Yanyi Jiang ◽  
Xiaofan Chen ◽  
Wei Zhang

AbstractIn RNA field, the demarcation between coding and non-coding has been negotiated by the recent discovery of occasionally translated circular RNAs (circRNAs). Although absent of 5’ cap structure, circRNAs can be translated cap-independently. Complementary intron-mediated overexpression is one of the most utilized methodologies for circRNA research but not without bearing echoing skepticism for its poorly defined mechanism and latent coexistent side products. In this study, leveraging such circRNA overexpression system, we have interrogated the protein-coding potential of 30 human circRNAs containing infinite open reading frames in HEK293T cells. Surprisingly, pervasive translation signals are detected by immunoblotting. However, intensive mutagenesis reveals that numerous translation signals are generated independently of circRNA synthesis. We have developed a dual tag strategy to isolate translation noise and directly demonstrate that the fallacious translation signals originate from cryptically spliced linear transcripts. The concomitant linear RNA byproducts, presumably concatemers, can be translated to allow pseudo rolling circle translation signals, and can involve backsplicing junction (BSJ) to disqualify the BSJ-based evidence for circRNA translation. We also find non-AUG start codons may engage in the translation initiation of circRNAs. Taken together, our systematic evaluation sheds light on heterogeneous translational outputs from circRNA overexpression vector and comes with a caveat that ectopic overexpression technique necessitates extremely rigorous control setup in circRNA translation and functional investigation.


2020 ◽  
Vol 40 (6) ◽  
Author(s):  
Corrine Corrina R. Hartford ◽  
Ashish Lal

ABSTRACT Recent advancements in genetic and proteomic technologies have revealed that more of the genome encodes proteins than originally thought possible. Specifically, some putative long noncoding RNAs (lncRNAs) have been misannotated as noncoding. Numerous lncRNAs have been found to contain short open reading frames (sORFs) which have been overlooked because of their small size. Many of these sORFs encode small proteins or micropeptides with fundamental biological importance. These micropeptides can aid in diverse processes, including cell division, transcription regulation, and cell signaling. Here we discuss strategies for establishing the coding potential of putative lncRNAs and describe various functions of known micropeptides.


2019 ◽  
Vol 36 (7) ◽  
pp. 2053-2059 ◽  
Author(s):  
Saket Choudhary ◽  
Wenzheng Li ◽  
Andrew D. Smith

Abstract Motivation Ribo-seq, a technique for deep-sequencing ribosome-protected mRNA fragments, has enabled transcriptome-wide monitoring of translation in vivo. It has opened avenues for re-evaluating the coding potential of open reading frames (ORFs), including many short ORFs that were previously presumed to be non-translating. However, the detection of translating ORFs, specifically short ORFs, from Ribo-seq data, remains challenging due to its high heterogeneity and noise. Results We present ribotricer, a method for detecting actively translating ORFs by directly leveraging the three-nucleotide periodicity of Ribo-seq data. Ribotricer demonstrates higher accuracy and robustness compared with other methods at detecting actively translating ORFs including short ORFs on multiple published datasets across species inclusive of Arabidopsis, Caenorhabditis elegans, Drosophila, human, mouse, rat, yeast and zebrafish. Availability and implementation Ribotricer is available at https://github.com/smithlabcode/ribotricer. All analysis scripts and results are available at https://github.com/smithlabcode/ribotricer-results. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Jonathan Bohlen ◽  
Liza Harbrecht ◽  
Saioa Blanco ◽  
Katharina Clemm von Hohenberg ◽  
Kai Fenzl ◽  
...  

Abstract Translation efficiency varies considerably between different mRNAs, thereby impacting protein expression. Translation of the stress response master-regulator ATF4 increases upon stress, but the molecular mechanisms are not well understood. We discover here that translation factors DENR, MCTS1 and eIF2D are required to induce ATF4 translation upon stress by promoting translation reinitiation in the ATF4 5′UTR. We find DENR and MCTS1 are only needed for reinitiation after upstream Open Reading Frames (uORFs) containing certain penultimate codons, perhaps because DENR•MCTS1 are needed to evict only certain tRNAs from post-termination 40S ribosomes. This provides a model for how DENR and MCTS1 promote translation reinitiation. Cancer cells, which are exposed to many stresses, require ATF4 for survival and proliferation. We find a strong correlation between DENR•MCTS1 expression and ATF4 activity across cancers. Furthermore, additional oncogenes including a-Raf, c-Raf and Cdk4 have long uORFs and are translated in a DENR•MCTS1 dependent manner.


2004 ◽  
Vol 78 (20) ◽  
pp. 11187-11197 ◽  
Author(s):  
Lisa M. Kattenhorn ◽  
Ryan Mills ◽  
Markus Wagner ◽  
Alexandre Lomsadze ◽  
Vsevolod Makeev ◽  
...  

ABSTRACT Proteins associated with the murine cytomegalovirus (MCMV) viral particle were identified by a combined approach of proteomic and genomic methods. Purified MCMV virions were dissociated by complete denaturation and subjected to either separation by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and in-gel digestion or treated directly by in-solution tryptic digestion. Peptides were separated by nanoflow liquid chromatography and analyzed by tandem mass spectrometry (LC-MS/MS). The MS/MS spectra obtained were searched against a database of MCMV open reading frames (ORFs) predicted to be protein coding by an MCMV-specific version of the gene prediction algorithm GeneMarkS. We identified 38 proteins from the capsid, tegument, glycoprotein, replication, and immunomodulatory protein families, as well as 20 genes of unknown function. Observed irregularities in coding potential suggested possible sequence errors in the 3′-proximal ends of m20 and M31. These errors were experimentally confirmed by sequencing analysis. The MS data further indicated the presence of peptides derived from the unannotated ORFs ORFc225441-226898 (m166.5) and ORF105932-106072. Immunoblot experiments confirmed expression of m166.5 during viral infection.


2006 ◽  
Vol 188 (5) ◽  
pp. 1999-2013 ◽  
Author(s):  
Muriel Gaillard ◽  
Tatiana Vallaeys ◽  
Frank Jörg Vorhölter ◽  
Marco Minoia ◽  
Christoph Werlen ◽  
...  

ABSTRACT Pseudomonas sp. strain B13 is a bacterium known to degrade chloroaromatic compounds. The properties to use 3- and 4-chlorocatechol are determined by a self-transferable DNA element, the clc element, which normally resides at two locations in the cell's chromosome. Here we report the complete nucleotide sequence of the clc element, demonstrating the unique catabolic properties while showing its relatedness to genomic islands and integrative and conjugative elements rather than to other known catabolic plasmids. As far as catabolic functions, the clc element harbored, in addition to the genes for chlorocatechol degradation, a complete functional operon for 2-aminophenol degradation and genes for a putative aromatic compound transport protein and for a multicomponent aromatic ring dioxygenase similar to anthranilate hydroxylase. The genes for catabolic functions were inducible under various conditions, suggesting a network of catabolic pathway induction. For about half of the open reading frames (ORFs) on the clc element, no clear functional prediction could be given, although some indications were found for functions that were similar to plasmid conjugation. The region in which these ORFs were situated displayed a high overall conservation of nucleotide sequence and gene order to genomic regions in other recently completed bacterial genomes or to other genomic islands. Most notably, except for two discrete regions, the clc element was almost 100% identical over the whole length to a chromosomal region in Burkholderia xenovorans LB400. This indicates the dynamic evolution of this type of element and the continued transition between elements with a more pathogenic character and those with catabolic properties.


Sign in / Sign up

Export Citation Format

Share Document