scholarly journals Overexpression-based detection of translatable circular RNAs is vulnerable to coexistent linear RNA byproducts

2021 ◽  
Author(s):  
Yanyi Jiang ◽  
Xiaofan Chen ◽  
Wei Zhang

AbstractIn RNA field, the demarcation between coding and non-coding has been negotiated by the recent discovery of occasionally translated circular RNAs (circRNAs). Although absent of 5’ cap structure, circRNAs can be translated cap-independently. Complementary intron-mediated overexpression is one of the most utilized methodologies for circRNA research but not without bearing echoing skepticism for its poorly defined mechanism and latent coexistent side products. In this study, leveraging such circRNA overexpression system, we have interrogated the protein-coding potential of 30 human circRNAs containing infinite open reading frames in HEK293T cells. Surprisingly, pervasive translation signals are detected by immunoblotting. However, intensive mutagenesis reveals that numerous translation signals are generated independently of circRNA synthesis. We have developed a dual tag strategy to isolate translation noise and directly demonstrate that the fallacious translation signals originate from cryptically spliced linear transcripts. The concomitant linear RNA byproducts, presumably concatemers, can be translated to allow pseudo rolling circle translation signals, and can involve backsplicing junction (BSJ) to disqualify the BSJ-based evidence for circRNA translation. We also find non-AUG start codons may engage in the translation initiation of circRNAs. Taken together, our systematic evaluation sheds light on heterogeneous translational outputs from circRNA overexpression vector and comes with a caveat that ectopic overexpression technique necessitates extremely rigorous control setup in circRNA translation and functional investigation.

2020 ◽  
Vol 40 (6) ◽  
Author(s):  
Corrine Corrina R. Hartford ◽  
Ashish Lal

ABSTRACT Recent advancements in genetic and proteomic technologies have revealed that more of the genome encodes proteins than originally thought possible. Specifically, some putative long noncoding RNAs (lncRNAs) have been misannotated as noncoding. Numerous lncRNAs have been found to contain short open reading frames (sORFs) which have been overlooked because of their small size. Many of these sORFs encode small proteins or micropeptides with fundamental biological importance. These micropeptides can aid in diverse processes, including cell division, transcription regulation, and cell signaling. Here we discuss strategies for establishing the coding potential of putative lncRNAs and describe various functions of known micropeptides.


2004 ◽  
Vol 78 (20) ◽  
pp. 11187-11197 ◽  
Author(s):  
Lisa M. Kattenhorn ◽  
Ryan Mills ◽  
Markus Wagner ◽  
Alexandre Lomsadze ◽  
Vsevolod Makeev ◽  
...  

ABSTRACT Proteins associated with the murine cytomegalovirus (MCMV) viral particle were identified by a combined approach of proteomic and genomic methods. Purified MCMV virions were dissociated by complete denaturation and subjected to either separation by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and in-gel digestion or treated directly by in-solution tryptic digestion. Peptides were separated by nanoflow liquid chromatography and analyzed by tandem mass spectrometry (LC-MS/MS). The MS/MS spectra obtained were searched against a database of MCMV open reading frames (ORFs) predicted to be protein coding by an MCMV-specific version of the gene prediction algorithm GeneMarkS. We identified 38 proteins from the capsid, tegument, glycoprotein, replication, and immunomodulatory protein families, as well as 20 genes of unknown function. Observed irregularities in coding potential suggested possible sequence errors in the 3′-proximal ends of m20 and M31. These errors were experimentally confirmed by sequencing analysis. The MS data further indicated the presence of peptides derived from the unannotated ORFs ORFc225441-226898 (m166.5) and ORF105932-106072. Immunoblot experiments confirmed expression of m166.5 during viral infection.


2019 ◽  
Author(s):  
Yaara Finkel ◽  
Dominik Schmiedel ◽  
Julie Tai-Schmiedel ◽  
Aharon Nachshon ◽  
Michal Schwartz ◽  
...  

AbstractHuman herpesvirus 6 (HHV-6) A and B are highly ubiquitous betaherpesviruses, infecting the majority of the human population. Like other herpesviruses, they encompass large genomes and our understanding of their protein coding potential is far from complete. Here we employ ribosome profiling and systematic transcript analysis to experimentally define the HHV-6 translation products and to follow their temporal expression. We identify hundreds of new open reading frames (ORFs), including many upstream ORFs (uORFs) and internal ORFs (iORFs), generating a complete unbiased atlas of HHV-6 proteome. Furthermore, by integrating systematic data from the prototypic betaherpesvirus, human cytomegalovirus, we uncover numerous uORFs and iORFs that are conserved across betaherpesviruses and we show that uORFs are specifically enriched in late viral genes. Using our transcriptome measurements, we identified three highly abundant HHV-6 encoded long non-coding RNAs (lncRNAs), one of which generates a non-polyadenylated stable intron that appears to be a conserved feature of betaherpesviruses. Overall, our work reveals the complexity of HHV-6 genomes and highlights novel features that are conserved between betaherpesviruses, providing a rich resource for future functional studies.


NAR Cancer ◽  
2020 ◽  
Vol 2 (3) ◽  
Author(s):  
Ghofran Othoum ◽  
Emily Coonrod ◽  
Sidi Zhao ◽  
Ha X Dang ◽  
Christopher A Maher

Abstract Recent studies show that annotated long noncoding RNAs (lncRNAs) and circular RNAs (circRNAs) encode for stable, functional peptides that contribute to human development and disease. To systematically discover lncRNAs and circRNAs encoding peptides, we performed a comprehensive integrative analysis of mass spectrometry-based proteomic and transcriptomic sequencing data from >900 patients across nine cancer types. This enabled us to identify 19,871 novel peptides derived from 8,903 lncRNAs. Further, we exploited open reading frames overlapping the backspliced region of circRNAs to identify 3,238 peptides that are uniquely derived from 2,834 circRNAs and not their corresponding linear RNAs. Collectively, our pan-cancer proteogenomic analysis will serve as a resource for evaluating the coding potential of lncRNAs and circRNAs that could aid future mechanistic studies exploring their function in cancer.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Yaara Finkel ◽  
Dominik Schmiedel ◽  
Julie Tai-Schmiedel ◽  
Aharon Nachshon ◽  
Roni Winkler ◽  
...  

Human herpesvirus-6 (HHV-6) A and B are ubiquitous betaherpesviruses, infecting the majority of the human population. They encompass large genomes and our understanding of their protein coding potential is far from complete. Here, we employ ribosome-profiling and systematic transcript-analysis to experimentally define HHV-6 translation products. We identify hundreds of new open reading frames (ORFs), including upstream ORFs (uORFs) and internal ORFs (iORFs), generating a complete unbiased atlas of HHV-6 proteome. By integrating systematic data from the prototypic betaherpesvirus, human cytomegalovirus, we uncover numerous uORFs and iORFs conserved across betaherpesviruses and we show uORFs are enriched in late viral genes. We identified three highly abundant HHV-6 encoded long non-coding RNAs, one of which generates a non-polyadenylated stable intron appearing to be a conserved feature of betaherpesviruses. Overall, our work reveals the complexity of HHV-6 genomes and highlights novel features conserved between betaherpesviruses, providing a rich resource for future functional studies.


eLife ◽  
2014 ◽  
Vol 3 ◽  
Author(s):  
Jorge Ruiz-Orera ◽  
Xavier Messeguer ◽  
Juan Antonio Subirana ◽  
M Mar Alba

Deep transcriptome sequencing has revealed the existence of many transcripts that lack long or conserved open reading frames (ORFs) and which have been termed long non-coding RNAs (lncRNAs). The vast majority of lncRNAs are lineage-specific and do not yet have a known function. In this study, we test the hypothesis that they may act as a repository for the synthesis of new peptides. We find that a large fraction of the lncRNAs expressed in cells from six different species is associated with ribosomes. The patterns of ribosome protection are consistent with the translation of short peptides. lncRNAs show similar coding potential and sequence constraints than evolutionary young protein coding sequences, indicating that they play an important role in de novo protein evolution.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Robin-Lee Troskie ◽  
Yohaann Jafrani ◽  
Tim R. Mercer ◽  
Adam D. Ewing ◽  
Geoffrey J. Faulkner ◽  
...  

AbstractPseudogenes are gene copies presumed to mainly be functionless relics of evolution due to acquired deleterious mutations or transcriptional silencing. Using deep full-length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we identify here hundreds of novel transcribed pseudogenes expressed in tissue-specific patterns. Some pseudogene transcripts have intact open reading frames and are translated in cultured cells, representing unannotated protein-coding genes. To assess the biological impact of noncoding pseudogenes, we CRISPR-Cas9 delete the nucleus-enriched pseudogene PDCL3P4 and observe hundreds of perturbed genes. This study highlights pseudogenes as a complex and dynamic component of the human transcriptional landscape.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
David S. M. Lee ◽  
Joseph Park ◽  
Andrew Kromer ◽  
Aris Baras ◽  
Daniel J. Rader ◽  
...  

AbstractRibosome-profiling has uncovered pervasive translation in non-canonical open reading frames, however the biological significance of this phenomenon remains unclear. Using genetic variation from 71,702 human genomes, we assess patterns of selection in translated upstream open reading frames (uORFs) in 5’UTRs. We show that uORF variants introducing new stop codons, or strengthening existing stop codons, are under strong negative selection comparable to protein-coding missense variants. Using these variants, we map and validate gene-disease associations in two independent biobanks containing exome sequencing from 10,900 and 32,268 individuals, respectively, and elucidate their impact on protein expression in human cells. Our results suggest translation disrupting mechanisms relating uORF variation to reduced protein expression, and demonstrate that translation at uORFs is genetically constrained in 50% of human genes.


1987 ◽  
Vol 7 (12) ◽  
pp. 4266-4272 ◽  
Author(s):  
L W Stanton ◽  
J M Bishop

NMYC is a gene whose amplification and overexpression have been implicated in the generation of certain human malignancies. Little is known of how the expression of NMYC is normally controlled. We have therefore characterized transcription from the gene and the structure and stability of the resulting mRNAs. Transcription from NMYC is exceptionally complex: it initiates at numerous sites that may be grouped under the control of two promoters, and the multiplicity of initiation sites combines with alternative splicing to engender two forms of mRNA. The mRNAs have different 5' leader sequences (alternative first exons of the gene) but identical bodies (the second and third exons of the gene). Both forms of mRNA are unstable, with half-lives of ca. 15 min. Both encode the previously identified 65,000 and 67,000-dalton products of NMYC. However, the alternative first exons contain distinctive open reading frames that may diversify the coding potential of NMYC. The complexities in transcription of NMYC expand the means by which expression of the gene might be controlled.


Sign in / Sign up

Export Citation Format

Share Document