scholarly journals Identification of Proteins Associated with Murine Cytomegalovirus Virions

2004 ◽  
Vol 78 (20) ◽  
pp. 11187-11197 ◽  
Author(s):  
Lisa M. Kattenhorn ◽  
Ryan Mills ◽  
Markus Wagner ◽  
Alexandre Lomsadze ◽  
Vsevolod Makeev ◽  
...  

ABSTRACT Proteins associated with the murine cytomegalovirus (MCMV) viral particle were identified by a combined approach of proteomic and genomic methods. Purified MCMV virions were dissociated by complete denaturation and subjected to either separation by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and in-gel digestion or treated directly by in-solution tryptic digestion. Peptides were separated by nanoflow liquid chromatography and analyzed by tandem mass spectrometry (LC-MS/MS). The MS/MS spectra obtained were searched against a database of MCMV open reading frames (ORFs) predicted to be protein coding by an MCMV-specific version of the gene prediction algorithm GeneMarkS. We identified 38 proteins from the capsid, tegument, glycoprotein, replication, and immunomodulatory protein families, as well as 20 genes of unknown function. Observed irregularities in coding potential suggested possible sequence errors in the 3′-proximal ends of m20 and M31. These errors were experimentally confirmed by sequencing analysis. The MS data further indicated the presence of peptides derived from the unannotated ORFs ORFc225441-226898 (m166.5) and ORF105932-106072. Immunoblot experiments confirmed expression of m166.5 during viral infection.

2021 ◽  
Author(s):  
Yanyi Jiang ◽  
Xiaofan Chen ◽  
Wei Zhang

AbstractIn RNA field, the demarcation between coding and non-coding has been negotiated by the recent discovery of occasionally translated circular RNAs (circRNAs). Although absent of 5’ cap structure, circRNAs can be translated cap-independently. Complementary intron-mediated overexpression is one of the most utilized methodologies for circRNA research but not without bearing echoing skepticism for its poorly defined mechanism and latent coexistent side products. In this study, leveraging such circRNA overexpression system, we have interrogated the protein-coding potential of 30 human circRNAs containing infinite open reading frames in HEK293T cells. Surprisingly, pervasive translation signals are detected by immunoblotting. However, intensive mutagenesis reveals that numerous translation signals are generated independently of circRNA synthesis. We have developed a dual tag strategy to isolate translation noise and directly demonstrate that the fallacious translation signals originate from cryptically spliced linear transcripts. The concomitant linear RNA byproducts, presumably concatemers, can be translated to allow pseudo rolling circle translation signals, and can involve backsplicing junction (BSJ) to disqualify the BSJ-based evidence for circRNA translation. We also find non-AUG start codons may engage in the translation initiation of circRNAs. Taken together, our systematic evaluation sheds light on heterogeneous translational outputs from circRNA overexpression vector and comes with a caveat that ectopic overexpression technique necessitates extremely rigorous control setup in circRNA translation and functional investigation.


2020 ◽  
Vol 40 (6) ◽  
Author(s):  
Corrine Corrina R. Hartford ◽  
Ashish Lal

ABSTRACT Recent advancements in genetic and proteomic technologies have revealed that more of the genome encodes proteins than originally thought possible. Specifically, some putative long noncoding RNAs (lncRNAs) have been misannotated as noncoding. Numerous lncRNAs have been found to contain short open reading frames (sORFs) which have been overlooked because of their small size. Many of these sORFs encode small proteins or micropeptides with fundamental biological importance. These micropeptides can aid in diverse processes, including cell division, transcription regulation, and cell signaling. Here we discuss strategies for establishing the coding potential of putative lncRNAs and describe various functions of known micropeptides.


2021 ◽  
Vol 9 (1) ◽  
pp. 129
Author(s):  
Katelyn McNair ◽  
Carol L. Ecale Zhou ◽  
Brian Souza ◽  
Stephanie Malfatti ◽  
Robert A. Edwards

One of the main steps in gene-finding in prokaryotes is determining which open reading frames encode for a protein, and which occur by chance alone. There are many different methods to differentiate the two; the most prevalent approach is using shared homology with a database of known genes. This method presents many pitfalls, most notably the catch that you only find genes that you have seen before. The four most popular prokaryotic gene-prediction programs (GeneMark, Glimmer, Prodigal, Phanotate) all use a protein-coding training model to predict protein-coding genes, with the latter three allowing for the training model to be created ab initio from the input genome. Different methods are available for creating the training model, and to increase the accuracy of such tools, we present here GOODORFS, a method for identifying protein-coding genes within a set of all possible open reading frames (ORFS). Our workflow begins with taking the amino acid frequencies of each ORF, calculating an entropy density profile (EDP), using KMeans to cluster the EDPs, and then selecting the cluster with the lowest variation as the coding ORFs. To test the efficacy of our method, we ran GOODORFS on 14,179 annotated phage genomes, and compared our results to the initial training-set creation step of four other similar methods (Glimmer, MED2, PHANOTATE, Prodigal). We found that GOODORFS was the most accurate (0.94) and had the best F1-score (0.85), while Glimmer had the highest precision (0.92) and PHANOTATE had the highest recall (0.96).


2019 ◽  
Author(s):  
Yaara Finkel ◽  
Dominik Schmiedel ◽  
Julie Tai-Schmiedel ◽  
Aharon Nachshon ◽  
Michal Schwartz ◽  
...  

AbstractHuman herpesvirus 6 (HHV-6) A and B are highly ubiquitous betaherpesviruses, infecting the majority of the human population. Like other herpesviruses, they encompass large genomes and our understanding of their protein coding potential is far from complete. Here we employ ribosome profiling and systematic transcript analysis to experimentally define the HHV-6 translation products and to follow their temporal expression. We identify hundreds of new open reading frames (ORFs), including many upstream ORFs (uORFs) and internal ORFs (iORFs), generating a complete unbiased atlas of HHV-6 proteome. Furthermore, by integrating systematic data from the prototypic betaherpesvirus, human cytomegalovirus, we uncover numerous uORFs and iORFs that are conserved across betaherpesviruses and we show that uORFs are specifically enriched in late viral genes. Using our transcriptome measurements, we identified three highly abundant HHV-6 encoded long non-coding RNAs (lncRNAs), one of which generates a non-polyadenylated stable intron that appears to be a conserved feature of betaherpesviruses. Overall, our work reveals the complexity of HHV-6 genomes and highlights novel features that are conserved between betaherpesviruses, providing a rich resource for future functional studies.


2006 ◽  
Vol 72 (11) ◽  
pp. 6980-6985 ◽  
Author(s):  
Shelley A. Haveman ◽  
Dawn E. Holmes ◽  
Yan-Huai R. Ding ◽  
Joy E. Ward ◽  
Raymond J. DiDonato ◽  
...  

ABSTRACT Previous studies failed to detect c-type cytochromes in Pelobacter species despite the fact that other close relatives in the Geobacteraceae, such as Geobacter and Desulfuromonas species, have abundant c-type cytochromes. Analysis of the recently completed genome sequence of Pelobacter carbinolicus revealed 14 open reading frames that could encode c-type cytochromes. Transcripts for all but one of these open reading frames were detected in acetoin-fermenting and/or Fe(III)-reducing cells. Three putative c-type cytochrome genes were expressed specifically during Fe(III) reduction, suggesting that the encoded proteins may participate in electron transfer to Fe(III). One of these proteins was a periplasmic triheme cytochrome with a high level of similarity to PpcA, which has a role in Fe(III) reduction in Geobacter sulfurreducens. Genes for heme biosynthesis and system II cytochrome c biogenesis were identified in the genome and shown to be expressed. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis gels of protein extracted from acetoin-fermenting P. carbinolicus cells contained three heme-staining bands which were confirmed by mass spectrometry to be among the 14 predicted c-type cytochromes. The number of cytochrome genes, the predicted amount of heme c per protein, and the ratio of heme-stained protein to total protein were much smaller in P. carbinolicus than in G. sulfurreducens. Furthermore, many of the c-type cytochromes that genetic studies have indicated are required for optimal Fe(III) reduction in G. sulfurreducens were not present in the P. carbinolicus genome. These results suggest that further evaluation of the functions of c-type cytochromes in the Geobacteraceae is warranted.


1998 ◽  
Vol 66 (6) ◽  
pp. 2743-2749 ◽  
Author(s):  
J. D. Hillman ◽  
Jan Novák ◽  
Edy Sagura ◽  
Juan A. Gutierrez ◽  
T. A. Brooks ◽  
...  

ABSTRACT Streptococcus mutans JH1000 and its derivatives were previously shown (J. D. Hillman, K. P. Johnson, and B. I. Yaphe, Infect. Immun. 44:141–144, 1984) to produce a low-molecular-weight, broad-spectrum bacteriocin-like inhibitory substance (BLIS). The thermosensitive vector pTV1-OK harboring Tn917 was used to isolate a BLIS-deficient mutant, DM25, and the mutated gene was recovered by shotgun cloning inEscherichia coli. Sequence analysis of insert DNA adjacent to Tn917 led to the identification of four open reading frames including two (lanA and lanB) which have substantial homology to the Staphylococcus epidermidisstructural gene (epiA) and a modifying enzyme gene (epiB) for biosynthesis of the lantibiotic epidermin, respectively. Although the BLIS activity could not be recovered from broth cultures, high yields were obtained from a solid medium consisting of Todd-Hewitt broth containing 0.5% agarose that was stab inoculated with JH1140 (a spontaneous mutant of JH1000 that produces threefold-elevated amounts of activity). Agar could not substitute for agarose. Chloroform extraction of the spent medium produced a fraction which yielded two major bands on sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The faster-migrating band was absent in chloroform extracts of the mutant, DM25. The amino acid sequence of this band was determined by Edman sequencing and mass spectroscopy. The results showed that it is a lantibiotic, which we have named mutacin 1140, and that the sequence corresponded to that deduced from thelanA sequence. We observed a number of similarities of mutacin 1140 to epidermin and an S. mutans lantibiotic, B-Ny266, but it appears to have significant differences in the positions of its thioether bridges. It also has other unique features with regard to its leader sequence and posttranslational modification. A proposed structure for mutacin 1140 is presented.


2017 ◽  
Author(s):  
Ulrich Omasits ◽  
Adithi R. Varadarajan ◽  
Michael Schmid ◽  
Sandra Goetze ◽  
Damianos Melidis ◽  
...  

AbstractAccurate annotation of all protein-coding sequences (CDSs) is an essential prerequisite to fully exploit the rapidly growing repertoire of completely sequenced prokaryotic genomes. However, large discrepancies among the number of CDSs annotated by different resources, missed functional short open reading frames (sORFs), and overprediction of spurious ORFs represent serious limitations.Our strategy towards accurate and complete genome annotation consolidates CDSs from multiple reference annotation resources,ab initiogene prediction algorithms andin silicoORFs in an integrated proteogenomics database (iPtgxDB) that covers the entire protein-coding potential of a prokaryotic genome. By extending the PeptideClassifier concept of unambiguous peptides for prokaryotes, close to 95% of the identifiable peptides imply one distinct protein, largely simplifying downstream analysis. Searching a comprehensiveBartonella henselaeproteomics dataset against such an iPtgxDB allowed us to unambiguously identify novel ORFs uniquely predicted by each resource, including lipoproteins, differentially expressed and membrane-localized proteins, novel start sites and wrongly annotated pseudogenes. Most novelties were confirmed by targeted, parallel reaction monitoring mass spectrometry, including unique ORFs and variants identified in a re-sequenced laboratory strain that are not present in its reference genome. We demonstrate the general applicability of our strategy for genomes with varying GC content and distinct taxonomic origin, and release iPtgxDBs forB. henselae,Bradyrhozibium diazoefficiensandEscherichia colias well as the software to generate such proteogenomics search databases for any prokaryote.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Yaara Finkel ◽  
Dominik Schmiedel ◽  
Julie Tai-Schmiedel ◽  
Aharon Nachshon ◽  
Roni Winkler ◽  
...  

Human herpesvirus-6 (HHV-6) A and B are ubiquitous betaherpesviruses, infecting the majority of the human population. They encompass large genomes and our understanding of their protein coding potential is far from complete. Here, we employ ribosome-profiling and systematic transcript-analysis to experimentally define HHV-6 translation products. We identify hundreds of new open reading frames (ORFs), including upstream ORFs (uORFs) and internal ORFs (iORFs), generating a complete unbiased atlas of HHV-6 proteome. By integrating systematic data from the prototypic betaherpesvirus, human cytomegalovirus, we uncover numerous uORFs and iORFs conserved across betaherpesviruses and we show uORFs are enriched in late viral genes. We identified three highly abundant HHV-6 encoded long non-coding RNAs, one of which generates a non-polyadenylated stable intron appearing to be a conserved feature of betaherpesviruses. Overall, our work reveals the complexity of HHV-6 genomes and highlights novel features conserved between betaherpesviruses, providing a rich resource for future functional studies.


eLife ◽  
2014 ◽  
Vol 3 ◽  
Author(s):  
Jorge Ruiz-Orera ◽  
Xavier Messeguer ◽  
Juan Antonio Subirana ◽  
M Mar Alba

Deep transcriptome sequencing has revealed the existence of many transcripts that lack long or conserved open reading frames (ORFs) and which have been termed long non-coding RNAs (lncRNAs). The vast majority of lncRNAs are lineage-specific and do not yet have a known function. In this study, we test the hypothesis that they may act as a repository for the synthesis of new peptides. We find that a large fraction of the lncRNAs expressed in cells from six different species is associated with ribosomes. The patterns of ribosome protection are consistent with the translation of short peptides. lncRNAs show similar coding potential and sequence constraints than evolutionary young protein coding sequences, indicating that they play an important role in de novo protein evolution.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Robin-Lee Troskie ◽  
Yohaann Jafrani ◽  
Tim R. Mercer ◽  
Adam D. Ewing ◽  
Geoffrey J. Faulkner ◽  
...  

AbstractPseudogenes are gene copies presumed to mainly be functionless relics of evolution due to acquired deleterious mutations or transcriptional silencing. Using deep full-length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we identify here hundreds of novel transcribed pseudogenes expressed in tissue-specific patterns. Some pseudogene transcripts have intact open reading frames and are translated in cultured cells, representing unannotated protein-coding genes. To assess the biological impact of noncoding pseudogenes, we CRISPR-Cas9 delete the nucleus-enriched pseudogene PDCL3P4 and observe hundreds of perturbed genes. This study highlights pseudogenes as a complex and dynamic component of the human transcriptional landscape.


Sign in / Sign up

Export Citation Format

Share Document