Assembly of Viral Metagenomes from Yellowstone Hot Springs

ABSTRACT Thermophilic viruses were reported decades ago; however, knowledge of their diversity, biology, and ecological impact is limited. Previous research on thermophilic viruses focused on cultivated strains. This study examined metagenomic profiles of viruses directly isolated from two mildly alkaline hot springs, Bear Paw (74°C) and Octopus (93°C). Using a new method for constructing libraries from picograms of DNA, nearly 30 Mb of viral DNA sequence was determined. In contrast to previous studies, sequences were assembled at 50% and 95% identity, creating composite contigs up to 35 kb and facilitating analysis of the inherent heterogeneity in the populations. Lowering the assembly identity reduced the estimated number of viral types from 1,440 and 1,310 to 548 and 283, respectively. Surprisingly, the diversity of viral species in these springs approaches that in moderate-temperature environments. While most known thermophilic viruses have a chronic, nonlytic infection lifestyle, analysis of coding sequences suggests lytic viruses are more common in geothermal environments than previously thought. The 50% assembly included one contig with high similarity and perfect synteny to nine genes from Pyrobaculum spherical virus (PSV). In fact, nearly all the genes of the 28-kb genome of PSV have apparent homologs in the metagenomes. Similarities to thermoacidophilic viruses isolated on other continents were limited to specific open reading frames but were equally strong. Nearly 25% of the reads showed significant similarity between the hot springs, suggesting a common subterranean source. To our knowledge, this is the first application of metagenomics to viruses of geothermal origin.

Download Full-text

Adding toYersinia enterocoliticaGene Pool Diversity: Two Cryptic Plasmids from a Biotype 1A Isolate

Journal of Biomedicine and Biotechnology ◽

10.1155/2009/398434 ◽

2009 ◽

Vol 2009 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

Daniela Lepka ◽

Tobias Kerrinnes ◽

Evelyn Skiebe ◽

Birgitt Hahn ◽

Angelika Fruth ◽

...

Keyword(s):

Hypothetical Protein ◽

Replication Initiation ◽

Amino Acid Sequences ◽

Open Reading Frames ◽

Eukaryotic Cells ◽

Base Pairs ◽

Significant Similarity ◽

Replication Initiation Protein ◽

Cryptic Plasmids ◽

Reading Frames

We report the nucleotide sequence of two novel cryptic plasmids (4357 and 14 662 base pairs) carried by aYersinia enterocoliticabiotype 1A strain isolated from pork. As distinguished from most biotype 1A strains, this isolate, designated 07-04449, exhibited adherence to eukaryotic cells. The smaller plasmid pYe4449-1 carries five attributable open reading frames (ORFs) encoding the first CcdA/CcdB-like antitoxin/toxin system described for aYersiniaplasmid, a RepA-like replication initiation protein, and mobilizing factors MobA and MobC. The deduced amino acid sequences showed highest similarity to proteins described inSalmonella(CcdA/B),Klebsiella(RepA), andPlesiomonas(MobA/C) indicating genomic fluidity among members of theEnterobacteriaceae. One additional ORF with unknown function, termed ORF5, was identified with an ancestry distinct from the rest of the plasmid. While the C+G content of ORF5 is 38.3%, the rest of pYe4449-1 shows a C+G content of 55.7%. The C+G content of the larger plasmid pYe4449-2 (54.9%) was similar to that of pYe4449-1 (53.7%) and differed from that of theY. enterocoliticagenome (47.3%). Of the 14 ORFs identified on pYe4449-2, only six ORFs showed significant similarity to database entries. For three of these ORFs likely functions could be ascribed: a TnpR-like resolvase and a phage replication protein, localized each on a low C+G island, and DNA primase TraC. Two ORFs of pYe4449-2, ORF3 and ORF7, seem to encode secretable proteins. Epitope-tagging of ORF3 revealed protein expression at4°Cbut not at or above27°Csuggesting adaptation to a habitat outside swine. The hypothetical protein encoded by ORF7 is the member of a novel repeat protein family sharing theDxxGN(x)nDxxGNmotif. Our findings illustrate the exceptional gene pool diversity within the speciesY. enterocoliticadriven by horizontal gene transfer events.

Download Full-text

The DNA sequence of a 7941 bp fragment of the left arm of chromosome VII ofSaccharomyces cerevisiae contains four open reading frames including the multicopy suppressor gene of thepop2 mutation and a putative serine/threonine protein kinase gene

Yeast ◽

10.1002/yea.320110808 ◽

1995 ◽

Vol 11 (8) ◽

pp. 767-774 ◽

Cited By ~ 7

Author(s):

Maristella Coglievina ◽

Iris Bertani ◽

Raffaella Klima ◽

Paolo Zaccaria ◽

Carlo Vito Bruschi

Keyword(s):

Protein Kinase ◽

Dna Sequence ◽

Suppressor Gene ◽

Open Reading Frames ◽

Kinase Gene ◽

Multicopy Suppressor ◽

Protein Kinase Gene ◽

Reading Frames

Download Full-text

Application of Serial Analysis of Gene Expression to the Study of the Gene Expression Profile ofLeishmania infantum chagasiPromastigote

Journal of Biomedicine and Biotechnology ◽

10.1155/2012/673458 ◽

2012 ◽

Vol 2012 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Adelino Soares Lima Neto ◽

Osvaldo Pompílio de Melo Neto ◽

Carlos Henrique Nery Costa

Keyword(s):

Gene Expression ◽

Life Cycle ◽

Gene Expression Profile ◽

Expression Profile ◽

Leishmania Infantum ◽

Open Reading Frames ◽

Genomic Sequences ◽

Coding Sequences ◽

Key Genes ◽

Reading Frames

This study describes the application of the LongSAGE methodology to study the gene expression profile in promastigotes ofLeishmania infantum chagasi. A tag library was created using the LongSAGE method and consisted of 14,208 tags of 17 bases. Of these, 8,427 (59.3%) were distinct. BLAST research of the 1,645 most abundant tags showed that 12.8% of them identified the coding sequences of genes, while 82% (1,349/1,645) identified one or more genomic sequences that did not correspond with open reading frames. Only 5.2% (84/1,645) of the tags were not aligned to any position in theL. infantum genome. The UTR size ofLeishmaniaand the lack of CATG sites in some transcripts were decisive for the generation of tags in these regions. Additional analysis will allow a better understanding of the expression profile and discovering the key genes in this life cycle.

Download Full-text

The first and fourth upstream open reading frames in GCN4 mRNA have similar initiation efficiencies but respond differently in translational control to change in length and sequence

Molecular and Cellular Biology ◽

10.1128/mcb.8.12.5439-5447.1988 ◽

1988 ◽

Vol 8 (12) ◽

pp. 5439-5447

Author(s):

P P Mueller ◽

B M Jackson ◽

P F Miller ◽

A G Hinnebusch

Keyword(s):

Translational Control ◽

Normal Growth ◽

Open Reading Frames ◽

Regulatory Function ◽

Coding Sequences ◽

Lacz Fusions ◽

Upstream Open Reading Frames ◽

Reading Frames ◽

Starvation Conditions

The third and fourth AUG codons in GCN4 mRNA efficiently repress translation of the GCN4-coding sequences under normal growth conditions. The first AUG codon is approximately 30-fold less inhibitory and is required under amino acid starvation conditions to override the repressing effects of AUG codons 3 and 4. lacZ fusions constructed to functional, elongated versions of the first and fourth upstream open reading frames (URFs) were used to show that AUG codons 1 and 4 function similarly as efficient translational start sites in vivo, raising the possibility that steps following initiation distinguish the regulatory properties of URFs 1 and 4. In accord with this idea, we observed different consequences of changing the length and termination site of URF1 versus changing those of URFs 3 and 4. The latter were lengthened considerably, with little or no effect on regulation. In fact, the function of URFs 3 and 4 was partially reconstituted with a completely heterologous URF. By contrast, certain mutations that lengthen URF1 impaired its positive regulatory function nearly as much as removing its AUG codon did. The same mutations also made URF1 a much more inhibitory element when it was present alone in the mRNA leader. These results strongly suggest that URFs 1 and 4 both function in regulation as translated coding sequences. To account for the phenotypes of the URF1 mutations, we suggest the most ribosomes normally translate URF1 and that the mutations reduce the number of ribosomes that are able to complete URF1 translation and resume scanning downstream. This effect would impair URF1 positive regulatory function if ribosomes must first translate URF1 in order to overcome the strong translational block at the 3'-proximal URFs. Because URF1-lacZ fusions were translated at the same rate under repressing and derepressing conditions, it appears that modulating initiation at URF1 is not the means that is used to restrict the regulatory consequences of URF1 translation to starvation conditions.

Download Full-text

XV. Yeast sequencing reports. DNA sequence analysis of a 13 kbp fragment of the left arm of yeast chromosome XV containing seven new open reading frames

Yeast ◽

10.1002/yea.320111308 ◽

1995 ◽

Vol 11 (13) ◽

pp. 1281-1288 ◽

Cited By ~ 9

Author(s):

Antonio Casamayor ◽

Martí Aldea ◽

Celia Casas ◽

Enrique Herrero ◽

Francisco-Javier Gamo ◽

...

Keyword(s):

Sequence Analysis ◽

Dna Sequence ◽

Dna Sequence Analysis ◽

Open Reading Frames ◽

Yeast Chromosome ◽

Reading Frames

Download Full-text

Sequence and Organization of pXO1, the Large Bacillus anthracis Plasmid Harboring the Anthrax Toxin Genes

Journal of Bacteriology ◽

10.1128/jb.181.20.6509-6515.1999 ◽

1999 ◽

Vol 181 (20) ◽

pp. 6509-6515 ◽

Cited By ~ 250

Author(s):

R. T. Okinaka ◽

K. Cloud ◽

O. Hampton ◽

A. R. Hoffmaster ◽

K. K. Hill ◽

...

Keyword(s):

Bacillus Anthracis ◽

Regulatory Elements ◽

Open Reading Frames ◽

Significant Similarity ◽

Toxin Genes ◽

Theta Replication ◽

Large Plasmids ◽

High Degree ◽

Degree Of Similarity ◽

Reading Frames

ABSTRACT The Bacillus anthracis Sterne plasmid pXO1 was sequenced by random, “shotgun” cloning. A circular sequence of 181,654 bp was generated. One hundred forty-three open reading frames (ORFs) were predicted using GeneMark and GeneMark.hmm, comprising only 61% (110,817 bp) of the pXO1 DNA sequence. The overall guanine-plus-cytosine content of the plasmid is 32.5%. The most recognizable feature of the plasmid is a “pathogenicity island,” defined by a 44.8-kb region that is bordered by inverted IS1627 elements at each end. This region contains the three toxin genes (cya, lef, and pagA), regulatory elements controlling the toxin genes, three germination response genes, and 19 additional ORFs. Nearly 70% of the ORFs on pXO1 do not have significant similarity to sequences available in open databases. Absent from the pXO1 sequence are homologs to genes that are typically required to drive theta replication and to maintain stability of large plasmids in Bacillus spp. Among the ORFs with a high degree of similarity to known sequences are a collection of putative transposases, resolvases, and integrases, suggesting an evolution involving lateral movement of DNA among species. Among the remaining ORFs, there are three sequences that may encode enzymes responsible for the synthesis of a polysaccharide capsule usually associated with serotype-specific virulent streptococci.

Download Full-text

Evaluation of the Lytic Origins of Replication of Kaposi's Sarcoma-Associated Virus/Human Herpesvirus 8 in the Context of the Viral Genome

Journal of Virology ◽

10.1128/jvi.01004-06 ◽

2006 ◽

Vol 80 (19) ◽

pp. 9905-9909 ◽

Cited By ~ 20

Author(s):

Yiyang Xu ◽

Alicia Rodriguez-Huete ◽

Gregory S. Pari

Keyword(s):

Viral Genome ◽

Human Herpesvirus ◽

Artificial Chromosome ◽

Open Reading Frames ◽

Human Herpesvirus 8 ◽

Viral Dna ◽

Herpesvirus 8 ◽

Replication Assay ◽

Transient Replication Assay ◽

Reading Frames

ABSTRACT The lytic origins of DNA replication for human herpesvirus 8 (HHV8), oriLyt-L and oriLyt-R, are located between open reading frames K4.2 and K5 and ORF69 and vFLIP, respectively. These lytic origins were elucidated using a transient replication assay. Although this assay is a powerful tool for identifying many herpesvirus lytic origins, it is limited in its ability to evaluate the activity of replication origins in the context of the viral genome. To this end, we investigated the ability of a recombinant HHV8 bacterial artificial chromosome (BAC) to replicate in the absence of oriLyt-R, oriLyt-L, or both oriLyt regions. We generated the HHV8 BAC recombinants (BAC36-ΔOri-R, BAC36-ΔOri-L, and BAC36-ΔOri-RL), which removed one or all of the identified lytic origins. An evaluation of these recombinant BACs revealed that oriLyt-L was sufficient to propagate the viral genome, whereas oriLyt-R alone failed to direct the amplification of viral DNA.

Download Full-text

Identification of novel Arabidopsis thaliana upstream open reading frames that control expression of the main coding sequences in a peptide sequence-dependent manner

Nucleic Acids Research ◽

10.1093/nar/gkv018 ◽

2015 ◽

Vol 43 (3) ◽

pp. 1562-1576 ◽

Cited By ~ 36

Author(s):

Isao Ebina ◽

Mariko Takemoto-Tsutsumi ◽

Shun Watanabe ◽

Hiroaki Koyama ◽

Yayoi Endo ◽

...

Keyword(s):

Arabidopsis Thaliana ◽

Open Reading Frames ◽

Peptide Sequence ◽

Dependent Manner ◽

Coding Sequences ◽

Sequence Dependent ◽

Upstream Open Reading Frames ◽

Reading Frames

Download Full-text

SOME HINTS ON OPEN READING FRAME STATISTICS — HOW ORF LENGTH DEPENDS ON SELECTION

International Journal of Modern Physics C ◽

10.1142/s0129183199000474 ◽

1999 ◽

Vol 10 (04) ◽

pp. 635-643 ◽

Cited By ~ 4

Author(s):

AGNIESZKA GIERLIK ◽

PAWEŁ MACKIEWICZ ◽

MARIA KOWALCZUK ◽

STANISŁAW CEBRAT ◽

MIROSŁAW R. DUDEK

Keyword(s):

Genetic Code ◽

Dna Sequences ◽

Nucleotide Composition ◽

Open Reading Frames ◽

Open Reading Frame ◽

Antisense Strand ◽

Coding Sequences ◽

Reading Frame ◽

Intergenic Sequences ◽

Reading Frames

Coding sequences of DNA generate Open Reading Frames (ORFs) inside them with much higher frequency than random DNA sequences do, especially in the antisense strand. This is a specific feature of the genetic code. Since coding sequences are selected for their length, the generated ORFs are indirect results of this selection and their length is also influenced by selection. That is why ORFs found in any genome, even much longer ones than those spontaneously generated in random DNA sequences, should be considered as two different sets of ORFs: The first one coding for proteins, the second one generated by the coding ORFs. Even intergenic sequences possess greater capacity for generating ORFs than random DNA sequences of the same nucleotide composition, which seems to be a premise that intergenic sequences were generated from coding sequences by recombinational mechanisms.

Download Full-text

Identification of Eukaryotic Open Reading Frames in Metagenomic cDNA Libraries Made from Environmental Samples

Applied and Environmental Microbiology ◽

10.1128/aem.72.1.135-143.2006 ◽

2006 ◽

Vol 72 (1) ◽

pp. 135-143 ◽

Cited By ~ 48

Author(s):

Susan Grant ◽

William D. Grant ◽

Don A. Cowan ◽

Brian E. Jones ◽

Yanhe Ma ◽

...

Keyword(s):

Activated Sludge ◽

Ribosomal Protein ◽

Environmental Samples ◽

Hot Springs ◽

Sewage Treatment Plant ◽

Open Reading Frames ◽

Cdna Libraries ◽

Microbial Composition ◽

Total Rna ◽

Reading Frames

ABSTRACT Here we describe the application of metagenomic technologies to construct cDNA libraries from RNA isolated from environmental samples. RNAlater (Ambion) was shown to stabilize RNA in environmental samples for periods of at least 3 months at −20°C. Protocols for library construction were established on total RNA extracted from Acanthamoeba polyphaga trophozoites. The methodology was then used on algal mats from geothermal hot springs in Tengchong county, Yunnan Province, People's Republic of China, and activated sludge from a sewage treatment plant in Leicestershire, United Kingdom. The Tenchong libraries were dominated by RNA from prokaryotes, reflecting the mainly prokaryote microbial composition. The majority of these clones resulted from rRNA; only a few appeared to be derived from mRNA. In contrast, many clones from the activated sludge library had significant similarity to eukaryote mRNA-encoded protein sequences. A library was also made using polyadenylated RNA isolated from total RNA from activated sludge; many more clones in this library were related to eukaryotic mRNA sequences and proteins. Open reading frames (ORFs) up to 378 amino acids in size could be identified. Some resembled known proteins over their full length, e.g., 36% match to cystatin, 49% match to ribosomal protein L32, 63% match to ribosomal protein S16, 70% to CPC2 protein. The methodology described here permits the polyadenylated transcriptome to be isolated from environmental samples with no knowledge of the identity of the microorganisms in the sample or the necessity to culture them. It has many uses, including the identification of novel eukaryotic ORFs encoding proteins and enzymes.

Download Full-text