scholarly journals Evolution and Horizontal Transfer of dUTPase-Encoding Genes in Viruses and Their Hosts

1999 ◽  
Vol 73 (9) ◽  
pp. 7710-7721 ◽  
Author(s):  
Angela M. Baldo ◽  
Marcella A. McClure

ABSTRACT dUTPase is a ubiquitous and essential enzyme responsible for regulating cellular levels of dUTP. The dut gene exists as single, tandemly duplicated, and tandemly triplicated copies. Crystallized single-copy dUTPases have been shown to assemble as homotrimers. dUTPase is encoded as an auxiliary gene in a number of virus genomes. The origin of viral dut genes has remained unresolved since their initial discovery. A comprehensive analysis of dUTPase amino acid sequence relationships was performed to explore the evolutionary dynamics of dut in viruses and their hosts. Our data set, comprised of 24 host and 51 viral sequences, includes representative sequences from available eukaryotes, archaea, eubacteria cells, and viruses, including herpesviruses. These amino acid sequences were aligned by using a hidden Markov model approach developed to align divergent data. Known secondary structures from single-copy crystals were mapped onto the aligned duplicate and triplicate sequences. We show how duplicated dUTPases might fold into a monomer, and we hypothesize that triplicated dUTPases also assemble as monomers. Phylogenetic analysis revealed at least five viral dUTPase sequence lineages in well-supported monophyletic clusters with eukaryotic, eubacterial, and archaeal hosts. We have identified all five as strong examples of horizontal transfer as well as additional potential transfer of dut genes among eubacteria, between eubacteria and viruses, and between retroviruses. The evidence for horizontal transfers is particularly interesting since eukaryotic dutgenes have introns, while DNA virus dut genes do not. This implies that an intermediary retroid agent facilitated the horizontal transfer process between host mRNA and DNA viruses.

mSphere ◽  
2019 ◽  
Vol 4 (2) ◽  
Author(s):  
Marli Vlok ◽  
Andrew S. Lang ◽  
Curtis A. Suttle

ABSTRACTRNA viruses, particularly genetically diverse members of thePicornavirales, are widespread and abundant in the ocean. Gene surveys suggest that there are spatial and temporal patterns in the composition of RNA virus assemblages, but data on their diversity and genetic variability in different oceanographic settings are limited. Here, we show that specific RNA virus genomes have widespread geographic distributions and that the dominant genotypes are under purifying selection. Genomes from three previously unknown picorna-like viruses (BC-1, -2, and -3) assembled from a coastal site in British Columbia, Canada, as well as marine RNA viruses JP-A, JP-B, andHeterosigma akashiwoRNA virus exhibited different biogeographical patterns. Thus, biotic factors such as host specificity and viral life cycle, and not just abiotic processes such as dispersal, affect marine RNA virus distribution. Sequence differences relative to reference genomes imply that virus quasispecies are under purifying selection, with synonymous single-nucleotide variations dominating in genomes from geographically distinct regions resulting in conservation of amino acid sequences. Conversely, sequences from coastal South Africa that mapped to marine RNA virus JP-A exhibited more nonsynonymous mutations, probably representing amino acid changes that accumulated over a longer separation. This biogeographical analysis of marine RNA viruses demonstrates that purifying selection is occurring across oceanographic provinces. These data add to the spectrum of known marine RNA virus genomes, show the importance of dispersal and purifying selection for these viruses, and indicate that closely related RNA viruses are pathogens of eukaryotic microbes across oceans.IMPORTANCEVery little is known about aquatic RNA virus populations and genome evolution. This is the first study that analyzes marine environmental RNA viral assemblages in an evolutionary and broad geographical context. This study contributes the largest marine RNA virus metagenomic data set to date, substantially increasing the sequencing space for RNA viruses and also providing a baseline for comparisons of marine RNA virus diversity. The new viruses discovered in this study are representative of the most abundant family of marine RNA viruses, theMarnaviridae, and expand our view of the diversity of this important group. Overall, our data and analyses provide a foundation for interpreting marine RNA virus diversity and evolution.


2008 ◽  
Vol 191 (1) ◽  
pp. 65-73 ◽  
Author(s):  
Pavel S. Novichkov ◽  
Yuri I. Wolf ◽  
Inna Dubchak ◽  
Eugene V. Koonin

ABSTRACT In order to explore microevolutionary trends in bacteria and archaea, we constructed a data set of 41 alignable tight genome clusters (ATGCs). We show that the ratio of the medians of nonsynonymous to synonymous substitution rates (dN/dS) that is used as a measure of the purifying selection pressure on protein sequences is a stable characteristic of the ATGCs. In agreement with previous findings, parasitic bacteria, notwithstanding the sometimes dramatic genome shrinkage caused by gene loss, are typically subjected to relatively weak purifying selection, presumably owing to relatively small effective population sizes and frequent bottlenecks. However, no evidence of genome streamlining caused by strong selective pressure was found in any of the ATGCs. On the contrary, a significant positive correlation between the genome size, as well as gene size, and selective pressure was observed, although a variety of free-living prokaryotes with very close selective pressures span nearly the entire range of genome sizes. In addition, we examined the connections between the sequence evolution rate and other genomic features. Although gene order changes much faster than protein sequences during the evolution of prokaryotes, a strong positive correlation was observed between the “rearrangement distance” and the amino acid distance, suggesting that at least some of the events leading to genome rearrangement are subjected to the same type of selective constraints as the evolution of amino acid sequences.


2008 ◽  
Vol 74 (17) ◽  
pp. 5524-5532 ◽  
Author(s):  
Dong Xu ◽  
Jean-Charles Côté

ABSTRACT In Bacillus thuringiensis, the hag gene encodes flagellin, the protein responsible for eliciting the immunological reaction in H serotyping. Specific flagellin amino acid sequences have been correlated to specific B. thuringiensis H serotypes, H1 to H67. Ten H serotypes, however, contain three or more antigenic subfactors, labeled a, b, c, d, or e, and have been subdivided into 23 serovars. In the present study, we set out to analyze the sequence diversity of flagellins among serovars from the same H serotypes. We studied the hag genes in 39 B. thuringiensis strains representing the 23 serovars from the 10 H serotypes mentioned above. A serovar and a biovar from an 11th H serotype were also included. The hag genes were amplified and cloned and their nucleotide sequences were determined and translated into amino acid sequences, or the sequences were retrieved directly from GenBank when available. Strains of the H3 serotype contained two or three copies of the fla gene, an ortholog of the hag gene. Strains of the H6 serotype contained three copies. Strains of all other H serotypes each contained a single copy of the hag gene. Alignments of amino acid sequences from all copies in all strains of the H3 serotype revealed short signature sequences, GGAG and SGG, GPDPDDAVKNLT, and DITTTK, that appeared to be specific to the H3c, H3d, and H3e antigenic subfactors, respectively. Similar short signature sequences, GDIT, AFIK, TSAGKA, and SAPSKG, were revealed for H8b, H8c, H20b, and H20c, respectively. Amino acid sequences in the flagellin central variable region were highly conserved among serovars of the H3, H5, H11, and H20 serotypes and much more divergent among serovars of the H4, H10, H18, H24, and H28 serotypes. Two bootstrapped neighbor-joining trees were respectively generated from the alignments of the amino acid sequences translated from all copies of the hag genes in the B. thuringiensis strains of the H3 and H6 serotypes. Sequence identities and relationships were revealed. A third bootstrapped neighbor-joining tree was generated, this one from the alignment of the flagellin amino acid sequences from all the B. thuringiensis strains in the study. Eight clusters, I to VIII, were revealed. Although most clusters contained strains and serovars from the same H serotype, clusters VII and VIII contained serovars from different H serotypes.


Genome ◽  
1992 ◽  
Vol 35 (2) ◽  
pp. 360-371 ◽  
Author(s):  
Hugh Tyson

Optimum alignment in all pairwise combinations among a group of amino acid sequences generated a distance matrix. These distances were clustered to evaluate relationships among the sequences. The degree of relationship among sequences was also evaluated by calculating specific distances from the distance matrix and examining correlations between patterns of specific distances for pairs of sequences. The sequences examined were a group of 20 amino acid sequences of scorpion toxins originally published and analyzed by M.J. Dufton and H. Rochat in 1984. Alignment gap penalties were constant for all 190 pairwise sequence alignments and were chosen after assessing the impact of changing penalties on resultant distances. The total distances generated by the 190 pairwise sequence aligments were clustered using complete (farthest neighbour) linkage. The square, symmetrical input distance matrix is analogous to diallel cross data where reciprocal and parental values are absent. Diallel analysis methods provided analogues for the distance matrix to genetical specific combining abilities, namely specific distances between all sequence pairs that are independent of the average distances shown by individual sequences. Correlation of specific distance patterns, with transformation to modified z values and a stringent probability level, were used to delineate subgroups of related sequences. These were compared with complete linkage clustering results. Excellent agreement between the two approaches was found. Three originally outlying sequences were placed within the four new subgroups.Key words: sequence alignment, specific distances, sequence relationships.


2004 ◽  
Vol 70 (9) ◽  
pp. 5357-5365 ◽  
Author(s):  
Kathleen M. Schleinitz ◽  
Sabine Kleinsteuber ◽  
Tatiana Vallaeys ◽  
Wolfgang Babel

ABSTRACT Two novel genes, rdpA and sdpA, encoding the enantiospecific α-ketoglutarate dependent dioxygenases catalyzing R,S-dichlorprop cleavage in Delftia acidovorans MC1 were identified. Significant similarities to other known genes were not detected, but their deduced amino acid sequences were similar to those of other α-ketoglutarate dioxygenases. RdpA showed 35% identity with TauD of Pseudomonas aeruginosa, and SdpA showed 37% identity with TfdA of Ralstonia eutropha JMP134. The functionally important amino acid sequence motif HX(D/E)X23-26(T/S)X114-183HX10-13R/K, which is highly conserved in group II α-ketoglutarate-dependent dioxygenases, was present in both dichlorprop-cleaving enzymes. Transposon mutagenesis of rdpA inactivated R-dichlorprop cleavage, indicating that it was a single-copy gene. Both rdpA and sdpA were located on the plasmid pMC1 that also carries the lower pathway genes. Sequencing of a 25.8-kb fragment showed that the dioxygenase genes were separated by a 13.6-kb region mainly comprising a Tn501-like transposon. Furthermore, two copies of a sequence similar to IS91-like elements were identified. Hybridization studies comparing the wild-type plasmid and that of the mutant unable to cleave dichlorprop showed that rdpA and sdpA were deleted, whereas the lower pathway genes were unaffected, and that deletion may be caused by genetic rearrangements of the IS91-like elements. Two other dichlorprop-degrading bacterial strains, Rhodoferax sp. strain P230 and Sphingobium herbicidovorans MH, were shown to carry rdpA genes of high similarity to rdpA from strain MC1, but sdpA was not detected. This suggested that rdpA gene products are involved in the degradation of R-dichlorprop in these strains.


Author(s):  
Zhiqiang Han ◽  
Chenyan Shou ◽  
Manhong Liu ◽  
Tianxiang Gao

Our understanding of phylogenetic relationships among Gadiformes fish is obtained through the analysis of a small number of genes, but uncertainty remains around critical nodes. A series of phylogenetic controversial exists at the suborder, family, subfamily, and species levels. A total of 1105 orthologous exon sequences and translated amino acid sequences from 36 genomes and 12 transcriptomes covering 33 species were applied to investigate the phylogenetic relationships within Gadiformes and address these problems. Phylogenetic trees reconstructed with the amino acid data set using different tree-building methods (RAxML and MrBayes) showed consistent topology. The monophyly of Gadifromes was confirmed in our study. However, the three suborders Muraenolepidoidei, Macrouroidei, and Gadoidei were not well recovered by our phy-logenomic study, rejecting the validity of suborder Muraenolepidoidei. Four major lineages were revealed in this study. The family Bregmacerotidae forming clade I was the basal lineage of Gadiformes. The family Merluciidae formed clade II. Clade III contained families Melanonidae, Muraenolepididae, Macrouridae (with subfamilies Trachyrincinae, Macrourinae, and Bathygadinae), and Moridae. Clade IV contained at least three families of suborder Gadoidei, i.e., Gadidae, Phycidae, and Ranicipitidae. The subspecies of Lota lota from Amur River were confirmed, indicating that exon markers were a valid high-resolution method for delimiting subspecies or distinct lineages within species level. The PSMC analysis of different populations of L. lota suggests a continuous decline since 2 Myr.


2020 ◽  
Vol 165 (10) ◽  
pp. 2291-2299
Author(s):  
Sébastien Calvignac-Spencer ◽  
Léonce Kouadio ◽  
Emmanuel Couacy-Hymann ◽  
Nafomon Sogoba ◽  
Kyle Rosenke ◽  
...  

Abstract The multimammate mouse (Mastomys natalensis; M. natalensis) serves as the main reservoir for the zoonotic arenavirus Lassa virus (LASV), and this has led to considerable investigation into the distribution of LASV and other related arenaviruses in this host species. In contrast to the situation with arenaviruses, the presence of other viruses in M. natalensis remains largely unexplored. In this study, herpesviruses and polyomaviruses were identified and partially characterized by PCR methods, sequencing, and phylogenetic analysis. In tissues sampled from M. natalensis populations in Côte d'Ivoire and Mali, six new DNA viruses (four betaherpesviruses, one gammaherpesvirus and one polyomavirus) were identified. Phylogenetic analysis based on glycoprotein B amino acid sequences showed that the herpesviruses clustered with cytomegaloviruses and rhadinoviruses of multiple rodent species. The complete circular genome of the newly identified polyomavirus was amplified by PCR. Amino acid sequence analysis of the large T antigen or VP1 showed that this virus clustered with a known polyomavirus from a house mouse (species Mus musculus polyomavirus 1). These two polyomaviruses form a clade with other rodent polyomaviruses, and the newly identified virus represents the third known polyomavirus of M. natalensis. This study represents the first identification of herpesviruses and the discovery of a novel polyomavirus in M. natalensis. In contrast to arenaviruses, we anticipate that these newly identified viruses represent a low zoonotic risk due to the normally highly restricted specificity of members of these two DNA virus families to their individual mammalian host species.


2001 ◽  
Vol 183 (2) ◽  
pp. 500-511 ◽  
Author(s):  
Tsutomu Sekizaki ◽  
Yoshiko Otani ◽  
Makoto Osaki ◽  
Daisuke Takamatsu ◽  
Yoshihiro Shimoji

ABSTRACT Different strains of Streptococcus suis serotypes 1 and 2 isolated from pigs either contained a restriction-modification (R-M) system or lacked it. The R-M system was an isoschizomer ofStreptococcus pneumoniae DpnII, which recognizes nucleotide sequence 5′-GATC-3′. The nucleotide sequencing of the genes encoding the R-M system in S. suis DAT1, designatedSsuDAT1I, showed that the SsuDAT1I gene region contained two methyltransferase genes, designated ssuMA andssuMB, as does the DpnII system. The deduced amino acid sequences of M.SsuMA and M.SsuMB showed 70 and 90% identity to M.DpnII and M.DpnA, respectively. However, the SsuDAT1I system contained two isoschizomeric restriction endonuclease genes, designated ssuRA and ssuRB. The deduced amino acid sequence of R.SsuRA was 49% identical to that of R.DpnII, and R.SsuRB was 72% identical to R.LlaDCHI of Lactococcus lactis subsp.cremoris DCH-4. The four SsuDAT1I genes overlapped and were bounded by purine biosynthetic gene clusters in the following gene order:purF-purM-purN-purH-ssuMA-ssuMB-ssuRA-ssuRB-purD-purE. The G+C content of the SsuDAT1I gene region (34.1%) was lower than that of the pur region (48.9%), suggesting horizontal transfer of the SsuDAT1I system. No transposable element or long-repeat sequence was found in the flanking regions. TheSsuDAT1I genes were functional by themselves, as they were individually expressed in Escherichia coli. Comparison of the sequences between strains with and without the R-M system showed that only the region from 53 bp upstream of ssuMA to 5 bp downstream of ssuRB was inserted in the intergenic sequence between purH and purD and that the insertion target site was not the recognition site of SsuDAT1I. No notable substitutions or insertions could be found, and the structures were conserved among all the strains. These results suggest that theSsuDAT1I system could have been integrated into theS. suis chromosome by an illegitimate recombination mechanism.


2000 ◽  
Vol 13 (4) ◽  
pp. 359-365 ◽  
Author(s):  
F. I. García-Maceira ◽  
Antonio Di Pietro ◽  
M. Isabel G. Roncero

Fusarium oxysporum f. sp. lycopersici, the causal agent of tomato vascular wilt, produces an array of pectinolytic enzymes, including at least two exo-α1,4-polygalac-turonases (exoPGs). A gene encoding an exoPG, pgx4, was isolated with degenerate polymerase chain reaction primers derived from amino acid sequences conserved in two fungal exoPGs. pgx4 encodes a 454 amino acid polypeptide with nine potential N-glycosylation sites and a putative 21 amino acid N-terminal signal peptide. The deduced mature protein has a calculated molecular mass of 47.9 kDa, a pI of 8.0, and 51 and 49% identity with the exoPGs of Cochliobolus carbonum and Aspergillus tubingensis, respectively. The gene is present in a single copy in different formae speciales of F. oxysporum. Expression of pgx4 was detected during in vitro growth on pectin, polygalacturonic acid, and tomato vascular tissue and in roots and stems of tomato plants infected by F. oxysporum f. sp. lycopersici. Two mutants of F. oxy-sporum f. sp. lycopersici with a copy of pgx4 inactivated by gene replacement were as virulent on tomato plants as the wild-type strain.


2016 ◽  
Author(s):  
Sankar Basu ◽  
Fredrik Söderquist ◽  
Björn Wallner

AbstractThe focus of the computational structural biology community has taken a dramatic shift over the past one-and-a-half decades from the classical protein structure prediction problem to the possible understanding of intrinsically disordered proteins (IDP) or proteins containing regions of disorder (IDPR). The current interest lies in the unraveling of a disorder-to-order transitioning code embedded in the amino acid sequences of IDPs / IDPRs. Disordered proteins are characterized by an enormous amount of structural plasticity which makes them promiscuous in binding to different partners, multi-functional in cellular activity and atypical in folding energy landscapes resembling partially folded molten globules. Also, their involvement in several deadly human diseases (e.g. cancer, cardiovascular and neurodegenerative diseases) makes them attractive drug targets, and important for a biochemical understanding of the disease(s). The study of the structural ensemble of IDPs is rather difficult, in particular for transient interactions. When bound to a structured partner, an IDPR adapts an ordered conformation in the complex. The residues that undergo this disorder-to-order transition are called protean residues, generally found in short contiguous stretches and the first step in understanding the modus operandi of an IDP / IDPR would be to predict these residues. There are a few available methods which predict these protean segments from their amino acid sequences; however, their performance reported in the literature leaves clear room for improvement. With this background, the current study presents 'Proteus', a random forest classifier that predicts the likelihood of a residue undergoing a disorder-to-order transition upon binding to a potential partner protein. The prediction is based on features that can be calculated using the amino acid sequence alone. Proteus compares favorably with existing methods predicting twice as many true positives as the second best method (55% vs. 27%) with a much higher precision on an independent data set. The current study also sheds some light on a possible 'disorder-to-order' transitioning consensus, untangled, yet embedded in the amino acid sequence of IDPs. Some guidelines have also been suggested for proceeding with a real-life structural modeling involving an IDPR using Proteus.Software Availabilityhttps://github.com/bjornwallner/proteus


Sign in / Sign up

Export Citation Format

Share Document