scholarly journals Machine-learning classification suggests that many alphaproteobacterial prophages may instead be gene transfer agents

2019 ◽  
Author(s):  
Roman Kogay ◽  
Taylor B. Neely ◽  
Daniel P. Birnbaum ◽  
Camille R. Hankel ◽  
Migun Shakya ◽  
...  

AbstractMany of the sequenced bacterial and archaeal genomes encode regions of viral provenance. Yet, not all of these regions encode bona fide viruses. Gene transfer agents (GTAs) are thought to be former viruses that are now maintained in genomes of some bacteria and archaea and are hypothesized to enable exchange of DNA within bacterial populations. In Alphaproteobacteria, genes homologous to the ‘head-tail’ gene cluster that encodes structural components of the Rhodobacter capsulatus GTA (RcGTA) are found in many taxa, even if they are only distantly related to Rhodobacter capsulatus. Yet, in most genomes available in GenBank RcGTA-like genes have annotations of typical viral proteins, and therefore are not easily distinguished from their viral homologs without additional analyses. Here, we report a ‘support vector machine’ classifier that quickly and accurately distinguishes RcGTA-like genes from their viral homologs by capturing the differences in the amino acid composition of the encoded proteins. Our open-source classifier is implemented in Python and can be used to scan homologs of the RcGTA genes in newly sequenced genomes. The classifier can also be trained to identify other types of GTAs, or even to detect other elements of viral ancestry. Using the classifier trained on a manually curated set of homologous viruses and GTAs, we detected RcGTA-like ‘head-tail’ gene clusters in 57.5% of the 1,423 examined alphaproteobacterial genomes. We also demonstrated that more than half of the in silico prophage predictions are instead likely to be GTAs, suggesting that in many alphaproteobacterial genomes the RcGTA-like elements remain unrecognized.Data depositionSequence alignments and phylogenetic trees are available in a FigShare repository at DOI 10.6084/m9.figshare.8796419. The Python source code of the described classifier and additional scripts used in the analyses are available via a GitHub repository at https://github.com/ecg-lab/GTA-Hunter-v1

2019 ◽  
Vol 11 (10) ◽  
pp. 2941-2953 ◽  
Author(s):  
Roman Kogay ◽  
Taylor B Neely ◽  
Daniel P Birnbaum ◽  
Camille R Hankel ◽  
Migun Shakya ◽  
...  

Abstract Many of the sequenced bacterial and archaeal genomes encode regions of viral provenance. Yet, not all of these regions encode bona fide viruses. Gene transfer agents (GTAs) are thought to be former viruses that are now maintained in genomes of some bacteria and archaea and are hypothesized to enable exchange of DNA within bacterial populations. In Alphaproteobacteria, genes homologous to the “head–tail” gene cluster that encodes structural components of the Rhodobacter capsulatus GTA (RcGTA) are found in many taxa, even if they are only distantly related to Rhodobacter capsulatus. Yet, in most genomes available in GenBank RcGTA-like genes have annotations of typical viral proteins, and therefore are not easily distinguished from their viral homologs without additional analyses. Here, we report a “support vector machine” classifier that quickly and accurately distinguishes RcGTA-like genes from their viral homologs by capturing the differences in the amino acid composition of the encoded proteins. Our open-source classifier is implemented in Python and can be used to scan homologs of the RcGTA genes in newly sequenced genomes. The classifier can also be trained to identify other types of GTAs, or even to detect other elements of viral ancestry. Using the classifier trained on a manually curated set of homologous viruses and GTAs, we detected RcGTA-like “head–tail” gene clusters in 57.5% of the 1,423 examined alphaproteobacterial genomes. We also demonstrated that more than half of the in silico prophage predictions are instead likely to be GTAs, suggesting that in many alphaproteobacterial genomes the RcGTA-like elements remain unrecognized.


Author(s):  
Emma Esterman ◽  
Yuri I. Wolf ◽  
Roman Kogay ◽  
Eugene V. Koonin ◽  
Olga Zhaxybayeva

AbstractGene transfer agents (GTAs) are virus-like particles encoded and produced by many bacteria and archaea. Unlike viruses, GTAs package fragments of the host genome instead of the genes that encode the components of the GTA itself. As a result of this non-specific DNA packaging, GTAs can transfer genes within bacterial and archaeal communities. GTAs clearly evolved from viruses and are thought to have been maintained in prokaryotic genomes due to the advantages associated with their DNA transfer capacity. The most-studied GTA is produced by the alphaproteobacterium Rhodobacter capsulatus (RcGTA), which packages random portions of the host genome at a lower DNA density than usually observed in tailed bacterial viruses. How the DNA packaging properties of RcGTA evolved from those of the ancestral virus remains unknown. To address this question, we reconstructed the evolutionary history of the large subunit of the terminase (TerL), a highly conserved enzyme used by viruses and GTAs to package DNA. We found that RcGTA-like TerLs grouped within viruses that employ the headful packaging strategy. Because distinct mechanisms of viral DNA packaging correspond to differences in the TerL amino acid sequence, our finding suggests that RcGTA evolved from a headful packaging virus. Headful packaging is the least sequence-specific mode of DNA packaging, which would facilitate the switch from packaging of the viral genome to packaging random pieces of the host genome during GTA evolution.


2019 ◽  
Vol 202 (2) ◽  
Author(s):  
Purvikalyan Pallegar ◽  
Lourdes Peña-Castillo ◽  
Evan Langille ◽  
Mark Gomelsky ◽  
Andrew S. Lang

ABSTRACT Gene transfer agents (GTAs) are bacteriophage-like particles produced by several bacterial and archaeal lineages that contain small pieces of the producing cells’ genomes that can be transferred to other cells in a process similar to transduction. One well-studied GTA is RcGTA, produced by the alphaproteobacterium Rhodobacter capsulatus. RcGTA gene expression is regulated by several cellular regulatory systems, including the CckA-ChpT-CtrA phosphorelay. The transcription of multiple other regulator-encoding genes is affected by the response regulator CtrA, including genes encoding putative enzymes involved in the synthesis and hydrolysis of the second messenger bis-(3′-5′)-cyclic dimeric GMP (c-di-GMP). To investigate whether c-di-GMP signaling plays a role in RcGTA production, we disrupted the CtrA-affected genes potentially involved in this process. We found that disruption of four of these genes affected RcGTA gene expression and production. We performed site-directed mutagenesis of key catalytic residues in the GGDEF and EAL domains responsible for diguanylate cyclase (DGC) and c-di-GMP phosphodiesterase (PDE) activities and analyzed the functions of the wild-type and mutant proteins. We also measured RcGTA production in R. capsulatus strains where intracellular levels of c-di-GMP were altered by the expression of either a heterologous DGC or a heterologous PDE. This adds c-di-GMP signaling to the collection of cellular regulatory systems controlling gene transfer in this bacterium. Furthermore, the heterologous gene expression and the four gene disruptions had similar effects on R. capsulatus flagellar motility as found for gene transfer, and we conclude that c-di-GMP inhibits both RcGTA production and flagellar motility in R. capsulatus. IMPORTANCE Gene transfer agents (GTAs) are virus-like particles that move cellular DNA between cells. In the alphaproteobacterium Rhodobacter capsulatus, GTA production is affected by the activities of multiple cellular regulatory systems, to which we have now added signaling via the second messenger dinucleotide molecule bis-(3′-5′)-cyclic dimeric GMP (c-di-GMP). Similar to the CtrA phosphorelay, c-di-GMP also affects R. capsulatus flagellar motility in addition to GTA production, with lower levels of intracellular c-di-GMP favoring increased flagellar motility and gene transfer. These findings further illustrate the interconnection of GTA production with global systems of regulation in R. capsulatus, providing additional support for the notion that the production of GTAs has been maintained in this and related bacteria because it provides a benefit to the producing organisms.


mBio ◽  
2020 ◽  
Vol 11 (4) ◽  
Author(s):  
Roman Kogay ◽  
Yuri I. Wolf ◽  
Eugene V. Koonin ◽  
Olga Zhaxybayeva

ABSTRACT Gene transfer agents (GTAs) are virus-like elements integrated into bacterial genomes, particularly, those of Alphaproteobacteria. The GTAs can be induced under conditions of nutritional stress, incorporate random fragments of bacterial DNA into miniphage particles, lyse the host cells, and infect neighboring bacteria, thus enhancing horizontal gene transfer. We show that GTA genes evolve under conditions of pronounced positive selection for the reduction of the energy cost of protein production as shown by comparison of the amino acid compositions with those of both homologous viral genes and host genes. The energy saving in GTA genes is comparable to or even more pronounced than that in the genes encoding the most abundant, essential bacterial proteins. In cases in which viruses acquire genes from GTAs, the bias in amino acid composition disappears in the course of evolution, showing that reduction of the energy cost of protein production is an important factor of evolution of GTAs but not bacterial viruses. These findings strongly suggest that GTAs represent bacterial adaptations rather than selfish, virus-like elements. Because GTA production kills the host cell and does not propagate the GTA genome, it appears likely that the GTAs are retained in the course of evolution via kin or group selection. Therefore, we hypothesize that GTAs facilitate the survival of bacterial populations under energy-limiting conditions through the spread of metabolic and transport capabilities via horizontal gene transfer and increases in nutrient availability resulting from the altruistic suicide of GTA-producing cells. IMPORTANCE Kin selection and group selection remain controversial topics in evolutionary biology. We argue that these types of selection are likely to operate in bacterial populations by showing that bacterial gene transfer agents (GTAs), but not related viruses, evolve under conditions of positive selection for the reduction of the energy cost of GTA particle production. We hypothesize that GTAs are dedicated devices mediating the survival of bacteria under conditions of nutrient limitation. The benefits conferred by GTAs under nutritional stress conditions appear to include horizontal dissemination of genes that could provide bacteria with enhanced capabilities for nutrient utilization and increases of nutrient availability occurring through the lysis of GTA-producing bacteria.


2021 ◽  
Vol 9 (6) ◽  
pp. 1115
Author(s):  
Kathryn Forcone ◽  
Felipe H. Coutinho ◽  
Giselle S. Cavalcanti ◽  
Cynthia B. Silveira

Roseobacters are globally abundant bacteria with critical roles in carbon and sulfur biogeochemical cycling. Here, we identified 173 new putative prophages in 79 genomes of Rhodobacteraceae. These prophages represented 1.3 ± 0.15% of the bacterial genomes and had no to low homology with reference and metagenome-assembled viral genomes from aquatic and terrestrial ecosystems. Among the newly identified putative prophages, 35% encoded auxiliary metabolic genes (AMGs), mostly involved in secondary metabolism, amino acid metabolism, and cofactor and vitamin production. The analysis of integration sites and gene homology showed that 22 of the putative prophages were actually gene transfer agents (GTAs) similar to a GTA of Rhodobacter capsulatus. Twenty-three percent of the predicted prophages were observed in the TARA Oceans viromes generated from free viral particles, suggesting that they represent active prophages capable of induction. The distribution of these prophages was significantly associated with latitude and temperature. The prophages most abundant at high latitudes encoded acpP, an auxiliary metabolic gene involved in lipid synthesis and membrane fluidity at low temperatures. Our results show that prophages and gene transfer agents are significant sources of genomic diversity in roseobacter, with potential roles in the ecology of this globally distributed bacterial group.


2019 ◽  
Author(s):  
Mustafa O. Jibrin ◽  
Gerald V. Minsavage ◽  
Erica M. Goss ◽  
Pamela D. Roberts ◽  
Jeffrey B. Jones

AbstractBackgroundGene transfer agents (GTAs) are phage-like mediators of gene transfer in bacterial species. Typically, strains of a bacteria species which have GTA shows more recombination than strains without GTAs. GTA-mediated gene transfer activity has been shown for few bacteria, with Rhodobacter capsulatus being the prototypical GTA. GTA have not been previously studied in plant pathogenic bacteria. A recent study inferring recombination in strains of the bacterial spot xanthomonads identified a Nigerian lineage which showed unusual recombination background. We initially set out to understand genomic drivers of recombination in this genome by focusing on mobile genetic elements.ResultsWe identified a unique cluster which was present in the Nigerian strain but absent in other sequenced strains of bacterial spot xanthomonads. The protein sequence of a gene within this cluster contained the GTA_TIM domain that is present in bacteria with GTA. We identified GTA clusters in other Xanthomonas species as well as species of Agrobacterium and Pantoea. Recombination analyses showed that generally, strains of Xanthomonas with GTA have more inferred recombination events than strains without GTA, which could lead to genome divergence.ConclusionThis study identified GTA clusters in species of the plant pathogen genera Xanthomonas, Agrobacterium and Pantoea which we have named XpGTA, AgGTA and PaGTA respectively. Our recombination analyses suggest that Xanthomonas strains with GTA generally have more inferred recombination events than strains without GTA. The study is important in understanding the drivers of evolution of bacterial plant pathogens.


2017 ◽  
Author(s):  
Migun Shakya ◽  
Shannon M. Soucy ◽  
Olga Zhaxybayeva

AbstractSeveral bacterial and archaeal lineages produce nanostructures that morphologically resemble small tailed viruses, but, unlike most viruses, contain apparently random pieces of the host genome. Since these elements can deliver the packaged DNA to other cells, they were dubbed Gene Transfer Agents (GTAs). Because many genes involved in GTA production have viral homologs, it has been hypothesized that the GTA ancestor was a virus. Whether GTAs represent an atypical virus, a defective virus, or a virus co-opted by the prokaryotes for some function, remains to be elucidated. To evaluate these possibilities, we examined the distribution and evolutionary histories of genes that encode a GTA in the α-proteobacteriumRhodobacter capsulatus(RcGTA). We report that although homologs of many individual RcGTA genes are abundant across bacteria and their viruses, RcGTA-like genomes are mainly found in one subclade of α-proteobacteria. When compared to the viral homologs, genes of the RcGTA-like genomes evolve significantly slower, and do not have higher %A+T nucleotides than their host chromosomes. Moreover, they appear to reside in stable regions of the bacterial chromosomes that are generally conserved across taxonomic orders. These findings argue against RcGTA being an atypical or a defective virus. Our phylogenetic analyses suggest that RcGTA ancestor likely originated in the lineage that gave rise to contemporary α-proteobacterial ordersRhizobiales, Rhodobacterales, Caulobacterales, Parvularculales, and Sphingomonadales,and since that time the RcGTA-like element has co-evolved with its host chromosomes. Such evolutionary history is compatible with maintenance of these elements by bacteria due to some selective advantage. As for many other prokaryotic traits, horizontal gene transfer played a substantial role in the evolution of RcGTA-like elements, not only in shaping its genome components within the orders, but also in occasional dissemination of RcGTA-like regions across the orders and even to different bacterial phyla.


2019 ◽  
Author(s):  
Mustafa O Jibrin ◽  
Gerald V. Minsavage ◽  
Erica M. Goss ◽  
Pamela D. Roberts ◽  
Jeffrey B Jones

Abstract Background Gene transfer agents (GTAs) are phage-like mediators of gene transfer in bacterial species. Typically, strains of a bacteria species which have GTA shows more recombination than strains without GTAs. GTA-mediated gene transfer activity has been shown for few bacteria, with Rhodobacter capsulatus being the prototypical GTA. GTA have not been previously studied in plant pathogenic bacteria. A recent study inferring recombination in strains of the bacterial spot xanthomonads identified a Nigerian lineage which showed unusual recombination background. We initially set out to understand genomic drivers of recombination in this genome by focusing on mobile genetic elements. Results We identified a unique cluster which was present in the Nigerian strain but absent in other sequenced strains of bacterial spot xanthomonads. The protein sequence of a gene within this cluster contained the GTA_TIM domain that is present in bacteria with GTA. We identified GTA clusters in other Xanthomonas species as well as species of Agrobacterium and Pantoea. Recombination analyses showed that generally, strains of Xanthomonas with GTA have more inferred recombination events than strains without GTA, which could lead to genome divergence.Conclusion This study identified GTA clusters in species of the plant pathogen genera Xanthomonas, Agrobacterium and Pantoea which we have named XpGTA, AgGTA and PaGTA respectively. Our recombination analyses suggest that Xanthomonas strains with GTA generally have more inferred recombination events than strains without GTA. The study is important in understanding the drivers of evolution of bacterial plant pathogens.


2020 ◽  
Vol 2 (7A) ◽  
Author(s):  
David Sherlock ◽  
Millie Benn ◽  
Paul Fogg

Gene transfer agents (GTAs) are small viruses that package and transfer random pieces of the producing cell’s genome but are unable to transfer all the genes required for their own production. GTAs are able to spread any DNA in the host cell and so their potential impact upon bacterial evolution and antimicrobial resistance is immense. Our discovery that the product of gene rcc01865 is a specific GTA activation factor (GafA) for the model Rhodobacter capsulatus GTA (RcGTA) and that GafA is essential for RcGTA production, has provided the link between GTA production and host regulatory pathways. However, while GafA has significantly improved our understanding of GTA regulation the complete mechanism is unclear. Our goal was to investigate the GafA mechanism of action in more detail. We demonstrate direct protein-protein interaction between GafA and the RNA polymerase omega subunit (RNAP Ω) using bacterial-two-hybrid and pull down assays. Further evidence for the interaction has come from random and site directed mutagenesis of gafA and targeted truncations. GafA mutants were also tested to assess their impact on RcGTA production. RNAP Ω is thought to recruit alternative sigma factors to the RNAP holoenzyme. Regions of GafA also share sequence homology with known sigma factor proteins, and we propose that GafA acts as an alternative sigma factor to co-ordinate expression of disparate RcGTA genes. Our results advance our understanding of this fascinating mode of horizontal gene transfer, not only in the model species but also in other potential GTA producing species that contain gafA homologues.


2021 ◽  
Author(s):  
Jason W. Shapiro ◽  
Catherine Putonti

AbstractBackgroundA pangenome is the collection of all genes found in a set of related genomes. For microbes, these genomes are often different strains of the same species, and the pangenome offers a means to compare gene content variation with differences in phenotypes, ecology, and phylogenetic relatedness. Though most frequently applied to bacteria, there is growing interest in adapting pangenome analysis to bacteriophages. However, working with phage genomes presents new challenges. First, most phage families are under-sampled, and homologous genes in related viruses can be difficult to identify. Second, homing endonucleases and intron-like sequences may be present, resulting in fragmented gene calls. Each of these issues can reduce the accuracy of standard pangenome analysis tools.MethodsWe developed an R pipeline called Rephine.r that takes as input the gene clusters produced by an initial pangenomics workflow. Rephine.r then proceeds in two primary steps. First, it identifies three common causes of fragmented gene calls: 1) indels creating early stop codons and new start codons; 2) interruption by a selfish genetic element; and 3) splitting at the ends of the reported genome. Fragmented genes are then fused to create new sequence alignments. In tandem, Rephine.r searches for distant homologs separated into different gene families using Hidden Markov Models. Significant hits are used to merge families into larger clusters. A final round of fragment identification is then run, and results may be used to infer single-copy core genomes and phylogenetic trees.ResultsWe applied Rephine.r to three well-studied phage groups: the Tevenvirinae (e.g. T4), the Studiervirinae (e.g. T7), and the Pbunaviruses (e.g. PB1). In each case, Rephine.r recovered additional members of the single-copy core genome and increased the overall bootstrap support of the phylogeny. The Rephine.r pipeline is provided through GitHub (https://www.github.com/coevoeco/Rephine.r) as a single script for automated analysis and with utility functions and a walkthrough for researchers with specific use cases for each type of correction.


Sign in / Sign up

Export Citation Format

Share Document