scholarly journals VIBRANT: Automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequence

Author(s):  
Kristopher Kieft ◽  
Zhichao Zhou ◽  
Karthik Anantharaman

Abstract Background: Viruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools available for analyses of viral genomes and assessing their metabolic impacts on microbiomes. Design: Here we present VIBRANT, the first method to utilize a hybrid machine learning and protein similarity approach that is not reliant on sequence features for automated recovery and annotation of viruses, determination of genome quality and completeness, and characterization of viral community function from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and a newly developed v-score metric that circumvents traditional boundaries to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. VIBRANT highlights viral auxiliary metabolic genes and metabolic pathways, thereby serving as a user-friendly platform for evaluating viral community function. VIBRANT was trained and validated on reference virus datasets as well as microbiome and virome data. Results: VIBRANT showed superior performance in recovering higher quality viruses and concurrently reduced the false identification of non-viral genome fragments in comparison to other virus identification programs, specifically VirSorter, VirFinder and MARVEL. When applied to 120,834 metagenomically derived viral sequences representing several human and natural environments, VIBRANT recovered an average of 94% of the viruses, whereas VirFinder, VirSorter and MARVEL achieved less powerful performance, averaging 48%, 87% and 71%, respectively. Similarly, VIBRANT identified more total viral sequence and proteins when applied to real metagenomes. When compared to PHASTER, Prophage Hunter and VirSorter for the ability to extract integrated provirus regions from host scaffolds, VIBRANT performed comparably and even identified proviruses that the other programs did not. To demonstrate applications of VIBRANT, we studied viromes associated with Crohn’s Disease to show that specific viral groups, namely Enterobacteriales-like viruses, as well as putative dysbiosis associated viral proteins are more abundant compared to healthy individuals, providing a possible viral link to maintenance of diseased states. Conclusions: The ability to accurately recover viruses and explore viral impacts on microbial community metabolism will greatly advance our understanding of microbiomes, host-microbe interactions and ecosystem dynamics.

Author(s):  
Kristopher Kieft ◽  
Zhichao Zhou ◽  
Karthik Anantharaman

Abstract Background Viruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools available for analyses of viral genomes and assessing their metabolic impacts on microbiomes. Design Here we present VIBRANT, the first method to utilize a hybrid machine learning and protein similarity approach that is not reliant on sequence features for automated recovery and annotation of viruses, determination of genome quality and completeness, and characterization of virome function from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and a novel v-score metric that circumvents traditional boundaries to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. VIBRANT highlights viral auxiliary metabolic genes and metabolic pathways, thereby serving as a user-friendly platform for evaluating virome function. VIBRANT was trained and validated on reference virus datasets as well as microbiome and virome data. Results VIBRANT showed superior performance in recovering higher quality viruses and concurrently reduced the false identification of non-viral genome fragments in comparison to other virus identification programs, specifically VirSorter and VirFinder. When applied to 120,834 metagenomically derived viral sequences representing several human and natural environments, VIBRANT recovered an average of 94.5% of the viruses, whereas VirFinder and VirSorter achieved less powerful performance, averaging 48.1% and 56.0%, respectively. Similarly, VIBRANT identified more total viral sequence and proteins when applied to real metagenomes. When compared to PHASTER and Prophage Hunter for the ability to extract integrated provirus regions from host scaffolds, VIBRANT performed comparably and even identified proviruses that the other programs did not. To demonstrate applications of VIBRANT, we studied viromes associated with Crohn’s Disease to show that specific viral groups, namely Enterobacteriales-like viruses, as well as putative dysbiosis associated viral proteins are more abundant compared to healthy individuals, providing a possible viral link to maintenance of diseased states. Conclusions The ability to accurately recover viruses and explore viral impacts on microbial community metabolism will greatly advance our understanding of microbiomes, host-microbe interactions and ecosystem dynamics.


2019 ◽  
Author(s):  
Kristopher Kieft ◽  
Zhichao Zhou ◽  
Karthik Anantharaman

AbstractBackgroundViruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools available for analyses of viral genomes and assessing their metabolic impacts on microbiomes.DesignHere we present VIBRANT, the first method to utilize a hybrid machine learning and protein similarity approach that is not reliant on sequence features for automated recovery and annotation of viruses, determination of genome quality and completeness, and characterization of virome function from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and a novel v-score metric that circumvents traditional boundaries to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. VIBRANT highlights viral auxiliary metabolic genes and metabolic pathways, thereby serving as a user-friendly platform for evaluating virome function. VIBRANT was trained and validated on reference virus datasets as well as microbiome and virome data.ResultsVIBRANT showed superior performance in recovering higher quality viruses and concurrently reduced the false identification of non-viral genome fragments in comparison to other virus identification programs, specifically VirSorter and VirFinder. When applied to 120,834 metagenomically derived viral sequences representing several human and natural environments, VIBRANT recovered an average of 94.5% of the viruses, whereas VirFinder and VirSorter achieved less powerful performance, averaging 48.1% and 56.0%, respectively. Similarly, VIBRANT identified more total viral sequence and proteins when applied to real metagenomes. When compared to PHASTER and Prophage Hunter for the ability to extract integrated provirus regions from host scaffolds, VIBRANT performed comparably and even identified proviruses that the other programs did not. To demonstrate applications of VIBRANT, we studied viromes associated with Crohn’s Disease to show that specific viral groups, namely Enterobacteriales-like viruses, as well as putative dysbiosis associated viral proteins are more abundant compared to healthy individuals, providing a possible viral link to maintenance of diseased states.ConclusionsThe ability to accurately recover viruses and explore viral impacts on microbial community metabolism will greatly advance our understanding of microbiomes, host-microbe interactions and ecosystem dynamics.


Pathogens ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 13
Author(s):  
Markus Hoffmann ◽  
Enrika Schütze ◽  
Andreas Bernhard ◽  
Lennart Schlaphoff ◽  
Artur Kaul ◽  
...  

Pan paniscus Papillomavirus 1 (PpPV1) causes focal epithelial hyperplasia (FEH) in infected animals. Here, we analyzed the present disease manifestation and PpPV1 genomic sequence of an animal that was afflicted by an FEH epizootic outbreak in 1987 for which the sequence of the responsible PpPV1 was determined. The animal displayed FEH more than 30 years after the initial diagnosis, indicating persistence or recurrence of the disease, and evidence for active PpPV1 infection was obtained. Moreover, the sequences of the viral genomes present in the late 1980s and in 2018 differed at 23 nucleotide positions, resulting in 11 amino acid exchanges within coding regions. These findings suggest that PpPV1-induced FEH might not undergo complete and/or permanent remission in a subset of afflicted animals.


Author(s):  
Manish C Choudhary ◽  
Charles R Crain ◽  
Xueting Qiu ◽  
William Hanage ◽  
Jonathan Z Li

Abstract Background Both SARS-CoV-2 reinfection and persistent infection have been reported, but sequence characteristics in these scenarios have not been described. We assessed published cases of SARS-CoV-2 reinfection and persistence, characterizing the hallmarks of reinfecting sequences and the rate of viral evolution in persistent infection. Methods A systematic review of PubMed was conducted to identify cases of SARS-CoV-2 reinfection and persistence with available sequences. Nucleotide and amino acid changes in the reinfecting sequence were compared to both the initial and contemporaneous community variants. Time-measured phylogenetic reconstruction was performed to compare intra-host viral evolution in persistent SARS-CoV-2 to community-driven evolution. Results Twenty reinfection and nine persistent infection cases were identified. Reports of reinfection cases spanned a broad distribution of ages, baseline health status, reinfection severity, and occurred as early as 1.5 months or >8 months after the initial infection. The reinfecting viral sequences had a median of 17.5 nucleotide changes with enrichment in the ORF8 and N genes. The number of changes did not differ by the severity of reinfection and reinfecting variants were similar to the contemporaneous sequences circulating in the community. Patients with persistent COVID-19 demonstrated more rapid accumulation of sequence changes than seen with community-driven evolution with continued evolution during convalescent plasma or monoclonal antibody treatment. Conclusions Reinfecting SARS-CoV-2 viral genomes largely mirror contemporaneous circulating sequences in that geographic region, while persistent COVID-19 has been largely described in immunosuppressed individuals and is associated with accelerated viral evolution.


1980 ◽  
Vol 210 (1180) ◽  
pp. 423-435 ◽  

We have cloned and propagated in prokaryotic vectors the viral DNA sequences that are integrated in a variety of cells transformed by adenovirus 2 or SV40. Analysis of the clones reveals that the viral DNA sequences sometimes are arranged in a simple fashion, collinear with the viral genome; in other cell lines there are complex arrangements of viral sequences in which tracts of the viral genome are inverted with respect to each other. In several cases the nucleotide sequences at the joints between cell and viral sequences have been determined: usually there is a sharp transition between cellular and viral DNAs. The viral sequences are integrated at different locations within the genomes of different cell lines; likewise there is no specific site on the viral genomes at which integration occurs. Sometimes the viral sequences are integrated within repetitive cellular DNA, and sometimes within unique sequences. In some cases there is evidence that the viral sequences along with the flanking cell DNA have been amplified after integration. The sequences that flank the viral insertion in the line of SV40-transformed rat cells known as 14B have been used as probes to isolate, from untransformed rat cells, clones that carry the region of the chromosome in which integration occurred. Analysis of the structure of these clones by restriction endonuclease digestion and heteroduplex formation shows that a rearrangement of cellular sequences has occurred, presumably as a consequence of integration.


2021 ◽  
pp. 117744
Author(s):  
Aijie Wang ◽  
Ke Shi ◽  
Daliang Ning ◽  
Haoyi Cheng ◽  
Hongcheng Wang ◽  
...  

2017 ◽  
Vol 7 (19) ◽  
pp. 7965-7974 ◽  
Author(s):  
Rim Khlifa ◽  
Alain Paquette ◽  
Christian Messier ◽  
Peter B. Reich ◽  
Alison D. Munson

2006 ◽  
Vol 87 (10) ◽  
pp. 3045-3051 ◽  
Author(s):  
Mazen S. Habayeb ◽  
Sophia K. Ekengren ◽  
Dan Hultmark

Several viruses, including picornaviruses, are known to establish persistent infections, but the mechanisms involved are poorly understood. Here, a novel picorna-like virus, Nora virus, which causes a persistent infection in Drosophila melanogaster, is described. It has a single-stranded, positive-sense genomic RNA of 11879 nt, followed by a poly(A) tail. Unlike other picorna-like viruses, the genome has four open reading frames (ORFs). One ORF encodes a picornavirus-like cassette of proteins for virus replication, including an iflavirus-like RNA-dependent RNA polymerase and a helicase that is related to those of mammalian picornaviruses. The three other ORFs are not closely related to any previously described viral sequences. The unusual sequence and genome organization in Nora virus suggest that it belongs to a new family of picorna-like viruses. Surprisingly, Nora virus could be detected in all tested D. melanogaster laboratory stocks, as well as in wild-caught material. The viral titres varied enormously, between 104 and 1010 viral genomes per fly in different stocks, without causing obvious pathological effects. The virus was also found in Drosophila simulans, a close relative of D. melanogaster, but not in more distantly related Drosophila species. It will now be possible to use Drosophila genetics to study the factors that control this persistent infection.


2019 ◽  
Author(s):  
Xiaochu Li ◽  
Floricel Gonzalez ◽  
Nathaniel Esteves ◽  
Birgit E. Scharf ◽  
Jing Chen

AbstractCoexistence of bacteriophages, or phages, and their host bacteria plays an important role in maintaining the microbial communities. In natural environments with limited nutrients, motile bacteria can actively migrate towards locations of richer resources. Although phages are not motile themselves, they can infect motile bacterial hosts and spread in space via the hosts. Therefore, in a migrating microbial community coexistence of bacteria and phages implies their co-propagation in space. Here, we combine an experimental approach and mathematical modeling to explore how phages and their motile host bacteria coexist and co-propagate. When lytic phages encountered motile host bacteria in our experimental set up, a sector-shaped lysis zone formed. Our mathematical model indicates that local nutrient depletion and the resulting inhibition of proliferation and motility of bacteria and phages are the key to formation of the observed lysis pattern. The model further reveals the straight radial boundaries in the lysis pattern as a tell-tale sign for coexistence and co-propagation of bacteria and phages. Emergence of such a pattern, albeit insensitive to extrinsic factors, requires a balance between intrinsic biological properties of phages and bacteria, which likely results from co-evolution of phages and bacteria.Author summaryCoexistence of phages and their bacterial hosts is important for maintaining the microbial communities. In a migrating microbial community, coexistence between phages and host bacteria implies that they co-propagate in space. Here we report a novel phage lysis pattern that is indicative of this co-propagation. The corresponding mathematical model we developed highlights a crucial dependence of the lysis pattern and implied phage-bacteria co-propagation on intrinsic properties allowing proliferation and spreading of the microbes in space. Remarkably, extrinsic factors, such as overall nutrient level, do not influence phage-bacteria coexistence and co-propagation. Findings from this work have strong implications for dispersal of phages mediated by motile bacterial communities, which will provide scientific basis for the fast-growing applications of phages.


2017 ◽  
Author(s):  
Joshua E. Goldford ◽  
Nanxi Lu ◽  
Djordje Bajic ◽  
Sylvie Estrela ◽  
Mikhail Tikhonov ◽  
...  

AbstractMicrobes assemble into complex, dynamic, and species-rich communities that play critical roles in human health and in the environment. The complexity of natural environments and the large number of niches present in most habitats are often invoked to explain the maintenance of microbial diversity in the presence of competitive exclusion. Here we show that soil and plant-associated microbiota, cultivated ex situ in minimal synthetic environments with a single supplied source of carbon, universally re-assemble into large and dynamically stable communities with strikingly predictable coarse-grained taxonomic and functional compositions. We find that generic, non-specific metabolic cross-feeding leads to the assembly of dense facilitation networks that enable the coexistence of multiple competitors for the supplied carbon source. The inclusion of universal and non-specific cross-feeding in ecological consumer-resource models is sufficient to explain our observations, and predicts a simple determinism in community structure, a property reflected in our experiments.


Sign in / Sign up

Export Citation Format

Share Document