VIRIDIC—A Novel Tool to Calculate the Intergenomic Similarities of Prokaryote-Infecting Viruses

Nucleotide-based intergenomic similarities are useful to understand how viruses are related with each other and to classify them. Here we have developed VIRIDIC, which implements the traditional algorithm used by the International Committee on Taxonomy of Viruses (ICTV), Bacterial and Archaeal Viruses Subcommittee, to calculate virus intergenomic similarities. When compared with other software, VIRIDIC gave the best agreement with the traditional algorithm, which is based on the percent identity between two genomes determined by BLASTN. Furthermore, VIRIDIC proved best at estimating the relatedness between more distantly-related phages, relatedness that other tools can significantly overestimate. In addition to the intergenomic similarities, VIRIDIC also calculates three indicators of the alignment ability to capture the relatedness between viruses: the aligned fractions for each genome in a pair and the length ratio between the two genomes. The main output of VIRIDIC is a heatmap integrating the intergenomic similarity values with information regarding the genome lengths and the aligned genome fraction. Additionally, VIRIDIC can group viruses into clusters, based on user-defined intergenomic similarity thresholds. The sensitivity of VIRIDIC is given by the BLASTN. Thus, it is able to capture relationships between viruses having in common even short genomic regions, with as low as 65% similarity. Below this similarity level, protein-based analyses should be used, as they are the best suited to capture distant relationships. VIRIDIC is available at viridic.icbm.de, both as a web-service and a stand-alone tool. It allows fast analysis of large phage genome datasets, especially in the stand-alone version, which can be run on the user’s own servers and can be integrated in bioinformatics pipelines. VIRIDIC was developed having viruses of Bacteria and Archaea in mind; however, it could potentially be used for eukaryotic viruses as well, as long as they are monopartite.

Download Full-text

VIRIDIC – a novel tool to calculate the intergenomic similarities of prokaryote-infecting viruses

10.1101/2020.07.05.188268 ◽

2020 ◽

Cited By ~ 7

Author(s):

Cristina Moraru ◽

Arvind Varsani ◽

Andrew M. Kropinski

Keyword(s):

Web Service ◽

Phage Genome ◽

International Committee ◽

Length Ratio ◽

Fast Analysis ◽

Archaeal Viruses ◽

Traditional Algorithm

AbstractNucleotide based intergenomic similarities are useful to understand how viruses are related with each other and to classify them. Here we have developed VIRIDIC, which implements the traditional algorithm used by the International Committee on Taxonomy of Viruses (ICTV), Bacterial and Archaeal Viruses Subcommittee, to calculate virus intergenomic similarities. When compared with other software, VIRIDIC gave the best agreement with the traditional algorithm. Furthermore, it proved best at estimating the relatedness between more distantly related phages, relatedness that other tools can significantly overestimate. In addition to the intergenomic similarities, VIRIDIC also calculates three indicators of the alignment ability to capture the relatedness between viruses: the aligned fractions for each genome in a pair and the length ratio between the two genomes. The main output of VIRIDIC is a heatmap integrating the intergenomic similarity values with information regarding the genome lengths and the aligned genome fraction. VIRIDIC is available at viridic.icbm.de, both as a web-service and a stand-alone tool. It allows fast analysis of large phage genome datasets, especially in the stand-alone version, which can be run on the user’s own servers and can be integrated in bioinformatics pipelines. VIRIDIC was developed having viruses of Bacteria and Archaea in mind, however, it could potentially be used for eukaryotic viruses as well, as long as they are monopartite.

Download Full-text

From Orphan Phage to a Proposed New Family–The Diversity of N4-Like Viruses

Antibiotics ◽

10.3390/antibiotics9100663 ◽

2020 ◽

Vol 9 (10) ◽

pp. 663 ◽

Cited By ~ 1

Author(s):

Johannes Wittmann ◽

Dann Turner ◽

Andrew D. Millard ◽

Padmanabhan Mahadevan ◽

Andrew M. Kropinski ◽

...

Keyword(s):

New Technologies ◽

Phage Genome ◽

International Committee ◽

Genome Sequences ◽

New Members ◽

Phage Group ◽

New Family ◽

Archaeal Viruses ◽

Long Time ◽

Bacterial Viruses

Escherichia phage N4 was isolated in 1966 in Italy and has remained a genomic orphan for a long time. It encodes an extremely large virion-associated RNA polymerase unique for bacterial viruses that became characteristic for this group. In recent years, due to new and relatively inexpensive sequencing techniques the number of publicly available phage genome sequences expanded rapidly. This revealed new members of the N4-like phage group, from 33 members in 2015 to 115 N4-like viruses in 2020. Using new technologies and methods for classification, the Bacterial and Archaeal Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV) has moved the classification and taxonomy of bacterial viruses from mere morphological approaches to genomic and proteomic methods. The analysis of 115 N4-like genomes resulted in a huge reassessment of this group and the proposal of a new family “Schitoviridae”, including eight subfamilies and numerous new genera.

Download Full-text

How to name and classify your phage: an informal guide

10.1101/111526 ◽

2017 ◽

Cited By ~ 2

Author(s):

Evelien M. Adriaenssens ◽

J. Rodney Brister

Keyword(s):

Holistic Approach ◽

International Committee ◽

Bottom Up ◽

Archaeal Viruses ◽

Phage Isolate

AbstractWith this informal guide, we try to assist both new and experienced phage researchers through two important stages that follow phage discovery, i.e. naming and classification. Providing an appropriate name for a bacteriophage is not as trivial as it sounds and the effects might be long-lasting in databases and in official taxon names. Phage classification is the responsibility of the Bacterial and Archaeal Viruses Subcommittee (BAVS) of the International Committee on the Taxonomy of Viruses (ICTV). While the BAVS aims at providing a holistic approach to phage taxonomy, for individual researchers who have isolated and sequenced a new phage, this can be a little overwhelming. We are now providing these researchers with an informal guide to phage naming and classification, taking a “bottom-up” approach from the phage isolate level.

Download Full-text

Fast analysis of scATAC-seq data using a predefined set of genomic regions

F1000Research ◽

10.12688/f1000research.22731.2 ◽

2020 ◽

Vol 9 ◽

pp. 199 ◽

Cited By ~ 2

Author(s):

Valentina Giansanti ◽

Ming Tang ◽

Davide Cittaro

Keyword(s):

De Novo ◽

Marker Genes ◽

Fast Analysis ◽

Public Data ◽

Cell Groups ◽

Hypersensitive Sites ◽

Reliable Quantification ◽

Genomic Regions ◽

Computational Resources ◽

Cell Data

Background: Analysis of scATAC-seq data has been recently scaled to thousands of cells. While processing of other types of single cell data was boosted by the implementation of alignment-free techniques, pipelines available to process scATAC-seq data still require large computational resources. We propose here an approach based on pseudoalignment, which reduces the execution times and hardware needs at little cost for precision. Methods: Public data for 10k PBMC were downloaded from 10x Genomics web site. Reads were aligned to various references derived from DNase I Hypersensitive Sites (DHS) using kallisto and quantified with bustools. We compared our results with the ones publicly available derived by cellranger-atac. We subsequently tested our approach on scATAC-seq data for K562 cell line. Results: We found that kallisto does not introduce biases in quantification of known peaks; cells groups identified are consistent with the ones identified from standard method. We also found that cell identification is robust when analysis is performed using DHS-derived reference in place of de novo identification of ATAC peaks. Lastly, we found that our approach is suitable for reliable quantification of gene activity based on scATAC-seq signal, thus allows for efficient labelling of cell groups based on marker genes. Conclusions: Analysis of scATAC-seq data by means of kallisto produces results in line with standard pipelines while being considerably faster; using a set of known DHS sites as reference does not affect the ability to characterize the cell populations.

Download Full-text

Analysis of Spounaviruses as a Case Study for the Overdue Reclassification of Tailed Phages

Systematic Biology ◽

10.1093/sysbio/syz036 ◽

2019 ◽

Vol 69 (1) ◽

pp. 110-123 ◽

Cited By ~ 25

Author(s):

Jakub Barylski ◽

François Enault ◽

Bas E Dutilh ◽

Margo BP Schuller ◽

Robert A Edwards ◽

...

Keyword(s):

Marker Gene ◽

International Committee ◽

New Taxon ◽

New Family ◽

Archaeal Viruses ◽

Wide Range ◽

The Family ◽

Comprehensive Picture ◽

Bacillus Phage

Abstract Tailed bacteriophages are the most abundant and diverse viruses in the world, with genome sizes ranging from 10 kbp to over 500 kbp. Yet, due to historical reasons, all this diversity is confined to a single virus order—Caudovirales, composed of just four families: Myoviridae, Siphoviridae, Podoviridae, and the newly created Ackermannviridae family. In recent years, this morphology-based classification scheme has started to crumble under the constant flood of phage sequences, revealing that tailed phages are even more genetically diverse than once thought. This prompted us, the Bacterial and Archaeal Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV), to consider overall reorganization of phage taxonomy. In this study, we used a wide range of complementary methods—including comparative genomics, core genome analysis, and marker gene phylogenetics—to show that the group of Bacillus phage SPO1-related viruses previously classified into the Spounavirinae subfamily, is clearly distinct from other members of the family Myoviridae and its diversity deserves the rank of an autonomous family. Thus, we removed this group from the Myoviridae family and created the family Herelleviridae—a new taxon of the same rank. In the process of the taxon evaluation, we explored the feasibility of different demarcation criteria and critically evaluated the usefulness of our methods for phage classification. The convergence of results, drawing a consistent and comprehensive picture of a new family with associated subfamilies, regardless of method, demonstrates that the tools applied here are particularly useful in phage taxonomy. We are convinced that creation of this novel family is a crucial milestone toward much-needed reclassification in the Caudovirales order.

Download Full-text

Bacterial Viruses Subcommittee and Archaeal Viruses Subcommittee of the ICTV: update of taxonomy changes in 2021

Archives of Virology ◽

10.1007/s00705-021-05205-9 ◽

2021 ◽

Author(s):

Mart Krupovic ◽

Dann Turner ◽

Vera Morozova ◽

Mike Dyall-Smith ◽

Hanna M. Oksanen ◽

...

Keyword(s):

New Taxa ◽

Executive Committee ◽

International Committee ◽

New Members ◽

Vice Chair ◽

Archaeal Viruses ◽

Bacterial Viruses

AbstractIn this article, we – the Bacterial Viruses Subcommittee and the Archaeal Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV) – summarise the results of our activities for the period March 2020 – March 2021. We report the division of the former Bacterial and Archaeal Viruses Subcommittee in two separate Subcommittees, welcome new members, a new Subcommittee Chair and Vice Chair, and give an overview of the new taxa that were proposed in 2020, approved by the Executive Committee and ratified by vote in 2021. In particular, a new realm, three orders, 15 families, 31 subfamilies, 734 genera and 1845 species were newly created or redefined (moved/promoted).

Download Full-text

Fast analysis of scATAC-seq data using a predefined set of genomic regions

F1000Research ◽

10.12688/f1000research.22731.1 ◽

2020 ◽

Vol 9 ◽

pp. 199

Author(s):

Valentina Giansanti ◽

Ming Tang ◽

Davide Cittaro

Keyword(s):

De Novo ◽

Marker Genes ◽

Fast Analysis ◽

Public Data ◽

Cell Groups ◽

Hypersensitive Sites ◽

Reliable Quantification ◽

Genomic Regions ◽

Computational Resources ◽

Cell Data

Background: Analysis of scATAC-seq data has been recently scaled to thousands of cells. While processing of other types of single cell data was boosted by the implementation of alignment-free techniques, pipelines available to process scATAC-seq data still require large computational resources. We propose here an approach based on pseudoalignment, which reduces the execution times and hardware needs at little cost for precision. Methods: Public data for 10k PBMC were downloaded from 10x Genomics web site. Reads were aligned to various references derived from DNase I Hypersensitive Sites (DHS) using kallisto and quantified with bustools. We compared our results with the ones publicly available derived by cellranger-atac. Results: We found that kallisto does not introduce biases in quantification of known peaks and cells groups are identified in a consistent way. We also found that cell identification is robust when analysis is performed using DHS-derived reference in place of de novo identification of ATAC peaks. Lastly, we found that our approach is suitable for reliable quantification of gene activity based on scATAC-seq signal, thus allows for efficient labelling of cell groups based on marker genes. Conclusions: Analysis of scATAC-seq data by means of kallisto produces results in line with standard pipelines while being considerably faster; using a set of known DHS sites as reference does not affect the ability to characterize the cell populations

Download Full-text