Analysis of 1,000 Type-Strain Genomes Improves Taxonomic Classification of Bacteroidetes

Abstract Candida albicans is the most commonly reported species causing candidiasis. The taxonomic classification of C. albicans and related lineages is controversial, with Candida africana (syn. C. albicans var. africana) and Candida stellatoidea (syn. C. albicans var. stellatoidea) being considered different species or C. albicans varieties depending on the authors. Moreover, recent genomic analyses have suggested a shared hybrid origin of C. albicans and C. africana, but the potential parental lineages remain unidentified. Although the genomes of C. albicans and C. africana have been extensively studied, the genome of C. stellatoidea has not been sequenced so far. In order to get a better understanding of the evolution of the C. albicans clade, and to assess whether C. stellatoidea could represent one of the unknown C. albicans parental lineages, we sequenced C. stellatoidea type strain (CBS 1905). This genome was compared to that of C. albicans and of the closely related lineage C. africana. Our results show that, similarly to C. africana, C. stellatoidea descends from the same hybrid ancestor as other C. albicans strains and that it has undergone a parallel massive loss of heterozygosity.

Download Full-text

Analysis of 1,000+ Type-Strain Genomes Substantially Improves Taxonomic Classification of Alphaproteobacteria

Frontiers in Microbiology ◽

10.3389/fmicb.2020.00468 ◽

2020 ◽

Vol 11 ◽

Cited By ~ 1269

Author(s):

Anton Hördt ◽

Marina García López ◽

Jan P. Meier-Kolthoff ◽

Marcel Schleuning ◽

Lisa-Maria Weinhold ◽

...

Keyword(s):

Type Strain ◽

Taxonomic Classification

Download Full-text

Genetic characteristics and taxonomic classification of Fimic Anthrosols in China

Geoderma ◽

10.1016/s0016-7061(03)00073-9 ◽

2003 ◽

Vol 115 (1-2) ◽

pp. 31-44 ◽

Cited By ~ 6

Author(s):

Min Zhang ◽

Li Ma ◽

Wenqing Li ◽

Baocheng Chen ◽

Jiwen Jia

Keyword(s):

Taxonomic Classification ◽

Genetic Characteristics

Download Full-text

Taxonomic classification of metagenomic sequences from Relative Abundance Index profiles using deep learning

Biomedical Signal Processing and Control ◽

10.1016/j.bspc.2021.102539 ◽

2021 ◽

Vol 67 ◽

pp. 102539

Author(s):

Meryem Altın Karagöz ◽

O. Ufuk Nalbantoglu

Keyword(s):

Deep Learning ◽

Relative Abundance ◽

Taxonomic Classification ◽

Abundance Index

Download Full-text

A singular value decomposition approach for improved taxonomic classification of biological sequences

BMC Genomics ◽

10.1186/1471-2164-12-s4-s11 ◽

2011 ◽

Vol 12 (Suppl 4) ◽

pp. S11 ◽

Cited By ~ 2

Author(s):

Anderson R Santos ◽

Marcos A Santos ◽

Jan Baumbach ◽

John A McCulloch ◽

Guilherme C Oliveira ◽

...

Keyword(s):

Singular Value Decomposition ◽

Singular Value ◽

Taxonomic Classification ◽

Biological Sequences ◽

Decomposition Approach ◽

Value Decomposition

Download Full-text

VirusTaxo: Taxonomic classification of virus genome using multi-class hierarchical classification by k-mer enrichment

10.1101/2021.04.29.442004 ◽

2021 ◽

Author(s):

Rajan Saha Raju ◽

Abdullah Al Nahid ◽

Preonath Shuvo ◽

Rashedul Islam

Keyword(s):

Genome Sequence ◽

Rna Viruses ◽

Hierarchical Classification ◽

Classification Problem ◽

Virus Genome ◽

Taxonomic Classification ◽

Dna Viruses ◽

Dna And Rna ◽

Full Length Genome

AbstractTaxonomic classification of viruses is a multi-class hierarchical classification problem, as taxonomic ranks (e.g., order, family and genus) of viruses are hierarchically structured and have multiple classes in each rank. Classification of biological sequences which are hierarchically structured with multiple classes is challenging. Here we developed a machine learning architecture, VirusTaxo, using a multi-class hierarchical classification by k-mer enrichment. VirusTaxo classifies DNA and RNA viruses to their taxonomic ranks using genome sequence. To assign taxonomic ranks, VirusTaxo extracts k-mers from genome sequence and creates bag-of-k-mers for each class in a rank. VirusTaxo uses a top-down hierarchical classification approach and accurately assigns the order, family and genus of a virus from the genome sequence. The average accuracies of VirusTaxo for DNA viruses are 99% (order), 98% (family) and 95% (genus) and for RNA viruses 97% (order), 96% (family) and 82% (genus). VirusTaxo can be used to detect taxonomy of novel viruses using full length genome or contig sequences.AvailabilityOnline version of VirusTaxo is available at https://omics-lab.com/virustaxo/.

Download Full-text

Functional and taxonomic classification of a greenhouse water drain metagenome

Standards in Genomic Sciences ◽

10.1186/s40793-018-0326-y ◽

2018 ◽

Vol 13 (1) ◽

Cited By ~ 1

Author(s):

Gamaliel López-Leal ◽

Fernanda Cornejo-Granados ◽

Juan Manuel Hurtado-Ramírez ◽

Alfredo Mendoza-Vargas ◽

Adrian Ochoa-Leyva

Keyword(s):

Taxonomic Classification ◽

Water Drain

Download Full-text

Psychroserpens Luteus Sp. Nov., Isolated fFrom Red Algae

10.21203/rs.3.rs-836932/v1 ◽

2021 ◽

Author(s):

Xiu-Ya Ping ◽

Kai Wang ◽

Jin-Yu Zhang ◽

Shu-Xin Wang ◽

Zong-Jun Du ◽

...

Keyword(s):

Red Algae ◽

Genomic Dna ◽

Type Strain ◽

Novel Species ◽

Sequence Similarity ◽

Rrna Gene ◽

Gram Stain ◽

Optimum Ph ◽

The 16S Rrna Gene

Abstract A Gram-stain-negative, gliding-motile, positive for catalase, facultative anaerobic, designated strain XSD401T, was isolated from the red algae of Xiaoshi Island, Shandong Province, China. Growth occurred at 20–37 °C (optimum, 33 °C), pH 5.5–9.5 (optimum, pH 6.5–7.5), and with 0.5–5% (w/v) NaCl (optimum, 3%). The main fatty acids are iso-C15:0, iso-C15:1 G, iso-C17:0 3-OH, iso-C15:0 3-OH, C16:0. Phosphatidylethanolamine (PE), three unidentified aminolipids (AL1, AL2, AL3) and one unidentified lipid (L) were the major polar lipids. The G+C content of the genomic DNA was 33.9 mol%. Strain XSD401T had the highest sequence similarity (96.88%) to the 16S rRNA gene of Psychroserpens damuponensis KCTC 23539T. The similarities with Psychroserpens burtonensis DSM 12212T was 96.31%. The dDDH values between strain XSD401T and P. damuponensis KCTC 23539T, P. burtonensis DSM 12212T, were 20.40% and 20.30%, respectively. The average nucleotide identity (ANI) values between strain XSD401T and P. damuponensis KCTC 23539T, P. burtonensis DSM 12212T were 76.91%, 76.88%, respectively. The differences in morphology, physiology and genotype from the previously described taxa support the classification of strain XSD401T as a representative of a novel species of the genus Psychroserpens, for which the name Psychroserpens luteus sp. nov. is proposed. The type strain is XSD401T (= MCCC 1H00396T = KCTC 72684T = JCM 33931T).

Download Full-text

Taxonomic Classification of Medically Important Microorganisms

Pocket Guide to Clinical Microbiology, Third Edition ◽

10.1128/9781555817725.ch1 ◽

2014 ◽

pp. 1-19

Keyword(s):

Taxonomic Classification

Download Full-text

Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT

Genome Biology ◽

10.1186/s13059-019-1817-x ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 26

Author(s):

F. A. Bastiaan von Meijenfeldt ◽

Ksenia Arkhipova ◽

Diego D. Cambuy ◽

Felipe H. Coutinho ◽

Bas E. Dutilh

Keyword(s):

Dna Sequences ◽

De Novo ◽

Taxonomic Classification ◽

Classification Method ◽

Reference Database ◽

Annotation Tool ◽

Multiple Signals

Abstract Current-day metagenomics analyses increasingly involve de novo taxonomic classification of long DNA sequences and metagenome-assembled genomes. Here, we show that the conventional best-hit approach often leads to classifications that are too specific, especially when the sequences represent novel deep lineages. We present a classification method that integrates multiple signals to classify sequences (Contig Annotation Tool, CAT) and metagenome-assembled genomes (Bin Annotation Tool, BAT). Classifications are automatically made at low taxonomic ranks if closely related organisms are present in the reference database and at higher ranks otherwise. The result is a high classification precision even for sequences from considerably unknown organisms.

Download Full-text