taxonomic tree
Recently Published Documents


TOTAL DOCUMENTS

31
(FIVE YEARS 11)

H-INDEX

7
(FIVE YEARS 1)

PLoS ONE ◽  
2021 ◽  
Vol 16 (10) ◽  
pp. e0258693
Author(s):  
Yuval Bussi ◽  
Ruti Kapon ◽  
Ziv Reich

Information theoretic approaches are ubiquitous and effective in a wide variety of bioinformatics applications. In comparative genomics, alignment-free methods, based on short DNA words, or k-mers, are particularly powerful. We evaluated the utility of varying k-mer lengths for genome comparisons by analyzing their sequence space coverage of 5805 genomes in the KEGG GENOME database. In subsequent analyses on four k-mer lengths spanning the relevant range (11, 21, 31, 41), hierarchical clustering of 1634 genus-level representative genomes using pairwise 21- and 31-mer Jaccard similarities best recapitulated a phylogenetic/taxonomic tree of life with clear boundaries for superkingdom domains and high subtree similarity for named taxons at lower levels (family through phylum). By analyzing ~14.2M prokaryotic genome comparisons by their lowest-common-ancestor taxon levels, we detected many potential misclassification errors in a curated database, further demonstrating the need for wide-scale adoption of quantitative taxonomic classifications based on whole-genome similarity.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Tetsu Sakamoto ◽  
J. Miguel Ortega

Abstract Background NCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, an alternative representation of data as a table would facilitate the use of information for processing bioinformatics data. To do so, since some taxonomic-ranks are missing in some lineages, an algorithm might propose provisional names for all taxonomic-ranks. Results To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic table, maintaining its compatibility with the original tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic-rank to an existing clade or “no rank” node when possible, using its name as part of the created taxonomic-rank name (e.g. Ord_Ornithischia) or interpolating parent nodes when needed (e.g. Cla_of_Ornithischia), both examples given for the dinosaur Brachylophosaurus lineage. The new hierarchical structure was named Taxallnomy because it contains names for all taxonomic-ranks, and it contains 41 hierarchical levels corresponding to the 41 taxonomic-ranks currently found in the NCBI Taxonomy database. From Taxallnomy, users can obtain the complete taxonomic lineage with 41 nodes of all taxa available in the NCBI Taxonomy database, without any hazard to the original tree information. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree and by producing metagenomics profiles. Conclusion Taxallnomy applies to any bioinformatics analyses that depend on the information from NCBI Taxonomy. Taxallnomy is updated periodically but with a distributed PERL script users can generate it locally using NCBI Taxonomy as input. All Taxallnomy resources are available at http://bioinfo.icb.ufmg.br/taxallnomy.


Author(s):  
Lindsay Triplett ◽  
Ravikumar Patel

Abstract Xanthomonas vasicola pv. vasculorum (Xvv) is a bacterial pathogen that causes both bacterial leaf streak of maize and sugarcane gumming disease. After decades limited to South Africa, bacterial leaf streak of maize spread rapidly through maize-growing areas of Argentina, Brazil and the USA since 2014. The origin, method and biological underpinnings of this sudden spread are not well understood but are the subject of active research. Effective control methods remain elusive, but sanitation and crop debris management may limit the disease. Yield impact data are not yet available, but lesions may become severe enough to limit plant productivity in some varieties. The pathogen is not currently considered a quarantine threat by the USDA, EPPO or IPPC. Taxonomic Tree Top of page Domain: Bacteria Phylum: Proteobacteria Class: Gammaproteobacteria Order: Xanthomonadales Family: Xanthomonadaceae Genus: Xanthomonas Species: Xanthomonas vasicola pv. vasculorum Notes on Taxonomy and Nomenclature Top of page The taxonomic nomenclature of the pathogen has undergone several changes and is still being resolved at the time of this report. The Xanthomonas clade causing.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Francisco Gomez-Donoso ◽  
Félix Escalona ◽  
Ferran Pérez-Esteve ◽  
Miguel Cazorla

The most common approaches for classification rely on the inference of a specific class. However, every category could be naturally organized within a taxonomic tree, from the most general concept to the specific element, and that is how human knowledge works. This representation avoids the necessity of learning roughly the same features for a range of very similar categories, and it is easier to understand and work with and provides a classification for each abstraction level. In this paper, we carry out an exhaustive study of different methods to perform multilevel classification applied to the task of classifying wild animals and plant species. Different convolutional backbones, data setups, and ensembling techniques are explored to find the model which provides the best performance. As our experimentation remarks, in order to achieve the best performance on the datasets that are arranged in a tree-like structure, the classifier must feature an EfficientNetB5 backbone with an input size of 300 × 300 px, followed by a multilevel classifier. In addition, a Multiscale Crop data augmentation process must be carried out. Finally, the accuracy of this setup is a 62% top-1 accuracy and 88% top-5 accuracy. The architecture could benefit for an accuracy boost if it is involved in an ensemble of cascade classifiers, but the computational demand is unbearable for any real application.


2021 ◽  
Author(s):  
Heejung Yang ◽  
Beomjun Park ◽  
Jinyoung Park ◽  
Jiho Lee ◽  
Hyeon Seok Jang ◽  
...  

AbstractBiomedical databases grow by more than a thousand new publications every day. The large volume of biomedical literature that is being published at an unprecedented rate hinders the discovery of relevant knowledge from keywords of interest to gather new insights and form hypotheses. A text-mining tool, PubTator, helps to automatically annotate bioentities, such as species, chemicals, genes, and diseases, from PubMed abstracts and full-text articles. However, the manual re-organization and analysis of bioentities is a non-trivial and highly time-consuming task. ChexMix was designed to extract the unique identifiers of bioentities from query results. Herein, ChexMix was used to construct a taxonomic tree with allied species among Korean native plants and to extract the medical subject headings unique identifier of the bioentities, which co-occurred with the keywords in the same literature. ChexMix discovered the allied species related to a keyword of interest and experimentally proved its usefulness for multi-species analysis.


Zootaxa ◽  
2021 ◽  
Vol 4908 (3) ◽  
pp. 447-450
Author(s):  
ADRIANO B. KURY ◽  
AMANDA C. MENDES ◽  
LILIAN CARDOSO ◽  
MILENA S. KURY ◽  
ALEXIA A. GRANADO ◽  
...  

The “World Catalogue of Opiliones” (WCO) is a collaborative effort to comprehensively index the Earth’s species of harvestmen. This paper announces one component of the WCO, “WCO-Lite” a website available at https://wcolite.com/. WCO-Lite provides a graphic user interface for a second component of the WCO, “Opiliones of the World”, a database on the taxonomy of the harvestmen curated in TaxonWorks (TW). WCO-Lite interfaces include: (1) a checklist of all valid taxa of the arachnid Opiliones, exhaustive up to December 2018; (2) a taxonomic tree; (3) a search engine comprising two modules; and (4) a counter of species diversity for each taxon. An e-Book companion was launched simultaneously with WCO-Lite version 1.1 on September 12, 2020 to account for the formal publication of mandatory nomenclatural changes and availability of taxonomic names. The collective components of the WCO are also being summarized in a forthcoming conventional paper-form catalogue, currently in manuscript stage. 


Author(s):  
Tetsu Sakamoto ◽  
J. Miguel Ortega

ABSTRACTNCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, taking advantage of its taxonomic tree could be challenging because (1) some taxonomic ranks are missing in some lineages and (2) some nodes in the tree do not have a taxonomic rank assigned (referred to as “no rank”). To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic rank to “no rank” nodes and of creating/deleting nodes throughout the tree. The algorithm also creates a name for the new nodes by borrowing the names from its ranked child or, if there is no child, from its ranked parent node. The new hierarchical structure was named taxallnomy and it contains 33 hierarchical levels corresponding to the 33 taxonomic ranks currently used in the NCBI Taxonomy database. From taxallnomy, users can obtain the complete taxonomic lineage with 33 nodes of all taxa available in the NCBI Taxonomy database. Taxallnomy is applicable to several bioinformatics analyses that depend on NCBI Taxonomy data. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree; and by making metagenomics profiles. Taxallnomy algorithm was written in PERL and all its resources are available at bioinfo.icb.ufmg.br/taxallnomy.Database URL: http://bioinfo.icb.ufmg.br/taxallnomy


2020 ◽  
Author(s):  
Fábio M. Miranda ◽  
Vasco A. C. Azevedo ◽  
Bernhard Y. Renard ◽  
Vitor C. Piro ◽  
Rommel T. J. Ramos

AbstractMotivationFungi are key elements in several important ecological functions, ranging from organic matter decomposition to symbiotic associations with plants. Moreover, fungi naturally inhabit the human microbiome and can be causative agents of human infections. An accurate and robust method for fungal ITS classification is not only desired for the purpose of better diversity estimation, but it can also help us gain a deeper insight of the dynamics of environmental communities and ultimately comprehend whether the abundance of certain species correlate with health and disease. Although many methods have been proposed for taxonomic classification, to the best of our knowledge, none of them consider the taxonomic tree hierarchy when building their models. This in turn, leads to lower generalization power and higher risk of committing classification errors.ResultsIn this work, we developed a robust, hierarchical machine learning model for accurate ITS classification, which requires a small amount of data for training and is able to handle imbalanced datasets. We show that our hierarchical model, HiTaC, outperforms state-of-the-art methods when trained over noisy data, consistently achieving higher accuracy and sensitivity across different taxonomic ranks.AvailabilityHiTaC is an open-source software, with documentation and source code available at https://gitlab.com/dacs-hpi/[email protected] informationSupplementary data are available at bioRxiv online.


2020 ◽  
Vol 2 (1) ◽  
Author(s):  
Qiaoxing Liang ◽  
Paul W Bible ◽  
Yu Liu ◽  
Bin Zou ◽  
Lai Wei

Abstract Large-scale metagenomic assemblies have uncovered thousands of new species greatly expanding the known diversity of microbiomes in specific habitats. To investigate the roles of these uncultured species in human health or the environment, researchers need to incorporate their genome assemblies into a reference database for taxonomic classification. However, this procedure is hindered by the lack of a well-curated taxonomic tree for newly discovered species, which is required by current metagenomics tools. Here we report DeepMicrobes, a deep learning-based computational framework for taxonomic classification that allows researchers to bypass this limitation. We show the advantage of DeepMicrobes over state-of-the-art tools in species and genus identification and comparable accuracy in abundance estimation. We trained DeepMicrobes on genomes reconstructed from gut microbiomes and discovered potential novel signatures in inflammatory bowel diseases. DeepMicrobes facilitates effective investigations into the uncharacterized roles of metagenomic species.


Sign in / Sign up

Export Citation Format

Share Document