scholarly journals Taxallnomy: an extension of NCBI Taxonomy that produces a hierarchically complete taxonomic tree

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Tetsu Sakamoto ◽  
J. Miguel Ortega

Abstract Background NCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, an alternative representation of data as a table would facilitate the use of information for processing bioinformatics data. To do so, since some taxonomic-ranks are missing in some lineages, an algorithm might propose provisional names for all taxonomic-ranks. Results To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic table, maintaining its compatibility with the original tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic-rank to an existing clade or “no rank” node when possible, using its name as part of the created taxonomic-rank name (e.g. Ord_Ornithischia) or interpolating parent nodes when needed (e.g. Cla_of_Ornithischia), both examples given for the dinosaur Brachylophosaurus lineage. The new hierarchical structure was named Taxallnomy because it contains names for all taxonomic-ranks, and it contains 41 hierarchical levels corresponding to the 41 taxonomic-ranks currently found in the NCBI Taxonomy database. From Taxallnomy, users can obtain the complete taxonomic lineage with 41 nodes of all taxa available in the NCBI Taxonomy database, without any hazard to the original tree information. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree and by producing metagenomics profiles. Conclusion Taxallnomy applies to any bioinformatics analyses that depend on the information from NCBI Taxonomy. Taxallnomy is updated periodically but with a distributed PERL script users can generate it locally using NCBI Taxonomy as input. All Taxallnomy resources are available at http://bioinfo.icb.ufmg.br/taxallnomy.

Author(s):  
Tetsu Sakamoto ◽  
J. Miguel Ortega

ABSTRACTNCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, taking advantage of its taxonomic tree could be challenging because (1) some taxonomic ranks are missing in some lineages and (2) some nodes in the tree do not have a taxonomic rank assigned (referred to as “no rank”). To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic rank to “no rank” nodes and of creating/deleting nodes throughout the tree. The algorithm also creates a name for the new nodes by borrowing the names from its ranked child or, if there is no child, from its ranked parent node. The new hierarchical structure was named taxallnomy and it contains 33 hierarchical levels corresponding to the 33 taxonomic ranks currently used in the NCBI Taxonomy database. From taxallnomy, users can obtain the complete taxonomic lineage with 33 nodes of all taxa available in the NCBI Taxonomy database. Taxallnomy is applicable to several bioinformatics analyses that depend on NCBI Taxonomy data. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree; and by making metagenomics profiles. Taxallnomy algorithm was written in PERL and all its resources are available at bioinfo.icb.ufmg.br/taxallnomy.Database URL: http://bioinfo.icb.ufmg.br/taxallnomy


Zootaxa ◽  
2019 ◽  
Vol 4706 (3) ◽  
pp. 401-407 ◽  
Author(s):  
AKHIL GARG ◽  
DETLEF LEIPE ◽  
PETER UETZ

We compared the species names in the Reptile Database, a dedicated taxonomy database, with those in the NCBI taxonomy database, which provides the taxonomic backbone for the GenBank sequence database. About 67% of the known ~11,000 reptile species are represented with at least one DNA sequence and a binary species name in GenBank. However, a common problem arises through the submission of preliminary species names (such as “Pelomedusa sp. A CK-2014”) to GenBank and thus the NCBI taxonomy. These names cannot be assigned to any accepted species names and thus create a disconnect between DNA sequences and species. While these names of unknown taxonomic meaning sometimes get updated, often they remain in GenBank which now contains sequences from ~1,300 such “putative” reptile species tagged by informal names (~15% of its reptile names). We estimate that NCBI/GenBank probably contain tens of thousands of such “disconnected” entries. We encourage sequence submitters to update informal species names after they have been published, otherwise the disconnect will cause increasing confusion and possibly misleading taxonomic conclusions.


Asy-Syari ah ◽  
2019 ◽  
Vol 19 (1) ◽  
pp. 69-90
Author(s):  
Maulana Ni'ma Alhizbi

AbstractHumans as khalifatullah are provided by Allah with minds and qalbu, so they can differentiate mashlahah (good) from mafsadat (bad). The potential of minds, situation, condition, and environment of each person is different, that has made the mashlahah is different for each person. This study is aimed to determine the epistemological process of individual mashlahah to be a communal mashlahah. This research begins with the view that the door of ijtihad is always open to anyone who has the credibility to do so, and the notion that not every issue has been answered by previous fuqaha, and reaffirm Islam as the religion of rahmatan lil 'aalamiin This research applied the descriptive philosophical method, and used various books of scholars and experts that related to ushul fiqh, qaidah fiqh, and the objective of Islamic Law as data source. The results of this study show that epistemologically, individual welfare can be explored through understanding and analysis of two sources of Islamic law (the Qur'an and the Sunnah) about the ease in carrying out the legal provisions contained in it as stipulated in surah al-Baqarah verse 185 and 286, by using ushul fiqh and fiqh rules. The hadith of the Prophet لا ضرر ولا ضرار is a legal basis that can provide information about individuals mashlahah that can be used as a standard for public selfare. By refers to illat factors of law in every person, a law can be applied only to people who have equal illat. If all humans have the same illat, then the law can be applied to all humans.Keywords: mashlahah, the objective of Islamic law, ushul fiqh, fiqhAbstrakManusia sebagai khalifatullah dibekali oleh Allah dengan akal dan qolbu, akal dapat berfungsi untuk mengetahui mana hal yang mashlahah (baik), dan mana yang mafsadat (buruk).  Potensi akal, situasi, kondisi, dan lingkungan setiap orang berbeda-beda maka mashlahahnya pun akan berbeda pula satu sama lainnya. Penelitian ini bertujuan untuk memaparkan proses epistemologis kemaslahatan individu menjadi kemaslahatan yang umum serta dasar hukum yang melandasinya. Penelitian ini berawal dari pandangan bahwa pintu ijtihad selalu terbuka bagi setiap orang yang mempunyai kredibilitas untuk melakukannya, dan anggapan bahwa tidak setiap persoalan yang ada sekarang sudah dijawab oleh para fuqaha terdahulu, juga harus adanya pemikiran kembali terhadap Islam dan mengukuhkannya sebagai agama rahmatan lil ‘aalamiin. Metode penelitian yang digunakan adalah metode filosofis deskriptif, dengan sumber data adalah kitab-kitab karya ulama dan para pakar ushul fiqh yang berhubungan dengan Tujuan Hukum Islam, Mashlahah, Qaidah Fiqh dan Qaidah Ushul Fiqh. Hasil penelitian menunjukan bahwa secara epistemologis, kemash­lahatan perseorangan dapat digali melalui pemahaman dan analisis terhadap dua sumber hukum Islam (al-Qur’an dan al-Sunnah) tentang kemudahan-kemudahan dalam menjalankan ketentuan hukum yang ada di dalamnya seperti dalam surat al-Baqarah ayat 185 dan 286, dengan menggunakan kaidah ushul fiqh dan kaidah fiqh untuk men­jelaskan keduanya. Hadits Nabiلا ضرر ولا ضرار  merupakan landasan hukum yang dapat memberikan keterangan tentang kemash­lahatan individu yang bisa dijadikan standar dari kemashlahataan umum. Dengan melihat kepada faktor illat hukum pada setiap individu, hukum dapat diterapkan hanya kepada orang yang mempunyai persamaan dalam illat hukum. Apabila seluruh manusia memiliki illat hukum yang sama maka hukum dapat diberlakukan kepada seluruh manusia.Kata kunci: mashlahah, tujuan hukum Islam, ushul fiqh, fiqh 


2014 ◽  
Vol 43 (D1) ◽  
pp. D1086-D1098 ◽  
Author(s):  
Scott Federhen

2017 ◽  
Author(s):  
Eneida Hatcher ◽  
Yiming Bao ◽  
Paolo Amedeo ◽  
Olga Blinkova ◽  
Guy Cochrane ◽  
...  

Currently the National Center of Biotechnology Information (NCBI) assigns individual taxonomy identifiers to each distinct influenza virus isolate submitted to GenBank. To support this practice, individual flu isolates must be manually added to the NCBI taxonomy database and unique taxonomy identifiers generated. This added layer of manual processing is unique to influenza virus and prevents automatization of the flu sequence submission process. Here we outline a new NCBI policy that normalizes influenza virus taxonomy processing but maintains features supported by the previous approach. This change will reduce the amount of manual handling necessary for flu submissions and pave the way for increased automation of the submissions process. While this automation may disrupt some historic practices, it will better align influenza virus data processing with other viruses and ultimately lower the submission burden on data providers.


Author(s):  
VICENÇ TORRA ◽  
SADAAKI MIYAMOTO

This work introduces an alternative representation for large dimensional data sets. Instead of using 2D or 3D representations, data is located on the surface of a sphere. Together with this representation, a hierarchical clustering algorithm is defined to analyse and extract the structure of the data. The algorithm builds a hierarchical structure (a dendrogram) in such a way that different cuts of the structure lead to different partitions of the surface of the sphere. This can be seen as a set of concentric spheres, each one being of different granularity. Also, to obtain an initial assignment of the data on the surface of the sphere, a method based on Sammon's mapping has been developed.


2011 ◽  
Vol 40 (D1) ◽  
pp. D136-D143 ◽  
Author(s):  
S. Federhen

2017 ◽  
Author(s):  
Eneida Hatcher ◽  
Yiming Bao ◽  
Paolo Amedeo ◽  
Olga Blinkova ◽  
Guy Cochrane ◽  
...  

Currently the National Center of Biotechnology Information (NCBI) assigns individual taxonomy identifiers to each distinct influenza virus isolate submitted to GenBank. To support this practice, individual flu isolates must be manually added to the NCBI taxonomy database and unique taxonomy identifiers generated. This added layer of manual processing is unique to influenza virus and prevents automatization of the flu sequence submission process. Here we outline a new NCBI policy that normalizes influenza virus taxonomy processing but maintains features supported by the previous approach. This change will reduce the amount of manual handling necessary for flu submissions and pave the way for increased automation of the submissions process. While this automation may disrupt some historic practices, it will better align influenza virus data processing with other viruses and ultimately lower the submission burden on data providers.


2021 ◽  
Author(s):  
Adam S Chan ◽  
Wei Jiang ◽  
Emily Blyth ◽  
Jean Yee Hwa Yang ◽  
Ellis Patrick

High-throughput single cell technologies hold the promise of discovering novel cellular relationships with disease and necessitate the use of effective analytical workflows. When manual gating is used to define cell types, the gating hierarchy can be used to identify cell types whose abundances change relative to a parent population. This strategy allows subtle changes to be observed that could be missed if small subsets were compared to all measured cells. However, typical analyses that employ unsupervised clustering overlook the valuable hierarchical structure present in cell type definitions by exclusively quantifying the proportions of cell type clusters relative to all cells. We present treekoR, a framework that facilitates multiple quantifications and comparisons of cell type proportions. Our results from twelve case studies reinforce the importance of quantifying proportions relative to parent populations in the analyses of cytometry data - as failing to do so can lead to missing important biological insights.


2018 ◽  
Vol 41 ◽  
Author(s):  
Duane T. Wegener ◽  
Leandre R. Fabrigar

AbstractReplications can make theoretical contributions, but are unlikely to do so if their findings are open to multiple interpretations (especially violations of psychometric invariance). Thus, just as studies demonstrating novel effects are often expected to empirically evaluate competing explanations, replications should be held to similar standards. Unfortunately, this is rarely done, thereby undermining the value of replication research.


Sign in / Sign up

Export Citation Format

Share Document