annotation error
Recently Published Documents


TOTAL DOCUMENTS

15
(FIVE YEARS 1)

H-INDEX

5
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Megan E. Smithmyer ◽  
Alice E. Wiedeman ◽  
David A. G. Skibinski ◽  
Adam K. Savage ◽  
Carolina Acosta‐Vega ◽  
...  


Author(s):  
Xinyu Gong ◽  
Jialiang Lu ◽  
Yuefu Zhou ◽  
Han Qiu ◽  
Ruan He


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5030 ◽  
Author(s):  
Robert Edgar

Sequencing of the 16S ribosomal RNA (rRNA) gene is widely used to survey microbial communities. Specialized 16S rRNA databases have been developed to support this approach including Greengenes, RDP and SILVA. Most taxonomy annotations in these databases are predictions from sequence rather than authoritative assignments based on studies of type strains or isolates. In this work, I investigated the taxonomy annotations and guide trees provided by these databases. Using a blinded test, I estimated that the annotation error rate of the RDP database is ∼10%. The branching orders of the Greengenes and SILVA guide trees were found to disagree at comparable rates with each other and with taxonomy annotations according to the training set (authoritative reference) provided by RDP, indicating that the trees have comparable quality. Pervasive conflicts between tree branching order and type strain taxonomies strongly suggest that the guide trees are unreliable guides to phylogeny. I found 249,490 identical sequences with conflicting annotations in SILVA v128 and Greengenes v13.5 at ranks up to phylum (7,804 conflicts), indicating that the annotation error rate in these databases is ∼17%.



2018 ◽  
Author(s):  
Robert C. Edgar

AbstractSequencing of the 16S ribosomal RNA (rRNA) gene and the fungal Internal Transcribed Spacer (ITS) region is widely used to survey microbial communities. Specialized ribosomal sequence databases have been developed to support this approach including Greengenes, SILVA and RDP. Most taxonomy annotations in these databases are predictions from sequence rather than authoritative assignments based on studies of type strains or isolates. Here, I investigate the error rates of taxonomy annotations in these databases. I found 253,485 sequences with conflicting annotations in SILVA v128 and Greengenes v13.5 at ranks up to phylum (9,644 conflicts), indicating that the annotation error rate in these databases is ~15%. I found that 34% of non-singleton genera have overlapping subtrees in the Greengenes tree from 2001 according to the RDP taxonomy, most of which are probably due to branching order errors in the Greengenes tree, which is therefore an unreliable guide to phylogeny. Using a blinded test, I estimated that the annotation error rate of the RDP database is ~10%.



2017 ◽  
Vol 46 ◽  
pp. 1-35 ◽  
Author(s):  
Jindřich Matoušek ◽  
Daniel Tihelka


PLoS ONE ◽  
2017 ◽  
Vol 12 (10) ◽  
pp. e0185270 ◽  
Author(s):  
J. Francis Borgio


2017 ◽  
Vol 7 (1) ◽  
Author(s):  
M. E. Engkvist ◽  
E. W. Stratford ◽  
S. Lorenz ◽  
L. A. Meza-Zepeda ◽  
O. Myklebost ◽  
...  




2014 ◽  
Author(s):  
Marilena Di Bari ◽  
Serge Sharoff ◽  
Martin Thomas


Sign in / Sign up

Export Citation Format

Share Document