decoy database
Recently Published Documents


TOTAL DOCUMENTS

13
(FIVE YEARS 3)

H-INDEX

6
(FIVE YEARS 0)

2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Sangjeong Lee ◽  
Heejin Park ◽  
Hyunwoo Kim

Abstract Background The target-decoy strategy effectively estimates the false-discovery rate (FDR) by creating a decoy database with a size identical to that of the target database. Decoy databases are created by various methods, such as, the reverse, pseudo-reverse, shuffle, pseudo-shuffle, and the de Bruijn methods. FDR is sometimes over- or under-estimated depending on which decoy database is used because the ratios of redundant peptides in the target databases are different, that is, the numbers of unique (non-redundancy) peptides in the target and decoy databases differ. Results We used two protein databases (the UniProt Saccharomyces cerevisiae protein database and the UniProt human protein database) to compare the FDRs of various decoy databases. When the ratio of redundant peptides in the target database is low, the FDR is not overestimated by any decoy construction method. However, if the ratio of redundant peptides in the target database is high, the FDR is overestimated when the (pseudo) shuffle decoy database is used. Additionally, human and S. cerevisiae six frame translation databases, which are large databases, also showed outcomes similar to that from the UniProt human protein database. Conclusion The FDR must be estimated using the correction factor proposed by Elias and Gygi or that by Kim et al. when (pseudo) shuffle decoy databases are used.


2018 ◽  
Author(s):  
Uri Keich ◽  
Kaipo Tamura ◽  
William Stafford Noble

AbstractDecoy database search with target-decoy competition (TDC) provides an intuitive, easy-to-implement method for estimating the false discovery rate (FDR) associated with spectrum identifications from shotgun proteomics data. However, the procedure can yield different results for a fixed dataset analyzed with different decoy databases, and this decoy-induced variability is particularly problematic for smaller FDR thresholds, datasets or databases. In such cases, the nominal FDR might be 1% but the true proportion of false discoveries might be 10%. The averaged TDC protocol combats this problem by exploiting multiple independently shuffled decoy databases to provide an FDR estimate with reduced variability. We provide a tutorial introduction to aTDC, describe an improved variant of the protocol that offers increased statistical power, and discuss how to deploy aTDC in practice using the Crux software toolkit.


2016 ◽  
Vol 22 (1) ◽  
pp. 56-60
Author(s):  
Honglan Li ◽  
Duanhui Liu ◽  
Kiwook Lee ◽  
Kyu-Baek Hwang

2015 ◽  
Vol 22 (9) ◽  
pp. 823-836 ◽  
Author(s):  
Hsin-Yi (Cindy) Yeh ◽  
Aaron Lindsey ◽  
Chih-Peng Wu ◽  
Shawna Thomas ◽  
Nancy M. Amato

Sign in / Sign up

Export Citation Format

Share Document