scholarly journals Mass spectrometrists should search for all peptides, but assess only the ones they care about

2017 ◽  
Author(s):  
Adriaan Sticker ◽  
Lennart Martens ◽  
Lieven Clement

AbstractIn shotgun proteomics identified mass spectra that are deemed irrelevant to the scientific hypothesis are often discarded. Noble (2015)1 therefore urged researchers to remove irrelevant peptides from the database prior to searching to improve statistical power. We here however, argue that both the classical as well as Noble’s revised method produce suboptimal peptide identifications and have problems in controlling the false discovery rate (FDR). Instead, we show that searching for all expected peptides, and removing irrelevant peptides prior to FDR calculation results in more reliable identifications at controlled FDR level than the classical strategy that discards irrelevant peptides post FDR calculation, or than Noble’s strategy that discards irrelevant peptides prior to searching.

2018 ◽  
Author(s):  
Uri Keich ◽  
Kaipo Tamura ◽  
William Stafford Noble

AbstractDecoy database search with target-decoy competition (TDC) provides an intuitive, easy-to-implement method for estimating the false discovery rate (FDR) associated with spectrum identifications from shotgun proteomics data. However, the procedure can yield different results for a fixed dataset analyzed with different decoy databases, and this decoy-induced variability is particularly problematic for smaller FDR thresholds, datasets or databases. In such cases, the nominal FDR might be 1% but the true proportion of false discoveries might be 10%. The averaged TDC protocol combats this problem by exploiting multiple independently shuffled decoy databases to provide an FDR estimate with reduced variability. We provide a tutorial introduction to aTDC, describe an improved variant of the protocol that offers increased statistical power, and discuss how to deploy aTDC in practice using the Crux software toolkit.


2006 ◽  
Vol 45 (9) ◽  
pp. 1181-1189 ◽  
Author(s):  
D. S. Wilks

Abstract The conventional approach to evaluating the joint statistical significance of multiple hypothesis tests (i.e., “field,” or “global,” significance) in meteorology and climatology is to count the number of individual (or “local”) tests yielding nominally significant results and then to judge the unusualness of this integer value in the context of the distribution of such counts that would occur if all local null hypotheses were true. The sensitivity (i.e., statistical power) of this approach is potentially compromised both by the discrete nature of the test statistic and by the fact that the approach ignores the confidence with which locally significant tests reject their null hypotheses. An alternative global test statistic that has neither of these problems is the minimum p value among all of the local tests. Evaluation of field significance using the minimum local p value as the global test statistic, which is also known as the Walker test, has strong connections to the joint evaluation of multiple tests in a way that controls the “false discovery rate” (FDR, or the expected fraction of local null hypothesis rejections that are incorrect). In particular, using the minimum local p value to evaluate field significance at a level αglobal is nearly equivalent to the slightly more powerful global test based on the FDR criterion. An additional advantage shared by Walker’s test and the FDR approach is that both are robust to spatial dependence within the field of tests. The FDR method not only provides a more broadly applicable and generally more powerful field significance test than the conventional counting procedure but also allows better identification of locations with significant differences, because fewer than αglobal × 100% (on average) of apparently significant local tests will have resulted from local null hypotheses that are true.


2016 ◽  
Vol 16 (2) ◽  
pp. 393-397 ◽  
Author(s):  
Lev I. Levitsky ◽  
Mark V. Ivanov ◽  
Anna A. Lobas ◽  
Mikhail V. Gorshkov

2010 ◽  
Vol 24 (4) ◽  
pp. 454-462 ◽  
Author(s):  
Anton A. Goloborodko ◽  
Corina Mayerhofer ◽  
Alexander R. Zubarev ◽  
Irina A. Tarasova ◽  
Alexander V. Gorshkov ◽  
...  

mSystems ◽  
2017 ◽  
Vol 2 (6) ◽  
Author(s):  
Lingjing Jiang ◽  
Amnon Amir ◽  
James T. Morton ◽  
Ruth Heller ◽  
Ery Arias-Castro ◽  
...  

ABSTRACT DS-FDR can achieve higher statistical power to detect significant findings in sparse and noisy microbiome data compared to the commonly used Benjamini-Hochberg procedure and other FDR-controlling procedures. Differential abundance testing is a critical task in microbiome studies that is complicated by the sparsity of data matrices. Here we adapt for microbiome studies a solution from the field of gene expression analysis to produce a new method, discrete false-discovery rate (DS-FDR), that greatly improves the power to detect differential taxa by exploiting the discreteness of the data. Additionally, DS-FDR is relatively robust to the number of noninformative features, and thus removes the problem of filtering taxonomy tables by an arbitrary abundance threshold. We show by using a combination of simulations and reanalysis of nine real-world microbiome data sets that this new method outperforms existing methods at the differential abundance testing task, producing a false-discovery rate that is up to threefold more accurate, and halves the number of samples required to find a given difference (thus increasing the efficiency of microbiome experiments considerably). We therefore expect DS-FDR to be widely applied in microbiome studies. IMPORTANCE DS-FDR can achieve higher statistical power to detect significant findings in sparse and noisy microbiome data compared to the commonly used Benjamini-Hochberg procedure and other FDR-controlling procedures.


2019 ◽  
Vol 29 (3) ◽  
pp. 866-878
Author(s):  
Piero Quatto ◽  
Nicolò Margaritella ◽  
Isa Costantini ◽  
Francesca Baglio ◽  
Massimo Garegnani ◽  
...  

Brain functional connectivity is a widely investigated topic in neuroscience. In recent years, the study of brain connectivity has been largely aided by graph theory. The link between time series recorded at multiple locations in the brain and the construction of a graph is usually an adjacency matrix. The latter converts a measure of the connectivity between two time series, typically a correlation coefficient, into a binary choice on whether the two brain locations are functionally connected or not. As a result, the choice of a threshold τ over the correlation coefficient is key. In the present work, we propose a multiple testing approach to the choice of τ that uses the Bayes false discovery rate and a new estimator of the statistical power called average power function to balance the two types of statistical error. We show that the proposed average power function estimator behaves well both in case of independence and weak dependence of the tests and it is reliable under several simulated dependence conditions. Moreover, we propose a robust method for the choice of τ using the 5% and 95% percentiles of the average power function and False Discovery Rate bootstrap distributions, respectively, to improve stability. We applied our approach to functional magnetic resonance imaging and high density electroencephalogram data.


Genetics ◽  
1998 ◽  
Vol 150 (4) ◽  
pp. 1699-1706 ◽  
Author(s):  
Joel Ira Weller ◽  
Jiu Zhou Song ◽  
David W Heyen ◽  
Harris A Lewin ◽  
Micha Ron

Abstract Saturated genetic marker maps are being used to map individual genes affecting quantitative traits. Controlling the “experimentwise” type-I error severely lowers power to detect segregating loci. For preliminary genome scans, we propose controlling the “false discovery rate,” that is, the expected proportion of true null hypotheses within the class of rejected null hypotheses. Examples are given based on a granddaughter design analysis of dairy cattle and simulated backcross populations. By controlling the false discovery rate, power to detect true effects is not dependent on the number of tests performed. If no detectable genes are segregating, controlling the false discovery rate is equivalent to controlling the experimentwise error rate. If quantitative loci are segregating in the population, statistical power is increased as compared to control of the experimentwise type-I error. The difference between the two criteria increases with the increase in the number of false null hypotheses. The false discovery rate can be controlled at the same level whether the complete genome or only part of it has been analyzed. Additional levels of contrasts, such as multiple traits or pedigrees, can be handled without the necessity of a proportional decrease in the critical test probability.


Sign in / Sign up

Export Citation Format

Share Document