Mass spectrometrists should search for all peptides, but assess only the ones they care about

Mapping Intimacies ◽

10.1101/094581 ◽

2017 ◽

Author(s):

Adriaan Sticker ◽

Lennart Martens ◽

Lieven Clement

Keyword(s):

False Discovery Rate ◽

Statistical Power ◽

Mass Spectra ◽

Shotgun Proteomics ◽

False Discovery ◽

Scientific Hypothesis ◽

Calculation Results ◽

Classical Strategy

AbstractIn shotgun proteomics identified mass spectra that are deemed irrelevant to the scientific hypothesis are often discarded. Noble (2015)1 therefore urged researchers to remove irrelevant peptides from the database prior to searching to improve statistical power. We here however, argue that both the classical as well as Noble’s revised method produce suboptimal peptide identifications and have problems in controlling the false discovery rate (FDR). Instead, we show that searching for all expected peptides, and removing irrelevant peptides prior to FDR calculation results in more reliable identifications at controlled FDR level than the classical strategy that discards irrelevant peptides post FDR calculation, or than Noble’s strategy that discards irrelevant peptides prior to searching.

Download Full-text

An averaging strategy to reduce variability in target-decoy estimates of false discovery rate

10.1101/440594 ◽

2018 ◽

Cited By ~ 1

Author(s):

Uri Keich ◽

Kaipo Tamura ◽

William Stafford Noble

Keyword(s):

False Discovery Rate ◽

Statistical Power ◽

Shotgun Proteomics ◽

Database Search ◽

Proteomics Data ◽

Decoy Database ◽

Software Toolkit ◽

True Proportion ◽

False Discovery ◽

False Discoveries

AbstractDecoy database search with target-decoy competition (TDC) provides an intuitive, easy-to-implement method for estimating the false discovery rate (FDR) associated with spectrum identifications from shotgun proteomics data. However, the procedure can yield different results for a fixed dataset analyzed with different decoy databases, and this decoy-induced variability is particularly problematic for smaller FDR thresholds, datasets or databases. In such cases, the nominal FDR might be 1% but the true proportion of false discoveries might be 10%. The averaged TDC protocol combats this problem by exploiting multiple independently shuffled decoy databases to provide an FDR estimate with reduced variability. We provide a tutorial introduction to aTDC, describe an improved variant of the protocol that offers increased statistical power, and discuss how to deploy aTDC in practice using the Crux software toolkit.

Download Full-text

On “Field Significance” and the False Discovery Rate

Journal of Applied Meteorology and Climatology ◽

10.1175/jam2404.1 ◽

2006 ◽

Vol 45 (9) ◽

pp. 1181-1189 ◽

Cited By ~ 280

Author(s):

D. S. Wilks

Keyword(s):

False Discovery Rate ◽

Statistical Power ◽

Statistical Significance ◽

Significance Test ◽

P Value ◽

Test Statistic ◽

Global Test ◽

Additional Advantage ◽

Counting Procedure ◽

False Discovery

Abstract The conventional approach to evaluating the joint statistical significance of multiple hypothesis tests (i.e., “field,” or “global,” significance) in meteorology and climatology is to count the number of individual (or “local”) tests yielding nominally significant results and then to judge the unusualness of this integer value in the context of the distribution of such counts that would occur if all local null hypotheses were true. The sensitivity (i.e., statistical power) of this approach is potentially compromised both by the discrete nature of the test statistic and by the fact that the approach ignores the confidence with which locally significant tests reject their null hypotheses. An alternative global test statistic that has neither of these problems is the minimum p value among all of the local tests. Evaluation of field significance using the minimum local p value as the global test statistic, which is also known as the Walker test, has strong connections to the joint evaluation of multiple tests in a way that controls the “false discovery rate” (FDR, or the expected fraction of local null hypothesis rejections that are incorrect). In particular, using the minimum local p value to evaluate field significance at a level αglobal is nearly equivalent to the slightly more powerful global test based on the FDR criterion. An additional advantage shared by Walker’s test and the FDR approach is that both are robust to spatial dependence within the field of tests. The FDR method not only provides a more broadly applicable and generally more powerful field significance test than the conventional counting procedure but also allows better identification of locations with significant differences, because fewer than αglobal × 100% (on average) of apparently significant local tests will have resulted from local null hypotheses that are true.

Download Full-text

Estimation of statistical power and false discovery rate of QTL mapping methods through computer simulation

Chinese Science Bulletin ◽

10.1007/s11434-012-5239-3 ◽

2012 ◽

Vol 57 (21) ◽

pp. 2701-2710 ◽

Cited By ~ 11

Author(s):

HuiHui Li ◽

LuYan Zhang ◽

JianKang Wang

Keyword(s):

Computer Simulation ◽

Qtl Mapping ◽

False Discovery Rate ◽

Statistical Power ◽

False Discovery

Download Full-text

Unbiased False Discovery Rate Estimation for Shotgun Proteomics Based on the Target-Decoy Approach

Journal of Proteome Research ◽

10.1021/acs.jproteome.6b00144 ◽

2016 ◽

Vol 16 (2) ◽

pp. 393-397 ◽

Cited By ~ 30

Author(s):

Lev I. Levitsky ◽

Mark V. Ivanov ◽

Anna A. Lobas ◽

Mikhail V. Gorshkov

Keyword(s):

False Discovery Rate ◽

Shotgun Proteomics ◽

Rate Estimation ◽

False Discovery ◽

False Discovery Rate Estimation

Download Full-text

Empirical approach to false discovery rate estimation in shotgun proteomics

Rapid Communications in Mass Spectrometry ◽

10.1002/rcm.4417 ◽

2010 ◽

Vol 24 (4) ◽

pp. 454-462 ◽

Cited By ~ 12

Author(s):

Anton A. Goloborodko ◽

Corina Mayerhofer ◽

Alexander R. Zubarev ◽

Irina A. Tarasova ◽

Alexander V. Gorshkov ◽

...

Keyword(s):

False Discovery Rate ◽

Shotgun Proteomics ◽

Empirical Approach ◽

Rate Estimation ◽

False Discovery ◽

False Discovery Rate Estimation

Download Full-text

Discrete False-Discovery Rate Improves Identification of Differentially Abundant Microbes

mSystems ◽

10.1128/msystems.00092-17 ◽

2017 ◽

Vol 2 (6) ◽

Cited By ~ 25

Author(s):

Lingjing Jiang ◽

Amnon Amir ◽

James T. Morton ◽

Ruth Heller ◽

Ery Arias-Castro ◽

...

Keyword(s):

Gene Expression ◽

False Discovery Rate ◽

Real World ◽

Statistical Power ◽

Gene Expression Analysis ◽

New Method ◽

Data Sets ◽

Differential Abundance ◽

False Discovery ◽

Microbiome Data

ABSTRACT DS-FDR can achieve higher statistical power to detect significant findings in sparse and noisy microbiome data compared to the commonly used Benjamini-Hochberg procedure and other FDR-controlling procedures. Differential abundance testing is a critical task in microbiome studies that is complicated by the sparsity of data matrices. Here we adapt for microbiome studies a solution from the field of gene expression analysis to produce a new method, discrete false-discovery rate (DS-FDR), that greatly improves the power to detect differential taxa by exploiting the discreteness of the data. Additionally, DS-FDR is relatively robust to the number of noninformative features, and thus removes the problem of filtering taxonomy tables by an arbitrary abundance threshold. We show by using a combination of simulations and reanalysis of nine real-world microbiome data sets that this new method outperforms existing methods at the differential abundance testing task, producing a false-discovery rate that is up to threefold more accurate, and halves the number of samples required to find a given difference (thus increasing the efficiency of microbiome experiments considerably). We therefore expect DS-FDR to be widely applied in microbiome studies. IMPORTANCE DS-FDR can achieve higher statistical power to detect significant findings in sparse and noisy microbiome data compared to the commonly used Benjamini-Hochberg procedure and other FDR-controlling procedures.

Download Full-text

Correction to “Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics”

Journal of Proteome Research ◽

10.1021/acs.jproteome.6b00853 ◽

2016 ◽

Vol 15 (12) ◽

pp. 4779-4780

Author(s):

Uri Keich ◽

Attila Kertesz-Farkas ◽

William Stafford Noble

Keyword(s):

False Discovery Rate ◽

Estimation Procedure ◽

Shotgun Proteomics ◽

Rate Estimation ◽

False Discovery ◽

False Discovery Rate Estimation

Download Full-text

Controlling the false discovery rate and increasing statistical power in ecological studies

Ecoscience ◽

10.2980/1195-6860(2006)13[439:ctfdra]2.0.co;2 ◽

2006 ◽

Vol 13 (4) ◽

pp. 439-442 ◽

Cited By ~ 86

Author(s):

Thomas A. Waite ◽

Lesley G. Campbell

Keyword(s):

False Discovery Rate ◽

Statistical Power ◽

Ecological Studies ◽

False Discovery

Download Full-text

Brain networks construction using Bayes FDR and average power function

Statistical Methods in Medical Research ◽

10.1177/0962280219844288 ◽

2019 ◽

Vol 29 (3) ◽

pp. 866-878

Author(s):

Piero Quatto ◽

Nicolò Margaritella ◽

Isa Costantini ◽

Francesca Baglio ◽

Massimo Garegnani ◽

...

Keyword(s):

Time Series ◽

False Discovery Rate ◽

Correlation Coefficient ◽

Power Function ◽

Multiple Testing ◽

Statistical Power ◽

Brain Connectivity ◽

Average Power ◽

Statistical Error ◽

False Discovery

Brain functional connectivity is a widely investigated topic in neuroscience. In recent years, the study of brain connectivity has been largely aided by graph theory. The link between time series recorded at multiple locations in the brain and the construction of a graph is usually an adjacency matrix. The latter converts a measure of the connectivity between two time series, typically a correlation coefficient, into a binary choice on whether the two brain locations are functionally connected or not. As a result, the choice of a threshold τ over the correlation coefficient is key. In the present work, we propose a multiple testing approach to the choice of τ that uses the Bayes false discovery rate and a new estimator of the statistical power called average power function to balance the two types of statistical error. We show that the proposed average power function estimator behaves well both in case of independence and weak dependence of the tests and it is reliable under several simulated dependence conditions. Moreover, we propose a robust method for the choice of τ using the 5% and 95% percentiles of the average power function and False Discovery Rate bootstrap distributions, respectively, to improve stability. We applied our approach to functional magnetic resonance imaging and high density electroencephalogram data.

Download Full-text

A New Approach to the Problem of Multiple Comparisons in the Genetic Dissection of Complex Traits

Genetics ◽

10.1093/genetics/150.4.1699 ◽

1998 ◽

Vol 150 (4) ◽

pp. 1699-1706 ◽

Cited By ~ 12

Author(s):

Joel Ira Weller ◽

Jiu Zhou Song ◽

David W Heyen ◽

Harris A Lewin ◽

Micha Ron

Keyword(s):

False Discovery Rate ◽

Complex Traits ◽

Statistical Power ◽

Type I Error ◽

Type I ◽

Genetic Dissection ◽

Multiple Traits ◽

Experimentwise Error Rate ◽

False Discovery ◽

Backcross Populations

Abstract Saturated genetic marker maps are being used to map individual genes affecting quantitative traits. Controlling the “experimentwise” type-I error severely lowers power to detect segregating loci. For preliminary genome scans, we propose controlling the “false discovery rate,” that is, the expected proportion of true null hypotheses within the class of rejected null hypotheses. Examples are given based on a granddaughter design analysis of dairy cattle and simulated backcross populations. By controlling the false discovery rate, power to detect true effects is not dependent on the number of tests performed. If no detectable genes are segregating, controlling the false discovery rate is equivalent to controlling the experimentwise error rate. If quantitative loci are segregating in the population, statistical power is increased as compared to control of the experimentwise type-I error. The difference between the two criteria increases with the increase in the number of false null hypotheses. The false discovery rate can be controlled at the same level whether the complete genome or only part of it has been analyzed. Additional levels of contrasts, such as multiple traits or pedigrees, can be handled without the necessity of a proportional decrease in the critical test probability.

Download Full-text