scholarly journals Control of the False Discovery Proportion for Independently Tested Null Hypotheses

2012 ◽  
Vol 2012 ◽  
pp. 1-19 ◽  
Author(s):  
Yongchao Ge ◽  
Xiaochun Li

Consider the multiple testing problem of testingmnull hypothesesH1,…,Hm, among whichm0hypotheses are truly null. Given theP-values for each hypothesis, the question of interest is how to combine theP-values to find out which hypotheses are false nulls and possibly to make a statistical inference onm0. Benjamini and Hochberg proposed a classical procedure that can control the false discovery rate (FDR). The FDR control is a little bit unsatisfactory in that it only concerns the expectation of the false discovery proportion (FDP). The control of the actual random variable FDP has recently drawn much attention. For any level1−α, this paper proposes a procedure to construct an upper prediction bound (UPB) for the FDP for a fixed rejection region. When1−α=50%, our procedure is very close to the classical Benjamini and Hochberg procedure. Simultaneous UPBs for all rejection regions' FDPs and the upper confidence bound for the unknownm0are presented consequently. This new proposed procedure works for finite samples and hence avoids the slow convergence problem of the asymptotic theory.

2019 ◽  
Author(s):  
David C. Handler ◽  
Paul A. Haynes

AbstractThe multiple testing problem is a well-known statistical stumbling block in high-throughput data analysis, where large scale repetition of statistical methods introduces unwanted noise into the results. While approaches exist to overcome the multiple testing problem, these methods focus on theoretical statistical clarification rather than incorporating experimentally-derived measures to ensure appropriately tailored analysis parameters. Here, we introduce a method for estimating inter-replicate variability in reference samples for a quantitative proteomics experiment using permutation analysis. This can function as a modulator to multiple testing corrections such as the Benjamini-Hochberg ordered Q value test. We refer to this as a ‘same-same’ analysis, since this method incorporates the use of six biological replicates of the reference sample and determines, through non-redundant triplet pairwise comparisons, the level of quantitative noise inherent within the system. The method can be used to produce an experiment-specific Q value cut-off that achieves a specified false discovery rate at the quantitation level, such as 1%. The same-same method is applicable to any experimental set that incorporates six replicates of a reference sample. To facilitate access to this approach, we have developed a same-same analysis R module that is freely available and ready to use via the internet.


2016 ◽  
Vol 2016 ◽  
pp. 1-7
Author(s):  
Oluyemi Oyeniran ◽  
Hanfeng Chen

The problem of estimating the proportion, π0, of the true null hypotheses in a multiple testing problem is important in cases where large scale parallel hypotheses tests are performed independently. While the problem is a quantity of interest in its own right in applications, the estimate of π0 can be used for assessing or controlling an overall false discovery rate. In this article, we develop an innovative nonparametric maximum likelihood approach to estimate π0. The nonparametric likelihood is proposed to be restricted to multinomial models and an EM algorithm is also developed to approximate the estimate of π0. Simulation studies show that the proposed method outperforms other existing methods. Using experimental microarray datasets, we demonstrate that the new method provides satisfactory estimate in practice.


2011 ◽  
Vol 311-313 ◽  
pp. 1661-1666
Author(s):  
Pei Jin ◽  
Jian Zhang

Several biomaterials have been widely used in the treatment of cancer. However, how these biomaterials alter gene expression is poorly understood. The problem of identifying genes that are differentially expressed across varying biological conditions or in response to different biomaterials based on microarray data is a typical multiple testing problem. In this paper, we focus on FDR control for large-scale multiple testing problems, by our proposed statistics and resampling method, a powerful FDR controlling procedure for large-scale multiple testing problems is provided. Simulations show that, our Fiducial estimator is accurate and stable than other five traditional methods, with satisfactory FDR control. In particular, we propose a generally applicable estimate of the proposed procedure for identifying differentially expressed genes in microarray experiments. This microarray method consistently shows favorable performance over the existing methods. For example, in testing for differential expression between two breast cancer tumor types, the proposed procedure provides increases from 37% to 127% in the number of genes called significant at a false discovery rate of 3%.


2012 ◽  
Vol 2012 ◽  
pp. 1-13 ◽  
Author(s):  
Shulian Shang ◽  
Qianhe Zhou ◽  
Mengling Liu ◽  
Yongzhao Shao

The false discovery proportion (FDP), the proportion of incorrect rejections among all rejections, is a direct measure of abundance of false positive findings in multiple testing. Many methods have been proposed to control FDP, but they are too conservative to be useful for power analysis. Study designs for controlling the mean of FDP, which is false discovery rate, have been commonly used. However, there has been little attempt to design study with direct FDP control to achieve certain level of efficiency. We provide a sample size calculation method using the variance formula of the FDP under weak-dependence assumptions to achieve the desired overall power. The relationship between design parameters and sample size is explored. The adequacy of the procedure is assessed by simulation. We illustrate the method using estimated correlations from a prostate cancer dataset.


2004 ◽  
Vol 3 (1) ◽  
pp. 1-24 ◽  
Author(s):  
Katy L Simonsen ◽  
Lauren M McIntyre

The increase in the number of available markers for many experimental populations has led to QTL studies with ever increasing marker numbers and densities. The resulting conundrum is that as marker density increases, so does the multiple testing problem. It is important to re-examine the detection of multiple QTL in light of increasing marker density. We explore through simulation whether existing methods have achieved the maximum possible power for detecting multiple QTL and whether increasing the marker density is an effective strategy for locating multiple QTL. In addition to existing methods, such as the maximum, the CET, and the Benjamini-Hochberg and Benjamini-Yekutieli procedures, we propose and evaluate the complete set of order statistics with their corresponding empirical joint distribution. We examine these statistics in conjunction with a novel application of the alpha-spending approach, providing a less conservative solution to the problem of controlling the false discovery rate (FDR) in multiple tests. We conducted a simulation study to assess the relative power of these approaches as well as their ability to control FDR. We find that several of the new approaches have a reasonable FDR, and can substantially improve the experimenter's ability to detect multiple QTL compared to existing approaches in many cases; however, the Benjamini-Hochberg procedure remains a very reasonable choice. The methods are applied to a nine-trait Oat vernalization dataset.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Sangyoon Yi ◽  
Xianyang Zhang ◽  
Lu Yang ◽  
Jinyan Huang ◽  
Yuanhang Liu ◽  
...  

AbstractOne challenge facing omics association studies is the loss of statistical power when adjusting for confounders and multiple testing. The traditional statistical procedure involves fitting a confounder-adjusted regression model for each omics feature, followed by multiple testing correction. Here we show that the traditional procedure is not optimal and present a new approach, 2dFDR, a two-dimensional false discovery rate control procedure, for powerful confounder adjustment in multiple testing. Through extensive evaluation, we demonstrate that 2dFDR is more powerful than the traditional procedure, and in the presence of strong confounding and weak signals, the power improvement could be more than 100%.


2000 ◽  
Vol 25 (1) ◽  
pp. 60-83 ◽  
Author(s):  
Yoav Benjamini ◽  
Yosef Hochberg

A new approach to problems of multiple significance testing was presented in Benjamini and Hochberg (1995), which calls for controlling the expected ratio of the number of erroneous rejections to the number of rejections–the False Discovery Rate (FDR). The procedure given there was shown to control the FDR for independent test statistics. When some of the hypotheses are in fact false, that procedure is too conservative. We present here an adaptive procedure, where the number of true null hypotheses is estimated first as in Hochberg and Benjamini (1990), and this estimate is used in the procedure of Benjamini and Hochberg (1995). The result is still a simple stepwise procedure, to which we also give a graphical companion. The new procedure is used in several examples drawn from educational and behavioral studies, addressing problems in multi-center studies, subset analysis and meta-analysis. The examples vary in the number of hypotheses tested, and the implication of the new procedure on the conclusions. In a large simulation study of independent test statistics the adaptive procedure is shown to control the FDR and have substantially better power than the previously suggested FDR controlling method, which by itself is more powerful than the traditional family wise error-rate controlling methods. In cases where most of the tested hypotheses are far from being true there is hardly any penalty due to the simultaneous testing of many hypotheses.


2012 ◽  
Vol 02 (02) ◽  
pp. 163-171 ◽  
Author(s):  
Shulian Shang ◽  
Mengling Liu ◽  
Yongzhao Shao

Sign in / Sign up

Export Citation Format

Share Document