Sample Size Calculation for Simulation-Based Multiple-Testing Procedures

Resampling-based multiple testing procedures are widely used in genomic studies to identify differentially expressed genes and to conduct genome-wide association studies. However, the power and stability properties of these popular resampling-based multiple testing procedures have not been extensively evaluated. Our study focuses on investigating the power and stability of seven resampling-based multiple testing procedures frequently used in high-throughput data analysis for small sample size data through simulations and gene oncology examples. The bootstrap single-step minPprocedure and the bootstrap step-down minPprocedure perform the best among all tested procedures, when sample size is as small as 3 in each group and either familywise error rate or false discovery rate control is desired. When sample size increases to 12 and false discovery rate control is desired, the permutation maxTprocedure and the permutation minPprocedure perform best. Our results provide guidance for high-throughput data analysis when sample size is small.

Download Full-text

On power and sample size computation for multiple testing procedures

Computational Statistics & Data Analysis ◽

10.1016/j.csda.2010.05.024 ◽

2011 ◽

Vol 55 (1) ◽

pp. 110-122 ◽

Cited By ~ 9

Author(s):

Jie Chen ◽

Jianfeng Luo ◽

Kenneth Liu ◽

Devan V. Mehrotra

Keyword(s):

Sample Size ◽

Multiple Testing ◽

Testing Procedures ◽

Multiple Testing Procedures

Download Full-text

Multiple Testing in Candidate Gene Situations: A Comparison of Classical, Discrete, and Resampling-Based Procedures

Statistical Applications in Genetics and Molecular Biology ◽

10.2202/1544-6115.1729 ◽

2011 ◽

Vol 10 (1) ◽

Cited By ~ 1

Author(s):

Amelie Elsäßer ◽

Anja Victor ◽

Gerhard Hommel

Keyword(s):

Sample Size ◽

Candidate Gene ◽

Multiple Testing ◽

Small Sample Size ◽

Association Studies ◽

Correlated Data ◽

Small Sample ◽

Testing Procedures ◽

Multiple Testing Procedures ◽

Resampling Procedure

In candidate gene association studies, usually several elementary hypotheses are tested simultaneously using one particular set of data. The data normally consist of partly correlated SNP information. Every SNP can be tested for association with the disease, e.g., using the Cochran-Armitage test for trend. To account for the multiplicity of the test situation, different types of multiple testing procedures have been proposed. The question arises whether procedures taking into account the discreteness of the situation show a benefit especially in case of correlated data. We empirically evaluate several different multiple testing procedures via simulation studies using simulated correlated SNP data. We analyze FDR and FWER controlling procedures, special procedures for discrete situations, and the minP-resampling-based procedure. Within the simulation study, we examine a broad range of different gene data scenarios. We show that the main difference in the varying performance of the procedures is due to sample size. In small sample size scenarios, the minP-resampling procedure though controlling the stricter FWER even had more power than the classical FDR controlling procedures. In contrast, FDR controlling procedures led to more rejections in higher sample size scenarios.

Download Full-text

The Romano–Wolf multiple-hypothesis correction in Stata

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x20976314 ◽

2020 ◽

Vol 20 (4) ◽

pp. 812-843

Author(s):

Damian Clarke ◽

Joseph P. Romano ◽

Michael Wolf

Keyword(s):

Error Rate ◽

Multiple Testing ◽

Original Data ◽

Dependence Structure ◽

Familywise Error Rate ◽

Testing Procedures ◽

Multiple Testing Procedures ◽

Multiple Hypothesis ◽

Wide Range ◽

Performance Gains

When considering multiple-hypothesis tests simultaneously, standard statistical techniques will lead to overrejection of null hypotheses unless the multiplicity of the testing framework is explicitly considered. In this article, we discuss the Romano–Wolf multiple-hypothesis correction and document its implementation in Stata. The Romano–Wolf correction (asymptotically) controls the familywise error rate, that is, the probability of rejecting at least one true null hypothesis among a family of hypotheses under test. This correction is considerably more powerful than earlier multiple-testing procedures, such as the Bonferroni and Holm corrections, given that it takes into account the dependence structure of the test statistics by resampling from the original data. We describe a command, rwolf, that implements this correction and provide several examples based on a wide range of models. We document and discuss the performance gains from using rwolf over other multiple-testing procedures that control the familywise error rate.

Download Full-text

Simulation-Based Sample-Size Calculation for Designing New Clinical Trials and Diagnostic Test Accuracy Studies to Update an Existing Meta-Analysis

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x1301300302 ◽

2013 ◽

Vol 13 (3) ◽

pp. 451-473 ◽

Cited By ~ 6

Author(s):

Michael J. Crowther ◽

Sally R. Hinchliffe ◽

Alison Donald ◽

Alex J. Sutton

Keyword(s):

Clinical Trials ◽

Sample Size ◽

Diagnostic Test ◽

Meta Analysis ◽

Sample Size Calculation ◽

Diagnostic Test Accuracy ◽

Test Accuracy ◽

Simulation Based

Download Full-text

Closure properties of classes of multiple testing procedures

AStA Advances in Statistical Analysis ◽

10.1007/s10182-017-0297-0 ◽

2017 ◽

Vol 102 (2) ◽

pp. 167-178 ◽

Cited By ~ 1

Author(s):

Georg Hahn

Keyword(s):

Multiple Testing ◽

Closure Properties ◽

Testing Procedures ◽

Multiple Testing Procedures

Download Full-text

Joint testing of overall and simple effects for the two-by-two factorial trial design

Clinical Trials ◽

10.1177/17407745211014493 ◽

2021 ◽

Vol 18 (5) ◽

pp. 521-528

Author(s):

Eric S Leifer ◽

James F Troendle ◽

Alexis Kolecki ◽

Dean A Follmann

Keyword(s):

Blood Pressure ◽

Blood Pressure Control ◽

Multiple Testing ◽

Pressure Control ◽

Factorial Analysis ◽

Testing Procedures ◽

Type 1 Error ◽

Multiple Testing Procedures ◽

To Receive

Background/aims: The two-by-two factorial design randomizes participants to receive treatment A alone, treatment B alone, both treatments A and B( AB), or neither treatment ( C). When the combined effect of A and B is less than the sum of the A and B effects, called a subadditive interaction, there can be low power to detect the A effect using an overall test, that is, factorial analysis, which compares the A and AB groups to the C and B groups. Such an interaction may have occurred in the Action to Control Cardiovascular Risk in Diabetes blood pressure trial (ACCORD BP) which simultaneously randomized participants to receive intensive or standard blood pressure, control and intensive or standard glycemic control. For the primary outcome of major cardiovascular event, the overall test for efficacy of intensive blood pressure control was nonsignificant. In such an instance, simple effect tests of A versus C and B versus C may be useful since they are not affected by a subadditive interaction, but they can have lower power since they use half the participants of the overall trial. We investigate multiple testing procedures which exploit the overall tests’ sample size advantage and the simple tests’ robustness to a potential interaction. Methods: In the time-to-event setting, we use the stratified and ordinary logrank statistics’ asymptotic means to calculate the power of the overall and simple tests under various scenarios. We consider the A and B research questions to be unrelated and allocate 0.05 significance level to each. For each question, we investigate three multiple testing procedures which allocate the type 1 error in different proportions for the overall and simple effects as well as the AB effect. The Equal Allocation 3 procedure allocates equal type 1 error to each of the three effects, the Proportional Allocation 2 procedure allocates 2/3 of the type 1 error to the overall A (respectively, B) effect and the remaining type 1 error to the AB effect, and the Equal Allocation 2 procedure allocates equal amounts to the simple A (respectively, B) and AB effects. These procedures are applied to ACCORD BP. Results: Across various scenarios, Equal Allocation 3 had robust power for detecting a true effect. For ACCORD BP, all three procedures would have detected a benefit of intensive glycemia control. Conclusions: When there is no interaction, Equal Allocation 3 has less power than a factorial analysis. However, Equal Allocation 3 often has greater power when there is an interaction. The R package factorial2x2 can be used to explore the power gain or loss for different scenarios.

Download Full-text