Hypotheses on a tree: new error rates and testing strategies

Biometrika ◽  
2020 ◽  
Author(s):  
Marina Bogomolov ◽  
Christine B Peterson ◽  
Yoav Benjamini ◽  
Chiara Sabatti

Abstract We introduce a multiple testing procedure that controls global error rates at multiple levels of resolution. Conceptually, we frame this problem as the selection of hypotheses that are organized hierarchically in a tree structure. We describe a fast algorithm and prove that it controls relevant error rates given certain assumptions on the dependence between the p-values. Through simulations, we demonstrate that the proposed procedure provides the desired guarantees under a range of dependency structures and that it has the potential to gain power over alternative methods. Finally, we apply the method to studies on the genetic regulation of gene expression across multiple tissues and on the relation between the gut microbiome and colorectal cancer.

2015 ◽  
Author(s):  
Christine Peterson ◽  
Marina Bogomolov ◽  
Yoav Benjamini ◽  
Chiara Sabatti

Commonly used multiplicity adjustments fail to control the error rate for reported findings in many expression quantitative trait loci (eQTL) studies. TreeQTL implements a stage-wise multiple testing procedure which allows control of appropriate error rates defined relative to a hierarchical grouping of the eQTL hypotheses. The R package TreeQTL is available for download at http://bioinformatics.org/treeqtl.


2015 ◽  
Vol 14 (1) ◽  
pp. 1-19 ◽  
Author(s):  
Rosa J. Meijer ◽  
Thijmen J.P. Krebs ◽  
Jelle J. Goeman

AbstractWe present a multiple testing method for hypotheses that are ordered in space or time. Given such hypotheses, the elementary hypotheses as well as regions of consecutive hypotheses are of interest. These region hypotheses not only have intrinsic meaning but testing them also has the advantage that (potentially small) signals across a region are combined in one test. Because the expected number and length of potentially interesting regions are usually not available beforehand, we propose a method that tests all possible region hypotheses as well as all individual hypotheses in a single multiple testing procedure that controls the familywise error rate. We start at testing the global null-hypothesis and when this hypothesis can be rejected we continue with further specifying the exact location/locations of the effect present. The method is implemented in the


2016 ◽  
Vol 6 (2) ◽  
pp. 30-41
Author(s):  
Mark Chang ◽  
Xuan Deng ◽  
John Balser

2012 ◽  
Vol 44 (3) ◽  
pp. 635-643 ◽  
Author(s):  
David Causeur ◽  
Mei-Chen Chu ◽  
Shulan Hsieh ◽  
Ching-Fan Sheu

Author(s):  
Jelle J. Goeman ◽  
Livio Finos

Hypotheses tests in bioinformatics can often be set in a tree structure in a very natural way, e.g. when tests are performed at probe, gene, and chromosome level. Exploiting this graph structure in a multiple testing procedure may result in a gain in power or increased interpretability of the results.We present the inheritance procedure, a method of familywise error control for hypotheses structured in a tree. The method starts testing at the top of the tree, following up on those branches in which it finds significant results, and following up on leaf nodes in the neighborhood of those leaves. The method is a uniform improvement over a recently proposed method by Meinshausen. The inheritance procedure has been implemented in the globaltest package which is available on www.bioconductor.org.


2020 ◽  
pp. 1-38
Author(s):  
Erwan Koch ◽  
Jonathan Kohy ◽  
Anthony C. Davison ◽  
Chiara Lepore ◽  
Michael K. Tippett

AbstractSevere thunderstorms can have devastating impacts. Concurrently high values of convective available potential energy (CAPE) and storm relative helicity (SRH) are known to be conducive to severe weather, so high values of have been used to indicate high risk of severe thunderstorms. We consider the extreme values of these three variables for a large area of the contiguous United States (US) over the period 1979–2015, and use extreme-value theory and a multiple testing procedure to show that there is a significant time trend in the extremes for PROD maxima in April, May and August, for CAPE maxima in April, May and June, and for maxima of SRH in April and May. These observed increases in CAPE are also relevant for rainfall extremes and are expected in a warmer climate, but have not previously been reported. Moreover, we show that the El Niño-Southern Oscillation explains variation in the extremes of PROD and SRH in February. Our results suggest that the risk from severe thunderstorms in April and May is increasing in parts of the US where it was already high, and that the risk from storms in February is increased over the main part of the region during La Niña years.


Sign in / Sign up

Export Citation Format

Share Document