Faculty Opinions recommendation of Data-driven hypothesis weighting increases detection power in genome-scale multiple testing.

Author(s):  
Rita Casadio
2016 ◽  
Vol 13 (7) ◽  
pp. 577-580 ◽  
Author(s):  
Nikolaos Ignatiadis ◽  
Bernd Klaus ◽  
Judith B Zaugg ◽  
Wolfgang Huber

2016 ◽  
Author(s):  
Ajith Harish ◽  
Aare Abroi ◽  
Julian Gough ◽  
Charles Kurland

AbstractThe evolutionary origins of viruses according to marker gene phylogenies, as well as their relationships to the ancestors of host cells remains unclear. In a recent article Nasir and Caetano-Anollés reported that their genome-scale phylogenetic analyses identify an ancient origin of the “viral supergroup” (Nasir et al (2015) A phylogenomic data-driven exploration of viral origins and evolution. Science Advances, 1(8):e1500527). It suggests that viruses and host cells evolved independently from a universal common ancestor. Examination of their data and phylogenetic methods indicates that systematic errors likely affected the results. Reanalysis of the data with additional tests shows that small-genome attraction artifacts distort their phylogenomic analyses. These new results indicate that their suggestion of a distinct ancestry of the viral supergroup is not well supported by the evidence.


Author(s):  
Fengshi Jing ◽  
Qingpeng Zhang ◽  
Jason J. Ong ◽  
Yewei Xie ◽  
Yuxin Ni ◽  
...  

Human immunodeficiency virus self-testing (HIVST) is an innovative and effective strategy important to the expansion of HIV testing coverage. Several innovative implementations of HIVST have been developed and piloted among some HIV high-risk populations like men who have sex with men (MSM) to meet the global testing target. One innovative strategy is the secondary distribution of HIVST, in which individuals (defined as indexes) were given multiple testing kits for both self-use (i.e.self-testing) and distribution to other people in their MSM social network (defined as alters). Studies about secondary HIVST distribution have mainly concentrated on developing new intervention approaches to further increase the effectiveness of this relatively new strategy from the perspective of traditional public health discipline. There are many points of HIVST secondary distribution in which mathematical modelling can play an important role. In this study, we considered secondary HIVST kits distribution in a resource-constrained situation and proposed two data-driven integer linear programming models to maximize the overall economic benefits of secondary HIVST kits distribution based on our present implementation data from Chinese MSM. The objective function took expansion of normal alters and detection of positive and newly-tested ‘alters’ into account. Based on solutions from solvers, we developed greedy algorithms to find final solutions for our linear programming models. Results showed that our proposed data-driven approach could improve the total health economic benefit of HIVST secondary distribution. This article is part of the theme issue ‘Data science approaches to infectious disease surveillance’.


2015 ◽  
Author(s):  
Nikolaos Ignatiadis ◽  
Bernd Klaus ◽  
Judith Zaugg ◽  
Wolfgang Huber

AbstractHypothesis weighting is a powerful approach for improving the power of data analyses that employ multiple testing. However, in general it is not evident how to choose the weights in a data-dependent manner. We describe independent hypothesis weighting (IHW), a method for making use of informative covariates that are independent of the test statistic under the null, but informative of each test’s power or prior probability of the null hypothesis. Covariates can be continuous or categorical and need not fulfill any particular assumptions. The method increases statistical power in applications while controlling the false discovery rate (FDR) and produces additional insight by revealing the covariate-weight relationship. Independent hypothesis weighting is a practical approach to discovery of associations in large datasets.


Author(s):  
Shirley V Wang ◽  
Judith C Maro ◽  
Joshua J Gagne ◽  
Elisabetta Patorno ◽  
Sushama Kattinakere ◽  
...  

Abstract Tree-based scan statistics (TreeScan) are a data-mining method that adjusts for multiple testing of correlated hypotheses when screening thousands of potential adverse events for signal identification. Simulation has demonstrated the promise of TreeScan with a propensity score (PS) matched cohort design. However, it is unclear which variables to include in a PS for applied signal identification studies to simultaneously adjust for confounding across potential outcomes. We selected 4 drug pairs with well understood safety profiles. For each pair, we evaluated 5 candidate PSs with different combinations of: predefined general covariates (comorbidity, frailty, utilization), empirically-selected (data driven) covariates, and covariates tailored to the drug pair. For each pair, statistical alerting patterns were similar with alternative PSs (≤11 alerts in 7,996 outcomes scanned). Including covariates tailored to exposure did not appreciably impact screening results. Including empirically-selected covariates can provide better proxy coverage for confounders but can also decrease power. Unlike tailored covariates, empirical and predefined general covariates can be applied “out of the box” for signal identification. The choice of PS depends on level of concern about residual confounding versus loss of power. Potential signals should be followed by pharmacoepidemiologic assessment where confounding control is tailored to the specific outcome(s) under investigation.


2015 ◽  
Vol 6 ◽  
Author(s):  
Saheed Imam ◽  
Sascha Schäuble ◽  
Aaron N. Brooks ◽  
Nitin S. Baliga ◽  
Nathan D. Price

Sign in / Sign up

Export Citation Format

Share Document