scholarly journals Permutation tests for hypothesis testing with animal social data: problems and potential solutions

Author(s):  
Damien R. Farine ◽  
Gerald G. Carter

ABSTRACTGenerating insights about a null hypothesis requires not only a good dataset, but also statistical tests that are reliable and actually address the null hypothesis of interest. Recent studies have found that permutation tests, which are widely used to test hypotheses when working with animal social network data, can suffer from high rates of type I error (false positives) and type II error (false negatives).Here, we first outline why pre-network and node permutation tests have elevated type I and II error rates. We then propose a new procedure, the double permutation test, that addresses some of the limitations of existing approaches by combining pre-network and node permutations.We conduct a range of simulations, allowing us to estimate error rates under different scenarios, including errors caused by confounding effects of social or non-social structure in the raw data.We show that double permutation tests avoid elevated type I errors, while remaining sufficiently sensitive to avoid elevated type II errors. By contrast, the existing solutions we tested, including node permutations, pre-network permutations, and regression models with control variables, all exhibit elevated errors under at least one set of simulated conditions. Type I error rates from double permutation remain close to 5% in the same scenarios where type I error rates from pre-network permutation tests exceed 30%.The double permutation test provides a potential solution to issues arising from elevated type I and type II error rates when testing hypotheses with social network data. We also discuss other approaches, including restricted node permutations, testing multiple null hypotheses, and splitting large datasets to generate replicated networks, that can strengthen our ability to make robust inferences. Finally, we highlight ways that uncertainty can be explicitly considered during the analysis using permutation-based or Bayesian methods.

1994 ◽  
Vol 19 (2) ◽  
pp. 91-101 ◽  
Author(s):  
Ralph A. Alexander ◽  
Diane M. Govern

A new approximation is proposed for testing the equality of k independent means in the face of heterogeneity of variance. Monte Carlo simulations show that the new procedure has Type I error rates that are very nearly nominal and Type II error rates that are quite close to those produced by James’s (1951) second-order approximation. In addition, it is computationally the simplest approximation yet to appear, and it is easily applied to Scheffé (1959) -type multiple contrasts and to the calculation of approximate tail probabilities.


Author(s):  
Riko Kelter

Abstract Testing for differences between two groups is among the most frequently carried out statistical methods in empirical research. The traditional frequentist approach is to make use of null hypothesis significance tests which use p values to reject a null hypothesis. Recently, a lot of research has emerged which proposes Bayesian versions of the most common parametric and nonparametric frequentist two-sample tests. These proposals include Student’s two-sample t-test and its nonparametric counterpart, the Mann–Whitney U test. In this paper, the underlying assumptions, models and their implications for practical research of recently proposed Bayesian two-sample tests are explored and contrasted with the frequentist solutions. An extensive simulation study is provided, the results of which demonstrate that the proposed Bayesian tests achieve better type I error control at slightly increased type II error rates. These results are important, because balancing the type I and II errors is a crucial goal in a variety of research, and shifting towards the Bayesian two-sample tests while simultaneously increasing the sample size yields smaller type I error rates. What is more, the results highlight that the differences in type II error rates between frequentist and Bayesian two-sample tests depend on the magnitude of the underlying effect.


Methodology ◽  
2009 ◽  
Vol 5 (2) ◽  
pp. 60-70 ◽  
Author(s):  
W. Holmes Finch ◽  
Teresa Davenport

Permutation testing has been suggested as an alternative to the standard F approximate tests used in multivariate analysis of variance (MANOVA). These approximate tests, such as Wilks’ Lambda and Pillai’s Trace, have been shown to perform poorly when assumptions of normally distributed dependent variables and homogeneity of group covariance matrices were violated. Because Monte Carlo permutation tests do not rely on distributional assumptions, they may be expected to work better than their approximate cousins when the data do not conform to the assumptions described above. The current simulation study compared the performance of four standard MANOVA test statistics with their Monte Carlo permutation-based counterparts under a variety of conditions with small samples, including conditions when the assumptions were met and when they were not. Results suggest that for sample sizes of 50 subjects, power is very low for all the statistics. In addition, Type I error rates for both the approximate F and Monte Carlo tests were inflated under the condition of nonnormal data and unequal covariance matrices. In general, the performance of the Monte Carlo permutation tests was slightly better in terms of Type I error rates and power when both assumptions of normality and homogeneous covariance matrices were not met. It should be noted that these simulations were based upon the case with three groups only, and as such results presented in this study can only be generalized to similar situations.


Genetics ◽  
1988 ◽  
Vol 118 (4) ◽  
pp. 705-711 ◽  
Author(s):  
J A Stoddart ◽  
J F Taylor

Abstract We show that a commonly used statistic of genotypic diversity can be used to reflect one form of deviation from panmixia, viz. clonal reproduction, by comparing observed and predicted sample statistics. The characteristics of the statistic, in particular its relationship with population genotypic diversity, are formalised and a method of predicting the genotypic diversity of a sample drawn from a panmictic population using allelic frequencies and sample size is developed. The sensitivity of some possible tests of significance of the deviation from panmictic expectations is examined using computer simulations. Goodness-of-fit tests are robust but produce an unacceptably high level of type II error. With means and variances calculated either from Monte Carlo simulations or from distributional and series approximations, t-tests perform better than goodness-of-fit tests. Under simulation, both forms of t-test exhibit acceptable rates of type I error. Rates of type II are usually large when allele frequencies are severely skewed although the latter test performs the better in those conditions.


1986 ◽  
Vol 16 (4) ◽  
pp. 710-712 ◽  
Author(s):  
B. Côté ◽  
C. Camiré

Data from dense plantings (33 × 33 cm) of black alder (Alnusglutinosa (L.) Gaertn.) and hybrid poplar (Populusnigra L. × Populustrichocarpa Torr and Gray) illustrate a simple statistical procedure to assess the extension of edge effect in small plots. A reference mean free of edge effect for the response variable under study must first be determined. Relative estimates corresponding to plot means produced by incremental removal of border rows are screened for acceptability using type I error and an approximation of type II error. The procedure is applicable to any response variable and can be applied to data bases having secondary maxima in inner rows.


1996 ◽  
Vol 26 (2) ◽  
pp. 149-160 ◽  
Author(s):  
J. K. Belknap ◽  
S. R. Mitchell ◽  
L. A. O'Toole ◽  
M. L. Helms ◽  
J. C. Crabbe

Sign in / Sign up

Export Citation Format

Share Document