scholarly journals Permutation Tests for Classification: Towards Statistical Significance in Image-Based Studies

Author(s):  
Polina Golland ◽  
Bruce Fischl
2019 ◽  
Author(s):  
Marshall A. Taylor

Coefficient plots are a popular tool for visualizing regression estimates. The appeal of these plots is that they visualize confidence intervals around the estimates and generally center the plot around zero, meaning that any estimate that crosses zero is statistically non-significant at at least the alpha-level around which the confidence intervals are constructed. For models with statistical significance levels determined via randomization models of inference and for which there is no standard error or confidence intervals for the estimate itself, these plots appear less useful. In this paper, I illustrate a variant of the coefficient plot for regression models with p-values constructed using permutation tests. These visualizations plot each estimate's p-value and its associated confidence interval in relation to a specified alpha-level. These plots can help the analyst interpret and report both the statistical and substantive significance of their models. Illustrations are provided using a nonprobability sample of activists and participants at a 1962 anti-Communism school.


Author(s):  
Marshall A. Taylor

Coefficient plots are a popular tool for visualizing regression estimates. The appeal of these plots is that they visualize confidence intervals around the estimates and generally center the plot around zero, meaning that any estimate that crosses zero is statistically nonsignificant at least at the alpha level around which the confidence intervals are constructed. For models with statistical significance levels determined via randomization models of inference and for which there is no standard error or confidence intervals for the estimate itself, these plots appear less useful. In this article, I illustrate a variant of the coefficient plot for regression models with p-values constructed using permutation tests. These visualizations plot each estimate’s p-value and its associated confidence interval in relation to a specified alpha level. These plots can help the analyst interpret and report the statistical and substantive significances of their models. I illustrate using a nonprobability sample of activists and participants at a 1962 anticommunism school.


2011 ◽  
Vol 2011 ◽  
pp. 1-15 ◽  
Author(s):  
Anders Eklund ◽  
Mats Andersson ◽  
Hans Knutsson

Parametric statistical methods, such asZ-,t-, andF-values, are traditionally employed in functional magnetic resonance imaging (fMRI) for identifying areas in the brain that are active with a certain degree of statistical significance. These parametric methods, however, have two major drawbacks. First, it is assumed that the observed data are Gaussian distributed and independent; assumptions that generally are not valid for fMRI data. Second, the statistical test distribution can be derived theoretically only for very simple linear detection statistics. With nonparametric statistical methods, the two limitations described above can be overcome. The major drawback of non-parametric methods is the computational burden with processing times ranging from hours to days, which so far have made them impractical for routine use in single-subject fMRI analysis. In this work, it is shown how the computational power of cost-efficient graphics processing units (GPUs) can be used to speed up random permutation tests. A test with 10000 permutations takes less than a minute, making statistical analysis of advanced detection methods in fMRI practically feasible. To exemplify the permutation-based approach, brain activity maps generated by the general linear model (GLM) and canonical correlation analysis (CCA) are compared at the same significance level.


Author(s):  
Graham Hepworth ◽  
Ian R Gordon ◽  
Michael J McCullough

Differentiating strains of a pathogen is often central to investigating its epidemiological aspects. The genetic similarity of a group of strains can be assessed by calculating a matrix of dissimilarities from their DNA fingerprinting profiles. The mean dissimilarity for each strain across other strains within the group is then used as an observation in a statistical analysis. These observations are not independent of each other, and so standard analysis techniques such as the t-test are inappropriate, because they underestimate the variance of the group means, and hence overstate the statistical significance of any differences. By examining the correlation between elements of the dissimilarity matrix, it is shown that the variance is underestimated by a factor of between about 2 and 4. Permutation tests are proposed as a way of addressing the problem of dependence, and are applied to a study of fluconazole resistance in Candida albicans.


2017 ◽  
Author(s):  
Moo K. Chung ◽  
Victoria Villalta-Gil ◽  
Hyekyoung Lee ◽  
Paul J. Rathouz ◽  
Benjamin B. Lahey ◽  
...  

AbstractWe present a novel framework for characterizing paired brain networks using techniques in hyper-networks, sparse learning and persistent homology. The framework is general enough for dealing with any type of paired images such as twins, multimodal and longitudinal images. The exact nonparametric statistical inference procedure is derived on testing monotonic graph theory features that do not rely on time consuming permutation tests. The proposed method computes the exact probability in quadratic time while the permutation tests require exponential time. As illustrations, we apply the method to simulated networks and a twin fMRI study. In case of the latter, we determine the statistical significance of the heritability index of the large-scale reward network where every voxel is a network node.


2020 ◽  
Vol 175 (2) ◽  
pp. 156-167 ◽  
Author(s):  
Kenny Crump ◽  
Edmund Crouch ◽  
Daniel Zelterman ◽  
Casey Crump ◽  
Joseph Haseman

Abstract Glyphosate is a widely used herbicide worldwide. In 2015, the International Agency for Research on Cancer (IARC) reviewed glyphosate cancer bioassays and human studies and declared that the evidence for carcinogenicity of glyphosate is sufficient in experimental animals. We analyzed 10 glyphosate rodent bioassays, including those in which IARC found evidence of carcinogenicity, using a multiresponse permutation procedure that adjusts for the large number of tumors eligible for statistical testing and provides valid false-positive probabilities. The test statistics for these permutation tests are functions of p values from a standard test for dose-response trend applied to each specific type of tumor. We evaluated 3 permutation tests, using as test statistics the smallest p value from a standard statistical test for dose-response trend and the number of such tests for which the p value is less than or equal to .05 or .01. The false-positive probabilities obtained from 2 implementations of these 3 permutation tests are: smallest p value: .26, .17; p values ≤ .05: .08, .12; and p values ≤ .01: .06, .08. In addition, we found more evidence for negative dose-response trends than positive. Thus, we found no strong evidence that glyphosate is an animal carcinogen. The main cause for the discrepancy between IARC’s finding and ours appears to be that IARC did not account for the large number of tumor responses analyzed and the increased likelihood that several of these would show statistical significance simply by chance. This work provides a more comprehensive analysis of the animal carcinogenicity data for this important herbicide than previously available.


2014 ◽  
Vol 12 (05) ◽  
pp. 1440001 ◽  
Author(s):  
Malik N. Akhtar ◽  
Bruce R. Southey ◽  
Per E. Andrén ◽  
Jonathan V. Sweedler ◽  
Sandra L. Rodriguez-Zas

Various indicators of observed-theoretical spectrum matches were compared and the resulting statistical significance was characterized using permutation resampling. Novel decoy databases built by resampling the terminal positions of peptide sequences were evaluated to identify the conditions for accurate computation of peptide match significance levels. The methodology was tested on real and manually curated tandem mass spectra from peptides across a wide range of sizes. Spectra match indicators from complementary database search programs were profiled and optimal indicators were identified. The combination of the optimal indicator and permuted decoy databases improved the calculation of the peptide match significance compared to the approaches currently implemented in the database search programs that rely on distributional assumptions. Permutation tests using p-values obtained from software-dependent matching scores and E-values outperformed permutation tests using all other indicators. The higher overlap in matches between the database search programs when using end permutation compared to existing approaches confirmed the superiority of the end permutation method to identify peptides. The combination of effective match indicators and the end permutation method is recommended for accurate detection of peptides.


Author(s):  
N.J. Tao ◽  
J.A. DeRose ◽  
P.I. Oden ◽  
S.M. Lindsay

Clemmer and Beebe have pointed out that surface structures on graphite substrates can be misinterpreted as biopolymer images in STM experiments. We have been using electrochemical methods to react DNA fragments onto gold electrodes for STM and AFM imaging. The adsorbates produced in this way are only homogeneous in special circumstances. Searching an inhomogeneous substrate for ‘desired’ images limits the value of the data. Here, we report on a reversible method for imaging adsorbates. The molecules can be lifted onto and off the substrate during imaging. This leaves no doubt about the validity or statistical significance of the images. Furthermore, environmental effects (such as changes in electrolyte or surface charge) can be investigated easily.


2003 ◽  
Vol 73 (6) ◽  
pp. 439-445 ◽  
Author(s):  
Navia ◽  
Ortega ◽  
Requejo ◽  
Perea ◽  
López-Sobaler ◽  
...  

A study was conducted on the influence of maternal education level on food consumption, energy and nutrient intake, and dietary adequacy in 110 pre-school children from Madrid, Spain. With increasing maternal education, children consumed more sugar (p < 0.05), fruit (p < 0.05), and fish (p < 0.05). Snacking was more frequent with decreasing maternal education (p < 0.05). Though statistical significance was not reached, the consumption of pre-cooked foods was greater among children of mothers educated to a higher level, a phenomenon probably related to the work situation of these women. With respect to dietary composition, no significant differences were found between groups for macronutrient, fiber and energy intakes, except for energy supplied by polyunsaturated fatty acids (PUFA), which was greater in the children of less educated women (p < 0.01). This is probably due to their greater consumption of sunflower seed oil. The diets of children belonging to well-educated mothers came closer to meeting the recommended intakes for folate, vitamin C, and iodine. It would seem that maternal educational level influences the food habits of children. Mothers with less education may require special advice in this area.


Sign in / Sign up

Export Citation Format

Share Document