A novel algorithm to flag columns associated in any way with others or a dependent variable is computationally tractable in large data matrices and has much higher power when columns are linked like mutations in chromosomes

Mapping Intimacies ◽

10.1101/2021.09.15.460360 ◽

2021 ◽

Author(s):

Marcos A. Antezana

Keyword(s):

Model Selection ◽

Type I Error ◽

Computational Cost ◽

Large Data ◽

Data Matrix ◽

P Value ◽

Type I ◽

P Values ◽

Multiple Tests ◽

Order Of Magnitude

ABSTRACTWhen a data matrix DM has many independent variables IVs, it is not computationally tractable to assess the association of every distinct IV subset with the dependent variable DV of the DM, because the number of subsets explodes combinatorially as IVs increase. But model selection and correcting for multiple tests is complex even with few IVs.DMs in genomics will soon summarize millions of markers (mutations) and genomes. Searching exhaustively in such DMs for mutations that alone or synergistically with others are associated with a trait is computationally tractable only for 1- and 2-mutation effects. This is also why population geneticists study mainly 2-marker combinations.I present a computationally tractable, fully parallelizable Participation in Association Score (PAS) that in a DM with markers detects one by one every column that is strongly associated in any way with others. PAS does not examine column subsets and its computational cost grows linearly with the number of columns, remaining reasonable even when DMs have millions of columns. PAS P values are readily obtained by permutation and accurately Sidak-corrected for multiple tests, bypassing model selection. The P values of a column’s PASs and dvPASs for different orders of association are i.i.d. and easily turned into a single P value.PAS exploits how associations of markers in the rows of a DM cause associations of matches in the pairwise comparisons of the rows. For every such comparison with a match at a tested column, PAS computes the matches at other columns by modifying the comparison’s total matches (scored once per DM), yielding a distribution of conditional matches that reacts diagnostically to the associations of the tested column. Equally computationally tractable is dvPAS that flags DV-associated IVs by also probing the matches at the DV.Simulations show that i) PAS and dvPAS generate uniform-(0,1)-distributed type I error in null DMs and ii) detect randomly encountered binary and trinary models of significant n-column association and n-IV association to a binary DV, respectively, with power in the order of magnitude of exhaustive evaluation’s and false positives that are uniform-(0,1)-distributed or straightforwardly tuned to be so. Power to detect 2-way associations that extend over 100+ columns is non-parametrically ultimate but that to detect pure n-column associations and pure n-IV DV associations sinks exponentially with increasing n.Important for geneticists, dvPAS power increases about twofold in trinary vs. binary DMs and by orders of magnitude with markers linked like mutations in chromosomes, specially in trinary DMs where furthermore dvPAS fine-maps with highest resolution.

Download Full-text

The Conundrum of P-Values: Statistical Significance is Unavoidable but Need Medical Significance Too

Journal of Biostatistics and Epidemiology ◽

10.18502/jbe.v5i4.3862 ◽

2020 ◽

Author(s):

Abhaya Indrayan

Keyword(s):

Type I Error ◽

Dominant Role ◽

Statistical Significance ◽

Empirical Studies ◽

P Value ◽

Selective Reporting ◽

Type I ◽

Practical Application ◽

P Values ◽

Zero Effect

Background: Small P-values have been conventionally considered as evidence to reject a null hypothesis in empirical studies. However, there is widespread criticism of P-values now and the threshold we use for statistical significance is questioned.Methods: This communication is on contrarian view and explains why P-value and its threshold are still useful for ruling out sampling fluctuation as a source of the findings.Results: The problem is not with P-values themselves but it is with their misuse, abuse, and over-use, including the dominant role they have assumed in empirical results. False results may be mostly because of errors in design, invalid data, inadequate analysis, inappropriate interpretation, accumulation of Type-I error, and selective reporting, and not because of P-values per se.Conclusion: A threshold of P-values such as 0.05 for statistical significance is helpful in making a binary inference for practical application of the result. However, a lower threshold can be suggested to reduce the chance of false results. Also, the emphasis should be on detecting a medically significant effect and not zero effect.

Download Full-text

Inconsistencies in Reported p-Values in Spanish Journals of Psychology

Methodology ◽

10.1027/1614-2241/a000107 ◽

2016 ◽

Vol 12 (2) ◽

pp. 44-51 ◽

Cited By ~ 1

Author(s):

José Manuel Caperos ◽

Ricardo Olmos ◽

Antonio Pardo

Keyword(s):

Type I Error ◽

Meta Analysis ◽

P Value ◽

Type I ◽

Simultaneous Inference ◽

Test Statistics ◽

Type I Errors ◽

P Values ◽

Editorial Boards ◽

Correlation Tests

Abstract. Correlation analysis is one of the most widely used methods to test hypotheses in social and health sciences; however, its use is not completely error free. We have explored the frequency of inconsistencies between reported p-values and the associated test statistics in 186 papers published in four Spanish journals of psychology (1,950 correlation tests); we have also collected information about the use of one- versus two-tailed tests in the presence of directional hypotheses, and about the use of some kind of adjustment to control Type I errors due to simultaneous inference. Reported correlation tests (83.8%) are incomplete and 92.5% include an inexact p-value. Gross inconsistencies, which are liable to alter the statistical conclusions, appear in 4% of the reviewed tests, and 26.9% of the inconsistencies found were large enough to bias the results of a meta-analysis. The election of one-tailed tests and the use of adjustments to control the Type I error rate are negligible. We therefore urge authors, reviewers, and editorial boards to pay particular attention to this in order to prevent inconsistencies in statistical reports.

Download Full-text

Probability waves: pattern-based p-value correction in mass univariate analysis between two event-related potential waves

10.1101/2019.12.12.873570 ◽

2019 ◽

Author(s):

Dimitri Marques Abramov

Keyword(s):

Type I Error ◽

Univariate Analysis ◽

Event Related Potential ◽

New Method ◽

Multiple Point ◽

P Value ◽

Type I ◽

P Values ◽

Two Samples ◽

Point To Point

AbstractBackgroundMethods for p-value correction are criticized for either increasing Type II error or improperly reducing Type I error. This problem is worse when dealing with hundreds or thousands of paired comparisons between waves or images which are performed point-to-point. This text considers patterns in probability vectors resulting from multiple point-to-point comparisons between two ERP waves (mass univariate analysis) to correct p-values. These patterns (probability waves) mirror ERP waveshapes and might be indicators of consistency in statistical differences.New methodIn order to compute and analyze these patterns, we convoluted the decimal logarithm of the probability vector (p’) using a Gaussian vector with size compatible to the ERP periods observed. For verify consistency of this method, we also calculated mean amplitudes of late ERPs from Pz (P300 wave) and O1 electrodes in two samples, respectively of typical and ADHD subjects.Resultsthe present method reduces the range of p’-values that did not show covariance with neighbors (that is, that are likely random differences, type I errors), while preserving the amplitude of probability waves, in accordance to difference between respective mean amplitudes.Comparison with existing methodsthe positive-FDR resulted in a different profile of corrected p-values, which is not consistent with expected results or differences between mean amplitudes of the analyzed ERPs.Conclusionthe present new method seems to be biological and statistically more suitable to correct p-values in mass univariate analysis of ERP waves.

Download Full-text

Chi-square and F Ratio: Which should be used when?

Journal of Methods and Measurement in the Social Sciences ◽

10.2458/v8i2.22990 ◽

2018 ◽

Vol 8 (2) ◽

pp. 58-71

Author(s):

Richard L. Gorsuch ◽

Curtis Lehmann

Keyword(s):

Count Data ◽

Type I Error ◽

Statistical Significance ◽

Error Rates ◽

P Value ◽

Type I ◽

Chi Square ◽

P Values ◽

Type I Error Rates ◽

Criterion Variables

Approximations for Chi-square and F distributions can both be computed to provide a p-value, or probability of Type I error, to evaluate statistical significance. Although Chi-square has been used traditionally for tests of count data and nominal or categorical criterion variables (such as contingency tables) and F ratios for tests of non-nominal or continuous criterion variables (such as regression and analysis of variance), we demonstrate that either statistic can be applied in both situations. We used data simulation studies to examine when one statistic may be more accurate than the other for estimating Type I error rates across different types of analysis (count data/contingencies, dichotomous, and non-nominal) and across sample sizes (Ns) ranging from 20 to 160 (using 25,000 replications for simulating p-value derived from either Chi-squares or F-ratios). Our results showed that those derived from F ratios were generally closer to nominal Type I error rates than those derived from Chi-squares. The p-values derived from F ratios were more consistent for contingency table count data than those derived from Chi-squares. The smaller than 100 the N was, the more discrepant p-values derived from Chi-squares were from the nominal p-value. Only when the N was greater than 80 did the p-values from Chi-square tests become as accurate as those derived from F ratios in reproducing the nominal p-values. Thus, there was no evidence of any need for special treatment of dichotomous dependent variables. The most accurate and/or consistent p's were derived from F ratios. We conclude that Chi-square should be replaced generally with the F ratio as the statistic of choice and that the Chi-square test should only be taught as history.

Download Full-text

Probabilistic thresholding of functional connectomes: application to schizophrenia

10.1101/233510 ◽

2017 ◽

Author(s):

František Váša ◽

Edward T. Bullmore ◽

Ameera X. Patel

Keyword(s):

Type I Error ◽

Resting State Fmri ◽

Connected Components ◽

Edge Density ◽

P Value ◽

Type I ◽

Healthy Controls ◽

True Positive ◽

Sparse Graphs ◽

Cross Correlations

AbstractFunctional connectomes are commonly analysed as sparse graphs, constructed by thresholding cross-correlations between regional neurophysiological signals. Thresholding generally retains the strongest edges (correlations), either by retaining edges surpassing a given absolute weight, or by constraining the edge density. The latter (more widely used) method risks inclusion of false positive edges at high edge densities and exclusion of true positive edges at low edge densities. Here we apply new wavelet-based methods, which enable construction of probabilistically-thresholded graphs controlled for type I error, to a dataset of resting-state fMRI scans of 56 patients with schizophrenia and 71 healthy controls. By thresholding connectomes to fixed edge-specific P value, we found that functional connectomes of patients with schizophrenia were more dysconnected than those of healthy controls, exhibiting a lower edge density and a higher number of (dis)connected components. Furthermore, many participants’ connectomes could not be built up to the fixed edge densities commonly studied in the literature (~5-30%), while controlling for type I error. Additionally, we showed that the topological randomisation previously reported in the schizophrenia literature is likely attributable to “non-significant” edges added when thresholding connectomes to fixed density based on correlation. Finally, by explicitly comparing connectomes thresholded by increasing P value and decreasing correlation, we showed that probabilistically thresholded connectomes show decreased randomness and increased consistency across participants. Our results have implications for future analysis of functional connectivity using graph theory, especially within datasets exhibiting heterogenous distributions of edge weights (correlations), between groups or across participants.

Download Full-text

Multiplicity Eludes Peer Review: The Case of COVID-19 Research

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18179304 ◽

2021 ◽

Vol 18 (17) ◽

pp. 9304

Author(s):

Oliver Gutiérrez-Hernández ◽

Luis Ventura García

Keyword(s):

Peer Review ◽

Observational Studies ◽

Social Impact ◽

Type I Error ◽

Peer Review Process ◽

Type I ◽

Strongly Correlated ◽

P Values ◽

False Discoveries ◽

Spurious Results

Multiplicity arises when data analysis involves multiple simultaneous inferences, increasing the chance of spurious findings. It is a widespread problem frequently ignored by researchers. In this paper, we perform an exploratory analysis of the Web of Science database for COVID-19 observational studies. We examined 100 top-cited COVID-19 peer-reviewed articles based on p-values, including up to 7100 simultaneous tests, with 50% including >34 tests, and 20% > 100 tests. We found that the larger the number of tests performed, the larger the number of significant results (r = 0.87, p < 10−6). The number of p-values in the abstracts was not related to the number of p-values in the papers. However, the highly significant results (p < 0.001) in the abstracts were strongly correlated (r = 0.61, p < 10−6) with the number of p < 0.001 significances in the papers. Furthermore, the abstracts included a higher proportion of significant results (0.91 vs. 0.50), and 80% reported only significant results. Only one reviewed paper addressed multiplicity-induced type I error inflation, pointing to potentially spurious results bypassing the peer-review process. We conclude the need to pay special attention to the increased chance of false discoveries in observational studies, including non-replicated striking discoveries with a potentially large social impact. We propose some easy-to-implement measures to assess and limit the effects of multiplicity.

Download Full-text

An Evaluation of Four Solutions to the Forking Paths Problem: Adjusted Alpha, Preregistration, Sensitivity Analyses, and Abandoning the Neyman-Pearson Approach

Review of General Psychology ◽

10.1037/gpr0000135 ◽

2017 ◽

Vol 21 (4) ◽

pp. 321-329 ◽

Cited By ~ 9

Author(s):

Mark Rubin

Keyword(s):

Hypothesis Testing ◽

Present Article ◽

Type I Error ◽

Statistical Analyses ◽

Nonlinear Transformation ◽

Sensitivity Analyses ◽

Type I ◽

Alternative Analysis ◽

Multiple Tests ◽

Alpha Level

Gelman and Loken (2013 , 2014 ) proposed that when researchers base their statistical analyses on the idiosyncratic characteristics of a specific sample (e.g., a nonlinear transformation of a variable because it is skewed), they open up alternative analysis paths in potential replications of their study that are based on different samples (i.e., no transformation of the variable because it is not skewed). These alternative analysis paths count as additional (multiple) tests and, consequently, they increase the probability of making a Type I error during hypothesis testing. The present article considers this forking paths problem and evaluates four potential solutions that might be used in psychology and other fields: (a) adjusting the prespecified alpha level, (b) preregistration, (c) sensitivity analyses, and (d) abandoning the Neyman-Pearson approach. It is concluded that although preregistration and sensitivity analyses are effective solutions to p-hacking, they are ineffective against result-neutral forking paths, such as those caused by transforming data. Conversely, although adjusting the alpha level cannot address p-hacking, it can be effective for result-neutral forking paths. Finally, abandoning the Neyman-Pearson approach represents a further solution to the forking paths problem.

Download Full-text

A novel gene-set association test based on variance-gamma distribution

Statistical Methods in Medical Research ◽

10.1177/0962280218791205 ◽

2018 ◽

Vol 28 (9) ◽

pp. 2868-2875

Author(s):

Zhongxue Chen ◽

Qingzhong Liu ◽

Kai Wang

Keyword(s):

Gamma Distribution ◽

Type I Error ◽

Null Distribution ◽

Real Data ◽

Association Test ◽

P Value ◽

Type I ◽

Test Statistic ◽

Data Set ◽

Variance Gamma

Several gene- or set-based association tests have been proposed recently in the literature. Powerful statistical approaches are still highly desirable in this area. In this paper we propose a novel statistical association test, which uses information of the burden component and its complement from the genotypes. This new test statistic has a simple null distribution, which is a special and simplified variance-gamma distribution, and its p-value can be easily calculated. Through a comprehensive simulation study, we show that the new test can control type I error rate and has superior detecting power compared with some popular existing methods. We also apply the new approach to a real data set; the results demonstrate that this test is promising.

Download Full-text

When Studies are in Error: Basic Statistical Vocabulary Needed to Understand Clinical Studies

Journal of Cutaneous Medicine and Surgery ◽

10.1177/120347549600100108 ◽

1996 ◽

Vol 1 (1) ◽

pp. 25-28 ◽

Cited By ~ 1

Author(s):

Martin A. Weinstock

Keyword(s):

Null Hypothesis ◽

Statistical Power ◽

Critical Appraisal ◽

Type I Error ◽

Statistical Significance ◽

P Value ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Error Type

Background: Accurate understanding of certain basic statistical terms and principles is key to critical appraisal of published literature. Objective: This review describes type I error, type II error, null hypothesis, p value, statistical significance, a, two-tailed and one-tailed tests, effect size, alternate hypothesis, statistical power, β, publication bias, confidence interval, standard error, and standard deviation, while including examples from reports of dermatologic studies. Conclusion: The application of the results of published studies to individual patients should be informed by an understanding of certain basic statistical concepts.

Download Full-text

Testing equality of means in partially paired data with incompleteness in single response

Statistical Methods in Medical Research ◽

10.1177/0962280218765007 ◽

2018 ◽

Vol 28 (5) ◽

pp. 1508-1522 ◽

Cited By ~ 1

Author(s):

Qianya Qi ◽

Li Yan ◽

Lili Tian

Keyword(s):

Type I Error ◽

Real Data ◽

The Cancer Genome Atlas ◽

P Value ◽

Type I ◽

Paired Data ◽

Data Set ◽

Equality Of Means ◽

Breast Cancer Study ◽

Single Response

In testing differentially expressed genes between tumor and healthy tissues, data are usually collected in paired form. However, incomplete paired data often occur. While extensive statistical researches exist for paired data with incompleteness in both arms, hardly any recent work can be found on paired data with incompleteness in single arm. This paper aims to fill this gap by proposing some new methods, namely, P-value pooling methods and a nonparametric combination test. Simulation studies are conducted to investigate the performance of the proposed methods in terms of type I error and power at small to moderate sample sizes. A real data set from The Cancer Genome Atlas (TCGA) breast cancer study is analyzed using the proposed methods.

Download Full-text