scholarly journals Mind your gaps: Overlooking assembly gaps confounds statistical testing in genome analysis

2018 ◽  
Author(s):  
Diana Domanska ◽  
Chakravarthi Kanduri ◽  
Boris Simovski ◽  
Geir Kjetil Sandve

AbstractBackgroundThe difficulties associated with sequencing and assembling some regions of the DNA sequence result in gaps in the reference genomes that are typically represented as stretches of Ns. Although the presence of assembly gaps causes a slight reduction in the mapping rate in many experimental settings, that does not invalidate the typical statistical testing comparing read count distributions across experimental conditions. However, we hypothesize that not handling assembly gaps in the null model may confound statistical testing of co-localization of genomic features.ResultsFirst, we performed a series of explorative analyses to understand whether and how the public genomic tracks intersect the assembly gaps track (hg19). The findings rightly confirm that the genomic regions in public genomic tracks intersect very little with assembly gaps and the intersection was observed only at the beginning and end regions of the assembly gaps rather than covering the whole gap sizes. Further, we simulated a set of query and reference genomic tracks in a way that nullified any dependence between them to test our hypothesis that not avoiding assembly gaps in the null model would result in spurious inflation of statistical significance. We then contrasted the distributions of test statistics and p-values of Monte Carlo simulation-based permutation tests that either avoided or not avoided assembly gaps in the null model when testing for significant co-localization between a pair of query and reference tracks. We observed that the statistical tests that did not account for the assembly gaps in the null model resulted in a distribution of the test statistic that is shifted to the right and a distribu tion of p-values that is shifted to the left (leading to inflated significance).ConclusionOur results shows that not accounting for assembly gaps in statistical testing of co-localization analysis may lead to false positives and over-optimistic findings.

2018 ◽  
Vol 16 (2) ◽  
pp. 37
Author(s):  
Peter Mitchell

Statistical appreciation is the knowledge about statistical tests, how they are chosen, the procedure and interpretation of the results, without the calculations of the test statistic.  This was taught as part of quantitative methods to students taking part-time degrees when there was insufficient time to include training on statistical computer packages.  Details of the content, teaching methods and assessment are given, with stress on correct understanding of P-values and interpretation as statistical significance.  Given that many more people need to understand the results and interpretation of statistical tests than to do the calculations, statistical appreciation is of general value, especially to research supervisors.  It also provides a firm base for further learning and training in statistics.


2020 ◽  
Vol 4 (2) ◽  
Author(s):  
Colin B Begg

Abstract Recently, a controversy has erupted regarding the use of statistical significance tests and the associated P values. Prominent academic statisticians have recommended that the use of statistical tests be discouraged or not used at all. This has naturally led to a lot of confusion among research investigators about the support in the academic statistical community for statistical methods in general. In fact, the controversy surrounding the use of P values has a long history. Critics of P values argue that their use encourages bad scientific practice, leading to the publication of far more false-positive and false-negative findings than the methodology would imply. The thesis of this commentary is that the problem is really human nature, the natural proclivity of scientists to believe their own theories and present data in the most favorable light. This is strongly encouraged by a celebrity culture that is fueled by academic institutions, the scientific journals, and the media. The importance of the truth-seeking tradition of the scientific method needs to be reinforced, and this is being helped by current initiatives to improve transparency in science and to encourage reproducible and replicable research. Statistical testing, used correctly, has an important and valuable place in the scientific tradition.


2019 ◽  
Vol 3 (3) ◽  
pp. 827-847 ◽  
Author(s):  
Leonardo Novelli ◽  
Patricia Wollstadt ◽  
Pedro Mediano ◽  
Michael Wibral ◽  
Joseph T. Lizier

Network inference algorithms are valuable tools for the study of large-scale neuroimaging datasets. Multivariate transfer entropy is well suited for this task, being a model-free measure that captures nonlinear and lagged dependencies between time series to infer a minimal directed network model. Greedy algorithms have been proposed to efficiently deal with high-dimensional datasets while avoiding redundant inferences and capturing synergistic effects. However, multiple statistical comparisons may inflate the false positive rate and are computationally demanding, which limited the size of previous validation studies. The algorithm we present—as implemented in the IDTxl open-source software—addresses these challenges by employing hierarchical statistical tests to control the family-wise error rate and to allow for efficient parallelization. The method was validated on synthetic datasets involving random networks of increasing size (up to 100 nodes), for both linear and nonlinear dynamics. The performance increased with the length of the time series, reaching consistently high precision, recall, and specificity (>98% on average) for 10,000 time samples. Varying the statistical significance threshold showed a more favorable precision-recall trade-off for longer time series. Both the network size and the sample size are one order of magnitude larger than previously demonstrated, showing feasibility for typical EEG and magnetoencephalography experiments.


Econometrics ◽  
2019 ◽  
Vol 7 (2) ◽  
pp. 18 ◽  
Author(s):  
Thomas R. Dyckman ◽  
Stephen A. Zeff

A great deal of the accounting research published in recent years has involved statistical tests. Our paper proposes improvements to both the quality and execution of such research. We address the following limitations in current research that appear to us to be ignored or used inappropriately: (1) unaddressed situational effects resulting from model limitations and what has been referred to as “data carpentry,” (2) limitations and alternatives to winsorizing, (3) necessary improvements to relying on a study’s calculated “p-values” instead of on the economic or behavioral importance of the results, and (4) the information loss incurred by under-valuing what can and cannot be learned from replications.


Author(s):  
Andreas Buja ◽  
Dianne Cook ◽  
Heike Hofmann ◽  
Michael Lawrence ◽  
Eun-Kyung Lee ◽  
...  

We propose to furnish visual statistical methods with an inferential framework and protocol, modelled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests. Statistical significance of ‘discoveries’ is measured by having the human viewer compare the plot of the real dataset with collections of plots of simulated datasets. A simple but rigorous protocol that provides inferential validity is modelled after the ‘lineup’ popular from criminal legal procedures. Another protocol modelled after the ‘Rorschach’ inkblot test, well known from (pop-)psychology, will help analysts acclimatize to random variability before being exposed to the plot of the real data. The proposed protocols will be useful for exploratory data analysis, with reference datasets simulated by using a null assumption that structure is absent. The framework is also useful for model diagnostics in which case reference datasets are simulated from the model in question. This latter point follows up on previous proposals. Adopting the protocols will mean an adjustment in working procedures for data analysts, adding more rigour, and teachers might find that incorporating these protocols into the curriculum improves their students’ statistical thinking.


Author(s):  
Zafar Iqbal ◽  
Lubna Waheed ◽  
Waheed Muhammad ◽  
Rajab Muhammad

Purpose: Quality Function Deployment, (QFD) is a methodology which helps to satisfy customer requirements through the selection of appropriate Technical Attributes (TAs). The rationale of this article is to provide a method lending statistical support to the selection of TAs.  The purpose is to determine the statistical significance of TAs through the derivation of associated significance (P) values.   Design/Methodology/Approach: We demonstrate our methodology with reference to an original QFD case study aimed at improving the educational system in high schools in Pakistan; and then with five further published case studies obtained from literature. Mean weights of TAs are determined. Considering each TA mean weight to be a Test Statistic, a weighted matrix is generated from the VOCs’ importance ratings, and ratings in the relationship matrix. Finally using R, P-values for the means of original TAs are determined from the hypothetical population of means of TAs.  Findings: Each TA’s P-value evaluates its significance/insignificance in terms of distance from the grand mean. P-values indirectly set the prioritization of TAs. Implications/Originality/Value: The novel aspect of this study is extension of mean weights of TAs, to also provide P-values for TAs. TAs with significant importance can be resolved on priority basis, while other can be fixed with appropriateness.


2019 ◽  
Vol 24 (3) ◽  
pp. 209-213
Author(s):  
Şükriye Deniz Mutluay ◽  
Memduha Gülhal Bozkır

Objectives: Estimating stature from long extremity bones, such as femur, humerus, is commonly usedduring forensic examinations. The aim of this study is to estimate stature by anthropometric measurements of right and left-hands second (2D) and fourth digit (4D) lengths. Method:The sample group consisted of 140 young adults, 70 male and 70 females (aged 21-19 years), whose 2D and 4D lengths were measured (using digital vernier caliper) of their left and right hands. One measurement was taken directly from landmarks from the proximal metacarpophalangeal crease to the finger tips. The program SPSS (Version 17.0) was used to make a descriptive analysis, Student’s t-test was usedto analyze the difference in height 2D and 4D between males and females. One-way ANOVA was usedto determine the potential interactions between anthropometric measurements within each other and stature. Pearson Correlation coefficient and related P values were also used. Statistical significance was assigned to p values <0.05. Linear and multiple regressions were also developed. Results:The differences between the right-and the left fingers length values were statistically significant for both sexes (p<0.001). In all, the measurements of males were significantly higher than females. The correlation coefficients between stature and the measurements of second and fourth digit were found to be positive and statistically significant. The highest correlation coefficient between stature and digit length for males regarded the right second digit (r=0.505), and for females, the left second digit (r=0.596). Regression equations were checked for accuracy by comparing the estimated stature and actual stature. Conclusion: Both regression models can...


2018 ◽  
Vol 13 (7) ◽  
pp. 669-672 ◽  
Author(s):  
Mayank Goyal ◽  
Aravind Ganesh ◽  
Scott Brown ◽  
Bijoy K Menon ◽  
Michael D Hill

The modified Rankin Scale (mRS) at 90 days after stroke onset has become the preferred outcome measure in acute stroke trials, including recent trials of interventional therapies. Reporting the range of modified Rankin Scale scores as a paired horizontal stacked bar graph (colloquially known as “Grotta bars”) has become the conventional method of visualizing modified Rankin Scale results. Grotta bars readily illustrate the levels of the ordinal modified Rankin Scale in which benefit may have occurred. However, complementing the available graphical information by including additional features to convey statistical significance may be advantageous. We propose a modification of the horizontal stacked bar graph with illustrative examples. In this suggested modification, the line joining the segments of the bar graph (e.g. modified Rankin Scale 1–2 in treatment arm to modified Rankin Scale 1–2 in control arm) is given a color and thickness based on the p-value of the result at that level (in this example, the p-value of modified Rankin Scale 0–1 vs. 2–6)—a thick green line for p-values <0.01, thin green for p-values of 0.01 to <0.05, gray for 0.05 to <0.10, thin red for 0.10 to <0.90, and thick red for p-values ≥0.90 or outcome favoring the control group. Illustrative examples from four recent trials (ESCAPE, SWIFT-PRIME, IST-3, ASTER) are shown to demonstrate the range of significant and non-significant effects that can be captured using this proposed method. By formalizing a display of outcomes which includes statistical tests of all possible dichotomizations of the Rankin scale, this approach also encourages pre-specification of such hypotheses. Prespecifying tests of all six dichotomizations of the Rankin scale provides all possible statistical information in an a priori fashion. Since the result of our proposed approach is six distinct dichotomized tests in addition to a primary test, e.g. of the ordinal Rankin shift, it may be prudent to account for multiplicity in testing by using dichotomized p-values only after adjustment, such as by the Bonferroni or Hochberg-Holm methods. Whether p-values are nominal or adjusted may be left to the discretion of the presenter as long as the presence or absence is clearly stated in the statistical methods. Our proposed modification results in a visually intuitive summary of both the size of the effect—represented by the matched bars and their connecting segments—as well as its statistical relevance.


2020 ◽  
Vol 7 (2) ◽  
pp. 150
Author(s):  
Henian Chen ◽  
Yuanyuan Lu ◽  
Nicole Slye

<p class="abstract">Reporting statistical tests for baseline measures of clinical trials does not make sense since the statistical significance is dependent on sample size, as a large trial can find significance in the same difference that a small trial did not find to be statistically significant. We use 3 published trials using the same baseline measures to provide the relationship between trial sample size and p value. For trial 1 sequential organ failure assessment (SOFA) score, p=0.01, 10.4±3.4 vs. 9.6±3.2, difference=0.8; p=0.007 for vasopressors, 83.0% vs. 72.6%. Trial 2 has SOFA score 11±3 vs. 12±3, difference=1, p=0.42. Trial 3 has vasopressors 73% vs. 83%, p=0.21. Based on trial 2, supine group has a mean of 12 and an SD of 3 for SOFA score, while prone group has a mean of 11 and an SD of 3 for SOFA score. The p values are 0.29850, 0.09877, 0.01940, 0.00094, 0.00005, and &lt;0.00001 when n (per arm) is 20, 50, 100, 200, 300 and 400, respectively. Based on trial 3 information, the vasopressors percentages are 73.0% in the supine group vs. 83.0% in the prone group. The p values are 0.4452, 0.2274, 0.0878, 0.0158, 0.0031, and 0.0006 when n (per arm) is 20, 50, 100, 200, 300 and 400, respectively. Small trials provide larger p values than big trials for the same baseline differences. We cannot define the imbalance in baseline measures only based on these p values. There is no statistical basis for advocating the baseline difference tests</p>


Author(s):  
Janet L. Peacock ◽  
Philip J. Peacock

Introduction 238 Samples and populations 240 Confidence interval for a mean 242 95% confidence interval for a proportion 244 Tests of statistical significance 246 P values 248 Statistical significance and clinical significance 250 t test for two independent means 252 t test for two independent means: example ...


Sign in / Sign up

Export Citation Format

Share Document