Feeling the future: A meta-analysis of 90 experiments on the anomalous anticipation of random future events

F1000Research ◽

10.12688/f1000research.7177.2 ◽

2016 ◽

Vol 4 ◽

pp. 1188 ◽

Cited By ~ 14

Author(s):

Daryl Bem ◽

Patrizio E. Tressoldi ◽

Thomas Rabeyron ◽

Michael Duggan

Keyword(s):

Effect Size ◽

Bayes Factor ◽

Statistical Tests ◽

Meta Analysis ◽

Statistical Significance ◽

Statistical Technique ◽

True Effect Size ◽

Selective Suppression ◽

Future Events ◽

Decisive Evidence

In 2011, one of the authors (DJB) published a report of nine experiments in the Journal of Personality and Social Psychology purporting to demonstrate that an individual’s cognitive and affective responses can be influenced by randomly selected stimulus events that do not occur until after his or her responses have already been made and recorded, a generalized variant of the phenomenon traditionally denoted by the term precognition. To encourage replications, all materials needed to conduct them were made available on request. We here report a meta-analysis of 90 experiments from 33 laboratories in 14 countries which yielded an overall effect greater than 6 sigma, z = 6.40, p = 1.2 × 10-10 with an effect size (Hedges’ g) of 0.09. A Bayesian analysis yielded a Bayes Factor of 5.1 × 109, greatly exceeding the criterion value of 100 for “decisive evidence” in support of the experimental hypothesis. When DJB’s original experiments are excluded, the combined effect size for replications by independent investigators is 0.06, z = 4.16, p = 1.1 × 10-5, and the BF value is 3,853, again exceeding the criterion for “decisive evidence.” The number of potentially unretrieved experiments required to reduce the overall effect size of the complete database to a trivial value of 0.01 is 544, and seven of eight additional statistical tests support the conclusion that the database is not significantly compromised by either selection bias or by intense “p-hacking”—the selective suppression of findings or analyses that failed to yield statistical significance. P-curve analysis, a recently introduced statistical technique, estimates the true effect size of the experiments to be 0.20 for the complete database and 0.24 for the independent replications, virtually identical to the effect size of DJB’s original experiments (0.22) and the closely related “presentiment” experiments (0.21). We discuss the controversial status of precognition and other anomalous effects collectively known as psi.

Get full-text (via PubEx)

Low statistical power in biomedical science: a review of three human research domains

Royal Society Open Science ◽

10.1098/rsos.160254 ◽

2017 ◽

Vol 4 (2) ◽

pp. 160254 ◽

Cited By ~ 71

Author(s):

Estelle Dumas-Mallet ◽

Katherine S. Button ◽

Thomas Boraud ◽

Francois Gonon ◽

Marcus R. Munafò

Keyword(s):

Effect Size ◽

Statistical Power ◽

Meta Analysis ◽

Statistical Significance ◽

Average Power ◽

Biomedical Science ◽

Significant Finding ◽

Biomedical Sciences ◽

True Effect Size ◽

Meta Analyses

Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0–10% or 11–20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation.

Get full-text (via PubEx)

Comparison of the power of statistical tests in connection with the discussion about the reproducibility criterion

Statistics and Economics ◽

10.21686/2500-3925-2018-5-4-14 ◽

2018 ◽

Vol 15 (5) ◽

pp. 4-14

Author(s):

V. E. Osipov

Keyword(s):

Effect Size ◽

Statistical Tests ◽

Meta Analysis ◽

Statistical Significance ◽

Mathematical Formulation ◽

Random Component ◽

Significance Level ◽

Two Samples ◽

Methodology Of Science ◽

Typical Length

The criterion of reproducibility, as well as its functioning in post-non-classical science, are discussed in the Russian methodology of science. At the same time, critics avoid statistical calculations in their arguments. This raises the following questions: “What is reproducibility?” and “What is the mathematical formulation of the reproducibility criterion?” Literature review has identified five indicators of reproducibility, which was proposed by foreign colleagues. These indicators are being tested and discussed. However, there is no General mathematical formulation of the reproducibility criterion (an integral criterion covering these indicators), and these indicators have not yet become a standard. In the present work, we compare two statistical tests, related to one of these five indicators of reproducibility.Purpose of the study. The aim of this paper is to compare the powers of two tests of statistical significance that can be used to reveal the effect with the requirement of reproducibility of research results. In this case, the reproducibility is estimated by the indicator “significance”. In accordance with the first criterion, the effect is considered to be revealed if the effect size in all studies is significant (i.e. if the significance of the effect size is reproduced in all studies). In accordance with the second criterion, the effect is considered to be revealed if the weighted mean of the effect size obtained as a result of meta-analysis is significant (the significance of the effect size may be absent in individual studies).Materials and methods. Methods of mathematical statistics are used to achieve this goal. The powers of two tests are compared by two estimates. The first estimate is theoretical. The second one was obtained during a statistical experiment. The powers are calculated: 1) for different values of the Cohen’s effect size: “small”, “medium” and “large”, 2) for different degree of heterogeneity: zero (fixed-effect primary studies (from 2 to 8).Results. The power of the first test is less or much less than the power of the second one. The power of the first test decreases with the growth of the number of primary studies, and the power of the second one increases. Taking into account the conventional power value equal to 80%, the first criterion is unsuitable for use in the considered values of the parameters of primary studies (that is, if a two-tailed t-test with the significance level of 0.05 and with two samples of the typical length n=25 is used to determine the significance of the effect size in individual studies), while the power of the second test can be increased if necessary by increasing the number of primary studies included in the meta-analysis.Conclusion. If the criterion of reproducibility, known from the philosophy of science, is intended to confirm the existence of the effect (connection) or, in other words, to reveal the effect, in conditions where there is a significant random component in the measurement process, it is advisable to apply not the first, but the second test.

Get full-text (via PubEx)

Potential Therapeutic Effects of Curcumin on Glycemic and Lipid Profile in Uncomplicated Type 2 Diabetes—A Meta-Analysis of Randomized Controlled Trial

Nutrients ◽

10.3390/nu13020404 ◽

2021 ◽

Vol 13 (2) ◽

pp. 404

Author(s):

Emma Altobelli ◽

Paolo Matteo Angeletti ◽

Ciro Marziliano ◽

Marianna Mastrodomenico ◽

Anna Rita Giuliani ◽

...

Keyword(s):

Type 2 Diabetes ◽

Lipid Profile ◽

Effect Size ◽

Meta Analysis ◽

Statistical Significance ◽

Glycosylated Hemoglobin ◽

Cochrane Library ◽

Therapeutic Effects ◽

Model Assessment

Diabetes mellitus is an important issue for public health, and it is growing in the world. In recent years, there has been a growing research interest on efficacy evidence of the curcumin use in the regulation of glycemia and lipidaemia. The molecular structure of curcumins allows to intercept reactive oxygen species (ROI) that are particularly harmful in chronic inflammation and tumorigenesis models. The aim of our study performed a systematic review and meta-analysis to evaluate the effect of curcumin on glycemic and lipid profile in subjects with uncomplicated type 2 diabetes. The papers included in the meta-analysis were sought in the MEDLINE, EMBASE, Scopus, Clinicaltrials.gov, Web of Science, and Cochrane Library databases as of October 2020. The sizes were pooled across studies in order to obtain an overall effect size. A random effects model was used to account for different sources of variation among studies. Cohen’s d, with 95% confidence interval (CI) was used as a measure of the effect size. Heterogeneity was assessed while using Q statistics. The ANOVA-Q test was used to value the differences among groups. Publication bias was analyzed and represented by a funnel plot. Curcumin treatment does not show a statistically significant reduction between treated and untreated patients. On the other hand, glycosylated hemoglobin, homeostasis model assessment (HOMA), and low-density lipoprotein (LDL) showed a statistically significant reduction in subjects that were treated with curcumin, respectively (p = 0.008, p < 0.001, p = 0.021). When considering HBA1c, the meta-regressions only showed statistical significance for gender (p = 0.034). Our meta-analysis seems to confirm the benefits on glucose metabolism, with results that appear to be more solid than those of lipid metabolism. However, further studies are needed in order to test the efficacy and safety of curcumin in uncomplicated type 2 diabetes.

Get full-text (via PubEx)

Abstract 3789: Nessun Dorma : Have the Risks of Rosiglitazone been Exaggerated?

Circulation ◽

10.1161/circ.116.suppl_16.ii_861-c ◽

2007 ◽

Vol 116 (suppl_16) ◽

Author(s):

George A Diamond ◽

Sanjay Kaul

Keyword(s):

Confidence Intervals ◽

Effect Size ◽

Fixed Effects ◽

Meta Analysis ◽

Cardiovascular Death ◽

Sensitivity Analyses ◽

Odds Ratios ◽

True Effect Size ◽

Index Study ◽

The Stability

Background A highly publicized meta-analysis of 42 clinical trials comprising 27,844 diabetics ignited a firestorm of controversy by charging that treatment with rosiglitazone was associated with a “…worrisome…” 43% greater risk of myocardial infarction ( p =0.03) and a 64% greater risk of cardiovascular death ( p =0.06). Objective The investigators excluded 4 trials from the infarction analysis and 19 trials from the mortality analysis in which no events were observed. We sought to determine if these exclusions biased the results. Methods We compared the index study to a Bayesian meta-analysis of the entire 42 trials (using odds ratio as the measure of effect size) and to fixed-effects and random-effects analyses with and without a continuity correction that adjusts for values of zero. Results The odds ratios and confidence intervals for the analyses are summarized in the Table . Odds ratios for infarction ranged from 1.43 to 1.22 and for death from 1.64 to 1.13. Corrected models resulted in substantially smaller odds ratios and narrower confidence intervals than did uncorrected models. Although corrected risks remain elevated, none are statistically significant (*p<0.05). Conclusions Given the fragility of the effect sizes and confidence intervals, the charge that roziglitazone increases the risk of adverse events is not supported by these additional analyses. The exaggerated values observed in the index study are likely the result of excluding the zero-event trials from analysis. Continuity adjustments mitigate this error and provide more consistent and reliable assessments of true effect size. Transparent sensitivity analyses should therefore be performed over a realistic range of the operative assumptions to verify the stability of such assessments especially when outcome events are rare. Given the relatively wide confidence intervals, additional data will be required to adjudicate these inconclusive results.

Get full-text (via PubEx)

Do sexist jokes increase rape proclivity among males high in hostile sexism? Evidence from two pre-registered direct replications of Thomae & Viki (2013)

10.31234/osf.io/cdfxm ◽

2021 ◽

Author(s):

Neil McLatchie ◽

Manuela Thomae

Keyword(s):

Bayes Factor ◽

Alternative Hypothesis ◽

Meta Analysis ◽

Statistical Significance ◽

Original Study ◽

Bayes Factors ◽

Hostile Sexism ◽

Replication Studies ◽

Open Research ◽

Rape Proclivity

Thomae and Viki (2013) reported that increased exposure to sexist humour can increase rape proclivity among males, specifically those who score high on measures of Hostile Sexism. Here we report two pre-registered direct replications (N = 530) of Study 2 from Thomae and Viki (2013) and assess replicability via (i) statistical significance, (ii) Bayes factors, (iii) the small-telescope approach, and (iv) an internal meta-analysis across the original and replication studies. The original results were not supported by any of the approaches. Combining the original study and the replications yielded moderate evidence in support of the null over the alternative hypothesis with a Bayes factor of B = 0.13. In light of the combined evidence, we encourage researchers to exercise caution before claiming that brief exposure to sexist humour increases male’s proclivity towards rape, until further pre-registered and open research demonstrates the effect is reliably reproducible.

Get full-text (via PubEx)

Focus on Data: Statistical Significance, Effect Size and the Accumulation of Evidence Achieved by Combining Study Results Through Meta-analysis

Investigative Opthalmology & Visual Science ◽

10.1167/iovs.61.10.32 ◽

2020 ◽

Vol 61 (10) ◽

pp. 32

Author(s):

Johannes Ledolter ◽

Randy H. Kardon

Keyword(s):

Effect Size ◽

Meta Analysis ◽

Statistical Significance ◽

Study Results

Get full-text (via PubEx)

Can Reliance be Placed on a Single Meta-Analysis?

Australian & New Zealand Journal of Psychiatry ◽

10.3109/00048679009077710 ◽

1990 ◽

Vol 24 (3) ◽

pp. 405-415 ◽

Cited By ~ 16

Author(s):

Nathaniel McConaghy

Keyword(s):

Literature Review ◽

Effect Size ◽

Meta Analysis ◽

Statistical Significance ◽

Effect Sizes ◽

Control Groups ◽

Consistent Finding ◽

Placebo Controls ◽

Effect Of Treatment ◽

Meta Analyses

Meta-analysis replaced statistical significance with effect size in the hope of resolving controversy concerning evaluation of treatment effects. Statistical significance measured reliability of the effect of treatment, not its efficacy. It was strongly influenced by the number of subjects investigated. Effect size as assessed originally, eliminated this influence but by standardizing the size of the treatment effect could distort it. Meta-analyses which combine the results of studies which employ different subject types, outcome measures, treatment aims, no-treatment rather than placebo controls or therapists with varying experience can be misleading. To ensure discussion of these variables meta-analyses should be used as an aid rather than a substitute for literature review. While meta-analyses produce contradictory findings, it seems unwise to rely on the conclusions of an individual analysis. Their consistent finding that placebo treatments obtain markedly higher effect sizes than no treatment hopefully will render the use of untreated control groups obsolete.

Get full-text (via PubEx)

Bias in emerging biomarkers for bipolar disorder

Psychological Medicine ◽

10.1017/s0033291716000957 ◽

2016 ◽

Vol 46 (11) ◽

pp. 2287-2297 ◽

Cited By ~ 38

Author(s):

A. F. Carvalho ◽

C. A. Köhler ◽

B. S. Fernandes ◽

J. Quevedo ◽

K. W. Miskowiak ◽

...

Keyword(s):

Bipolar Disorder ◽

Effect Size ◽

Comprehensive Evaluation ◽

Meta Analysis ◽

Selective Reporting ◽

True Effect Size ◽

Genetic Biomarkers ◽

Morning Cortisol ◽

Meta Analyses ◽

Positive Results

BackgroundTo date no comprehensive evaluation has appraised the likelihood of bias or the strength of the evidence of peripheral biomarkers for bipolar disorder (BD). Here we performed an umbrella review of meta-analyses of peripheral non-genetic biomarkers for BD.MethodThe Pubmed/Medline, EMBASE and PsycInfo electronic databases were searched up to May 2015. Two independent authors conducted searches, examined references for eligibility, and extracted data. Meta-analyses in any language examining peripheral non-genetic biomarkers in participants with BD (across different mood states) compared to unaffected controls were included.ResultsSix references, which examined 13 biomarkers across 20 meta-analyses (5474 BD cases and 4823 healthy controls) met inclusion criteria. Evidence for excess of significance bias (i.e. bias favoring publication of ‘positive’ nominally significant results) was observed in 11 meta-analyses. Heterogeneity was high for (I2 ⩾ 50%) 16 meta-analyses. Only two biomarkers met criteria for suggestive evidence namely the soluble IL-2 receptor and morning cortisol. The median power of included studies, using the effect size of the largest dataset as the plausible true effect size of each meta-analysis, was 15.3%.ConclusionsOur findings suggest that there is an excess of statistically significant results in the literature of peripheral biomarkers for BD. Selective publication of ‘positive’ results and selective reporting of outcomes are possible mechanisms.

Get full-text (via PubEx)

Does Every Study? Implementing Ordinal Constraint in Meta-Analysis

10.31234/osf.io/hf9se ◽

2020 ◽

Author(s):

Julia M. Haaf ◽

Jeffrey N. Rouder

Keyword(s):

Effect Size ◽

Model Comparison ◽

Meta Analysis ◽

Underlying Mechanism ◽

Average Effect ◽

True Effect Size ◽

Unconstrained Model ◽

Comparison Approach ◽

Major Implication ◽

Surface Statistics

The most prominent goal when conducting a meta-analysis is to estimate the true effect size across a set of studies. This approach is problematic whenever the analyzed studies are inconsistent, i.e. some studies show an effect in the predicted direction while others show no effect and still others show an effect in the opposite direction. In case of such an inconsistency, the average effect may be a product of a mixture of mechanisms. The first question in any meta-analysis should therefore be whether all studies show an effect in the same direction. To tackle this question a model with multiple ordinal constraints is proposed---one constraint for each study in the set. This "every study" model is compared to a set of alternative models, such as an unconstrained model that predicts effects in both directions. If the ordinal constraints hold, one underlying mechanism may suffice to explain the results from all studies. A major implication is then that average effects become interpretable. We illustrate the model-comparison approach using Carbajal et al.'s (2020) meta-analysis on the familiar-word-recognition effect, show how predictor analyses can be incorporated in the approach, and provide R-code for interested researchers. As common in meta-analysis, only surface statistics (such as effect size and sample size) are provided from each study, and the modeling approach can be adapted to suit these conditions.

Get full-text (via PubEx)

The Effect of Publication Bias on the Assessment of Heterogeneity

10.31219/osf.io/gv25c ◽

2017 ◽

Author(s):

Hilde Augusteijn ◽

Robbie Cornelis Maria van Aert ◽

Marcel A. L. M. van Assen

Keyword(s):

Publication Bias ◽

Effect Size ◽

Type I Error ◽

Meta Analysis ◽

Error Rates ◽

Population Heterogeneity ◽

Type I ◽

Monte Carlo Simulation Study ◽

True Effect Size ◽

True Effect

One of the main goals of meta-analysis is to test and estimate the heterogeneity of effect size. We examined the effect of publication bias on the Q-test and assessments of heterogeneity, as a function of true heterogeneity, publication bias, true effect size, number of studies, and variation of sample sizes. The expected values of heterogeneity measures H2 and I2 were analytically derived, and the power and the type I error rate of the Q-test were examined in a Monte-Carlo simulation study. Our results show that the effect of publication bias on the Q-test and assessment of heterogeneity is large, complex, and non-linear. Publication bias can both dramatically decrease and increase heterogeneity. Extreme homogeneity can occur even when the population heterogeneity is large. Particularly if the number of studies is large and population effect size is small, publication bias can cause both extreme type I error rates and power of the Q-test close to 0 or 1. We therefore conclude that the Q-test of homogeneity and heterogeneity measures H2 and I2 are generally not valid in assessing and testing heterogeneity when publication bias is present, especially when the true effect size is small and the number of studies is large. We introduce a web application, Q-sense, which can be used to assess the sensitivity of the Q-test to publication bias, and we apply it to two published meta-analysis. Meta-analytic methods should be enhanced in order to be able to deal with publication bias in their assessment and tests of heterogeneity.

Get full-text (via PubEx)