scholarly journals Is the Replicability Crisis Overblown? Three Arguments Examined

2012 ◽  
Vol 7 (6) ◽  
pp. 531-536 ◽  
Author(s):  
Harold Pashler ◽  
Christine R. Harris

We discuss three arguments voiced by scientists who view the current outpouring of concern about replicability as overblown. The first idea is that the adoption of a low alpha level (e.g., 5%) puts reasonable bounds on the rate at which errors can enter the published literature, making false-positive effects rare enough to be considered a minor issue. This, we point out, rests on statistical misunderstanding: The alpha level imposes no limit on the rate at which errors may arise in the literature (Ioannidis, 2005b). Second, some argue that whereas direct replication attempts are uncommon, conceptual replication attempts are common—providing an even better test of the validity of a phenomenon. We contend that performing conceptual rather than direct replication attempts interacts insidiously with publication bias, opening the door to literatures that appear to confirm the reality of phenomena that in fact do not exist. Finally, we discuss the argument that errors will eventually be pruned out of the literature if the field would just show a bit of patience. We contend that there are no plausible concrete scenarios to back up such forecasts and that what is needed is not patience, but rather systematic reforms in scientific practice.

2019 ◽  
Author(s):  
Amanda Kvarven ◽  
Eirik Strømland ◽  
Magnus Johannesson

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.


2017 ◽  
Author(s):  
G. Lohmann ◽  
J. Stelzer ◽  
K. Müller ◽  
E. Lacosse ◽  
T. Buschmann ◽  
...  

AbstractReproducibility is generally regarded as a hallmark of scientific validity. It can be undermined by two very different factors, namely inflated false positive rates or inflated false negative rates. Here we investigate the role of the second factor, i.e. the degree to which true effects are not detected reliably. The availability of large public databases and also supercomputing allows us to tackle this problem quantitatively. Specifically, we estimated the reproducibility in task-based fMRI data over different samples randomly drawn from a large cohort of subjects obtained from the Human Connectome Project. We use the full cohort as a standard of reference to approximate true positive effects, and compute the fraction of those effects that was detected reliably using standard software packages at various smaller sample sizes. We found that with standard sample sizes this fraction was less than 25 percent. We conclude that inflated false negative rates are a major factor that undermine reproducibility. We introduce a new statistical inference algorithm based on a novel test statistic and show that it improves reproducibility without inflating false positive rates.


2021 ◽  
pp. 261-270
Author(s):  
R. Barker Bausell

In this chapter, educational recommendations for future scientists are suggested followed by possible scenarios that may characterize the future of the reproducibility initiatives discussed in previous chapters. One such scenario, while quite pessimistic, is not without historical precedent. Namely, that the entire movement may turn out to be little more than a publishing opportunity for methodologically oriented scientists—soon replaced by something else and forgotten by most—thereby allowing it to be reprised a few decades later under a different name by different academics. Alternately, and more optimistically, the procedural and statistical behaviors discussed here will receive an increased emphasis in the scientific curricula accompanied by a sea change in actual scientific practice and its culture—thereby producing a substantial reduction in the prevalence of avoidable false-positive scientific results. And indeed recent evidence does appear to suggest that the reproducibility initiatives instituted by the dedicated cadre of methodologically oriented scientists chronicled in this book have indeed begun the process of making substantive improvements in the quality and veracity of scientific inquiry itself.


2010 ◽  
Vol 197 (4) ◽  
pp. 257-258 ◽  
Author(s):  
Marcus R. Munafò ◽  
Jonathan Flint

SummaryThere is growing concern that a substantial proportion of scientific research may in fact be false. A number of factors have been proposed as contributing to the presence of a large number of false-positive results in the literature, one of which is publication bias. We discuss empirical evidence for these factors.


2020 ◽  
Author(s):  
Jan Walleczek ◽  
Von Stillfried

A general commentary by Walleczek and von Stillfried (2020) was recently published in Frontiers in Psychology. The present work provides an account of (i) the detailed research record and (ii) the main arguments behind the commentary for the purpose of full transparency and disclosure. For historical overview, Walleczek and von Stillfried (2019) had previously reported (i) the absence of any true-positive effects and (ii) the presence of one false-positive effect in a commissioned replication study of the Radin double-slit (DS) experiment on observer consciousness. In their subsequent misrepresentations, Radin et al. (2019, 2020) regrettably used the malpractice of undisclosed HARKing, i.e., undisclosed hypothesizing after the results are known. HARKing can increase greatly the risk of false-negative or false-positive conclusions. Specifically, Radin et al. (2019, 2020) deviated in two major ways from the pre-specified protocol for this commissioned study, which (i) was agreed to by Radin before data collection was started (Radin, 2011) and (ii) included data encryption to prevent the use of p-hacking and HARKing. First, Radin et al. (2019) violate the original research design by reporting a so-called “true-positive outcome of a secondary planned hypothesis”. Contrary to the claim by Radin et al. (2019, 2020), that hypothesis was not, however, part of the planned test strategy, but, instead, the associated statistical analysis – a chi-square test – was chosen by Radin sometime after the planned statistical analysis had been completed and the data unblinded. Second, Radin et al. (2019, 2020) violate the funder-approved research design in an additional way by falsely claiming that the newly developed protocol, i.e., the advanced meta-experimental protocol (AMP), implements a non-predictive test strategy when – in fact – the AMP-based test strategy is strictly predictive. Put simply, Radin et al. (2019, 2020) are mistaken that the funder-approved hypotheses posited the random occurrence of effects for the test categories in this replication experiment; instead, a different specific prediction was tested in each of the eight planned test categories, and true-positive effects were predicted to occur for only two (12.5%) of the 16 possible measurement outcomes of the eight planned single-test categories. Therefore, in the predictive single-testing regime, a statistical correction for non-predictive, i.e., random, multiple testing would not be appropriate and would thus violate the AMP-based strategy, which was implemented in the commissioned study based upon the planned outcome predictions as pre-specified in Radin (2011). Neither of these post-hoc changes by Radin et al. (on the basis of HARKing) were disclosed in Radin et al. (2019, 2020) and both these changes violate the funder-approved, original methodology agreed upon in Radin (2011) and pre-specified in the research contract. In summary, the present work reconfirms that – exactly as reported in Walleczek and von Stillfried (2019) – “the false-positive effect, which would be indistinguishable from the predicted true-positive effect, was significant at p = 0.021 (σ = −2.02; N = 1,250 test trials)” and “no statistically significant effects could be identified” in those two groups for which true-positives were predicted to occur. These observations are consistent also with an independent statistical reanalysis of the Radin DS-experiment by Tremblay (2019) and a replication attempt by Guerrer (2019). Tremblay reported significant false-positives in control groups and Guerrer found significant effects with post-hoc analyses only, but null results only when using the planned confirmatory analysis. As a general recommendation, the authors call for the implementation of advanced control-test strategies, including novel approaches from the metascience reform movement, for empirically detecting and preventing uncontrolled false-positive effects in parapsychological research.


2021 ◽  
pp. 56-90
Author(s):  
R. Barker Bausell

The linchpin of both publication bias and irreproducibility involves an exhaustive list of more than a score of individually avoidable questionable research practices (QRPs) supplemented by 10 inane institutional research practices. While these untoward effects on the production of false-positive results are unsettling, a far more entertaining (in a masochistic sort of way) pair of now famous iconoclastic experiments conducted by Simmons, Nelson, and Simonsohn are presented in which, with the help of only a few well-chosen QRPs, research participants can actually become older after simply listening to a Beatle’s song. In addition, surveys designed to estimate the prevalence of these and other QRPs in the published literatures are also described.


1997 ◽  
Vol 131 (4) ◽  
pp. 371-382 ◽  
Author(s):  
Douglas M. Klieger ◽  
Kimberly K. Siejak

2021 ◽  
Author(s):  
Carlos D'Apolito ◽  
Carlos Jaramillo ◽  
Guy Harrington

During the Miocene, Andean tectonism caused the development of a vast wetland across western Amazonia. Palynological studies have been the main source of chronological and paleobotanical information for this region, including several boreholes in the Solimões Formation in western Brazilian Amazonia. Here, a palynological study of well core 1-AS-105-AM drilled in Tabatinga (Amazonas, Brazil) is presented: 91 new taxa are erected (25 spores and 66 pollen, including one new genus), 16 new combinations are proposed, and a list of botanical/ecological affinities is updated. We recorded 23,880 palynomorphs distributed in 401 different types. Among pollen and spores, 62 extant families and 99 extant genera were identified, which accounts for 39% and 30% of known botanical affinities to the family and genus level, respectively. Individual samples have pollen/spore counts with approximately 25% to 95% of known affinities to the family level. Pollen associations are sourced primarily from the wetland environments and to a minor extent from nonflooded forests. Palynological diversity analyses indicate an increase from the early to the middle/early late Miocene in core 1-AS-105-AM. Probable scenarios to explain this diversity increase include a higher degree of environmental complexity from the middle Miocene onwards, that is, a more heterogeneous riverscape, including broader extensions of nonflooded forests, as opposed to the swamp-dominated early Miocene. Additionally, the positive effects of the Miocene Climatic Optimum on plant richness could explain the increase in pollen richness. We posit hypotheses of forest diversification that can be tested as more botanical affinities are established along with a longer Miocene record.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 4947-4947
Author(s):  
Woo Jae Kwoun ◽  
Jeong-Yeal Ahn ◽  
Ja Young Seo ◽  
Jae Hoon Lee ◽  
Hawk Kim ◽  
...  

Abstract Introduction Flow cytometry is the gold standard in diagnosis of paroxysmal nocturnal hemoglobinuria (PNH) by detecting the absence of glycol-phosphatidyl inositol (GPI)-linked protein expression on red blood cell, granulocyte, and monocyte. The current assays are 4-color analyses of GPI-linked markers such as fluorescein-labeled proaerolysin (FLAER), CD24, CD14, CD59, and CD235a and the lineage markers for granulocyte (CD15) and monocyte (CD64) cells to detect PNH clones. We investigated the utility of CD14/CD64 monocyte gating by comparing with CD45/light scatter (LS) gating in PNH study of the patients with cytopenia and analyzed the types and cell lineages of PNH clone according to the disease groups. Method Total 138 cases were recruited in this study from July 2017 to February 2018 at Gachon University Gil Medical Center in Korea. Flow cytometric analysis was performed with EDTA blood by Beckman Coulter Cytomics FC500 cytometer using gating antibodies such as CD45, CD14, CD15, CD64, CD235a and GPI-linked antibodies such as CD59, CD14, CD24, FLAER. The proportion of monocyte was estimated by CD14/CD64 gating and compared with those using CD45/LS gating. The type of PNH clone was defined according to the size of PNH population. A PNH clone is defined as a PNH population exceeding 1% of the gated cells, a minor PNH clone as between 0.1 and 1%, and rare cells with GPI-deficiency defined as a PNH population less than 0.1%. The types and cell lineages of the PNH clone were analyzed according to the disease groups. Statistical analysis was done using SPSS 17.0 and MedCalc 15.2, and P<0.05 was considered statistically significant. Results Of the 138 cases, PNH clone was detected with 27 cases including 15 cases with a PNH clone and 12 cases with a minor PNH clone. PNH clone was observed in all 8 cases (100%) of PNH cases. Two PNH clone and 4 minor PNH clones were identified in 6 of 16 cases (38%) of acute myeloid leukemia. In 6 of 21 cases (29%) of aplastic anemia (AA) show 5 PNH clones and 1 minor PNH clone. In 5 of 78 cases (6%) of cytopenia(s) only minor PNH clone was observed. The CD45 plus LS gating in monocyte represents a sensitivity of 100%, a specificity of 40.2%, and 60% (73/89) false positive rate in detecting of PNH clone. McNemar test indicates a significant difference between CD14/CD64 and CD45/LS gating methods (P = 0.00). The Bland-Altman plot of monocyte proportion between the two gating methods revealed that CD45/LS gating method was tended to underestimate monocyte proportion and the larger the number of monocytes, the greater the difference in number of monocyte between the two gating methods. The trend of the size of PNH clone in each cell lineage was confirmed by follow-up in three patients with PNH clone. Two patients showed more abrupt changes of PNH clone in monocytes than in red blood cells or in granulocytes. However, in the other patient, a significant trend found in only PNH clone of RBC. Conclusion The types of PNH clone observed in each disease group showed different characteristics. PNH clone was identified in 5 of 6 PNH population detected AA cases, whereas minor PNH clones were observed in all 5 PNH population detected cytopenia cases. Four minor PNH clones and two PNH clones were discovered in 6 PNH population detected AML cases. However, all observed PNH clones observed in AML cases were monocyte. Monocyte gating with CD45 and LS not only underestimated the proportion of monocyte in total WBCs but also showed a high false positive rate of 60% in detecting PNH clone. In contrast, the CD14/CD64 gating method can accurately measure the monocyte population and avoid making a false positive measurement of PNH clone. In addition, in monitoring PNH patients, the measurement of the PNH clone in monocyte tends to be more sensitive to change of PNH clone size than those measured in RBC or granulocytes. In conclusion, the gating using CD14 and CD64 is significantly valuable in flow cytometric diagnosis for detecting the PNH clone in diagnosing new patents as well as monitoring of PNH patients. Disclosures No relevant conflicts of interest to declare.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Marieke Begemann ◽  
Mikela Leon ◽  
Harm Jan van der Horn ◽  
Joukje van der Naalt ◽  
Iris Sommer

Abstract Outcome after traumatic brain injury (TBI) varies largely and degree of immune activation is an important determinant factor. This meta-analysis evaluates the efficacy of drugs with anti-inflammatory properties in improving neurological and functional outcome. The systematic search following PRISMA guidelines resulted in 15 randomized placebo-controlled trials (3734 patients), evaluating progesterone, erythropoietin and cyclosporine. The meta-analysis (15 studies) showed that TBI patients receiving a drug with anti-inflammatory effects had a higher chance of a favorable outcome compared to those receiving placebo (RR = 1.15; 95% CI 1.01–1.32, p = 0.041). However, publication bias was indicated together with heterogeneity (I2 = 76.59%). Stratified analysis showed that positive effects were mainly observed in patients receiving this treatment within 8 h after injury. Subanalyses by drug type showed efficacy for progesterone (8 studies, RR 1.22; 95% CI 1.01–1.47, p = 0.040), again heterogeneity was high (I2 = 62.92%) and publication bias could not be ruled out. The positive effect of progesterone covaried with younger age and was mainly observed when administered intramuscularly and not intravenously. Erythropoietin (4 studies, RR 1.20; p = 0.110; I2 = 76.59%) and cyclosporine (3 studies, RR 0.75; p = 0.189, I2 = 0%) did not show favorable significant effects. While negative findings for erythropoietin may reflect insufficient power, cyclosporine did not show better outcome at all. Current results do not allow firm conclusions on the efficacy of drugs with anti-inflammatory properties in TBI patients. Included trials showed heterogeneity in methodological and sample parameters. At present, only progesterone showed positive results and early administration via intramuscular administration may be most effective, especially in young people. The anti-inflammatory component of progesterone is relatively weak and other mechanisms than mitigating overall immune response may be more important.


Sign in / Sign up

Export Citation Format

Share Document