More Than One Replication Study Is Needed for Unambiguous Tests of Replication

In this paper a method is proposed to determine whether the result from an original study is corroborated in a replication study. The paper is illustrated using data from the reproducibility project psychology by the Open Science Collaboration. This method emphasizes the need to determine what one wants to replicate: the hypotheses as formulated in the introduction of the original paper, or hypotheses derived from the research results presented in the original paper. The Bayes factor will be used to determine whether the hypotheses evaluated in/resulting from the original study are corroborated by the replication study. Our method to assess the successfulness of replication will better fit the needs and desires of researchers in fields that use replication studies.

Download Full-text

General claims require generalized effects: A reply to Ruiz et al.’s (2020) ‘A systematic and critical response to Pendrous et al. (2020) replication study’

10.31234/osf.io/83z2y ◽

2020 ◽

Cited By ~ 1

Author(s):

Ian Hussey

Keyword(s):

Scientific Community ◽

Original Study ◽

Replication Study ◽

High Quality ◽

Null Results ◽

Critical Response ◽

Replication Studies ◽

Predictable Pattern ◽

Made In

Recent evidence suggests the results of psychology studies replicate less than half the time. Replies are often then written by the authors of the original studies. This discourse often follows a predictable pattern: highly general claims are made in an original study, the replication produces null results, and the response by the original authors primarily focuses on reasons to disqualify the replication’s results from requiring consideration, rather than acknowledging that the original study’s finding may not in fact be replicable, or that the generality of the original claims may require revision or constraint. I illustrate these points using the example of a recently published trio of original study, failed replication, and response by original authors. I argue that our scientific goals would be better served by efforts to avoid falling into these writing tropes, and to instead move the discourse forward and reinforce the behaviours we want to see in our scientific community, such as the conduction of high-quality replication studies.

Download Full-text

Research Commentary: A Rejoinder: A Reflection on the Evolution of a Replication Study

Journal for Research in Mathematics Education ◽

10.5951/jresematheduc.49.1.0111 ◽

2018 ◽

Vol 49 (1) ◽

pp. 111-115 ◽

Cited By ~ 1

Author(s):

Faiza M. Jamil

Keyword(s):

Reward System ◽

Past Research ◽

Original Study ◽

Replication Study ◽

Expectancy Effects ◽

Opportunity To Respond ◽

Replication Studies ◽

Teacher Expectancy ◽

Do So

I appreciate the opportunity to respond to the thoughtful comments made by Alan Schoenfeld (2018) and Jon Star (2018) in their commentaries on replication studies in this issue of JRME, including their comments on our study of teacher expectancy effects (Jamil, Larsen, & Hamre, 2018). I have decided to write this rejoinder in the form of a personal reflection. As academics, we carry the tremendous burden of expertise, and perhaps that is partly why, as pointed out by Schoenfeld (2018), the academic reward system focuses so heavily on novelty and innovation. With our expertise, we are supposed to have all the answers, solve all the problems, and do so in brilliant, new ways. Replication studies are undervalued because they not only, by definition, recreate past research but, perhaps, also bring into question another scholar‧s expertise. Star (2018) even states that one of the three criteria of an outstanding replication study is that it “convincingly shows that there is reason to believe that the results of the original study may be flawed” (p. 99). Although this rigorous examination is precisely the way to build trust in the quality of our findings and move the field forward, it is also what makes it challenging to have candid conversations about what we do not know.

Download Full-text

General claims require generalized effects: A reply to Ruiz et al.’s (2020) ‘A systematic and critical response to Pendrous et al. (2020) replication study’

10.31234/osf.io/j5tey ◽

2020 ◽

Author(s):

Ian Hussey

Keyword(s):

Scientific Community ◽

Original Study ◽

Replication Study ◽

High Quality ◽

Null Results ◽

Critical Response ◽

Replication Studies ◽

Predictable Pattern ◽

Made In

Recent evidence suggests the results of psychology studies replicate less than half the time. Replies are often then written by the authors of the original studies. This discourse often follows a predictable pattern: highly general claims are made in an original study, the replication produces null results, and the response by the original authors primarily focuses on reasons to disqualify the replication’s results from requiring consideration, rather than acknowledging that the original study’s finding may not in fact be replicable, or that the generality of the original claims may require revision or constraint. I illustrate these points using the example of a recently published trio of original study, failed replication, and response by original authors. I argue that our scientific goals would be better served by efforts to avoid falling into these writing tropes, and to instead move the discourse forward and reinforce the behaviours we want to see in our scientific community, such as the conduction of high-quality replication studies.

Download Full-text

A call for replications of addiction research: Which studies should we replicate & what constitutes a “successful” replication?

10.31234/osf.io/xzmn4 ◽

2020 ◽

Author(s):

Robert Heirene

Keyword(s):

Problem Gambling ◽

Experimental Research ◽

Original Study ◽

Replication Study ◽

High Quality ◽

Extant Literature ◽

Entire Field ◽

Replication Studies ◽

Successful Replication

Several prominent researchers in the problem gambling field have recently called for high-quality replications of existing gambling studies. This call should be extended to the entire field of addiction research: there is a need to focus on ensuring that the understanding of addiction and related phenomena gained through the extant literature is robust and replicable. This article discusses two important questions addictions researchers should consider before proceeding with replication studies: [1] which studies should we attempt to replicate? And: [2] how should we interpret the findings of a replication study in relation to the original study? In answering these questions, a focus is placed on experimental research, though the discussion may still serve as a useful introduction to the topic of replications for addictions researchers using any methodology.

Download Full-text

Beïnvloedt een meer of minder sympathieke protagonist de transportatie van de lezer? : Twee nieuwe replicatiestudies

Tijdschrift voor taalbeheersing ◽

10.5117/tvt2019.3.005.jans ◽

2019 ◽

Vol 41 (3) ◽

pp. 515-546

Author(s):

Carel Jansen ◽

Anneke de Graaf ◽

Lettica Hustinx ◽

Joëlle Ooms ◽

Merel Schreinemakers ◽

...

Keyword(s):

Statistical Tests ◽

Meta Analysis ◽

Original Study ◽

Replication Study ◽

Limited Extent ◽

Replication Studies ◽

Disposition Theory ◽

Limited Power ◽

Affective Disposition Theory ◽

Positive Effect

Abstract Does a more or less sympathetic protagonist influence transportation of the reader? Two new replication studiesThree previous studies into presenting a protagonist in a story as more or less sympathetic have not provided a clear picture of the effects that the portrayal of the protagonist may have on transportation, and via transportation on story-consistent beliefs. Results from a first study (N = 83) by De Graaf and Hustinx (2015) suggest that the way the protagonist is portrayed ‐ as sympathetic, unsympathetic or neutral ‐ influences the extent to which readers are transported into a story. No significant effects on beliefs of the readers were found, however. In a direct replication study (N = 79) and in a conceptual replication study (N = 81), Jansen, Nederhoff, and Ooms (2017) found results that supported the hypotheses from the original study to a limited extent. In view of the relatively small numbers of participants in these three studies and the resulting limited power of the statistical tests two new, larger-scaled replication studies were conducted. A direct replication study was performed (N = 238) with the same versions of the story as used in the original study, and also a conceptual replication study (N = 248) with three versions of a new story. Again, the hypotheses from the original study were supported to a limited extent. A meta-analysis of all five studies revealed a large indirect positive effect of story version on transportation via empathy, when comparing the versions with a sympathetic protagonist with the versions with an unsympathetic protagonist. When comparing the neutral story versions with the versions with an unsympathetic protagonist, the meta-analytic indirect effect was medium sized. Other than what the Affective Disposition Theory (Raney, 2004; Zillmann, 1994; 2006) claims, the story versions with a neutral protagonist did not lead to an absence of emotional responses. Furthermore, the outcomes add to the Transportation-Imagery Model (Green & Brock, 2002; Van Laer, De Ruyter, Visconti, & Wetzels, 2014). While this model does not include concrete suggestions of story characteristics that lead to transportation, our studies show that a protagonist who is portrayed as sympathetic may contribute to the level of transportation that readers experience, be it indirectly through empathy.

Download Full-text

Do sexist jokes increase rape proclivity among males high in hostile sexism? Evidence from two pre-registered direct replications of Thomae & Viki (2013)

10.31234/osf.io/cdfxm ◽

2021 ◽

Author(s):

Neil McLatchie ◽

Manuela Thomae

Keyword(s):

Bayes Factor ◽

Alternative Hypothesis ◽

Meta Analysis ◽

Statistical Significance ◽

Original Study ◽

Bayes Factors ◽

Hostile Sexism ◽

Replication Studies ◽

Open Research ◽

Rape Proclivity

Thomae and Viki (2013) reported that increased exposure to sexist humour can increase rape proclivity among males, specifically those who score high on measures of Hostile Sexism. Here we report two pre-registered direct replications (N = 530) of Study 2 from Thomae and Viki (2013) and assess replicability via (i) statistical significance, (ii) Bayes factors, (iii) the small-telescope approach, and (iv) an internal meta-analysis across the original and replication studies. The original results were not supported by any of the approaches. Combining the original study and the replications yielded moderate evidence in support of the null over the alternative hypothesis with a Bayes factor of B = 0.13. In light of the combined evidence, we encourage researchers to exercise caution before claiming that brief exposure to sexist humour increases male’s proclivity towards rape, until further pre-registered and open research demonstrates the effect is reliably reproducible.

Download Full-text

Estimation of a term structure model of carbon prices through state spaces methods: a pitch

Accounting Research Journal ◽

10.1108/arj-08-2020-0278 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Thomas William Aspinall ◽

Adrian Gepp ◽

Geoff Harris ◽

Simone Kelly ◽

Colette Southam ◽

...

Keyword(s):

Term Structure ◽

Replication Study ◽

The European Union ◽

Research Project ◽

Research Teams ◽

Content Type ◽

Term Structure Model ◽

Replication Studies ◽

Core Elements ◽

Trading Scheme

Purpose The pitching research template (PRT) is designed to help pitchers identify the core elements that form the framework of any research project. This paper aims to provide a brief commentary on an application of the PRT to pitch an environmental finance research topic with a personal reflection on the pitch exercise discussed. Design/methodology/approach This paper applies the PRT developed by Faff (2015, 2019) to a research project on estimating the strength of carbon pricing signals under the European Union Emissions Trading Scheme. Findings The PRT is found to be a valuable tool to refine broad ideas into impactful and novel research contributions. The PRT is recommended for use by all academics regardless of field and particularly PhD students to structure and communicate their research ideas. The PRT is found to be particularly well suited to pitch replication studies, as it effectively summarizes both the “idea” and proposed “twist” of a replication study. Originality/value This letter is a reflection on a research teams experience with applying the PRT to pitch a replication study at the 2020 Accounting and Finance Association of Australia and New Zealand event. This event focused on replicable research and was a unique opportunity for research teams to pitch their replication research ideas.

Download Full-text

Some Observations on Arc Potentials in Hydrogen Thyratron Pulse Discharges

Canadian Journal of Physics ◽

10.1139/p72-183 ◽

1972 ◽

Vol 50 (12) ◽

pp. 1337-1345 ◽

Cited By ~ 1

Author(s):

R. J. Armstrong

Keyword(s):

High Power ◽

Electrical Discharge ◽

Experimental Results ◽

Relaxation Properties ◽

Arc Voltage ◽

Electrons And Ions

The arc voltage in high power hydrogen thyratrons has been measured, under a variety of pulse and DC conditions. The valve is found to be higher under pulse conditions than under DC conditions. An explanation, similar to that invoked by Sugawara and Gregory in a different situation, is found to give some agreement with the experimental results. This explanation depends on the relaxation properties of the electrons and ions in an electrical discharge or plasma.

Download Full-text

Many Labs 5: Testing Pre-Data-Collection Peer Review as an Intervention to Increase Replicability

Advances in Methods and Practices in Psychological Science ◽

10.1177/2515245920958687 ◽

2020 ◽

Vol 3 (3) ◽

pp. 309-331 ◽

Cited By ~ 2

Author(s):

Charles R. Ebersole ◽

Maya B. Mathur ◽

Erica Baranski ◽

Diane-Jo Bart-Plange ◽

Nicholas R. Buttrick ◽

...

Keyword(s):

Data Collection ◽

Peer Review ◽

Total Sample ◽

Median Number ◽

Open Science ◽

Original Study ◽

Effect Sizes ◽

Replication Studies ◽

Expert Review ◽

Science Collaboration

Replication studies in psychological science sometimes fail to reproduce prior findings. If these studies use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the protocol rather than a challenge to the original finding. Formal pre-data-collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replication studies from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) for which the original authors had expressed concerns about the replication designs before data collection; only one of these studies had yielded a statistically significant effect ( p < .05). Commenters suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these RP:P studies failed to replicate the original effects. We revised the replication protocols and received formal peer review prior to conducting new replication studies. We administered the RP:P and revised protocols in multiple laboratories (median number of laboratories per original study = 6.5, range = 3–9; median total sample = 1,279.5, range = 276–3,512) for high-powered tests of each original finding with both protocols. Overall, following the preregistered analysis plan, we found that the revised protocols produced effect sizes similar to those of the RP:P protocols (Δ r = .002 or .014, depending on analytic approach). The median effect size for the revised protocols ( r = .05) was similar to that of the RP:P protocols ( r = .04) and the original RP:P replications ( r = .11), and smaller than that of the original studies ( r = .37). Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes (median r = .07, range = .00–.15) were 78% smaller, on average, than the original effect sizes (median r = .37, range = .19–.50).

Download Full-text