scholarly journals Testing ANOVA Replications by Means of the Prior Predictive p-Value

2019 ◽  
Author(s):  
Marielle Zondervan-Zwijnenburg ◽  
Rens van de Schoot ◽  
Herbert Hoijtink

In the current study, we introduce the prior predictive p-value as a method to test replication of an analysis of variance (ANOVA). The prior predictive p-value is based on the prior predictive distribution. If we use the original study to compose the prior distribution, then the prior predictive distribution contains datasets that are expected given the original results.To determine whether the new data resulting from a replication study deviate from the data in the prior predictive distribution, we need to calculate a test statistic for each dataset. We propose to use F-bar, which measures to what degree the results of a dataset deviate from an inequality constrained hypothesis capturing the relevant features of the original study: H_RF. The inequality constraints in H_RF are based on the findings of the original study and can concern, for example, the ordering of means and interaction effects. The prior predictive p-value consequently tests to what degree the new data deviates from predicted data given the original results, considering the findings of the original study.We explain the calculation of the prior predictive p-value step by step, elaborate on the topic of power, and illustrate the method with examples. The replication test and its integrated power and sample size calculator are made available in an R-package and an online interactive application. As such, the current study supports researchers that want to adhere to the call for replication studies in the field of psychology.

2017 ◽  
Author(s):  
Michele B. Nuijten ◽  
Marcel A. L. M. van Assen ◽  
Chris Hubertus Joseph Hartgerink ◽  
Sacha Epskamp ◽  
Jelte M. Wicherts

The R package “statcheck” (Epskamp & Nuijten, 2016) is a tool to extract statistical results from articles and check whether the reported p-value matches the accompanying test statistic and degrees of freedom. A previous study showed high interrater reliabilities (between .76 and .89) between statcheck and manual coding of inconsistencies (.76 - .89; Nuijten, Hartgerink, Van Assen, Epskamp, & Wicherts, 2016). Here we present an additional, detailed study of the validity of statcheck. In Study 1, we calculated its sensitivity and specificity. We found that statcheck’s sensitivity (true positive rate) and specificity (true negative rate) were high: between 85.3% and 100%, and between 96.0% and 100%, respectively, depending on the assumptions and settings. The overall accuracy of statcheck ranged from 96.2% to 99.9%. In Study 2, we investigated statcheck’s ability to deal with statistical corrections for multiple testing or violations of assumptions in articles. We found that the prevalence of corrections for multiple testing or violations of assumptions in psychology was higher than we initially estimated in Nuijten et al. (2016). Although we found numerous reporting inconsistencies in results corrected for violations of the sphericity assumption, we demonstrate that inconsistencies associated with statistical corrections are not what is causing the high estimates of the prevalence of statistical reporting inconsistencies in psychology.


Author(s):  
Patrick Royston

Most randomized controlled trials with a time-to-event outcome are designed and analyzed assuming proportional hazards of the treatment effect. The sample-size calculation is based on a log-rank test or the equivalent Cox test. Nonproportional hazards are seen increasingly in trials and are recognized as a potential threat to the power of the log-rank test. To address the issue, Royston and Parmar (2016, BMC Medical Research Methodology 16: 16) devised a new “combined test” of the global null hypothesis of identical survival curves in each trial arm. The test, which combines the conventional Cox test with a new formulation, is based on the maximal standardized difference in restricted mean survival time (RMST) between the arms. The test statistic is based on evaluations of RMST over several preselected time points. The combined test involves the minimum p-value across the Cox and RMST-based tests, appropriately standardized to have the correct null distribution. In this article, I outline the combined test and introduce a command, stctest, that implements the combined test. I point the way to additional tools currently under development for power and sample-size calculation for the combined test.


2021 ◽  
Author(s):  
Qianrao Fu

It is a tradition that goes back to Jacob Cohen to calculate the sample size before collecting data. The most commonly asked question is: "How many subjects do we need to obtain a significant result if we use the p-value to evaluate the hypothesis if an effect size exists?" In the Bayesian framework, we may want to know how many subjects are needed to get convincing evidence if we use the Bayes factor to evaluate the hypothesis. This paper proposes a solution to the above question by reaching two goals: firstly, the size of the Bayes factor reaches a given threshold, and secondly the probability that the Bayes factor exceeds the given threshold reaches a required value. Researchers can express their expectations through the order or the sign hypothesis of the parameters in a linear regression model. For example, the researchers may expect the regression coefficient to be $\beta_1>\beta_2>\beta_3$, which is an order constrained hypothesis; or the researchers may expect a regression coefficient $\beta_1>0$, which is a sign hypothesis. The greatest advantage of using a specific hypothesis is that the sample size required is reduced compared to an unconstrained hypothesis to achieve the same probability that the Bayes factor exceeds some threshold. This article provides sample size tables for the null hypothesis, order hypothesis, sign hypothesis, complement hypothesis, and unconstrained hypothesis. To enhance the applicability, an R package is developed via a Monte Carlo simulation, which can facilitate psychologists while planning the sample size even if they do not have any statistical programming background.


Author(s):  
Yuhemy Zurizah Yuhemy Zurizah ◽  
Rini Mayasari Rini Mayasari

ABSTRACT Low Birth Weight (LBW) was defined as infants born weighing less than 2.500 grams. WHO estimates that nearly all (98%) of the five million neonatal deaths in developing countries. According to City Health if Palembang Departement, infant mortality rate (IMR) in the year 2007 is 3 per 1000 live births, in 2008 four per 1000 live births, and in 2009 approximately 2 per 1000 live births. The cause of LBW is a disease, maternal age, social circumstances, maternal habits factors, fetal factors and environmental factors. LBW prognosis depending on the severity of the perinatal period such as stage of gestation (gestation getting younger or lower the baby's weight, the higher the mortality), asphyxia / ischemia brain, respiratory distress syndromesmetabolic disturbances. This study aims to determine the relationship between maternal age and educations mothers of pregnancy with the incidence of LBW in the General Hospital Dr Center. Mohammad Hoesin Palembang in 2010 This study uses the Analytical Ceoss Sectional Survey. The study population was all mothers who gave birth in public hospitals center Dr. Mohammad Hoesin Palembang in 2010 were 1.476 mothers gave birth with a large sample of 94 studies of maternal taken by systematic random sampling, ie research instument Check List. Data analysis was performed univariate and bivariate. The results of this study show from 94 mothers of LBW was found 45 people (47,9%) Which has a high risk age 26 LBW ( 27,7%) while the distance of low educations LBW (55,3%). From Chi-Square test statistic that compares the p value with significance level α = 0,05 showed a significant correlation between maternal age, where the p value = 0,002, of education mothers of pregnancy p value = 0,003 with LBW. In the general hospital center Dr. Mohammad Hoesin Palembang ini 2010. Expected to researches who will come to examine in more depth.   ABSTRAK Bayi Berat Lahir Rendah (BBLR) telah didefinisikan sebagai bayi lahir kurang dari 2.500 gram. WHO memperkirakan hampir semua (98%) dari 5 juta kematian neonatal di negara berkembang. Menurut Data Dinas Kesehatan Kota Palembang, Angka Kematian Bayi (AKB) pada tahun 2007 yaitu 3 per 1.000 kelahiran hidup, pada tahun 2008 4 per 1.000 kelahiran hidup, dan pada tahun 2009 sekitar 2 per 1.000 kelahiran hidup. Penyebab BBLR adalah penyakit, usia ibu, keadaan sosial, faktor kebiasaan ibu, dan faktor lingkungan. Prognosis BBLR tergantung dari berat ringannya masa perinatal misalnya masa gestasi (makin muda masa gestasi atau makin rendah berat bayi, makin tinggi angka kematian), asfiksia atau iskemia otak, sindrom gangguan pernafasan, gangguan metabolik. Penelitian ini bertujuan untuk mengetahui hubungan antara umur dan pendidikan ibu dengan kejadian BBLR di Rumah Sakit Umum Pusat Dr. Mohammad Hoesin Palembang Tahun 2010. Penelitian ini menggunakan survey analitik Cross sectional. Populasi penelitian ini adalah semua ibu yang melahirkan di Rumah Sakit Umum Pusat Dr. Mohammad Hoesin Palembang tahun 2010 sebanyak 1.476 ibu melahirkan dengan besar sampel penelitian 94 ibu melahirkan yang diambil dengan tehnik acak sistematik, instrumen penelitian yaitu check list. Analisis data dilakukan secara univariat dan bivariat. Hasil penelitian ini menunjukkan dari 94 ibu didapatkan kejadian BBLR 45 orang (47,9%) yang memiliki umur resiko tinggi 26 kejadian BBLR (27,7%) sedangkan yang pendidikan rendah 52 kejadian BBLR (55,3%). Dari statistik uji Chi-square yang membandingkan p value dengan tingkat kemaknaan α = 0,05 menunjukkan bahwa ada hubungan yang bermakna antara umur ibu p value (0,002) , pendidikan p value (0,003) dengan kejadian BBLR di Rumah Sakit Umum Pusat Dr. Mohammad Hoesin Palembang Tahun 2010. Diharapkan bagi peneliti yang akan datang untuk meneliti lebih mendalam.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Morteza Bitaraf Sani ◽  
Javad Zare Harofte ◽  
Mohammad Hossein Banabazi ◽  
Saeid Esmaeilkhanian ◽  
Ali Shafei Naderi ◽  
...  

AbstractFor thousands of years, camels have produced meat, milk, and fiber in harsh desert conditions. For a sustainable development to provide protein resources from desert areas, it is necessary to pay attention to genetic improvement in camel breeding. By using genotyping-by-sequencing (GBS) method we produced over 14,500 genome wide markers to conduct a genome- wide association study (GWAS) for investigating the birth weight, daily gain, and body weight of 96 dromedaries in the Iranian central desert. A total of 99 SNPs were associated with birth weight, daily gain, and body weight (p-value < 0.002). Genomic breeding values (GEBVs) were estimated with the BGLR package using (i) all 14,522 SNPs and (ii) the 99 SNPs by GWAS. Twenty-eight SNPs were associated with birth weight, daily gain, and body weight (p-value < 0.001). Annotation of the genomic region (s) within ± 100 kb of the associated SNPs facilitated prediction of 36 candidate genes. The accuracy of GEBVs was more than 0.65 based on all 14,522 SNPs, but the regression coefficients for birth weight, daily gain, and body weight were 0.39, 0.20, and 0.23, respectively. Because of low sample size, the GEBVs were predicted using the associated SNPs from GWAS. The accuracy of GEBVs based on the 99 associated SNPs was 0.62, 0.82, and 0.57 for birth weight, daily gain, and body weight. This report is the first GWAS using GBS on dromedary camels and identifies markers associated with growth traits that could help to plan breeding program to genetic improvement. Further researches using larger sample size and collaboration of the camel farmers and more profound understanding will permit verification of the associated SNPs identified in this project. The preliminary results of study show that genomic selection could be the appropriate way to genetic improvement of body weight in dromedary camels, which is challenging due to a long generation interval, seasonal reproduction, and lack of records and pedigrees.


Author(s):  
Markus Ekvall ◽  
Michael Höhle ◽  
Lukas Käll

Abstract Motivation Permutation tests offer a straightforward framework to assess the significance of differences in sample statistics. A significant advantage of permutation tests are the relatively few assumptions about the distribution of the test statistic are needed, as they rely on the assumption of exchangeability of the group labels. They have great value, as they allow a sensitivity analysis to determine the extent to which the assumed broad sample distribution of the test statistic applies. However, in this situation, permutation tests are rarely applied because the running time of naïve implementations is too slow and grows exponentially with the sample size. Nevertheless, continued development in the 1980s introduced dynamic programming algorithms that compute exact permutation tests in polynomial time. Albeit this significant running time reduction, the exact test has not yet become one of the predominant statistical tests for medium sample size. Here, we propose a computational parallelization of one such dynamic programming-based permutation test, the Green algorithm, which makes the permutation test more attractive. Results Parallelization of the Green algorithm was found possible by non-trivial rearrangement of the structure of the algorithm. A speed-up—by orders of magnitude—is achievable by executing the parallelized algorithm on a GPU. We demonstrate that the execution time essentially becomes a non-issue for sample sizes, even as high as hundreds of samples. This improvement makes our method an attractive alternative to, e.g. the widely used asymptotic Mann-Whitney U-test. Availabilityand implementation In Python 3 code from the GitHub repository https://github.com/statisticalbiotechnology/parallelPermutationTest under an Apache 2.0 license. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 17 (1) ◽  
Author(s):  
Rutu Rathod ◽  
Hongmei Zhang ◽  
Wilfried Karmaus ◽  
Susan Ewart ◽  
Latha Kadalayil ◽  
...  

Abstract Purpose Body mass index (BMI) is associated with asthma but associations of BMI temporal patterns with asthma incidence are unclear. Previous studies suggest that DNA methylation (DNAm) is associated with asthma status and variation in DNAm is a consequence of BMI changes. This study assessed the direct and indirect (via DNAm) effects of BMI trajectories in childhood on asthma incidence at young adulthood. Methods Data from the Isle of Wight (IoW) birth cohort were included in the analyses. Group-based trajectory modelling was applied to infer latent BMI trajectories from ages 1 to 10 years. An R package, ttscreening, was applied to identify differentially methylated CpGs at age 10 years associated with BMI trajectories, stratified for sex. Logistic regressions were used to further exclude CpGs with DNAm at age 10 years not associated with asthma incidence at 18 years. CpGs discovered via path analyses that mediated the association of BMI trajectories with asthma incidence in the IoW cohort were further tested in an independent cohort, the Avon Longitudinal Study of Children and Parents (ALSPAC). Results Two BMI trajectories (high vs. normal) were identified. Of the 442,474 CpG sites, DNAm at 159 CpGs in males and 212 in females were potentially associated with BMI trajectories. Assessment of their association with asthma incidence identified 9 CpGs in males and 6 CpGs in females. DNAm at 4 of these 15 CpGs showed statistically significant mediation effects (p-value < 0.05). At two of the 4 CpGs (cg23632109 and cg10817500), DNAm completely mediated the association (i.e., only statistically significant indirect effects were identified). In the ALSPAC cohort, at all four CpGs, the same direction of mediating effects were observed as those found in the IoW cohort, although statistically insignificant. Conclusion The association of BMI trajectory in childhood with asthma incidence at young adulthood is possibly mediated by DNAm.


2021 ◽  
Vol 4 (1) ◽  
pp. 251524592097262
Author(s):  
Don van Ravenzwaaij ◽  
Alexander Etz

When social scientists wish to learn about an empirical phenomenon, they perform an experiment. When they wish to learn about a complex numerical phenomenon, they can perform a simulation study. The goal of this Tutorial is twofold. First, it introduces how to set up a simulation study using the relatively simple example of simulating from the prior. Second, it demonstrates how simulation can be used to learn about the Jeffreys-Zellner-Siow (JZS) Bayes factor, a currently popular implementation of the Bayes factor employed in the BayesFactor R package and freeware program JASP. Many technical expositions on Bayes factors exist, but these may be somewhat inaccessible to researchers who are not specialized in statistics. In a step-by-step approach, this Tutorial shows how a simple simulation script can be used to approximate the calculation of the Bayes factor. We explain how a researcher can write such a sampler to approximate Bayes factors in a few lines of code, what the logic is behind the Savage-Dickey method used to visualize Bayes factors, and what the practical differences are for different choices of the prior distribution used to calculate Bayes factors.


2021 ◽  
Author(s):  
Neil McLatchie ◽  
Manuela Thomae

Thomae and Viki (2013) reported that increased exposure to sexist humour can increase rape proclivity among males, specifically those who score high on measures of Hostile Sexism. Here we report two pre-registered direct replications (N = 530) of Study 2 from Thomae and Viki (2013) and assess replicability via (i) statistical significance, (ii) Bayes factors, (iii) the small-telescope approach, and (iv) an internal meta-analysis across the original and replication studies. The original results were not supported by any of the approaches. Combining the original study and the replications yielded moderate evidence in support of the null over the alternative hypothesis with a Bayes factor of B = 0.13. In light of the combined evidence, we encourage researchers to exercise caution before claiming that brief exposure to sexist humour increases male’s proclivity towards rape, until further pre-registered and open research demonstrates the effect is reliably reproducible.


Sign in / Sign up

Export Citation Format

Share Document