effect size distribution
Recently Published Documents


TOTAL DOCUMENTS

19
(FIVE YEARS 10)

H-INDEX

4
(FIVE YEARS 1)

PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257535
Author(s):  
Max M. Owens ◽  
Alexandra Potter ◽  
Courtland S. Hyatt ◽  
Matthew Albaugh ◽  
Wesley K. Thompson ◽  
...  

Effect sizes are commonly interpreted using heuristics established by Cohen (e.g., small: r = .1, medium r = .3, large r = .5), despite mounting evidence that these guidelines are mis-calibrated to the effects typically found in psychological research. This study’s aims were to 1) describe the distribution of effect sizes across multiple instruments, 2) consider factors qualifying the effect size distribution, and 3) identify examples as benchmarks for various effect sizes. For aim one, effect size distributions were illustrated from a large, diverse sample of 9/10-year-old children. This was done by conducting Pearson’s correlations among 161 variables representing constructs from all questionnaires and tasks from the Adolescent Brain and Cognitive Development Study® baseline data. To achieve aim two, factors qualifying this distribution were tested by comparing the distributions of effect size among various modifications of the aim one analyses. These modified analytic strategies included comparisons of effect size distributions for different types of variables, for analyses using statistical thresholds, and for analyses using several covariate strategies. In aim one analyses, the median in-sample effect size was .03, and values at the first and third quartiles were .01 and .07. In aim two analyses, effects were smaller for associations across instruments, content domains, and reporters, as well as when covarying for sociodemographic factors. Effect sizes were larger when thresholding for statistical significance. In analyses intended to mimic conditions used in “real-world” analysis of ABCD data, the median in-sample effect size was .05, and values at the first and third quartiles were .03 and .09. To achieve aim three, examples for varying effect sizes are reported from the ABCD dataset as benchmarks for future work in the dataset. In summary, this report finds that empirically determined effect sizes from a notably large dataset are smaller than would be expected based on existing heuristics.


2021 ◽  
Author(s):  
Kleber Neves ◽  
Pedro Batista Tan ◽  
Olavo Bohrer Amaral

Diagnostic screening models for the interpretation of null hypothesis significance test (NHST) results have been influential in highlighting the effect of selective publication on the reproducibility of the published literature, leading to John Ioannidis’ much-cited claim that most published research findings are false. These models, however, are typically based on the assumption that hypotheses are dichotomously true or false, without considering that effect sizes for different hypotheses are not the same. To address this limitation, we develop a simulation model that overcomes this by modeling effect sizes explicitly using different continuous distributions, while retaining other aspects of previous models such as publication bias and the pursuit of statistical significance. Our results show that the combination of selective publication, bias, low statistical power and unlikely hypotheses consistently leads to high proportions of false positives, irrespective of the effect size distribution assumed. Using continuous effect sizes also allows us to evaluate the degree of effect size overestimation and prevalence of estimates with the wrong signal in the literature, showing that the same factors that drive false-positive results also lead to errors in estimating effect size direction and magnitude. Nevertheless, the relative influence of these factors on different metrics varies depending on the distribution assumed for effect sizes. The model is made available as an R ShinyApp interface, allowing one to explore features of the literature in various scenarios.


2021 ◽  
Author(s):  
Niccolò Tesi ◽  
Marc Hulsman ◽  
Sven J. van der Lee ◽  
Iris E. Jansen ◽  
Najada Stringa ◽  
...  

AbstractThe genetics underlying human longevity is influenced by the genetic risk to develop -or escape- age-related diseases. As Alzheimer’s disease (AD) represents one of the most common conditions at old age, an interplay between genetic factors for AD and longevity is expected.We explored this interplay by studying the prevalence of 38 AD-associated single-nucleotide-polymorphisms (SNPs) identified in AD-GWAS, in self-reported cognitively healthy centenarians, and we replicated findings in the largest GWAS on parental-longevity.We found that 28/38 SNPs identified to associate with increased AD-risk also associated with decreased odds of longevity. For each SNP, we express the imbalance between AD- and longevity-risk as an effect-size distribution. When grouping the SNPs based on these distributions, we found three groups: 17 variants increased AD-risk more than they decreased the risk of longevity (AD-group): these variants were functionally enriched for β-amyloid metabolism and immune signaling, and they were enriched in microglia. 11 variants reported a larger effect on longevity as compared to their AD-effect (Longevity-group): these variants were enriched for endocytosis/immune signaling, and at the cell-type level were enriched in microglia and endothelial cells. Next to AD, these variants were previously associated with other aging-related diseases, including cardiovascular and autoimmune diseases, and cancer. Unexpectedly, 10 variants associated with an increased risk of both AD and longevity (Unexpected-group). The effect of the SNPs in AD- and Longevity-groups replicated in the largest GWAS on parental-longevity, while the effects on longevity of the SNPs in the Unexpected-group could not be replicated, suggesting that these effects may not be robust across different studies.Our study shows that some AD-associated variants negatively affect longevity primarily by their increased risk of AD, while other variants negatively affect longevity through an increased risk of multiple age-related diseases, including AD.


Author(s):  
Junji Morisawa ◽  
Takahiro Otani ◽  
Jo Nishino ◽  
Ryo Emoto ◽  
Kunihiko Takahashi ◽  
...  

AbstractBayes factor analysis has the attractive property of accommodating the risks of both false negatives and false positives when identifying susceptibility gene variants in genome-wide association studies (GWASs). For a particular SNP, the critical aspect of this analysis is that it incorporates the probability of obtaining the observed value of a statistic on disease association under the alternative hypotheses of non-null association. An approximate Bayes factor (ABF) was proposed by Wakefield (Genetic Epidemiology 2009;33:79–86) based on a normal prior for the underlying effect-size distribution. However, misspecification of the prior can lead to failure in incorporating the probability under the alternative hypothesis. In this paper, we propose a semi-parametric, empirical Bayes factor (SP-EBF) based on a nonparametric effect-size distribution estimated from the data. Analysis of several GWAS datasets revealed the presence of substantial numbers of SNPs with small effect sizes, and the SP-EBF attributed much greater significance to such SNPs than the ABF. Overall, the SP-EBF incorporates an effect-size distribution that is estimated from the data, and it has the potential to improve the accuracy of Bayes factor analysis in GWASs.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Ryo Emoto ◽  
Atsushi Kawaguchi ◽  
Kunihiko Takahashi ◽  
Shigeyuki Matsui

In disease-association studies using neuroimaging data, evaluating the biological or clinical significance of individual associations requires not only detection of disease-associated areas of the brain but also estimation of the magnitudes of the associations or effect sizes for individual brain areas. In this paper, we propose a model-based framework for voxel-based inferences under spatial dependency in neuroimaging data. Specifically, we employ hierarchical mixture models with a hidden Markov random field structure to incorporate the spatial dependency between voxels. A nonparametric specification is proposed for the effect size distribution to flexibly estimate the underlying effect size distribution. Simulation experiments demonstrate that compared with a naive estimation method, the proposed methods can substantially reduce the selection bias in the effect size estimates of the selected voxels with the greatest observed associations. An application to neuroimaging data from an Alzheimer’s disease study is provided.


2020 ◽  
Author(s):  
Luke Jen O’Connor

AbstractThe genetic effect-size distribution describes the number of variants that affect disease risk and the range of their effect sizes. Accurate estimates of this distribution would provide insights into genetic architecture and set sample-size targets for future genome-wide association studies. We developed Fourier Mixture Regression (FMR) to estimate common-variant effect-size distributions from GWAS summary statistics. We validated FMR in simulations and in analyses of UK Biobank data, using interim-release summary statistics (max N=145k) to predict the results of the full release (N=460k). Analyzing summary statistics for 10 diseases (avg Neff=169k) and 22 other traits, we estimated the sample size required for genome-wide significant SNPs to explain 50% of SNP-heritability. For most diseases the requisite number of cases is 100k-1M, an attainable number; ten times more would be required to explain 90% of heritability. In well-powered GWAS, genome-wide significance is a conservative threshold, and loci at less stringent thresholds have true positive rates that remain close to 1 if confounding is controlled. Analyzing the shape of the effect-size distribution, we estimate that heritability accumulates across many thousands of SNPs with a wide range of effect sizes: the largest effects (at the 90th percentile of heritability) are 100 times larger than the smallest (10th percentile), and while the midpoint of this range varies across traits, its size is similar. These results suggest attainable sample size targets for future GWAS, and they underscore the complexity of genetic architecture.


2019 ◽  
Author(s):  
Alexey A. Shadrin ◽  
Oleksandr Frei ◽  
Olav B. Smeland ◽  
Francesco Bettella ◽  
Kevin S. O’Connell ◽  
...  

AbstractDetermining the contribution of functional genetic categories is fundamental to understanding the genetic etiology of complex human traits and diseases. Here we present Annotation Informed MiXeR: a likelihood-based method to estimate the number of variants influencing a phenotype and their effect sizes across different functional annotation categories of the genome using summary statistics from genome-wide association studies. Applying the model to 11 complex phenotypes suggests diverse patterns of functional category-specific genetic architectures across human diseases and traits.


2019 ◽  
Author(s):  
Guanghao Qi ◽  
Nilanjan Chatterjee

AbstractBackgroundMendelian randomization (MR) has provided major opportunities for understanding the causal relationship among complex traits. Previous studies have often evaluated MR methods based on simulations that do not adequately reflect the data-generating mechanism in GWAS and there are often discrepancies in performance of MR methods in simulations and real datasets.MethodsWe use a simulation framework that generates data on full GWAS for two traits under realistic model for effect-size distribution coherent with heritability, co-heritability and polygenicity typically observed for complex traits. We further use recent data generated from GWAS of 38 biomarkers in the UK Biobank to investigate their causal effects on risk of type-2 diabetes using externally available GWAS summary-statistics.ResultsSimulation studies show that weighted mode and MRMix are the only two methods which maintain correct type-I error rate in a diverse set of scenarios. Between the two methods, MRMix tends to be more powerful for larger GWAS while the opposite being true for smaller sample sizes. Among the other methods, random-effect IVW, MR-Robust and MR-RAPS tend to perform best in maintaining low mean squared error when the InSIDE assumption is satisfied, but can produce large bias when InSIDE is violated. In real data analysis, some biomarkers showed major heterogeneity in estimates of their causal effects on risk of type-2 diabetes across the different methods, with patterns similar to those observed in simulation studies.ConclusionsRelative performance of different MR methods depends heavily on sample sizes of underlying GWAS, proportion of valid instruments and validity of the InSIDE assumption.Key MessagesMany previous simulations studies to evaluate Mendelian randomization methods do not adequately reflect the data-generating mechanism of genome-wide association studies (GWAS).We use a simulation framework that generates data on full GWASs under realistic model informed by recent studies on effect-size distribution. We also used very recent GWAS data available on a large number of biomarkers to evaluate their causal effect on type-2 diabetes using alternative methods.Among the 10 methods that were compared, relative performance of different methods depends heavily on sample sizes of underlying GWAS, proportion of valid instruments and validity of the InSIDE assumption.Weighted mode and MRMix are the only two methods that maintain correct type I error rate in a diverse set of scenarios.


Sign in / Sign up

Export Citation Format

Share Document