Factors influencing the effect size distribution of adaptive substitutions

The distribution of effect sizes of adaptive substitutions has been central to evolutionary biology since the modern synthesis. Early theory proposed that because large-effect mutations have negative pleiotropic consequences, only small-effect mutations contribute to adaptation. More recent theory suggested instead that large-effect mutations could be favoured when populations are far from their adaptive peak. Here we suggest that the distributions of effect sizes are expected to differ among study systems, reflecting the wide variation in evolutionary forces and ecological conditions experienced in nature. These include selection, mutation, genetic drift, gene flow, and other factors such as the degree of pleiotropy, the distance to the phenotypic optimum, whether the optimum is stable or moving, and whether new mutation or standing genetic variation provides the source of adaptive alleles. Our goal is to review how these factors might affect the distribution of effect sizes and to identify new research directions. Until more theory and empirical work is available, we feel that it is premature to make broad generalizations about the effect size distribution of adaptive substitutions important in nature.

Download Full-text

The distribution of common-variant effect sizes

10.1101/2020.09.19.304097 ◽

2020 ◽

Author(s):

Luke Jen O’Connor

Keyword(s):

Sample Size ◽

Size Distribution ◽

Effect Size ◽

Genetic Architecture ◽

Common Variant ◽

Effect Sizes ◽

Summary Statistics ◽

Genome Wide ◽

Variant Effect ◽

Effect Size Distribution

AbstractThe genetic effect-size distribution describes the number of variants that affect disease risk and the range of their effect sizes. Accurate estimates of this distribution would provide insights into genetic architecture and set sample-size targets for future genome-wide association studies. We developed Fourier Mixture Regression (FMR) to estimate common-variant effect-size distributions from GWAS summary statistics. We validated FMR in simulations and in analyses of UK Biobank data, using interim-release summary statistics (max N=145k) to predict the results of the full release (N=460k). Analyzing summary statistics for 10 diseases (avg Neff=169k) and 22 other traits, we estimated the sample size required for genome-wide significant SNPs to explain 50% of SNP-heritability. For most diseases the requisite number of cases is 100k-1M, an attainable number; ten times more would be required to explain 90% of heritability. In well-powered GWAS, genome-wide significance is a conservative threshold, and loci at less stringent thresholds have true positive rates that remain close to 1 if confounding is controlled. Analyzing the shape of the effect-size distribution, we estimate that heritability accumulates across many thousands of SNPs with a wide range of effect sizes: the largest effects (at the 90th percentile of heritability) are 100 times larger than the smallest (10th percentile), and while the midpoint of this range varies across traits, its size is similar. These results suggest attainable sample size targets for future GWAS, and they underscore the complexity of genetic architecture.

Download Full-text

An effect size distribution analysis of heart rate variability studies: Recommendations for reporting the magnitude of group differences

10.1101/072660 ◽

2016 ◽

Author(s):

Daniel S. Quintana

Keyword(s):

Heart Rate ◽

Heart Rate Variability ◽

Size Distribution ◽

Effect Size ◽

Case Control ◽

The Body ◽

Effect Sizes ◽

Group Differences ◽

Distribution Analysis ◽

Effect Size Distribution

AbstractThe calculation of heart rate variability (HRV) is a popular tool used to investigate differences in cardiac autonomic control between population samples. When interpreting effect sizes to quantify the magnitude of group differences, researchers typically use Cohen's guidelines of small (0.2), medium (0.5), and large (0.8) effects. However, these guidelines were only proposed for use when the effect size distribution (ESD) was unknown. Despite the availability of effect sizes from hundreds of HRV studies, researchers still largely rely on Cohen's guidelines to interpret effect sizes. This article describes an ESD analysis of 297 HRV effect sizes from case-control studies, revealing that the 25th, 50th, and 75th effect size percentiles correspond with effect sizes of 0.25, 0.5, and 0.84, respectively. The ESD for separate clinical groups are also presented. The data suggests that Cohen's guidelines underestimate the magnitude of small and large effect sizes for the body of HRV case-control research. Therefore, to better reflect observed HRV effect sizes, the data suggest that effect sizes of 0.25, 0.5, and 0.85 should be interpreted as small, medium, and large effects. Researchers are encouraged to use the ESD dataset or their own collected datasets in tandem with the provided analysis script to perform bespoke ESD analyses relevant to their specific research area.

Download Full-text

Effect Sizes: Conventional Choices and Calculations

Handbook of Meta-analysis in Ecology and Evolution ◽

10.23943/princeton/9780691137285.003.0006 ◽

2013 ◽

Cited By ~ 1

Author(s):

Michael S. Rosenberg ◽

Hannah R. Rothstein ◽

Jessica Gurevitch

Keyword(s):

Critical Point ◽

Effect Size ◽

Evolutionary Biology ◽

Meta Analysis ◽

Statistical Parameter ◽

Effect Sizes ◽

Common Effect ◽

Inference Methods ◽

Conventional Effect ◽

Estimate Effect Size

One of the fundamental concepts in meta-analysis is that of the effect size. An effect size is a statistical parameter that can be used to compare, on the same scale, the results of different studies in which a common effect of interest has been measured. This chapter describes the conventional effect sizes most commonly encountered in ecology and evolutionary biology, and the types of data associated with them. While choice of a specific measure of effect size may influence the interpretation of results, it does not influence the actual inference methods of meta-analysis. One critical point to remember is that one cannot combine different measures of effect size in a single meta-analysis: once you have chosen how you are going to estimate effect size, you need to use it for all of the studies to be analyzed.

Download Full-text

Inference of genetic architecture from chromosome partitioning analyses is sensitive to genome variation, sample size, heritability and effect size distribution

Molecular Ecology Resources ◽

10.1111/1755-0998.12774 ◽

2018 ◽

Vol 18 (4) ◽

pp. 767-777 ◽

Cited By ~ 7

Author(s):

Petri Kemppainen ◽

Arild Husby

Keyword(s):

Sample Size ◽

Size Distribution ◽

Effect Size ◽

Genetic Architecture ◽

Genome Variation ◽

Chromosome Partitioning ◽

Effect Size Distribution

Download Full-text

Are Most Published Research Findings False In A Continuous Universe?

10.31222/osf.io/jk7sa ◽

2021 ◽

Author(s):

Kleber Neves ◽

Pedro Batista Tan ◽

Olavo Bohrer Amaral

Keyword(s):

Publication Bias ◽

Effect Size ◽

Statistical Power ◽

Statistical Significance ◽

Significance Test ◽

Effect Sizes ◽

Diagnostic Screening ◽

Research Findings ◽

Published Research ◽

Effect Size Distribution

Diagnostic screening models for the interpretation of null hypothesis significance test (NHST) results have been influential in highlighting the effect of selective publication on the reproducibility of the published literature, leading to John Ioannidis’ much-cited claim that most published research findings are false. These models, however, are typically based on the assumption that hypotheses are dichotomously true or false, without considering that effect sizes for different hypotheses are not the same. To address this limitation, we develop a simulation model that overcomes this by modeling effect sizes explicitly using different continuous distributions, while retaining other aspects of previous models such as publication bias and the pursuit of statistical significance. Our results show that the combination of selective publication, bias, low statistical power and unlikely hypotheses consistently leads to high proportions of false positives, irrespective of the effect size distribution assumed. Using continuous effect sizes also allows us to evaluate the degree of effect size overestimation and prevalence of estimates with the wrong signal in the literature, showing that the same factors that drive false-positive results also lead to errors in estimating effect size direction and magnitude. Nevertheless, the relative influence of these factors on different metrics varies depending on the distribution assumed for effect sizes. The model is made available as an R ShinyApp interface, allowing one to explore features of the literature in various scenarios.

Download Full-text

Effect-Size Estimation Using Semiparametric Hierarchical Mixture Models in Disease-Association Studies with Neuroimaging Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2020/7482403 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Ryo Emoto ◽

Atsushi Kawaguchi ◽

Kunihiko Takahashi ◽

Shigeyuki Matsui

Keyword(s):

Size Distribution ◽

Mixture Models ◽

Effect Size ◽

Association Studies ◽

Disease Association ◽

Spatial Dependency ◽

Size Estimation ◽

Neuroimaging Data ◽

Hierarchical Mixture Models ◽

Effect Size Distribution

In disease-association studies using neuroimaging data, evaluating the biological or clinical significance of individual associations requires not only detection of disease-associated areas of the brain but also estimation of the magnitudes of the associations or effect sizes for individual brain areas. In this paper, we propose a model-based framework for voxel-based inferences under spatial dependency in neuroimaging data. Specifically, we employ hierarchical mixture models with a hidden Markov random field structure to incorporate the spatial dependency between voxels. A nonparametric specification is proposed for the effect size distribution to flexibly estimate the underlying effect size distribution. Simulation experiments demonstrate that compared with a naive estimation method, the proposed methods can substantially reduce the selection bias in the effect size estimates of the selected voxels with the greatest observed associations. An application to neuroimaging data from an Alzheimer’s disease study is provided.

Download Full-text

Recalibrating expectations about effect size: A multi-method survey of effect sizes in the ABCD study

PLoS ONE ◽

10.1371/journal.pone.0257535 ◽

2021 ◽

Vol 16 (9) ◽

pp. e0257535

Author(s):

Max M. Owens ◽

Alexandra Potter ◽

Courtland S. Hyatt ◽

Matthew Albaugh ◽

Wesley K. Thompson ◽

...

Keyword(s):

Effect Size ◽

Statistical Significance ◽

Psychological Research ◽

Sociodemographic Factors ◽

Effect Sizes ◽

Size Distributions ◽

Different Types ◽

Two Factors ◽

Future Work ◽

Effect Size Distribution

Effect sizes are commonly interpreted using heuristics established by Cohen (e.g., small: r = .1, medium r = .3, large r = .5), despite mounting evidence that these guidelines are mis-calibrated to the effects typically found in psychological research. This study’s aims were to 1) describe the distribution of effect sizes across multiple instruments, 2) consider factors qualifying the effect size distribution, and 3) identify examples as benchmarks for various effect sizes. For aim one, effect size distributions were illustrated from a large, diverse sample of 9/10-year-old children. This was done by conducting Pearson’s correlations among 161 variables representing constructs from all questionnaires and tasks from the Adolescent Brain and Cognitive Development Study® baseline data. To achieve aim two, factors qualifying this distribution were tested by comparing the distributions of effect size among various modifications of the aim one analyses. These modified analytic strategies included comparisons of effect size distributions for different types of variables, for analyses using statistical thresholds, and for analyses using several covariate strategies. In aim one analyses, the median in-sample effect size was .03, and values at the first and third quartiles were .01 and .07. In aim two analyses, effects were smaller for associations across instruments, content domains, and reporters, as well as when covarying for sociodemographic factors. Effect sizes were larger when thresholding for statistical significance. In analyses intended to mimic conditions used in “real-world” analysis of ABCD data, the median in-sample effect size was .05, and values at the first and third quartiles were .03 and .09. To achieve aim three, examples for varying effect sizes are reported from the ABCD dataset as benchmarks for future work in the dataset. In summary, this report finds that empirically determined effect sizes from a notably large dataset are smaller than would be expected based on existing heuristics.

Download Full-text

Estimation of effect size distribution from genome-wide association studies and implications for future discoveries

Nature Genetics ◽

10.1038/ng.610 ◽

2010 ◽

Vol 42 (7) ◽

pp. 570-575 ◽

Cited By ~ 427

Author(s):

Ju-Hyun Park ◽

Sholom Wacholder ◽

Mitchell H Gail ◽

Ulrike Peters ◽

Kevin B Jacobs ◽

...

Keyword(s):

Size Distribution ◽

Effect Size ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Effect Size Distribution

Download Full-text

Quantifying posterior effect size distribution of susceptibility loci by common summary statistics

Genetic Epidemiology ◽

10.1002/gepi.22286 ◽

2020 ◽

Vol 44 (4) ◽

pp. 339-351

Author(s):

Olga A. Vsevolozhskaya ◽

Dmitri V. Zaykin

Keyword(s):

Size Distribution ◽

Effect Size ◽

Summary Statistics ◽

Susceptibility Loci ◽

Effect Size Distribution

Download Full-text

Semi-parametric empirical Bayes factor for genome-wide association studies

European Journal of Human Genetics ◽

10.1038/s41431-020-00800-x ◽

2021 ◽

Author(s):

Junji Morisawa ◽

Takahiro Otani ◽

Jo Nishino ◽

Ryo Emoto ◽

Kunihiko Takahashi ◽

...

Keyword(s):

Factor Analysis ◽

Size Distribution ◽

Effect Size ◽

Empirical Bayes ◽

Bayes Factor ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Effect Size Distribution

AbstractBayes factor analysis has the attractive property of accommodating the risks of both false negatives and false positives when identifying susceptibility gene variants in genome-wide association studies (GWASs). For a particular SNP, the critical aspect of this analysis is that it incorporates the probability of obtaining the observed value of a statistic on disease association under the alternative hypotheses of non-null association. An approximate Bayes factor (ABF) was proposed by Wakefield (Genetic Epidemiology 2009;33:79–86) based on a normal prior for the underlying effect-size distribution. However, misspecification of the prior can lead to failure in incorporating the probability under the alternative hypothesis. In this paper, we propose a semi-parametric, empirical Bayes factor (SP-EBF) based on a nonparametric effect-size distribution estimated from the data. Analysis of several GWAS datasets revealed the presence of substantial numbers of SNPs with small effect sizes, and the SP-EBF attributed much greater significance to such SNPs than the ABF. Overall, the SP-EBF incorporates an effect-size distribution that is estimated from the data, and it has the potential to improve the accuracy of Bayes factor analysis in GWASs.

Download Full-text