scholarly journals Detecting recent selective sweeps while controlling for mutation rate and background selection

2015 ◽  
Author(s):  
Christian D. Huber ◽  
Michael DeGiorgio ◽  
Ines Hellmann ◽  
Rasmus Nielsen

A composite likelihood ratio test implemented in the program SweepFinder is a commonly used method for scanning a genome for recent selective sweeps. SweepFinder uses information on the spatial pattern of the site frequency spectrum (SFS) around the selected locus. To avoid confounding effects of background selection and variation in the mutation process along the genome, the method is typically applied only to sites that are variable within species. However, the power to detect and localize selective sweeps can be greatly improved if invariable sites are also included in the analysis. In the spirit of a Hudson-Kreitman-Aguadé test, we suggest to add fixed differences relative to an outgroup to account for variation in mutation rate, thereby facilitating more robust and powerful analyses. We also develop a method for including background selection modeled as a local reduction in the effective population size. Using simulations we show that these advances lead to a gain in power while maintaining robustness to mutation rate variation. Furthermore, the new method also provides more precise localization of the causative mutation than methods using the spatial pattern of segregating sites alone.


2019 ◽  
Vol 12 (1) ◽  
pp. 3550-3561 ◽  
Author(s):  
David Castellano ◽  
Adam Eyre-Walker ◽  
Kasper Munch

Abstract DNA diversity varies across the genome of many species. Variation in diversity across a genome might arise from regional variation in the mutation rate, variation in the intensity and mode of natural selection, and regional variation in the recombination rate. We show that both noncoding and nonsynonymous diversity are positively correlated to a measure of the mutation rate and the recombination rate and negatively correlated to the density of conserved sequences in 50 kb windows across the genomes of humans and nonhuman homininae. Interestingly, we find that although noncoding diversity is equally affected by these three genomic variables, nonsynonymous diversity is mostly dominated by the density of conserved sequences. The positive correlation between diversity and our measure of the mutation rate seems to be largely a direct consequence of regions with higher mutation rates having more diversity. However, the positive correlation with recombination rate and the negative correlation with the density of conserved sequences suggest that selection at linked sites also affect levels of diversity. This is supported by the observation that the ratio of the number of nonsynonymous to noncoding polymorphisms is negatively correlated to a measure of the effective population size across the genome. We show these patterns persist even when we restrict our analysis to GC-conservative mutations, demonstrating that the patterns are not driven by GC biased gene conversion. In conclusion, our comparative analyses describe how recombination rate, gene density, and mutation rate interact to produce the patterns of DNA diversity that we observe along the hominine genomes.



2015 ◽  
Vol 25 (1) ◽  
pp. 142-156 ◽  
Author(s):  
Christian D. Huber ◽  
Michael DeGiorgio ◽  
Ines Hellmann ◽  
Rasmus Nielsen


2017 ◽  
Vol 372 (1736) ◽  
pp. 20160471 ◽  
Author(s):  
Josep M. Comeron

The consequences of selection at linked sites are multiple and widespread across the genomes of most species. Here, I first review the main concepts behind models of selection and linkage in recombining genomes, present the difficulty in parametrizing these models simply as a reduction in effective population size ( N e ) and discuss the predicted impact of recombination rates on levels of diversity across genomes. Arguments are then put forward in favour of using a model of selection and linkage with neutral and deleterious mutations (i.e. the background selection model, BGS) as a sensible null hypothesis for investigating the presence of other forms of selection, such as balancing or positive. I also describe and compare two studies that have generated high-resolution landscapes of the predicted consequences of selection at linked sites in Drosophila melanogaster . Both studies show that BGS can explain a very large fraction of the observed variation in diversity across the whole genome, thus supporting its use as null model. Finally, I identify and discuss a number of caveats and challenges in studies of genetic hitchhiking that have been often overlooked, with several of them sharing a potential bias towards overestimating the evidence supporting recent selective sweeps to the detriment of a BGS explanation. One potential source of bias is the analysis of non-equilibrium populations: it is precisely because models of selection and linkage predict variation in N e across chromosomes that demographic dynamics are not expected to be equivalent chromosome- or genome-wide. Other challenges include the use of incomplete genome annotations, the assumption of temporally stable recombination landscapes, the presence of genes under balancing selection and the consequences of ignoring non-crossover (gene conversion) recombination events. This article is part of the themed issue ‘Evolutionary causes and consequences of recombination rate variation in sexual organisms’.



Genetics ◽  
1996 ◽  
Vol 143 (3) ◽  
pp. 1457-1465 ◽  
Author(s):  
Fumio Tajima

Abstract The expectations of the average number of nucleotide differences per site (π), the proportion of segregating site (s), the minimum number of mutations per site (s*) and some other quantities were derived under the finite site models with and without rate variation among sites, where the finite site models include Jukes and Cantor's model, the equal-input model and Kimura's model. As a model of rate variation, the gamma distribution was used. The results indicate that if distribution parameter α is small, the effect of rate variation on these quantities are substantial, so that the estimates of θ based on the infinite site model are substantially underestimated, where θ = 4Nv, N is the effective population size and vis the mutation rate per site per generation. New methods for estimating θ are also presented, which are based on the finite site models with and without rate variation. Using these methods, underestimation can be corrected.



Genetics ◽  
1994 ◽  
Vol 136 (2) ◽  
pp. 685-692 ◽  
Author(s):  
Y X Fu

Abstract A new estimator of the essential parameter theta = 4Ne mu from DNA polymorphism data is developed under the neutral Wright-Fisher model without recombination and population subdivision, where Ne is the effective population size and mu is the mutation rate per locus per generation. The new estimator has a variance only slightly larger than the minimum variance of all possible unbiased estimators of the parameter and is substantially smaller than that of any existing estimator. The high efficiency of the new estimator is achieved by making full use of phylogenetic information in a sample of DNA sequences from a population. An example of estimating theta by the new method is presented using the mitochondrial sequences from an American Indian population.



Genetics ◽  
2002 ◽  
Vol 160 (1) ◽  
pp. 247-256
Author(s):  
M Kauer ◽  
B Zangerl ◽  
D Dieringer ◽  
C Schlötterer

Abstract Levels of neutral variation are influenced by background selection and hitchhiking. The relative contribution of these evolutionary forces to the distribution of neutral variation is still the subject of ongoing debates. Using 133 microsatellites, we determined levels of variability on X chromosomes and autosomes in African and non-African D. melanogaster populations. In the ancestral African populations microsatellite variability was higher on X chromosomes than on autosomes. In non-African populations X-linked polymorphism is significantly more reduced than autosomal variation. In non-African populations we observed a significant positive correlation between X chromosomal polymorphism and recombination rate. These results are consistent with the interpretation that background selection shapes levels of neutral variability in the ancestral populations, while the pattern in derived populations is determined by multiple selective sweeps during the colonization process. Further research, however, is required to investigate the influence of inversion polymorphisms and unequal sex ratios.



Genetics ◽  
1996 ◽  
Vol 144 (2) ◽  
pp. 689-703 ◽  
Author(s):  
Michael J Ford ◽  
Charles F Aquadro

Abstract We present the results of a restriction site survey of variation at five loci in Drosophila athabasca, complimenting a previous study of the period locus. There is considerably greater differentiation between the three semispecies of D. athabasca at the period locus and two other X-linked genes (neon-transient-A and E74A) than at three autosomal genes (Xdh, Adh and RC98). Using a modification of the HKA test, which uses fixed differences between the semispecies and a test based on differences in Fst among loci, we show that the greater differentiation of the X-linked loci compared with the autosomal loci is inconsistent with a neutral model of molecular evolution. We explore several evolutionary scenarios by computer simulation, including differential migration of X and autosomal genes, very low levels of migration among the semispecies, selective sweeps, and background selection, and conclude that X-linked selective sweeps in at least two of the semispecies are the best explanation for the data. This evidence that natural selection acted on the X-chromosome suggests that another X-linked trait, mating song differences among the semispecies, may have been the target of selection.



Genetics ◽  
2000 ◽  
Vol 154 (1) ◽  
pp. 381-395
Author(s):  
Pavel Morozov ◽  
Tatyana Sitnikova ◽  
Gary Churchill ◽  
Francisco José Ayala ◽  
Andrey Rzhetsky

Abstract We propose models for describing replacement rate variation in genes and proteins, in which the profile of relative replacement rates along the length of a given sequence is defined as a function of the site number. We consider here two types of functions, one derived from the cosine Fourier series, and the other from discrete wavelet transforms. The number of parameters used for characterizing the substitution rates along the sequences can be flexibly changed and in their most parameter-rich versions, both Fourier and wavelet models become equivalent to the unrestricted-rates model, in which each site of a sequence alignment evolves at a unique rate. When applied to a few real data sets, the new models appeared to fit data better than the discrete gamma model when compared with the Akaike information criterion and the likelihood-ratio test, although the parametric bootstrap version of the Cox test performed for one of the data sets indicated that the difference in likelihoods between the two models is not significant. The new models are applicable to testing biological hypotheses such as the statistical identity of rate variation profiles among homologous protein families. These models are also useful for determining regions in genes and proteins that evolve significantly faster or slower than the sequence average. We illustrate the application of the new method by analyzing human immunoglobulin and Drosophilid alcohol dehydrogenase sequences.



2021 ◽  
Vol 52 (1) ◽  
pp. 177-197
Author(s):  
Brian Charlesworth ◽  
Jeffrey D. Jensen

Patterns of variation and evolution at a given site in a genome can be strongly influenced by the effects of selection at genetically linked sites. In particular, the recombination rates of genomic regions correlate with their amount of within-population genetic variability, the degree to which the frequency distributions of DNA sequence variants differ from their neutral expectations, and the levels of adaptation of their functional components. We review the major population genetic processes that are thought to lead to these patterns, focusing on their effects on patterns of variability: selective sweeps, background selection, associative overdominance, and Hill–Robertson interference among deleterious mutations. We emphasize the difficulties in distinguishing among the footprints of these processes and disentangling them from the effects of purely demographic factors such as population size changes. We also discuss how interactions between selective and demographic processes can significantly affect patterns of variability within genomes.



2016 ◽  
Vol 283 (1841) ◽  
pp. 20161785 ◽  
Author(s):  
Long Wang ◽  
Yanchun Zhang ◽  
Chao Qin ◽  
Dacheng Tian ◽  
Sihai Yang ◽  
...  

Mutation rates and recombination rates vary between species and between regions within a genome. What are the determinants of these forms of variation? Prior evidence has suggested that the recombination might be mutagenic with an excess of new mutations in the vicinity of recombination break points. As it is conjectured that domesticated taxa have higher recombination rates than wild ones, we expect domesticated taxa to have raised mutation rates. Here, we use parent–offspring sequencing in domesticated and wild peach to ask (i) whether recombination is mutagenic, and (ii) whether domesticated peach has a higher recombination rate than wild peach. We find no evidence that domesticated peach has an increased recombination rate, nor an increased mutation rate near recombination events. If recombination is mutagenic in this taxa, the effect is too weak to be detected by our analysis. While an absence of recombination-associated mutation might explain an absence of a recombination–heterozygozity correlation in peach, we caution against such an interpretation.



Sign in / Sign up

Export Citation Format

Share Document