scholarly journals Quantifying GC-biased gene conversion in great ape genomes using polymorphism-aware models

2018 ◽  
Author(s):  
Rui Borges ◽  
Gergely Szöllősi ◽  
Carolin Kosiol

AbstractAs multi-individual population-scale data is becoming available, more-complex modeling strategies are needed to quantify the genome-wide patterns of nucleotide usage and associated mechanisms of evolution. Recently, the multivariate neutral Moran model was proposed. However, it was shown insufficient to explain the distribution of alleles in great apes. Here, we propose a new model that includes allelic selection. Our theoretical results constitute the basis of a new Bayesian framework to estimate mutation rates and selection coefficients from population data. We employ the new framework to a great ape dataset at we found patterns of allelic selection that match those of genome-wide GC-biased gene conversion (gBCG). In particular, we show that great apes have patterns of allelic selection that vary in intensity, a feature that we correlated with the great apes’ distinct demographies. We also demonstrate that the AT/GC toggling effect decreases the probability of a substitution, promoting more polymorphisms in the base composition of great ape genomes. We further assess the impact of CG-bias in molecular analysis and we find that mutation rates and genetic distances are estimated under bias when gBGC is not properly accounted. Our results contribute to the discussion on the tempo and mode of gBGC evolution, while stressing the need for gBGC-aware models in population genetics and phylogenetics.

2019 ◽  
Author(s):  
José María Heredia-Genestar ◽  
Tomàs Marquès-Bonet ◽  
David Juan ◽  
Arcadi Navarro

Introductory ParagraphMutations do not accumulate uniformly across the genome. Human germline and tumor mutation density correlate poorly, and each is associated with different genomic features. Here, we analyze the genome-wide distribution of mutation densities in human and non-human Great Ape (NHGA) germlines as well as human tumors. Strikingly, non-human Great Ape germlines present higher correlation with tumors than the human germline does. This situation is mediated by a different distribution in the human germline of mutations at non-CpG sites, but not of CpG>T transitions. We propose that the impact of ancestral and historical human demographic events on human mutation density leads to this specific disruption in its expected genome-wide distribution. Tumors partially recover this distribution by the accumulation of pre-neoplastic-like somatic mutations. Our results highlight the potential utility of using Great Ape population data, rather than human controls, to establish the expected mutational background of healthy somatic cells.


2018 ◽  
Author(s):  
Toni I. Gossmann ◽  
Mathias Bockwoldt ◽  
Lilith Diringer ◽  
Friedrich Schwarz ◽  
Vic-Fabienne Schumann

ABSTRACTIt is well established that GC content varies across the genome in many species and that GC biased gene conversion, one form of meiotic recombination, is likely to contribute to this heterogeneity. Bird genomes provide an extraordinary system to study the impact of GC biased gene conversion owed to their specific genomic features. They are characterised by a high karyotype conservation with substantial heterogeneity in chromosome sizes, with up to a dozen large macrochromosomes and many smaller microchromosomes common across all bird species. This heterogeneity in chromosome morphology is also reflected by other genomic features, such as smaller chromosomes being gene denser, more compact and more GC rich relative to their macrochromosomal counterparts - illustrating that the intensity of GC biased gene conversion varies across the genome. Here we study whether it is possible to infer heterogeneity in GC biased gene conversion rates across the genome using a recently published method that accounts for GC biased gene conversion when estimating branch lengths in a phylogenetic context. To infer the strength of GC biased gene conversion we contrast branch length estimates across the genome both taking and not taking non-stationary GC composition into account. Using simulations we show that this approach works well when GC fixation bias is strong and note that the number of substitutions along a branch is consistently overestimated when GC biased gene conversion is not accounted for. We use this predictable feature to infer the strength of GC dynamics across the great tit genome by applying our new test statistic to data at 4-fold degenerate sites from three bird species - great tit, zebra finch and chicken - three species that are among the best annotated bird genomes to date. We show that using a simple one-dimensional binning we fail to capture a signal of fixation bias as observed in our simulations. However, using a multidimensional binning strategy, we find evidence for heterogeneity in the strength of fixation bias, including AT fixation bias. This highlights the difficulties when combining sequence data across different regions in the genome.


2017 ◽  
Author(s):  
Aaron C. Wacholder ◽  
David D. Pollock

AbstractWe performed a genome-wide scan for recombination-mediated interlocus gene conversion and deletion events among a set of orthologous Alu loci in the Great Apes, and were surprised to discover an extreme excess of such events in the gorilla lineage versus other lineages. Gorilla events, but not events in other Great Apes, are strongly associated with a 15 bp motif commonly found in Alu sequences. This result is consistent with evolutionarily transient targeting of the motif by PRDM9, which induces double strand breaks and crossovers during meiosis at specific but rapidly changing sequence motifs. The motif is preferentially found in conversion recipients but not donors, and is substantially depleted in gorillas, consistent with loss of PRDM9 targets by meiotic drive. Recombination probability falls of exponentially with distance between loci, is reduced slightly by sequence divergence, and drops substantially with recipient divergence from the target motif. We identified 16 other high-copy motifs in human, often associated with transposable elements, with lineage-specific depletion and nearby gene conversion signatures, consistent with transient roles as PRDM9 targets. This work strengthens our understanding of recombination-mediated events in evolution and highlights the potential for interactions between PRDM9 and repetitive sequences to cause rapid change in the genome.


2018 ◽  
Author(s):  
David Castellano ◽  
Adam Eyre-Walker ◽  
Kasper Munch

AbstractDNA diversity varies across the genome of many species. Variation in diversity across a genome might arise for one of three reasons; regional variation in the mutation rate, selection and biased gene conversion. We show that both non-coding and non-synonymous diversity are correlated to a measure of the mutation rate, the recombination rate and the density of conserved sequences in 50KB windows across the genomes of humans and non-human homininae. We show these patterns persist even when we restrict our analysis to GC-conservative mutations, demonstrating that the patterns are not driven by biased gene conversion. The positive correlation between diversity and our measure of the mutation rate seems to be largely a direct consequence of regions with higher mutation rates having more diversity. However, the positive correlation with recombination rate and the negative correlation with the density of conserved sequences suggests that selection at linked sites affect levels of diversity. This is supported by the observation that the ratio of the number of non-synonymous to non-coding polymorphisms is negatively correlated to a measure of the effective population size across the genome. Furthermore, we find evidence that these genomic variables are better predictors of non-coding diversity in large homininae populations than in small populations, after accounting for statistical power. This is consistent with genetic drift decreasing the impact of selection at linked sites in small populations. In conclusion, our comparative analyses describe for the first time how recombination rate, gene density, mutation rate and genetic drift interact to produce the patterns of DNA diversity that we observe along and between homininae genomes.


Genetics ◽  
2019 ◽  
Vol 212 (4) ◽  
pp. 1321-1336 ◽  
Author(s):  
Rui Borges ◽  
Gergely J. Szöllősi ◽  
Carolin Kosiol

2015 ◽  
Vol 5 (3) ◽  
pp. 441-447 ◽  
Author(s):  
Carina F Mugal ◽  
Peter F Arndt ◽  
Lena Holm ◽  
Hans Ellegren

Abstract The genomes of many vertebrates show a characteristic variation in GC content. To explain its origin and evolution, mainly three mechanisms have been proposed: selection for GC content, mutation bias, and GC-biased gene conversion. At present, the mechanism of GC-biased gene conversion, i.e., short-scale, unidirectional exchanges between homologous chromosomes in the neighborhood of recombination-initiating double-strand breaks in favor for GC nucleotides, is the most widely accepted hypothesis. We here suggest that DNA methylation also plays an important role in the evolution of GC content in vertebrate genomes. To test this hypothesis, we investigated one mammalian (human) and one avian (chicken) genome. We used bisulfite sequencing to generate a whole-genome methylation map of chicken sperm and made use of a publicly available whole-genome methylation map of human sperm. Inclusion of these methylation maps into a model of GC content evolution provided significant support for the impact of DNA methylation on the local equilibrium GC content. Moreover, two different estimates of equilibrium GC content, one that neglects and one that incorporates the impact of DNA methylation and the concomitant CpG hypermutability, give estimates that differ by approximately 15% in both genomes, arguing for a strong impact of DNA methylation on the evolution of GC content. Thus, our results put forward that previous estimates of equilibrium GC content, which neglect the hypermutability of CpG dinucleotides, need to be reevaluated.


2020 ◽  
Author(s):  
Rui Borges ◽  
Bastien Boussau ◽  
Gergely Szollosi ◽  
Carolin Kosiol

Despite the importance of natural selection in species' evolutionary history, phylogenetic methods that take into account population-level processes ignore selection. Assuming neutrality is often based on the idea that selection occurs at a minority of loci in the genome and is unlikely to significantly compromise phylogenetic inferences. However, selection might behave more pervasively, as it the case of nearly neutral evolving mutations. Genome-wide processes like GC-bias and some of the variation segregating at the coding regions are known to evolve in the nearly neutral range. As we are now using genome-wide data to estimate species tree, it is just natural to ask whether weak, but pervasive, selection is likely to blur species tree inferences. Here, we employed a polymorphism-aware phylogenetic model, specially tailored for measuring signatures of nucleotide usage biases, to test the impact of nearly neutrally in the substitution process. Analyses with simulated data indicate that while the inferred relationships among species are not significantly compromised, the genetic distances are systematically underestimated, with the deeper nodes suffering more than the younger ones. Such biases have implications for molecular dating. We found signatures of GC-bias considerably affecting the estimated divergence times (up to 21%) of worldwide fruit fly populations. Our findings call for the need to account for nearly neutral forces (or any other form of pervasive selection) when quantifying divergence or dating species evolution.


2014 ◽  
Author(s):  
Sylvain Glemin ◽  
Peter F Arndt ◽  
Philipp W Messer ◽  
Dmitri Petrov ◽  
Nicolas Galtier ◽  
...  

Many lines of evidence indicate GC-biased gene conversion (gBGC) has a major impact on the evolution of mammalian genomes. However, up to now, this process had not been properly quantified. In principle, the strength of gBGC can be measured from the analysis of derived allele frequency spectra. However, this approach is sensitive to a number of confounding factors. In particular, we show by simulations that the inference is pervasively affected by polymorphism polarization errors, especially at hypermutable sites, and spatial heterogeneity in gBGC strength. Here we propose a new method to quantify gBGC from DAF spectra, incorporating polarization errors and taking spatial heterogeneity into account. This method is very general in that it does not require any prior knowledge about the source of polarization errors and also provides information about mutation patterns. We apply this approach to human polymorphism data from the 1000 genomes project. We show that the strength of gBGC does not differ between hypermutable CpG sites and non-CpG sites, suggesting that in humans gBGC is not caused by the base-excision repair machinery. We further find that the impact of gBGC is concentrated primarily within recombination hotspots: genome-wide, the strength of gBGC is in the nearly neutral area, but 2% of the human genome is subject to strong gBGC, with population-scaled gBGC coefficients above 5. Given that the location of recombination hotspots evolves very rapidly, our analysis predicts that in the long term, a large fraction of the genome is affected by short episodes of strong gBGC.


2021 ◽  
Author(s):  
William R. Milligan ◽  
Guy Amster ◽  
Guy Sella

AbstractMutation rates and spectra differ among human populations. Here, we examine whether this variation could be explained by evolution at mutation modifiers. To this end, we consider genetic modifier sites at which mutations, “mutator alleles”, increase genome-wide mutation rates and model their evolution under purifying selection due to the additional deleterious mutations that they cause, genetic drift, and demographic processes. We solve the model analytically for a constant population size and characterize how evolution at modifier sites impacts variation in mutation rates within and among populations. We then use simulations to study the effects of modifier sites under a plausible demographic model for Africans and Europeans. When comparing populations that evolve independently, weakly selected modifier sites (2Nes ≈ 1), which evolve slowly, contribute the most to variation in mutation rates. In contrast, when populations recently split from a common ancestral population, strongly selected modifier sites (2Nes ≫ 1), which evolve rapidly, contribute the most to variation between them. Moreover, a modest number of modifier sites (e.g., 10 per mutation type in the standard classification into 96 types) subject to moderate to strong selection (2Nes > 1) could account for the variation in mutation rates observed among human populations. If such modifier sites indeed underlie differences among populations, they should also cause variation in mutation rates within populations and their effects should be detectable in pedigree studies.


2020 ◽  
Author(s):  
Juraj Bergman ◽  
Mikkel Heide Schierup

AbstractBackgroundThe nucleotide composition of the genome is a balance between origin and fixation rates of different mutations. For example, it is well-known that transitions occur more frequently than transversions, particularly at CpG sites. Differences in fixation rates of mutation types are less explored. Specifically, recombination-associated GC-biased gene conversion (gBGC) may differentially impact GC-changing mutations, due to differences in their genomic distributions and efficiency of mismatch repair mechanisms. Given that recombination evolves rapidly across species, we explore gBGC of different mutation types across human populations and among great ape species.ResultsWe report a stronger correlation between GC frequency and recombination for transitions than for transversions. Notably, CpG transitions are most strongly affected by gBGC. We show that the strength of gBGC differs for transitions and transversions but that its overall strength is positively correlated with effective population sizes of human populations and great ape species, with some notable exceptions, such as a stronger effect of gBGC on non-CpG transitions in populations of European descent. We study the dependence of gBGC dynamics on flanking nucleotides and show that some mutation types evolve in opposition to the gBGC expectation, likely due to hypermutability of specific nucleotide contexts.ConclusionsDifferences in GC-biased gene conversion are evident between different mutation types, and dependent on sex-specific recombination, population size and flanking nucleotide context. Our results therefore highlight the importance of different gBGC dynamics experienced by GC-changing mutations and their impact on nucleotide composition evolution.


Sign in / Sign up

Export Citation Format

Share Document