scholarly journals Modeling Linkage Disequilibrium and Identifying Recombination Hotspots Using Single-Nucleotide Polymorphism Data

Genetics ◽  
2003 ◽  
Vol 165 (4) ◽  
pp. 2213-2233 ◽  
Author(s):  
Na Li ◽  
Matthew Stephens

AbstractWe introduce a new statistical model for patterns of linkage disequilibrium (LD) among multiple SNPs in a population sample. The model overcomes limitations of existing approaches to understanding, summarizing, and interpreting LD by (i) relating patterns of LD directly to the underlying recombination process; (ii) considering all loci simultaneously, rather than pairwise; (iii) avoiding the assumption that LD necessarily has a “block-like” structure; and (iv) being computationally tractable for huge genomic regions (up to complete chromosomes). We examine in detail one natural application of the model: estimation of underlying recombination rates from population data. Using simulation, we show that in the case where recombination is assumed constant across the region of interest, recombination rate estimates based on our model are competitive with the very best of current available methods. More importantly, we demonstrate, on real and simulated data, the potential of the model to help identify and quantify fine-scale variation in recombination rate from population data. We also outline how the model could be useful in other contexts, such as in the development of more efficient haplotype-based methods for LD mapping.

2019 ◽  
Vol 10 (1) ◽  
pp. 299-309 ◽  
Author(s):  
Rami-Petteri Apuli ◽  
Carolina Bernhardsson ◽  
Bastian Schiffthaler ◽  
Kathryn M. Robinson ◽  
Stefan Jansson ◽  
...  

The rate of meiotic recombination is one of the central factors determining genome-wide levels of linkage disequilibrium which has important consequences for the efficiency of natural selection and for the dissection of quantitative traits. Here we present a new, high-resolution linkage map for Populus tremula that we use to anchor approximately two thirds of the P. tremula draft genome assembly on to the expected 19 chromosomes, providing us with the first chromosome-scale assembly for P. tremula (Table 2). We then use this resource to estimate variation in recombination rates across the P. tremula genome and compare these results to recombination rates based on linkage disequilibrium in a large number of unrelated individuals. We also assess how variation in recombination rates is associated with a number of genomic features, such as gene density, repeat density and methylation levels. We find that recombination rates obtained from the two methods largely agree, although the LD-based method identifies a number of genomic regions with very high recombination rates that the map-based method fails to detect. Linkage map and LD-based estimates of recombination rates are positively correlated and show similar correlations with other genomic features, showing that both methods can accurately infer recombination rate variation across the genome. Recombination rates are positively correlated with gene density and negatively correlated with repeat density and methylation levels, suggesting that recombination is largely directed toward gene regions in P. tremula.


2018 ◽  
Author(s):  
Ian M.S. White ◽  
William G. Hill

ABSTRACTIndividuals of specified pedigree relationship vary in the proportion of the genome they share identical by descent, i.e. in their realised or actual relationship. Basing predictions of the variance in realised relationship solely on the proportion of the map length shared implicitly assumes that both recombination rate and genetic information are uniformly distributed along the genome, ignoring the possible existence of recombination hotspots, and failing to distinguish between coding and non-coding sequences. In this paper we quantify the effects of heterogeneity in recombination rate at broad and fine scale levels on the variation in realised relationship. A chromosome with variable recombination rate usually shows more variance in realised relationship than does one having the same map length with constant recombination rate, especially if recombination rates are higher towards chromosome ends. Reductions in variance can also be found, and the overall pattern of change is quite complex. In general, local (fine-scale) variation in recombination rate, e.g. hotspots, has a small influence on the variance in realised relationship. Differences in rates across longer regions and between chromosome ends can increase or decrease the variance in realised relationship, depending on the genomic architecture.


2015 ◽  
Author(s):  
Hasan Alhaddad ◽  
Chi Zhang ◽  
Bruce Rannala ◽  
Leslie A Lyons

Recombination has essential roles in increasing genetic variability within a population and in ensuring successful meiotic events. The objective of this study is to (i) infer the population scaled recombination rate (ρ), and (ii) identify and characterize localities of increased recombination rate for the domestic cat, Felis silvestris catus. SNPs (n = 701) were genotyped in twenty-two cats of Eastern random bred origin. The SNPs covered ten different chromosomal regions (A1, A2, B3, C2, D1, D2, D4, E2, F2, X) with an average region size of 850 Kb and an average SNP density of 70 SNPs/region. The Bayesian method in the program inferRho was used to infer regional population recombination rates and hotspots localities. The regions exhibited variable population recombination rates and four decisive recombination hotspots were identified on cat chromosome A2, D1, and E2 regions. No correlation was detected between the GC content and the locality of recombination spots. The hotspots enclosed L2 LINE elements and MIR and tRNA-Lys SINE elements in agreement with hotspots found in other mammals.


2018 ◽  
Author(s):  
Enrique J. Schwarzkopf ◽  
Juan C. Motamayor ◽  
Omar E. Cornejo

AbstractOur study investigates the possible drivers of recombination hotspots in Theobroma cacao using ten genetically differentiated populations. By comparing recombination patterns between multiple populations, we obtain a novel view of recombination at the population-divergence timescale. For each population, a fine-scale recombination map was generated using the coalescent with a standard method based on linkage disequilibrium (LD). These maps revealed higher recombination rates in a domesticated population and a population that has undergone a recent bottleneck. We inferred hotspots of recombination for each population and find that the genomic locations of hotspots correlate with genetic differentiation between populations (FST). We used randomization approaches to generate appropriate null models to understand the association between hotspots of recombination and both DNA sequence motifs and genomic features. We found that hotspot regions contained fewer known retroelement sequences than expected and were overrepresented near transcription start and termination sites. Our findings indicate that recombination hotspots are evolving in a way that is consistent with genetic differentiation but are also preferentially driven to near coding regions. We illustrate that, consistent with predictions in plant domestication, the recombination rate of the domesticated population is orders of magnitude higher than that of other populations. More importantly, we find two fixed mutations in the domesticated population’s FIGL1 protein. FIGL1 has been shown to increase recombination rates in Arabidopsis by several orders of magnitude, suggesting a possible mechanism for the observed increased recombination rate in the domesticated population.


2021 ◽  
Author(s):  
Irene Novo ◽  
Armando Caballero ◽  
Enrique Santiago

The effective population size ( N e ) is a key parameter to quantify the magnitude of genetic drift and inbreeding, with important implications in human evolution. The increasing availability of high-density genetic markers allows the estimation of historical changes in N e across time using measures of genome diversity or linkage disequilibrium between markers. Selection is expected to reduce diversity and N e , and this reduction is modulated by the heterogeneity of the genome in terms of recombination rate. Here we investigate by computer simulations the consequences of selection (both positive and negative) and of recombination rate heterogeneity in the estimation of historical N e . We also investigate the relationship between diversity parameters and N e across the different regions of the genome using human marker data. We show that the estimates of historical N e obtained from linkage disequilibrium between markers ( N e LD ) are virtually unaffected by selection. In contrast, those estimates obtained by coalescence mutation-recombination-based methods can be strongly affected by it, what could have important consequences for the estimation of human demography. The simulation results are supported by the analysis of human data. The estimates of N e LD obtained for particular genomic regions do not correlate with recombination rate, nucleotide diversity, polymorphism, background selection statistic, minor allele frequency of SNPs, loss of function and missense variants and gene density. This suggests that N e LD measures are merely indicative of demographic changes in population size across generations.


2019 ◽  
Author(s):  
Rami-Petteri Apuli ◽  
Carolina Bernhardsson ◽  
Bastian Schiffthaler ◽  
Kathryn M. Robinson ◽  
Stefan Jansson ◽  
...  

AbstractThe rate of meiotic recombination is one of the central factors determining levels of linkage disequilibrium and the efficiency of natural selection, and many organisms show a positive correlation between local rates of recombination and levels of nucleotide diversity indicating that linked selection is an important factor determining genome-wide levels of nucleotide diversity. Several methods for estimating recombination rates from segregating polymorphisms in natural populations have recently been developed. These methods have been extensively used in part because they are relatively simple to implement even in many non-model organisms, but also because they potentially offer higher resolution than traditional map-based methods. However, thorough comparisons of LD and map-based estimates of recombination are not readily available in plants. Here we present a new, high-resolution linkage map for Populus tremula and use this to estimate variation in recombination rates across the P. tremula genome. We compare these results to recombination rates estimated based on linkage disequilibrium in a large number of unrelated individuals. We also assess how variation in recombination rates is associated with genomic features, such as gene density, repeat density and methylation levels. We find that recombination rates obtained from the two methods largely agree, although the LD-based method identify a number of genomic regions with very high recombination rates that the map-based method fail to detect. Linkage map and LD-based estimates of recombination rates are positively correlated and show similar correlations with other genomic features, showing that both methods can accurately infer recombination rate variation across the genome.


Genetics ◽  
2000 ◽  
Vol 156 (3) ◽  
pp. 1393-1401 ◽  
Author(s):  
Mary K Kuhner ◽  
Jon Yamato ◽  
Joseph Felsenstein

AbstractWe describe a method for co-estimating r = C/μ (where C is the per-site recombination rate and μ is the per-site neutral mutation rate) and Θ = 4Neμ (where Ne is the effective population size) from a population sample of molecular data. The technique is Metropolis-Hastings sampling: we explore a large number of possible reconstructions of the recombinant genealogy, weighting according to their posterior probability with regard to the data and working values of the parameters. Different relative rates of recombination at different locations can be accommodated if they are known from external evidence, but the algorithm cannot itself estimate rate differences. The estimates of Θ are accurate and apparently unbiased for a wide range of parameter values. However, when both Θ and r are relatively low, very long sequences are needed to estimate r accurately, and the estimates tend to be biased upward. We apply this method to data from the human lipoprotein lipase locus.


2020 ◽  
Vol 12 (4) ◽  
pp. 370-380 ◽  
Author(s):  
Ahmed R Hasan ◽  
Rob W Ness

Abstract Recombination confers a major evolutionary advantage by breaking up linkage disequilibrium between harmful and beneficial mutations, thereby facilitating selection. However, in species that are only periodically sexual, such as many microbial eukaryotes, the realized rate of recombination is also affected by the frequency of sex, meaning that infrequent sex can increase the effects of selection at linked sites despite high recombination rates. Despite this, the rate of sex of most facultatively sexual species is unknown. Here, we use genomewide patterns of linkage disequilibrium to infer fine-scale recombination rate variation in the genome of the facultatively sexual green alga Chlamydomonas reinhardtii. We observe recombination rate variation of up to two orders of magnitude and find evidence of recombination hotspots across the genome. Recombination rate is highest flanking genes, consistent with trends observed in other nonmammalian organisms, though intergenic recombination rates vary by intergenic tract length. We also find a positive relationship between nucleotide diversity and physical recombination rate, suggesting a widespread influence of selection at linked sites in the genome. Finally, we use estimates of the effective rate of recombination to calculate the rate of sex that occurs in natural populations, estimating a sexual cycle roughly every 840 generations. We argue that the relatively infrequent rate of sex and large effective population size creates a population genetic environment that increases the influence of selection on linked sites across the genome.


2019 ◽  
Author(s):  
Ziqian Hao ◽  
Haipeng Li

AbstractRecombination is a major force that shapes genetic diversity. The inference accuracy of recombination rate is important and can be improved by increasing sample size. However, it has never been investigated whether sample size affects the distribution of inferred recombination activity along the genome, and the inference of recombination hotspots. In this study, we applied an artificial intelligence approach to estimate recombination rates in the UK10K human genomic data set with 7,562 genomes and in the OMNI CEU data set with 170 genomes. We found that the fluctuation of local recombination rate along the UK10K genomes is much smaller than that along the CEU genomes, and recombination activity in the UK10K genomes is also much less concentrated. The same phenomena were also observed when comparing UK10K with its two subsets with 200 and 400 genomes. In all cases, analyses of a larger number of genomes result in a more precise estimation of recombination rate and a less concentrated recombination activity with fewer recombination hotpots identified. Generally, UK10K recombination hotspots are about 2.93-14.25 times fewer than that identified in previous studies. By comparing the recombination hotspots of UK10K and its subsets, we found that the false inference of population-specific recombination hotspots could be as high as 75.86% if the number of sampled genomes is not super large. The results suggest that the uncertainty of estimated recombination rate is substantial when sample size is not super large, and more attention should be paid to accurate identification of recombination hotspots, especially population-specific recombination hotspots.Author summaryWe applied FastEPRR, an artificial intelligence method to estimate recombination rates in the UK10K data set with 7,562 genomes and established the most accurate human genetic map. By comparing with other human genetic maps, we found that analyses of a larger number of genomes result in a more precise estimation of recombination rate and a less concentrated recombination activity with fewer recombination hotpots identified. The false inference of population-specific recombination hotspots could be substantial if the number of sampled genomes is not super large.


Sign in / Sign up

Export Citation Format

Share Document