scholarly journals High-resolution population-specific recombination rates and their effect on phasing and genotype imputation

Author(s):  
Shabbeer Hassan ◽  
Ida Surakka ◽  
Marja-Riitta Taskinen ◽  
Veikko Salomaa ◽  
Aarno Palotie ◽  
...  

AbstractPrevious research has shown that using population-specific reference panels has a significant effect on downstream population genomic analyses like haplotype phasing, genotype imputation, and association, especially in the context of population isolates. Here, we developed a high-resolution recombination rate mapping at 10 and 50 kb scale using high-coverage (20–30×) whole-genome sequenced data of 55 family trios from Finland and compared it to recombination rates of non-Finnish Europeans (NFE). We tested the downstream effects of the population-specific recombination rates in statistical phasing and genotype imputation in Finns as compared to the same analyses performed by using the NFE-based recombination rates. We found that Finnish recombination rates have a moderately high correlation (Spearman’s ρ = 0.67–0.79) with NFE, although on average (across all autosomal chromosomes), Finnish rates (2.268 ± 0.4209 cM/Mb) are 12–14% lower than NFE (2.641 ± 0.5032 cM/Mb). Finnish recombination map was found to have no significant effect in haplotype phasing accuracy (switch error rates ~2%) and average imputation concordance rates (97–98% for common, 92–96% for low frequency and 78–90% for rare variants). Our results suggest that haplotype phasing and genotype imputation mostly depend on population-specific contexts like appropriate reference panels and their sample size, but not on population-specific recombination maps. Even though recombination rate estimates had some differences between the Finnish and NFE populations, haplotyping and imputation had not been noticeably affected by the recombination map used. Therefore, the currently available HapMap recombination maps seem robust for population-specific phasing and imputation pipelines, even in the context of relatively isolated populations like Finland.

2020 ◽  
Author(s):  
Shabbeer Hassan ◽  
Ida Surakka ◽  
Marja-Riitta Taskinen ◽  
Veikko Salomaa ◽  
Aarno Palotie ◽  
...  

AbstractFounder population size, demographic changes (eg. population bottlenecks or rapid expansion) can lead to variation in recombination rates across different populations. Previous research has shown that using population-specific reference panels has a significant effect on downstream population genomic analysis like haplotype phasing, genotype imputation and association, especially in the context of population isolates. Here, we developed a high-resolution recombination rate mapping at 10kb and 50kb scale using high-coverage (20-30x) whole-genome sequenced 55 family trios from Finland and compared it to recombination rates of non-Finnish Europeans (NFE). We tested the downstream effects of the population-specific recombination rates in statistical phasing and genotype imputation in Finns as compared to the same analyses performed by using the NFE-based recombination rates. We found that Finnish recombination rates have a moderately high correlation (Spearman’s ρ =0.67-0.79) with non-Finnish Europeans, although on average (across all autosomal chromosomes), Finnish rates (2.268±0.4209 cM/Mb) are 12-14% lower than NFE (2.641±0.5032 cM/Mb). Finnish recombination map was found to have no significant effect in haplotype phasing accuracy (switch error rates ~ 2%) and average imputation concordance rates (97-98% for common, 92-96% for low frequency and 78-90% for rare variants). Our results suggest that downstream population genomic analyses like haplotype phasing and genotype imputation mostly depend on population-specific contexts like appropriate reference panels and their sample size, but not on population-specific recombination maps or effective population sizes. Currently, available HapMap recombination maps seem robust for population-specific phasing and imputation pipelines, even in the context of relatively isolated populations like Finland.


2021 ◽  
Author(s):  
Su Wang ◽  
Miran Kim ◽  
Xiaoqian Jiang ◽  
Arif Ozgun Harmanci

The decreasing cost of DNA sequencing has led to a great increase in our knowledge about genetic variation. While population-scale projects bring important insight into genotype-phenotype relationships, the cost of performing whole-genome sequencing on large samples is still prohibitive. In-silico genotype imputation coupled with genotyping-by-arrays is a cost-effective and accurate alternative for genotyping of common and uncommon variants. Imputation methods compare the genotypes of the typed variants with the large population-specific reference panels and estimate the genotypes of untyped variants by making use of the linkage disequilibrium patterns. Most accurate imputation methods are based on the Li-Stephens hidden Markov model, HMM, that treats the sequence of each chromosome as a mosaic of the haplotypes from the reference panel. Here we assess the accuracy of local-HMMs, where each untyped variant is imputed using the typed variants in a small window around itself (as small as 1 centimorgan). Locality-based imputation is used recently by machine learning-based genotype imputation approaches. We assess how the parameters of the local-HMMs impact the imputation accuracy in a comprehensive set of benchmarks and show that local-HMMs can accurately impute common and uncommon variants and can be relaxed to impute rare variants as well. The source code for the local HMM implementations is publicly available at https://github.com/harmancilab/LoHaMMer.


Nature ◽  
2021 ◽  
Vol 590 (7845) ◽  
pp. 290-299 ◽  
Author(s):  
Daniel Taliun ◽  
◽  
Daniel N. Harris ◽  
Michael D. Kessler ◽  
Jedidiah Carlson ◽  
...  

AbstractThe Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.


Genetics ◽  
2000 ◽  
Vol 156 (3) ◽  
pp. 1285-1298 ◽  
Author(s):  
Bret A Payseur ◽  
Michael W Nachman

Abstract Background (purifying) selection on deleterious mutations is expected to remove linked neutral mutations from a population, resulting in a positive correlation between recombination rate and levels of neutral genetic variation, even for markers with high mutation rates. We tested this prediction of the background selection model by comparing recombination rate and levels of microsatellite polymorphism in humans. Published data for 28 unrelated Europeans were used to estimate microsatellite polymorphism (number of alleles, heterozygosity, and variance in allele size) for loci throughout the genome. Recombination rates were estimated from comparisons of genetic and physical maps. First, we analyzed 61 loci from chromosome 22, using the complete sequence of this chromosome to provide exact physical locations. These 61 microsatellites showed no correlation between levels of variation and recombination rate. We then used radiation-hybrid and cytogenetic maps to calculate recombination rates throughout the genome. Recombination rates varied by more than one order of magnitude, and most chromosomes showed significant suppression of recombination near the centromere. Genome-wide analyses provided no evidence for a strong positive correlation between recombination rate and polymorphism, although analyses of loci with at least 20 repeats suggested a weak positive correlation. Comparisons of microsatellites in lowest-recombination and highest-recombination regions also revealed no difference in levels of polymorphism. Together, these results indicate that background selection is not a major determinant of microsatellite variation in humans.


Genetics ◽  
1997 ◽  
Vol 147 (3) ◽  
pp. 1303-1316
Author(s):  
Michael W Nachman

Introns of four X-linked genes (Hprt, Plp, Glra2, and Amg) were sequenced to provide an estimate of nucleotide diversity at nuclear genes within the house mouse and to test the neutral prediction that the ratio of intraspecific polymorphism to interspecific divergence is the same for different loci. Hprt and Plp lie in a region of the X chromosome that experiences relatively low recombination rates, while Glra2 and Amg lie near the telomere of the X chromosome, a region that experiences higher recombination rates. A total of 6022 bases were sequenced in each of 10 Mus domesticus and one M. caroli. Average nucleotide diversity (π) for introns within M. domesticus was quite low (π = 0.078%). However, there was substantial variation in the level of heterozygosity among loci. The two telomeric loci, Glra2 and Amg, had higher ratios of polymorphism to divergence than the two loci experiencing lower recombination rates. These results are consistent with the hypothesis that heterozygosity is reduced in regions with lower rates of recombination, although sampling of additional genes is needed to establish whether there is a general correlation between heterozygosity and recombination rate as in Drosophila melanogaster.


Genetics ◽  
2003 ◽  
Vol 165 (4) ◽  
pp. 2213-2233 ◽  
Author(s):  
Na Li ◽  
Matthew Stephens

AbstractWe introduce a new statistical model for patterns of linkage disequilibrium (LD) among multiple SNPs in a population sample. The model overcomes limitations of existing approaches to understanding, summarizing, and interpreting LD by (i) relating patterns of LD directly to the underlying recombination process; (ii) considering all loci simultaneously, rather than pairwise; (iii) avoiding the assumption that LD necessarily has a “block-like” structure; and (iv) being computationally tractable for huge genomic regions (up to complete chromosomes). We examine in detail one natural application of the model: estimation of underlying recombination rates from population data. Using simulation, we show that in the case where recombination is assumed constant across the region of interest, recombination rate estimates based on our model are competitive with the very best of current available methods. More importantly, we demonstrate, on real and simulated data, the potential of the model to help identify and quantify fine-scale variation in recombination rate from population data. We also outline how the model could be useful in other contexts, such as in the development of more efficient haplotype-based methods for LD mapping.


2020 ◽  
Vol 10 (1) ◽  
pp. 2 ◽  
Author(s):  
Laith N. AL-Eitan ◽  
Doaa M. Rababa’h ◽  
Nancy M. Hakooz ◽  
Mansour A. Alghamdi ◽  
Rana B. Dajani

Several genetic variants have been identified that cause variation among different populations and even within individuals of a similar descent. This leads to interindividual variations in the optimal dose of the drug that is required to sustain the treatment efficiency. In this study, 56 single nucleotide polymorphisms (SNPs) within several pharmacogenes were analyzed in 128 unrelated subjects from a genetically isolated group of Circassian people living in Jordan. We also compared these variant distributions to other ethnic groups that are available at two databases (Genome 1000 and eXAC). Our results revealed that the distribution of allele frequencies within genes among Circassians in Jordan showed similarities and disparities when compared to other populations. This study provides a powerful base for clinically relevant SNPs to enhance medical research and future pharmacogenomic studies. Rare variants detected in isolated populations can significantly guide to novel loci involved in the development of clinically relevant traits.


2016 ◽  
Vol 283 (1841) ◽  
pp. 20161785 ◽  
Author(s):  
Long Wang ◽  
Yanchun Zhang ◽  
Chao Qin ◽  
Dacheng Tian ◽  
Sihai Yang ◽  
...  

Mutation rates and recombination rates vary between species and between regions within a genome. What are the determinants of these forms of variation? Prior evidence has suggested that the recombination might be mutagenic with an excess of new mutations in the vicinity of recombination break points. As it is conjectured that domesticated taxa have higher recombination rates than wild ones, we expect domesticated taxa to have raised mutation rates. Here, we use parent–offspring sequencing in domesticated and wild peach to ask (i) whether recombination is mutagenic, and (ii) whether domesticated peach has a higher recombination rate than wild peach. We find no evidence that domesticated peach has an increased recombination rate, nor an increased mutation rate near recombination events. If recombination is mutagenic in this taxa, the effect is too weak to be detected by our analysis. While an absence of recombination-associated mutation might explain an absence of a recombination–heterozygozity correlation in peach, we caution against such an interpretation.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0246365
Author(s):  
Kellie J. Carim ◽  
Scott Relyea ◽  
Craig Barfoot ◽  
Lisa A. Eby ◽  
John A. Kronenberger ◽  
...  

Human activities that fragment fish habitat have isolated inland salmonid populations. This isolation is associated with loss of migratory life histories and declines in population density and abundance. Isolated populations exhibiting only resident life histories may be more likely to persist if individuals can increase lifetime reproductive success by maturing at smaller sizes or earlier ages. Therefore, accurate estimates of age and size at maturity across resident salmonid populations would improve estimates of population viability. Commonly used methods for assessing maturity such as dissection, endoscopy and hormone analysis are invasive and may disturb vulnerable populations. Ultrasound imaging is a non-invasive method that has been used to measure reproductive status across fish taxa. However, little research has assessed the accuracy of ultrasound for determining maturation status of small-bodied fish, or reproductive potential early in a species’ reproductive cycle. To address these knowledge gaps, we tested whether ultrasound imaging could be used to identify maturing female Westslope Cutthroat Trout (Oncorhynchus clarkii lewisi). Our methods were accurate at identifying maturing females reared in a hatchery setting up to eight months prior to spawning, with error rates ≤ 4.0%; accuracy was greater for larger fish. We also imaged fish in a field setting to examine variation in the size of maturing females among six wild, resident populations of Westslope Cutthroat Trout in western Montana. The median size of maturing females varied significantly across populations. We observed oocyte development in females as small as 109 mm, which is smaller than previously documented for this species. Methods tested in this study will allow researchers and managers to collect information on reproductive status of small-bodied salmonids without disrupting fish during the breeding season. This information can help elucidate life history traits that promote persistence of isolated salmonid populations.


Sign in / Sign up

Export Citation Format

Share Document