scholarly journals Sporadic, global linkage disequilibrium between unlinked segregating sites

2015 ◽  
Author(s):  
Daniel A Skelly ◽  
Paul M Magwene ◽  
Eric A Stone

Demographic, genetic, or stochastic factors can lead to perfect linkage disequilibrium (LD) between alleles at two loci without respect to the extent of their physical distance, a phenomenon that Lawrence et al. (2005a) refer to as "genetic indistinguishability". This phenomenon can complicate genotype-phenotype association testing by hindering the ability to localize causal alleles, but has not been thoroughly explored from a theoretical perspective or using large, dense whole-genome polymorphism datasets. We derive a simple theoretical model of the prevalence of genetic indistinguishability between unlinked loci, and verify its accuracy via simulation. We show that sample size and minor allele frequency are the major determinants of the prevalence of perfect LD between unlinked loci but that demographic factors, such as deviations from random mating, can produce significant effects as well. Finally, we quantify this phenomenon in three model organisms and find thousands of pairs of moderate-frequency (>5%) genetically indistinguishable variants in relatively large datasets. These results clarify a previously underexplored population genetic phenomenon with important implications for association studies, and define conditions under which it is likely to manifest.

2018 ◽  
Author(s):  
Agustín Barría ◽  
Kris A. Christensen ◽  
Grazyella Yoshida ◽  
Ana Jedlicki ◽  
Jean P. Lhorente ◽  
...  

AbstractThe estimation of linkage disequilibrium between molecular markers within a population is critical when establishing the minimum number of markers required for association studies, genomic selection and for inferring historical events influencing different populations. This work aimed to evaluate the extent and decay of linkage disequilibrium in a coho salmon breeding population using ddRAD genomic markers.Linkage disequilibrium was estimated between a total of 7,505 SNPs found in 62 individuals (33 dams and 29 sires) from the breeding population. The makers encompass all 30 coho salmon chromosomes and comprise 1,655.19 Mb of the genome. The average density of markers per chromosome ranged from 3.45 to 6.11 per 1 Mbp. The minor allele frequency averaged 0.20 (with a range from 0.08 to 0.50). The overall average linkage disequilibrium among SNPs pairs measured as r2 was 0.054. The Average r2 value decreased with increasing physical distance, with values ranging from 0.37 to 0.054 at distances lower than 1 kb and up to 10 Mb, respectively. An r2 threshold of 0.1 was reached at distance of approximately 1.3 Mb. Chromosomes Okis05, Okis15 and Okis28 showed high levels of linkage disequilibrium (> 0.20 at distances lower than 1 Mb). Average r2 values were lower than 0.1 for all chromosomes at distances greater than 4 Mb. Linkage disequilibrium values suggest that whole genome association and selection studies could be performed using about 75,000 SNPs in aquaculture populations (depending on the trait under investigation). From the identified SNPs, an effective population size of 100 was estimated for the population 10 generation ago, and 1,000, for 139 generations ago.Based on the extent of r2 decay, we suggest that at least 75,000 SNPs would be necessary for an association mapping study. Over 100,000 SNPs would be necessary for a high power study, in the current coho salmon population.


2019 ◽  
Vol 35 (19) ◽  
pp. 3855-3856 ◽  
Author(s):  
Emma A Fox ◽  
Alison E Wright ◽  
Matteo Fumagalli ◽  
Filipe G Vieira

Abstract Motivation Linkage disequilibrium (LD) measures the correlation between genetic loci and is highly informative for association mapping and population genetics. As many studies rely on called genotypes for estimating LD, their results can be affected by data uncertainty, especially when employing a low read depth sequencing strategy. Furthermore, there is a manifest lack of tools for the analysis of large-scale, low-depth and short-read sequencing data from non-model organisms with limited sample sizes. Results ngsLD addresses these issues by estimating LD directly from genotype likelihoods in a fast, reliable and user-friendly implementation. This method makes use of the full information available from sequencing data and provides accurate estimates of linkage disequilibrium patterns compared with approaches based on genotype calling. We conducted a case study to investigate how LD decays over physical distance in two avian species. Availability and implementation The methods presented in this work were implemented in C/C and are freely available for non-commercial use from https://github.com/fgvieira/ngsLD. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 12 ◽  
Author(s):  
Shirin Rahimmadar ◽  
Mokhtar Ghaffari ◽  
Mahdi Mokhber ◽  
John L. Williams

Linkage disequilibrium (LD) across the genome provides information to identify the genes and variations related to quantitative traits in genome-wide association studies (GWAS) and for the implementation of genomic selection (GS). LD can also be used to evaluate genetic diversity and population structure and reveal genomic regions affected by selection. LD structure and Ne were assessed in a set of 83 water buffaloes, comprising Azeri (AZI), Khuzestani (KHU), and Mazandarani (MAZ) breeds from Iran, Kundi (KUN) and Nili-Ravi (NIL) from Pakistan, Anatolian (ANA) buffalo from Turkey, and buffalo from Egypt (EGY). The values of corrected r2 (defined as the correlation between two loci) of adjacent SNPs for three pooled Iranian breeds (IRI), ANA, EGY, and two pooled Pakistani breeds (PAK) populations were 0.24, 0.28, 0.27, and 0.22, respectively. The corrected r2 between SNPs decreased with increasing physical distance from 100 Kb to 1 Mb. The LD values for IRI, ANA, EGY, and PAK populations were 0.16, 0.23, 0.24, and 0.21 for less than 100Kb, respectively, which reduced rapidly to 0.018, 0.042, 0.059, and 0.024, for a distance of 1 Mb. In all the populations, the decay rate was low for distances greater than 2Mb, up to the longest studied distance (15 Mb). The r2 values for adjacent SNPs in unrelated samples indicated that the Affymetrix Axiom 90 K SNP genomic array was suitable for GWAS and GS in these populations. The persistency of LD phase (PLDP) between populations was assessed, and results showed that PLPD values between the populations were more than 0.9 for distances of less than 100 Kb. The Ne in the recent generations has declined to the extent that breeding plans are urgently required to ensure that these buffalo populations are not at risk of being lost. We found that results are affected by sample size, which could be partially corrected for; however, additional data should be obtained to be confident of the results.


2018 ◽  
Vol 63 (No. 2) ◽  
pp. 61-69 ◽  
Author(s):  
M.M.I. Salem ◽  
G. Thompson ◽  
S. Chen ◽  
A. Beja-Pereira ◽  
J. Carvalheira

The objectives of this study were to estimate linkage disequilibrium (LD), describe and scan a haplotype block for the presence of genes that may affect milk production traits in Portuguese Holstein cattle. Totally 526 animals were genotyped using the Illumina BovineSNP50 BeadChip, which contained a total of 52 890 single nucleotide polymorphisms (SNPs). The final set of markers remaining after considering quality control standards consisted of 37 031 SNPs located on 29 autosomes. The LD parameters historical recombinations through allelic association (D') and squared correlation coefficient between locus alleles frequencies ( r<sup>2</sup>) were estimated and haplotype block analyses were performed using the Haploview software. The averages of D' and r<sup>2</sup> values were 0.628 and 0.122, respectively. The LD value decreased with increasing physical distance. The D' and r<sup>2</sup> values decreased respectively from 0.815 and 0.283 at the distance of 0–30 kb to 0.578 and 0.090 at the distance of 401–500 kb. The identified total number of blocks was 969 and consisted of 4259 SNPs that covered 159.06 Mb (6.24% of the total genome) on 29 autosomes. Several genes inside the haplotype blocks were detected; CSN1S2 gene in haplotype block 51 on BTA 6, IL6 and B4GALT1 genes in haplotype blocks 6 and 33 on BTA 8, IL1B and ID2 genes in haplotype blocks 19 and 29 on BTA 11, and DGAT1 gene in haplotype block 1 on BTA 14. The extension of LD using BovineSNP50 BeadChip did not exceed 500 kb and its parameters r<sup>2</sup> and D’ were less than 0.2 and 0.70, respectively, after 70–100 kb. Consequently, the 50K BeadChip would have a poor power in genome wide association studies at distances between adjacent markers lower than 70 kb.


Genetics ◽  
2000 ◽  
Vol 156 (1) ◽  
pp. 457-467 ◽  
Author(s):  
Z W Luo ◽  
S H Tao ◽  
Z-B Zeng

Abstract Three approaches are proposed in this study for detecting or estimating linkage disequilibrium between a polymorphic marker locus and a locus affecting quantitative genetic variation using the sample from random mating populations. It is shown that the disequilibrium over a wide range of circumstances may be detected with a power of 80% by using phenotypic records and marker genotypes of a few hundred individuals. Comparison of ANOVA and regression methods in this article to the transmission disequilibrium test (TDT) shows that, given the genetic variance explained by the trait locus, the power of TDT depends on the trait allele frequency, whereas the power of ANOVA and regression analyses is relatively independent from the allelic frequency. The TDT method is more powerful when the trait allele frequency is low, but much less powerful when it is high. The likelihood analysis provides reliable estimation of the model parameters when the QTL variance is at least 10% of the phenotypic variance and the sample size of a few hundred is used. Potential use of these estimates in mapping the trait locus is also discussed.


BMC Genomics ◽  
2014 ◽  
Vol 15 (1) ◽  
pp. 408 ◽  
Author(s):  
Cristina Rodriguez-Fontenla ◽  
Manuel Calaza ◽  
Antonio Gonzalez

2021 ◽  
pp. 1-11
Author(s):  
Valentina Escott-Price ◽  
Karl Michael Schmidt

<b><i>Background:</i></b> Genome-wide association studies (GWAS) were successful in identifying SNPs showing association with disease, but their individual effect sizes are small and require large sample sizes to achieve statistical significance. Methods of post-GWAS analysis, including gene-based, gene-set and polygenic risk scores, combine the SNP effect sizes in an attempt to boost the power of the analyses. To avoid giving undue weight to SNPs in linkage disequilibrium (LD), the LD needs to be taken into account in these analyses. <b><i>Objectives:</i></b> We review methods that attempt to adjust the effect sizes (β<i>-</i>coefficients) of summary statistics, instead of simple LD pruning. <b><i>Methods:</i></b> We subject LD adjustment approaches to a mathematical analysis, recognising Tikhonov regularisation as a framework for comparison. <b><i>Results:</i></b> Observing the similarity of the processes involved with the more straightforward Tikhonov-regularised ordinary least squares estimate for multivariate regression coefficients, we note that current methods based on a Bayesian model for the effect sizes effectively provide an implicit choice of the regularisation parameter, which is convenient, but at the price of reduced transparency and, especially in smaller LD blocks, a risk of incomplete LD correction. <b><i>Conclusions:</i></b> There is no simple answer to the question which method is best, but where interpretability of the LD adjustment is essential, as in research aiming at identifying the genomic aetiology of disorders, our study suggests that a more direct choice of mild regularisation in the correction of effect sizes may be preferable.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 243-244
Author(s):  
Brittany N Diehl ◽  
Andres A Pech-Cervantes ◽  
Thomas H Terrill ◽  
Ibukun M Ogunade ◽  
Owen Rae ◽  
...  

Abstract Florida Native sheep is an indigenous breed from Florida and expresses superior parasite resistance. Previous candidate and genome wide association studies with Florida Native sheep have identified single nucleotide polymorphisms with additive and non-additive effects associated with parasite resistance. However, the role of other potential DNA variants, such as copy number variants (CNVs), controlling this complex trait have not been evaluated. The objective of the present study was to investigate the importance of CNVs on resistance to natural Haemonchus contortus infections in Florida Native sheep. A total of 200 sheep were evaluated in the present study. Phenotypic records included fecal egg count (FEC, eggs/gram), FAMACHA score, and packed cell volume (PCV, %). Sheep were genotyped using the GGP Ovine 50K SNP chip. The copy number analysis was used to identify CNVs using the univariate method. A total of 170 animals with CNVs and phenotypic data were used for the association testing. Association tests were carried out using single linear regression and Principal Component Analysis (PCA) correction to identify CNVs associated with FEC, FAMACHA, and PCV. To confirm our results, a second association testing using the correlation-trend test with PCA correction was performed. Significant CNVs were detected when their adjusted p-value was &lt; 0.05 after FDR correction. A deletion CNV in chromosome 21 was associated with FEC. This DNA variant was located in intron 2 of RAB3IL gene and overlapped a QTL associated with changes in eosinophil number. Our study demonstrated for the first time that CNVs could be potentially involved with parasite resistance in this heritage sheep breed.


Sign in / Sign up

Export Citation Format

Share Document