scholarly journals Estimation of linkage disequilibrium levels and allele frequency distribution in crossbred Vrindavani cattle using 50K SNP data

PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0259572
Author(s):  
Akansha Singh ◽  
Amit Kumar ◽  
Arnav Mehrotra ◽  
Karthikeyan A. ◽  
Ashwni Kumar Pandey ◽  
...  

The objective of this study was to calculate the extent and decay of linkage disequilibrium (LD) in 96 crossbred Vrindavani cattle genotyped with Bovine SNP50K Bead Chip. After filtering, 43,821 SNPs were retained for final analysis, across 2500.3 Mb of autosome. A significant percentage of SNPs was having minor allele frequency of less than 0.20. The extent of LD between autosomal SNPs up to 10 Mb apart across the genome was measured using r2 statistic. The mean r2 value was 0.43, if pairwise distance of marker was less than10 kb and it decreased further to 0.21 for 25–50 kb markers distance. Further, the effect of minor allele frequency and sample size on LD estimate was investigated. The LD value decreased with the increase in inter-marker distance, and increased with the increase of minor allelic frequency. The estimated inbreeding coefficient and effective population size were 0.04, and 46 for present generation, which indicated small and unstable population of Vrindavani cattle. These findings suggested that a denser or breed specific SNP panel would be required to cover all genome of Vrindavani cattle for genome wide association studies (GWAS).

2020 ◽  
Author(s):  
Palle Duun Rohde ◽  
Peter Sørensen ◽  
Mette Nyegaard

AbstractGenomics has been forecasted to revolutionise human health by improving medical treatment through a better understanding of the molecular mechanisms of human diseases. Despite great successes of the last decade’s genome-wide association studies (GWAS), the results have to a limited extent been translated to genomic medicine. We propose, that one route to get closer to improved medical treatment is by understanding the genetics of medication-use. Here we obtained entire medication profiles from 335,744 individuals from the UK Biobank and performed a GWAS to identify which common genetic variants are major drivers of medication-use. We analysed 9 million imputed genetic variants, estimated SNP heritability, partitioned the genomic variance across functional categories, and constructed genetic scores for medication-use. In total, 59 independent loci were identified for medication-use and approximately 18% of the total variation was attributable to common genetic (minor allele frequency >0.01) variants. The largest fraction of variance was captured by variants with low to medium minor allele frequency. In particular coding and conserved regions, as well as transcription start sites, displayed significantly enrichment of heritability. The average correlation between medication-use and predicted genetic scores was 0.14. These results demonstrate that medication-use per se is a highly polygenic complex trait and that individuals with higher genetic liability are on average more diseased and have a higher risk for adverse drug reactions. These results provide an insight into the genetic architecture of medication use and pave the way for developments of multicomponent genetic risk models that includes the genetically informed medication-use.


2015 ◽  
Author(s):  
Xia Shen

Motivation: Genome-wide association studies have been conducted in inbred populations where the sample size is small. The ordinary association p-values and multiple testing correction therefore become questionable, as the detected genetic effect may or may not be due to chance, depending on the minor allele frequency distribution across the genome. Instead of permutation testing, marker-specific false positive rate can be analytically calculated in inbred populations without heterozygotes. Results: Solutions of exact p-values for genome-wide association studies in inbred populations were derived and implemented. An example is presented to illustrate that the marker-specific experiment-wise p-value varies as the genome-wide minor allele frequency distribution changes. A simulation using real Arabidopsis thaliana genome indicates that the use of exact p-values improves detection power and reduces inflation due to population structure. An analysis of a defense-related case-control phenotype using the exact p-values revealed the causal locus, where markers with higher MAFs had smaller p-values than the top variants with lower MAFs in ordinary genome-wide association analysis. Availability and Implementation: Project URL: https://r-forge.r-project.org/projects/statomics/. The R package p.exact: https://r-forge.r-project.org/R/?group_id=2030.


2013 ◽  
Vol 3 (1) ◽  
pp. 23-29 ◽  
Author(s):  
Lei Jiang ◽  
Dana Willner ◽  
Patrick Danoy ◽  
Huji Xu ◽  
Matthew A Brown

Abstract Most genome-wide association studies to date have been performed in populations of European descent, but there is increasing interest in expanding these studies to other populations. The performance of genotyping chips in Asian populations is not well established. Therefore, we sought to test the performance of widely used fixed-marker, genome-wide association studies chips in the Han Chinese population. Non-HapMap Chinese samples (n = 396) were genotyped using the Illumina OmniExpress and Affymetrix 6.0 platforms, whereas a subset also were genotyped using the Immunochip. Genotyped markers from the Affymetrix 6.0 and Illumina OmniExpress were used for full genome imputation based on the HapMap 2 JPT+CHB (Japanese from Tokyo, Japan and Chinese from Beijing, China) reference panel. The concordance between markers genotypes for the three platforms was very high whether directly genotyped or genotyped and imputed single nucleotide polymorphisms (SNPs; >99.8% for directly genotyped and >99.5% for genotyped and imputed SNPs, respectively) were compared. The OmniExpress chip data enabled more SNPs to be imputed, particularly SNPs with minor allele frequency >5%. The OmniExpress chip achieved better coverage of HapMap SNPs than the Affymetrix 6.0 chip (73.6% vs. 65.9%, respectively, for minor allele frequency >5%). The Affymetrix 6.0 and Illumina OmniExpress chip have similar genotyping accuracy and provide similar accuracy of imputed SNPs. The OmniExpress chip however provides better coverage of Asian HapMap SNPs, although its coverage of HapMap SNPs is moderate.


2019 ◽  
Vol 62 (1) ◽  
pp. 143-151 ◽  
Author(s):  
Seyed Mohammad Ghoreishifar ◽  
Hossein Moradi-Shahrbabak ◽  
Nahid Parna ◽  
Pourya Davoudi ◽  
Majid Khansefid

Abstract. This research aimed to measure the extent of linkage disequilibrium (LD), effective population size (Ne), and runs of homozygosity (ROHs) in one of the major Iranian sheep breeds (Zandi) using 96 samples genotyped with Illumina Ovine SNP50 BeadChip. The amount of LD (r2) for single-nucleotide polymorphism (SNP) pairs in short distances (10–20 kb) was 0.21±0.25 but rapidly decreased to 0.10±0.16 by increasing the distance between SNP pairs (40–60 kb). The Ne of Zandi sheep in past (approximately 3500 generations ago) and recent (five generations ago) populations was estimated to be 6475 and 122, respectively. The ROH-based inbreeding was 0.023. We found 558 ROH regions, of which 37 % were relatively long (> 10 Mb). Compared with the rate of LD reduction in other species (e.g., cattle and pigs), in Zandi, it was reduced more rapidly by increasing the distance between SNP pairs. According to the LD pattern and high genetic diversity of Zandi sheep, we need to use an SNP panel with a higher density than Illumina Ovine SNP50 BeadChip for genomic selection and genome-wide association studies in this breed.


2019 ◽  
Author(s):  
Jiayi Qu ◽  
Stephen D Kachman ◽  
Dorian Garrick ◽  
Rohan L Fernando ◽  
Hao Cheng

ABSTRACTLinkage disequilibrium (LD), often expressed in terms of the squared correlation (r2) between allelic values at two loci, is an important concept in many branches of genetics and genomics. Genetic drift and recombination have opposite effects on LD, and thus r2 will keep changing until the effects of these two forces are counterbalanced. Several approximations have been used to determine the expected value of r2 at equilibrium in the presence or absence of mutation. In this paper, we propose a probability-based approach to compute the exact distribution of allele frequencies at two loci in a finite population at any generation t conditional on the distribution at generation t − 1. As r2 is a function of this distribution of allele frequencies, this approach can be used to examine the distribution of r2 over generations as it approaches equilibrium. The exact distribution of LD from our method is used to describe, quantify and compare LD at different equilibria, including equilibrium in the absence or presence of mutation, selection, and filtering by minor allele frequency. We also propose a deterministic formula for expected LD in the presence of mutation at equilibrium based on the exact distribution of LD.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Yan Guo ◽  
David C. Samuels ◽  
Jiang Li ◽  
Travis Clark ◽  
Chung-I Li ◽  
...  

Next-generation sequencing (NGS) technology has provided researchers with opportunities to study the genome in unprecedented detail. In particular, NGS is applied to disease association studies. Unlike genotyping chips, NGS is not limited to a fixed set of SNPs. Prices for NGS are now comparable to the SNP chip, although for large studies the cost can be substantial. Pooling techniques are often used to reduce the overall cost of large-scale studies. In this study, we designed a rigorous simulation model to test the practicability of estimating allele frequency from pooled sequencing data. We took crucial factors into consideration, including pool size, overall depth, average depth per sample, pooling variation, and sampling variation. We used real data to demonstrate and measure reference allele preference in DNAseq data and implemented this bias in our simulation model. We found that pooled sequencing data can introduce high levels of relative error rate (defined as error rate divided by targeted allele frequency) and that the error rate is more severe for low minor allele frequency SNPs than for high minor allele frequency SNPs. In order to overcome the error introduced by pooling, we recommend a large pool size and high average depth per sample.


2016 ◽  
Author(s):  
Jaleal S. Sanjak ◽  
Anthony D. Long ◽  
Kevin R. Thornton

AbstractThe genetic component of complex disease risk in humans remains largely unexplained. A corollary is that the allelic spectrum of genetic variants contributing to complex disease risk is unknown. Theoretical models that relate population genetic processes to the maintenance of genetic variation for quantitative traits may suggest profitable avenues for future experimental design. Here we use forward simulation to model a genomic region evolving under a balance between recurrent deleterious mutation and Gaussian stabilizing selection. We consider multiple genetic and demographic models, and several different methods for identifying genomic regions harboring variants associated with complex disease risk. We demonstrate that the model of gene action, relating genotype to phenotype, has a qualitative effect on several relevant aspects of the population genetic architecture of a complex trait. In particular, the genetic model impacts genetic variance component partitioning across the allele frequency spectrum and the power of statistical tests. Models with partial recessivity closely match the minor allele frequency distribution of significant hits from empirical genome-wide association studies without requiring homozygous effect-sizes to be small. We highlight a particular gene-based model of incomplete recessivity that is appealing from first principles. Under that model, deleterious mutations in a genomic region partially fail to complement one another. This model of gene-based recessivity predicts the empirically observed inconsistency between twin and SNP based estimated of dominance heritability. Furthermore, this model predicts considerable levels of unexplained variance associated with intralocus epistasis. Our results suggest a need for improved statistical tools for region based genetic association and heritability estimation.Author SummaryGene action determines how mutations affect phenotype. When placed in an evolutionary context, the details of the genotype-to-phenotype model can impact the maintenance of genetic variation for complex traits. Likewise, non-equilibrium demographic history may affect patterns of genetic variation. Here, we explore the impact of genetic model and population growth on distribution of genetic variance across the allele frequency spectrum underlying risk for a complex disease. Using forward-in-time population genetic simulations, we show that the genetic model has important impacts on the composition of variation for complex disease risk in a population. We explicitly simulate genome-wide association studies (GWAS) and perform heritability estimation on population samples. A particular model of gene-based partial recessivity, based on allelic non-complementation, aligns well with empirical results. This model is congruent with the dominance variance estimates from both SNPs and twins, and the minor allele frequency distribution of GWAS hits.


2014 ◽  
Vol 166 ◽  
pp. 121-132 ◽  
Author(s):  
Ana M. Pérez O’Brien ◽  
Gábor Mészáros ◽  
Yuri T. Utsunomiya ◽  
Tad S. Sonstegard ◽  
J. Fernando Garcia ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document