scholarly journals Patterns of polymorphism, selection and linkage disequilibrium in the subgenomes of the allopolyploid Arabidopsis kamchatica

2018 ◽  
Author(s):  
Timothy Paape ◽  
Roman V. Briskine ◽  
Heidi E.L Lischer ◽  
Gwyneth Halstead-Nussloch ◽  
Rie Shimizu-Inatsugi ◽  
...  

AbstractAlthough genome duplication is widespread in wild and crop plants, little is known about genome-wide selection due to the complexity of polyploid genomes. In allopolyploid species, the patterns of purifying selection and adaptive substitutions would be affected by masking owing to duplicated genes or ‘homeologs’ as well as by effective population size. We resequenced 25 distribution-wide accessions of the allotetraploid Arabidopsis kamchatica, which has a relatively small genome size (450 Mb) derived from the diploid species A. halleri and A. lyrata. The level of nucleotide polymorphism and linkage disequilibrium decay were comparable to A. thaliana, indicating the feasibility of association studies. A reduction in purifying selection compared with parental species was observed. Interestingly, the proportion of adaptive substitutions (α) was significantly positive in contrast to the majority of plant species. A recurrent pattern observed in both frequency and divergence-based neutrality tests is that the genome-wide distributions of both subgenomes were similar, but the correlation between homeologous pairs was low. This may increase the opportunity of different evolutionary trajectories such as in the HMA4 gene involved in heavy metal hyperaccumulation.

2019 ◽  
Vol 62 (1) ◽  
pp. 143-151 ◽  
Author(s):  
Seyed Mohammad Ghoreishifar ◽  
Hossein Moradi-Shahrbabak ◽  
Nahid Parna ◽  
Pourya Davoudi ◽  
Majid Khansefid

Abstract. This research aimed to measure the extent of linkage disequilibrium (LD), effective population size (Ne), and runs of homozygosity (ROHs) in one of the major Iranian sheep breeds (Zandi) using 96 samples genotyped with Illumina Ovine SNP50 BeadChip. The amount of LD (r2) for single-nucleotide polymorphism (SNP) pairs in short distances (10–20 kb) was 0.21±0.25 but rapidly decreased to 0.10±0.16 by increasing the distance between SNP pairs (40–60 kb). The Ne of Zandi sheep in past (approximately 3500 generations ago) and recent (five generations ago) populations was estimated to be 6475 and 122, respectively. The ROH-based inbreeding was 0.023. We found 558 ROH regions, of which 37 % were relatively long (> 10 Mb). Compared with the rate of LD reduction in other species (e.g., cattle and pigs), in Zandi, it was reduced more rapidly by increasing the distance between SNP pairs. According to the LD pattern and high genetic diversity of Zandi sheep, we need to use an SNP panel with a higher density than Illumina Ovine SNP50 BeadChip for genomic selection and genome-wide association studies in this breed.


2021 ◽  
pp. 1-11
Author(s):  
Valentina Escott-Price ◽  
Karl Michael Schmidt

<b><i>Background:</i></b> Genome-wide association studies (GWAS) were successful in identifying SNPs showing association with disease, but their individual effect sizes are small and require large sample sizes to achieve statistical significance. Methods of post-GWAS analysis, including gene-based, gene-set and polygenic risk scores, combine the SNP effect sizes in an attempt to boost the power of the analyses. To avoid giving undue weight to SNPs in linkage disequilibrium (LD), the LD needs to be taken into account in these analyses. <b><i>Objectives:</i></b> We review methods that attempt to adjust the effect sizes (β<i>-</i>coefficients) of summary statistics, instead of simple LD pruning. <b><i>Methods:</i></b> We subject LD adjustment approaches to a mathematical analysis, recognising Tikhonov regularisation as a framework for comparison. <b><i>Results:</i></b> Observing the similarity of the processes involved with the more straightforward Tikhonov-regularised ordinary least squares estimate for multivariate regression coefficients, we note that current methods based on a Bayesian model for the effect sizes effectively provide an implicit choice of the regularisation parameter, which is convenient, but at the price of reduced transparency and, especially in smaller LD blocks, a risk of incomplete LD correction. <b><i>Conclusions:</i></b> There is no simple answer to the question which method is best, but where interpretability of the LD adjustment is essential, as in research aiming at identifying the genomic aetiology of disorders, our study suggests that a more direct choice of mild regularisation in the correction of effect sizes may be preferable.


Agronomy ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. 2006
Author(s):  
David P. Horvath ◽  
Michael Stamm ◽  
Zahirul I. Talukder ◽  
Jason Fiedler ◽  
Aidan P. Horvath ◽  
...  

A diverse population (429 member) of canola (Brassica napus L.) consisting primarily of winter biotypes was assembled and used in genome-wide association studies. Genotype by sequencing analysis of the population identified and mapped 290,972 high-quality markers ranging from 18.5 to 82.4% missing markers per line and an average of 36.8%. After interpolation, 251,575 high-quality markers remained. After filtering for markers with low minor allele counts (count > 5), we were left with 190,375 markers. The average distance between these markers is 4463 bases with a median of 69 and a range from 1 to 281,248 bases. The heterozygosity among the imputed population ranges from 0.9 to 11.0% with an average of 5.4%. The filtered and imputed dataset was used to determine population structure and kinship, which indicated that the population had minimal structure with the best K value of 2–3. These results also indicated that the majority of the population has substantial sequence from a single population with sub-clusters of, and admixtures with, a very small number of other populations. Analysis of chromosomal linkage disequilibrium decay ranged from ~7 Kb for chromosome A01 to ~68 Kb for chromosome C01. Local linkage decay rates determined for all 500 kb windows with a 10kb sliding step indicated a wide range of linkage disequilibrium decay rates, indicating numerous crossover hotspots within this population, and provide a resource for determining the likely limits of linkage disequilibrium from any given marker in which to identify candidate genes. This population and the resources provided here should serve as helpful tools for investigating genetics in winter canola.


2018 ◽  
Author(s):  
Agustín Barría ◽  
Kris A. Christensen ◽  
Grazyella Yoshida ◽  
Ana Jedlicki ◽  
Jean P. Lhorente ◽  
...  

AbstractThe estimation of linkage disequilibrium between molecular markers within a population is critical when establishing the minimum number of markers required for association studies, genomic selection and for inferring historical events influencing different populations. This work aimed to evaluate the extent and decay of linkage disequilibrium in a coho salmon breeding population using ddRAD genomic markers.Linkage disequilibrium was estimated between a total of 7,505 SNPs found in 62 individuals (33 dams and 29 sires) from the breeding population. The makers encompass all 30 coho salmon chromosomes and comprise 1,655.19 Mb of the genome. The average density of markers per chromosome ranged from 3.45 to 6.11 per 1 Mbp. The minor allele frequency averaged 0.20 (with a range from 0.08 to 0.50). The overall average linkage disequilibrium among SNPs pairs measured as r2 was 0.054. The Average r2 value decreased with increasing physical distance, with values ranging from 0.37 to 0.054 at distances lower than 1 kb and up to 10 Mb, respectively. An r2 threshold of 0.1 was reached at distance of approximately 1.3 Mb. Chromosomes Okis05, Okis15 and Okis28 showed high levels of linkage disequilibrium (> 0.20 at distances lower than 1 Mb). Average r2 values were lower than 0.1 for all chromosomes at distances greater than 4 Mb. Linkage disequilibrium values suggest that whole genome association and selection studies could be performed using about 75,000 SNPs in aquaculture populations (depending on the trait under investigation). From the identified SNPs, an effective population size of 100 was estimated for the population 10 generation ago, and 1,000, for 139 generations ago.Based on the extent of r2 decay, we suggest that at least 75,000 SNPs would be necessary for an association mapping study. Over 100,000 SNPs would be necessary for a high power study, in the current coho salmon population.


2018 ◽  
Author(s):  
Corbin Quick ◽  
Christian Fuchsberger ◽  
Daniel Taliun ◽  
Gonçalo Abecasis ◽  
Michael Boehnke ◽  
...  

AbstractSummaryEstimating linkage disequilibrium (LD) is essential for a wide range of summary statistics-based association methods for genome-wide association studies (GWAS). Large genetic data sets, e.g. the TOPMed WGS project and UK Biobank, enable more accurate and comprehensive LD estimates, but increase the computational burden of LD estimation. Here, we describe emeraLD (Efficient Methods for Estimation and Random Access of LD), a computational tool that leverages sparsity and haplotype structure to estimate LD orders of magnitude faster than existing tools.Availability and ImplementationemeraLD is implemented in C++, and is open source under GPLv3. Source code, documentation, an R interface, and utilities for analysis of summary statistics are freely available at http://github.com/statgen/[email protected] informationSupplementary data are available at Bioinformatics online.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Alejandra Vergara-Lope ◽  
M. Reza Jabalameli ◽  
Clare Horscroft ◽  
Sarah Ennis ◽  
Andrew Collins ◽  
...  

Abstract Quantification of linkage disequilibrium (LD) patterns in the human genome is essential for genome-wide association studies, selection signature mapping and studies of recombination. Whole genome sequence (WGS) data provides optimal source data for this quantification as it is free from biases introduced by the design of array genotyping platforms. The Malécot-Morton model of LD allows the creation of a cumulative map for each choromosome, analogous to an LD form of a linkage map. Here we report LD maps generated from WGS data for a large population of European ancestry, as well as populations of Baganda, Ethiopian and Zulu ancestry. We achieve high average genetic marker densities of 2.3–4.6/kb. These maps show good agreement with prior, low resolution maps and are consistent between populations. Files are provided in BED format to allow researchers to readily utilise this resource.


Sign in / Sign up

Export Citation Format

Share Document