scholarly journals Genome-wide patterns of population structure and linkage disequilibrium in farmed Nile tilapia (Oreochromis niloticus)

2019 ◽  
Author(s):  
Grazyella M. Yoshida ◽  
Agustín Barria ◽  
Katharina Correa ◽  
Giovanna Cáceres ◽  
Ana Jedlicki ◽  
...  

AbstractNile tilapia (Oreochromis niloticus) is one of the most produced farmed fish in the world and represents an important source of protein for human consumption. Farmed Nile tilapia populations are increasingly based on genetically improved stocks, which have been established from admixed populations. To date, there is scarce information about the population genomics of farmed Nile tilapia, assessed by dense single nucleotide polymorphism (SNP) panels. The patterns of linkage disequilibrium (LD) may affect the success of genome-wide association studies (GWAS) and genomic selection and can also provide key information about demographic history of farmed Nile tilapia populations. The objectives of this study were to provide further knowledge about the population structure and LD patterns, as well as, estimate the effective population size (Ne) for three farmed Nile tilapia populations, one from Brazil (POP A) and two from Costa Rica (POP B and POP C). A total of 55, 56 and 57 individuals from POP A, POP B and POP C, respectively, were genotyped using a 50K SNP panel selected from a whole-genome sequencing (WGS) experiment. Two principal components explained about 20% of the total variation and clearly discriminated between the three populations. Population genetic structure analysis showed evidence of admixture, especially for POP C. The contemporary Ne values calculated based to LD values, ranged from 71 to 141. No differences were observed in the LD decay among populations, with a rapid decrease of r2 when increasing inter-marker distance. Average r2 between adjacent SNP pairs ranged from 0.03 to 0.18, 0.03 to 0.17 and 0.03 to 0.16 for POP A, POP B and POP C, respectively. Based on the number of independent chromosome segments in the Nile tilapia genome, at least 4.2 K SNP are required for the implementation of GWAS and genomic selection in farmed Nile tilapia populations.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Dhriti Sengupta ◽  
◽  
Ananyo Choudhury ◽  
Cesar Fortes-Lima ◽  
Shaun Aron ◽  
...  

AbstractSouth Eastern Bantu-speaking (SEB) groups constitute more than 80% of the population in South Africa. Despite clear linguistic and geographic diversity, the genetic differences between these groups have not been systematically investigated. Based on genome-wide data of over 5000 individuals, representing eight major SEB groups, we provide strong evidence for fine-scale population structure that broadly aligns with geographic distribution and is also congruent with linguistic phylogeny (separation of Nguni, Sotho-Tswana and Tsonga speakers). Although differential Khoe-San admixture plays a key role, the structure persists after Khoe-San ancestry-masking. The timing of admixture, levels of sex-biased gene flow and population size dynamics also highlight differences in the demographic histories of individual groups. The comparisons with five Iron Age farmer genomes further support genetic continuity over ~400 years in certain regions of the country. Simulated trait genome-wide association studies further show that the observed population structure could have major implications for biomedical genomics research in South Africa.


2019 ◽  
Author(s):  
Andréa Carla Bastos Andrade ◽  
José Marcelo Soriano Viana ◽  
Helcio Duarte Pereira ◽  
Vitor Batista Pinto ◽  
Fabyano Fonseca e Silva

AbstractLinkage disequilibrium (LD) analysis provides information on evolutionary aspects of the populations and allows selecting populations and single nucleotide polymorphisms (SNPs) for association studies. Recently, haplotype blocks have been used to increase the power of quantitative trait loci detection in genome-wide association studies and the prediction accuracy with genomic selection. The objectives of this study were to compare the degree of LD, the LD decay, the LD decay extent, and the number and length of haplotype blocks in the populations and to elaborate the first LD map for maize, for elucidating if the maize chromosomes also had a pattern of interspaced regions of high and low rates of recombination. We used a biparental temperate population, a tropical synthetic, and a tropical breeding population, genotyped for approximately 75,000 SNPs. The level of LD expressed by the r2 values is surprisingly low (0.02, 0.04, and 0.04), but comparable to some non-isolated human populations. The general evidence is that the synthetic is the population with higher LD. It is not expected a significant advantage of haplotype-based association study and along generations genomic selection due to the reduced number of SNPs in the haplotype blocks (2 to 3). The results concerning LD decay (rapid decay after 5-10 kb) and LD decay extent (along up to 300 kb) are in the range observed with maize inbred line panels. Our most important result is that maize chromosomes had a pattern of regions of extensive LD interspaced with regions of low LD. However, our simple simulated LD map provides evidence that this pattern can reflect regions with differences of allele frequencies and LD level (expressed by D’) and not regions with high and low rates of recombination.


2019 ◽  
Vol 10 ◽  
Author(s):  
Grazyella M. Yoshida ◽  
Agustín Barria ◽  
Katharina Correa ◽  
Giovanna Cáceres ◽  
Ana Jedlicki ◽  
...  

2020 ◽  
Author(s):  
Dhriti Sengupta ◽  
Ananyo Choudhury ◽  
Cesar Fortes-Lima ◽  
Shaun Aron ◽  
Gavin Whitelaw ◽  
...  

AbstractSouth Eastern Bantu-speaking (SEB) groups constitute more than 80% of the population in South Africa. Despite clear linguistic and geographic diversity, the genetic differences between these groups have not been systematically investigated. Based on genome-wide data of over 5000 individuals, representing eight major SEB groups, we provide strong evidence for fine-scale population structure that broadly aligns with geographic distribution and is also congruent with linguistic phylogeny (separation of Nguni, Sotho-Tswana and Tsonga speakers). Although differential Khoe-San admixture plays a key role, the structure persists after Khoe-San ancestry-masking. The timing of admixture, levels of sex-biased gene flow and population size dynamics also highlight differences in the demographic histories of individual groups. The comparisons with five Iron Age farmer genomes further support genetic continuity over ∼400 years in certain regions of the country. Simulated trait genome-wide association studies further show that the observed population structure could have major implications for biomedical genomics research in South Africa.


2017 ◽  
Author(s):  
Takafumi Katsumura ◽  
Shoji Oda ◽  
Mitani Hiroshi ◽  
Hiroki Oota

AbstractMedaka is a model organism in medicine, genetics, developmental biology and population genetics. Lab stocks composed of more than 100 local wild populations are available for research in these fields. Thus, medaka represents a potentially excellent bioresource for screening disease-risk- and adaptation-related genes in genome-wide association studies. Although the genetic population structure should be known before performing such an analysis, a comprehensive study on the genome-wide diversity of wild medaka populations has not been performed. Here, we performed genotyping-by-sequencing (GBS) for 81 and 12 medakas captured from a bioresource and the wild, respectively. Based on the GBS data, we evaluated the genetic population structure and estimated the demographic parameters using an approximate Bayesian computation (ABC) framework. The autosomal data confirmed that there were substantial differences between local populations and supported our previously proposed hypothesis on medaka dispersal based on mitochondrial genome (mtDNA) data. A new finding was that a local group that was thought to be a hybrid between the northern and the southern Japanese groups was actually a sister group of the northern Japanese group. Thus, this paper presents the first population-genomic study of medaka and reveals its population structure and history based on autosomal diversity.


2021 ◽  
pp. 1-11
Author(s):  
Valentina Escott-Price ◽  
Karl Michael Schmidt

<b><i>Background:</i></b> Genome-wide association studies (GWAS) were successful in identifying SNPs showing association with disease, but their individual effect sizes are small and require large sample sizes to achieve statistical significance. Methods of post-GWAS analysis, including gene-based, gene-set and polygenic risk scores, combine the SNP effect sizes in an attempt to boost the power of the analyses. To avoid giving undue weight to SNPs in linkage disequilibrium (LD), the LD needs to be taken into account in these analyses. <b><i>Objectives:</i></b> We review methods that attempt to adjust the effect sizes (β<i>-</i>coefficients) of summary statistics, instead of simple LD pruning. <b><i>Methods:</i></b> We subject LD adjustment approaches to a mathematical analysis, recognising Tikhonov regularisation as a framework for comparison. <b><i>Results:</i></b> Observing the similarity of the processes involved with the more straightforward Tikhonov-regularised ordinary least squares estimate for multivariate regression coefficients, we note that current methods based on a Bayesian model for the effect sizes effectively provide an implicit choice of the regularisation parameter, which is convenient, but at the price of reduced transparency and, especially in smaller LD blocks, a risk of incomplete LD correction. <b><i>Conclusions:</i></b> There is no simple answer to the question which method is best, but where interpretability of the LD adjustment is essential, as in research aiming at identifying the genomic aetiology of disorders, our study suggests that a more direct choice of mild regularisation in the correction of effect sizes may be preferable.


Agronomy ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. 2006
Author(s):  
David P. Horvath ◽  
Michael Stamm ◽  
Zahirul I. Talukder ◽  
Jason Fiedler ◽  
Aidan P. Horvath ◽  
...  

A diverse population (429 member) of canola (Brassica napus L.) consisting primarily of winter biotypes was assembled and used in genome-wide association studies. Genotype by sequencing analysis of the population identified and mapped 290,972 high-quality markers ranging from 18.5 to 82.4% missing markers per line and an average of 36.8%. After interpolation, 251,575 high-quality markers remained. After filtering for markers with low minor allele counts (count > 5), we were left with 190,375 markers. The average distance between these markers is 4463 bases with a median of 69 and a range from 1 to 281,248 bases. The heterozygosity among the imputed population ranges from 0.9 to 11.0% with an average of 5.4%. The filtered and imputed dataset was used to determine population structure and kinship, which indicated that the population had minimal structure with the best K value of 2–3. These results also indicated that the majority of the population has substantial sequence from a single population with sub-clusters of, and admixtures with, a very small number of other populations. Analysis of chromosomal linkage disequilibrium decay ranged from ~7 Kb for chromosome A01 to ~68 Kb for chromosome C01. Local linkage decay rates determined for all 500 kb windows with a 10kb sliding step indicated a wide range of linkage disequilibrium decay rates, indicating numerous crossover hotspots within this population, and provide a resource for determining the likely limits of linkage disequilibrium from any given marker in which to identify candidate genes. This population and the resources provided here should serve as helpful tools for investigating genetics in winter canola.


Sign in / Sign up

Export Citation Format

Share Document