scholarly journals Haplotype heterogeneity and low linkage disequilibrium reduce reliable prediction of genotypes for the ‑α3.7I form of α-thalassaemia using genome-wide microarray data

2021 ◽  
Vol 5 ◽  
pp. 287
Author(s):  
Carolyne M. Ndila ◽  
Vysaul Nyirongo ◽  
Alexander W. Macharia ◽  
Anna E. Jeffreys ◽  
Kate Rowlands ◽  
...  

Background: The -α3.7I-thalassaemia deletion is very common throughout Africa because it protects against malaria. When undertaking studies to investigate human genetic adaptations to malaria or other diseases, it is important to account for any confounding effects of α-thalassaemia to rule out spurious associations. Methods: In this study, we have used direct α-thalassaemia genotyping to understand why GWAS data from a large malaria association study in Kilifi Kenya did not identify the α-thalassaemia signal. We then explored the potential use of a number of new approaches to using GWAS data for imputing α-thalassaemia as an alternative to direct genotyping by PCR. Results: We found very low linkage-disequilibrium of the directly typed data with the GWAS SNP markers around α-thalassaemia and across the haemoglobin-alpha (HBA) gene region, which along with a complex haplotype structure, could explain the lack of an association signal from the GWAS SNP data. Some indirect typing methods gave results that were in broad agreement with those derived from direct genotyping and could identify an association signal, but none were sufficiently accurate to allow correct interpretation compared with direct typing, leading to confusing or erroneous results. Conclusions: We conclude that going forwards, direct typing methods such as PCR will still be required to account for α-thalassaemia in GWAS studies.

2020 ◽  
Vol 5 ◽  
pp. 287
Author(s):  
Carolyne M. Ndila ◽  
Vysaul Nyirongo ◽  
Alexander W. Macharia ◽  
Anna E. Jeffreys ◽  
Kate Rowlands ◽  
...  

Background: The -α3.7I-thalassaemia deletion is very common throughout Africa because it protects against malaria. When undertaking studies to investigate human genetic adaptations to malaria or other diseases, it is important to account for any confounding effects of α-thalassaemia to rule out spurious associations. Methods: In this study we have used direct α-thalassaemia genotyping to understand why GWAS data from a large malaria association study in Kilifi Kenya did not identify the α-thalassaemia signal. We then explored the potential use of a number of new approaches to using GWAS data for imputing α-thalassaemia as an alternative to direct genotyping by PCR. Results: We found very low linkage-disequilibrium of the directly typed data with the GWAS SNP markers around α-thalassaemia and across the haemoglobin-alpha (HBA) gene region, which along with a complex haplotype structure, could explain the lack of an association signal from the GWAS SNP data. Some indirect typing methods gave results that were in broad agreement with those derived from direct genotyping and could identify an association signal, but none were sufficiently accurate to allow correct interpretation compared with direct typing, leading to confusing or erroneous results. Conclusions: We conclude that going forwards, direct typing methods such as PCR will still be required to account for α-thalassaemia in GWAS studies.


2020 ◽  
Author(s):  
Léa Boyrie ◽  
Corentin Moreau ◽  
Florian Frugier ◽  
Christophe Jacquet ◽  
Maxime Bonhomme

AbstractThe quest for genome-wide signatures of selection in populations using SNP data has proven efficient to uncover genes involved in conserved or adaptive molecular functions, but none of the statistical methods were designed to identify interacting genes as targets of selective processes. Here, we propose a straightforward statistical test aimed at detecting epistatic selection, based on a linkage disequilibrium (LD) measure accounting for population structure and heterogeneous relatedness between individuals. SNP-based (Trv) and window-based (TcorPC1v) statistics fit a Student distribution, allowing to easily and quickly test the significance of correlation coefficients in the frame of Genome-Wide Epistatic Selection Scans (GWESS) using candidate genes as baits. As a proof of concept, use of SNP data from the Medicago truncatula symbiotic legume plant uncovered a previously unknown gene coadaptation between the MtSUNN (Super Numeric Nodule) receptor and the MtCLE02 (CLAVATA3-Like) signalling peptide, and experimental evidence accordingly supported a MtSUNN-dependent negative role of MtCLE02 in symbiotic root nodulation. Using human HGDP-CEPH SNP data, our new statistical test uncovered strong LD between SLC24A5 and EDAR worldwide, which persists after correction for population structure and relatedness in Central South Asian populations. This result suggests adaptive genetic interaction or coselection between skin pigmentation and the ectodysplasin pathway involved in the development of ectodermal organs (hairs, teeth, sweat glands), in some human populations. Applying this approach to genome-wide SNP data will foster the identification of evolutionary coadapted gene networks.Author summaryPopulation genomic methods have allowed to identify many genes associated with adaptive processes in populations with complex histories. However, they are not designed to identify gene coadaptation between genes through epistatic selection, in structured populations. To tackle this problem, we developed a straightforward LD-based statistical test accounting for population structure and heterogeneous relatedness between individuals, using SNP-based (Trv) or windows-based (TcorPC1v) statistics. This allows easily and quickly testing for significance of correlation coefficients between polymorphic loci in the frame of Genome Wide Epistatic Selection Scans (GWESS). Following detection of gene coadaptation using SNP data from human and the model plant Medicago truncatula, we report experimental evidence of genetic interaction between two receptors involved in the regulation of root nodule symbiosis in Medicago truncatula. This test opens new avenues for exploring the evolution of genes as interacting units and thus paves the way to infer new networks based on evolutionary coadaptation between genes.


2018 ◽  
Author(s):  
Ya-Ping Lin ◽  
Chu-Yin Liu ◽  
Kai-Yi Chen

ABSTRACTTo mine new favorable alleles for tomato breeding, we investigated the feasibility of utilizing Solanum pimpinellifolium as a diverse panel of genome-wide association study through the restriction site-associated DNA sequencing technique. Previous attempts to conduct genome-wide association study using S. pimpinellifolium were impeded by an inability to correct for population stratification and by lack of high-density markers to address the issue of rapid linkage disequilibrium decay. In the current study, a set of 24,330 SNPs was identified using 99 S. pimpinellifolium accessions from the Tomato Genetic Resource Center. Approximately 84% PstI site-associated DNA sequencing regions were located in the euchromatic regions, resulting in the tagging of most SNPs on or near genes. Our genotypic data suggested that the optimum number of S. pimpinellifolium ancestral subpopulations was three, and accessions were classified into seven groups. In contrast to the SolCAP SNP genotypic data of previous studies, our SNP genotypic data consistently confirmed the population differentiation, achieving a relatively uniform correction of population stratification. Moreover, as expected, rapid linkage disequilibrium decay was observed in S. pimpinellifolium, especially in euchromatic regions. Approximately two-thirds of the flanking SNP markers did not display linkage disequilibrium. Our result suggests that higher density of molecular markers and more accessions are required to conduct the genome-wide association study utilizing the Solanum pimpinellifolium collection.


PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0248787
Author(s):  
Jhon A. Berdugo-Cely ◽  
Carolina Martínez-Moncayo ◽  
Tulio César Lagos-Burbano

Detailed knowledge on genetic parameters such as diversity, structure, and linkage disequilibrium (LD) and identification of duplicates in a germplasm bank and/or breeding collection are essential to conservation and breeding strategies in any crop. Therefore, the potato genetic breeding collection at the Universidad de Nariño in Colombia, which is made up of diploid and tetraploid genotypes in two of the more diverse genebanks in the world, was analyzed with 8303 single nucleotide polymorphisms (SNP) from SolCAP version 1. In total, 144 genotypes from this collection were analyzed identifying an 57.2% of the polymorphic markers that allowed establishing two and three subpopulations that differentiated the diploid genotypes from the tetraploids. These subpopulations had high levels of heterozygosity and linkage disequilibrium. The diversity levels were higher in the tetraploid genotypes, while the LD levels were higher in the diploid genotypes. For the tetraploids, the genotypes from Peru had greater diversity and lower linkage disequilibrium than those from Colombia, which had slightly lower diversity and higher degrees of LD. The genetic analysis identified, adjusted and/or selected diploid and tetraploid genotypes under the following characteristics: 1) errors in classification associated with the level of ploidy; 2) presence of duplicates; and 3) genotypes with broad genetic distances and potential use in controlled hybridization processes. These analyses suggested that the potato genetic breeding collection at the Universidad de Nariño has a genetic base with a potential use in breeding programs for this crop in the Department of Nariño, in southern Colombia.


2018 ◽  
Author(s):  
David T. Ashton ◽  
Peter A. Ritchie ◽  
Maren Wellenreuther

ABSTRACTCharacterizing the genetic variation underlying phenotypic traits is a central objective in biological research. This research has been hampered in the past by the limited genomic resources available for most non-model species. However, recent advances in sequencing technology and related genotyping methods are rapidly changing this. Here we report the use of genome-wide SNP data from the ecologically and commercially important marine fish species Chrysophrys auratus (snapper) to 1) construct the first linkage map for this species, 2) scan for growth QTLs, and 3) search for candidate genes in the surrounding QTL regions. The newly constructed linkage map contained ~11K SNP markers and is the densest map to date in the fish family Sparidae. Comparisons with available genome scaffolds indicated that overall marker placement was strongly correlated between the scaffolds and linkage map (R = 0.7), but at fine scales (< 5 cM) there were some precision limitations. Of the 24 linkage groups, which reflect the 24 chromosomes of this species, three were found to contain QTLs with genome-wide significance for growth-related traits. A scan for 13 known candidate growth genes located the genes for growth hormone, parvalbumin, and myogenin within 13.2, 2.6, and 5.0 cM of these genome-wide significant QTLs, respectively. The linkage map and QTLs found in this study will advance the investigation of genome structure and selective breeding in snapper.


Sign in / Sign up

Export Citation Format

Share Document