scholarly journals Commonly used Hardy-Weinberg equilibrium filtering schemes impact population structure inferences using RADseq data

2021 ◽  
Author(s):  
William S Pearman ◽  
Lara Urban ◽  
Alana Alexander

Reduced representation sequencing (RRS) is a widely used method to assay the diversity of genetic loci across the genome of an organism. The dominant class of RRS approaches assay loci associated with restriction sites within the genome (restriction site associated DNA sequencing, or RADseq). RADseq is frequently applied to non-model organisms since it enables population genetic studies without relying on well-characterized reference genomes. However, RADseq requires the use of many bioinformatic filters to ensure the quality of genotyping calls. These filters can have direct impacts on population genetic inference, and therefore require careful consideration. One widely used filtering approach is the removal of loci which do not conform to expectations of Hardy-Weinberg equilibrium (HWE). Despite being widely used, we show that this filtering approach is rarely described in sufficient detail to enable replication. Furthermore, through analyses of in silico and empirical datasets we show that some of the most widely used HWE filtering approaches dramatically impact inference of population structure. In particular, the removal of loci exhibiting departures from HWE after pooling across samples significantly reduces the degree of inferred population structure within a dataset (despite this approach being widely used). Based on these results, we provide recommendations for best practice regarding the implementation of HWE filtering for RADseq datasets.

2017 ◽  
Author(s):  
Wei Hao ◽  
John D. Storey

AbstractTesting for Hardy-Weinberg equilibrium (HWE) is an important component in almost all analyses of population genetic data. Genetic markers that violate HWE are often treated as special cases; for example, they may be flagged as possible genotyping errors or they may be investigated more closely for evolutionary signatures of interest. The presence of population structure is one reason why genetic markers may fail a test of HWE. This is problematic because almost all natural populations studied in the modern setting show some degree of structure. Therefore, it is important to be able to detect deviations from HWE for reasons other than structure. To this end, we extend statistical tests of HWE to allow for population structure, which we call a test of “structural HWE” (sHWE). Additionally, our new test allows one to automatically choose tuning parameters and identify accurate models of structure. We demonstrate our approach on several important studies, provide theoretical justification for the test, and present empirical evidence for its utility. We anticipate the proposed test will be useful in a broad range of analyses of genome-wide population genetic data.


2018 ◽  
Vol 63 (No. 11) ◽  
pp. 462-472
Author(s):  
Anna Stachurska ◽  
Antoni Brodacki ◽  
Marta Liss

The objective of this study was to estimate the frequencies of alleles which produce coat colour in Polish Coldblood horse population, and to verify the hypothesis that coat colour is not considered in its selection. The analysis included 35 928 horses and their parents having been registered in the studbook over a half-century. Allele frequencies in Agouti (A), Extension (E), Dun (D), Roan (Rn), and Grey (G) loci, in parental and offspring generations, were estimated according to test matings and the square root of recessive phenotype frequency. The population structure is in Hardy–Weinberg equilibrium only at E locus and coat colour is regarded by breeders. Black horses are favoured. Higher E locus homozygosity in blacks than in bays makes it easier to obtain black foals. Dun-diluted, roan and grey coat colours are undesirable and the population has come to consist almost uniformly of basic coat colours. These results show the importance of studies on population genetic structure, which despite no formal criteria for breeding for colour, can considerably change through generations.


2015 ◽  
Vol 63 (4) ◽  
pp. 275
Author(s):  
Andrea Bertram ◽  
P. Joana Dias ◽  
Sherralee Lukehurst ◽  
W. Jason Kennington ◽  
David Fairclough ◽  
...  

Bight redfish, Centroberyx gerrardi, is a demersal teleost endemic to continental shelf and upper slope waters of southern Australia. Throughout most of its range, C. gerrardi is targeted by a number of separately managed commercial and recreational fisheries across several jurisdictions. However, it is currently unknown whether stock assessments and management for this shared resource are being conducted at appropriate spatial scales, thereby requiring knowledge of population structure and connectivity. To investigate population structure and connectivity, we developed 16 new polymorphic microsatellite markers using 454 shotgun sequencing. Two to 15 alleles per locus were detected. There was no evidence of linkage disequilibrium between pairs of loci and all loci except one were in Hardy–Weinberg equilibrium. Cross-amplification trials in the congeneric C. australis and C. lineatus revealed that 11 and 16 loci are potentially useful, respectively. However, deviations from Hardy–Weinberg equilibrium and linkage disequilibrium between pairs of loci were detected at several of the 16 markers for C. australis, and therefore the number of markers useful for population genetic analyses with C. lineatus is likely considerably lower than 11.


Genetics ◽  
2021 ◽  
Author(s):  
Alan M Kwong ◽  
Thomas W Blackwell ◽  
Jonathon LeFaive ◽  
Mariza de Andrade ◽  
John Barnard ◽  
...  

Abstract Traditional Hardy–Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in data sets composed of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and to evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence data sets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false-positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently among the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.


2020 ◽  
Author(s):  
Alan M. Kwong ◽  
Thomas W. Blackwell ◽  
Jonathon LeFaive ◽  
Mariza de Andrade ◽  
John Barnard ◽  
...  

ABSTRACTTraditional Hardy-Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in datasets comprised of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence datasets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently amongst the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.


2019 ◽  
Author(s):  
Noora R. Al-Snan ◽  
Safia Messaoudi ◽  
Saranya R. Babu ◽  
Moiz Bakhiet

AbstractIntroductionBahrain’s population consists mainly of Arabs, Baharna and Persians leading Bahrain to become ethnically diverse. The exploration of the ethnic origin and genetic structure within the Bahraini population is fundamental mainly in the field of population genetics and forensic science.AimThe purpose of the study was to investigate and conduct genetic studies in the population of Bahrain to assist in the interpretation of DNA-based forensic evidence and in the construction of appropriate databases.Materials and Methods24 short-tandem repeats in the GlobalFiler™ PCR Amplification kit including 21 autosomal STR loci and three gender determination loci were amplified to characterize different genetic and forensic population parameters in a cohort of 543 Bahraini unrelated healthy men. Samples were collected during the year 2017.ResultsThe genotyping of the 21 autosomal STRs showed that most loci were in Hardy-Weinberg Equilibrium (HWE) except for three markers; D3S1358, D19S433 and D5S818 which showed deviation from HWE. We also found out no significant deviations from LD between pairwise STR loci in Bahraini population except when plotting for D3S1358-CSF1PO, CSF1PO-SE33, D19S433-D12S391, FGA-D2S1338, FGA-SE33, FGA-D7S820 and D7S820-SE33. The SE33 locus was the most polymorphic for the studied population and THO1 locus was the less polymorphic. The Allele 8 in TPOX scored the highest allele frequency of 0.496. The SE33 locus showed the highest power of discrimination (PD) in Bahraini population, whereas TPOX showed the lowest PD value. The 21 autosomal STRs showed a value of combined match probability (CMP) equal to 4.5633E-27, and a combined power of discrimination (CPD) of 99.99999999%. Off-ladders and tri-allelic variants were observed in various samples at D12S391, SE33 and D22S1045 loci.ConclusionOur study indicated that the twenty-one autosomal STRs are highly polymorphic in the Bahraini population and can be used as a powerful tool in forensics and population genetic analyses including paternity testing and familial DNA searching.Author SummaryKingdom of Bahrain is a country of 33 islands located in the Arabian Peninsula. The location of Bahrain had affected the diversity of its population, which is mainly divided into four main ethnic groups: Arabs, Baharna and Persians. Genetic studies on Bahraini population are very limited and little has been done to characterize population structure within Kingdom of Bahrain. Here, we used 21 autosomal STRs included in the GlobalFiler™ Amplification Kit to amplify DNA from 543 non-related males from Bahraini population. We conducted statistical analysis using two main different software such as STRAF and GenAlEx. Different forensic and population parameters were obtained to characterize Bahraini population. Some of the significant results obtained were the following: most of the loci were in Hardy-Weinberg Equilibrium, the most polymorphic and informative marker was SE33. Allele 8 in TPOX presented the highest allele frequency for the studied population. We also found out some of the rare variants which were recorded in STRbase website. Bahraini population was correspondingly compared to the genetic structure of the region. Our study indicated the usefulness of the 21 autosomal STRs in the GlobalFiler ™ in establishing databases, analyzing paternity and reviewing DNA-based evidences.


2010 ◽  
Vol 171 (8) ◽  
pp. 932-941 ◽  
Author(s):  
R. Moonesinghe ◽  
A. Yesupriya ◽  
M. h. Chang ◽  
N. F. Dowling ◽  
M. J. Khoury ◽  
...  

2018 ◽  
Author(s):  
Jonas Meisner ◽  
Anders Albrechtsen

AbstractTesting for Hardy-Weinberg Equilibrium (HWE) is a common practice for quality control in genetic studies. Variable sites violating HWE may be identified as technical errors in the sequencing or genotyping process, or they may be of special evolutionary interest. Large-scale genetic studies based on next-generation sequencing (NGS) methods have become more prevalent as cost is decreasing but these methods are still associated with statistical uncertainty. The large-scale studies usually consist of samples from diverse ancestries that make the existence of some degree of population structure almost inevitable. Precautions are therefore needed when analyzing these datasets, as population structure causes deviations from HWE. Here we propose a method that takes population structure into account in the testing for HWE, such that other factors causing deviations from HWE can be detected. We show the effectiveness of our method in NGS data, as well as in genotype data, for both simulated and real datasets, where the use of genotype likelihoods enables us to model the uncertainty for low-depth sequencing data.


HortScience ◽  
2010 ◽  
Vol 45 (1) ◽  
pp. 148-149
Author(s):  
Yuan Huang ◽  
Xue-qin Wang ◽  
Chun-yan Yang ◽  
Chun-lin Long

Primula amethystina Franchet. is a beautiful perennial herbaceous plant locally endemic to the alpine area in southwest China. We isolated and characterized 11 polymorphic microsatellite primer pairs from this species. The number of alleles ranged from two to five. The observed and expected heterozygosities (HO and HE) were 0.25 to 0.875 and 0.223 to 0.691, respectively. Six loci were significantly deviated from Hardy-Weinberg equilibrium as a result of the heterozygote deficiency. These markers will have great potential to reveal the genetic population structure and genetic diversity of P. amethystina.


2019 ◽  
Vol 112 (5) ◽  
pp. 2362-2368
Author(s):  
Yan Liu ◽  
Lei Chen ◽  
Xing-Zhi Duan ◽  
Dian-Shu Zhao ◽  
Jing-Tao Sun ◽  
...  

Abstract Deciphering genetic structure and inferring migration routes of insects with high migratory ability have been challenging, due to weak genetic differentiation and limited resolution offered by traditional genotyping methods. Here, we tested the ability of double digest restriction-site associated DNA sequencing (ddRADseq)-based single nucleotide polymorphisms (SNPs) in revealing the population structure relative to 13 microsatellite markers by using four small brown planthopper populations as subjects. Using ddRADseq, we identified 230,000 RAD loci and 5,535 SNP sites, which were present in at least 80% of individuals across the four populations with a minimum sequencing depth of 10. Our results show that this large SNP panel is more powerful than traditional microsatellite markers in revealing fine-scale population structure among the small brown planthopper populations. In contrast to the mixed population structure suggested by microsatellites, discriminant analysis of principal components (DAPC) of the SNP dataset clearly separated the individuals into four geographic populations. Our results also suggest the DAPC analysis is more powerful than the principal component analysis (PCA) in resolving population genetic structure of high migratory taxa, probably due to the advantages of DAPC in using more genetic variation and the discriminant analysis function. Together, these results point to ddRADseq being a promising approach for population genetic and migration studies of small brown planthopper.


Sign in / Sign up

Export Citation Format

Share Document