scholarly journals Characterizing Y-STRs in the Evaluation of Population Differentiation Using the Mean of Allele Frequency Difference between Populations

Genes ◽  
2020 ◽  
Vol 11 (5) ◽  
pp. 566
Author(s):  
Yuxiang Zhou ◽  
Yining Yao ◽  
Baonian Liu ◽  
Qinrui Yang ◽  
Zhihan Zhou ◽  
...  

Y-chromosomal short tandem repeats (Y-STRs) are widely used in human research for the evaluation of population substructure or population differentiation. Previous studies show that several haplotype sets can be used for the evaluation of population differentiation. However, little is known about whether each Y-STR in these sets performs well during this procedure. In this study, a total of 20,927 haplotypes of a Yfiler Plus set were collected from 41 global populations. Different configurations were observed in multidimensional scaling (MDS) plots based on pairwise genetic distances evaluated using a Yfiler set and a Yfiler Plus set, respectively. Subsequently, 23 single-copy Y-STRs were characterized in the evaluation of population differentiation using the mean of allele frequency difference (mAFD) between populations. Our results indicated that DYS392 had the largest mAFD value (0.3802) and YGATAH4 had the smallest value (0.1845). On the whole, larger pairwise genetic distances could be obtained using the set with the top fifteen markers from these 23 single-copy Y-STRs, and clear clustering or separation of populations could be observed in the MDS plot in comparison with those using the set with the minimum fifteen markers. In conclusion, the mAFD value is reliable to characterize Y-STRs for efficiency in the evaluation of population differentiation.

Genes ◽  
2019 ◽  
Vol 10 (4) ◽  
pp. 308 ◽  
Author(s):  
Berner

Measuring the magnitude of differentiation between populations based on genetic markers is commonplace in ecology, evolution, and conservation biology. The predominant differentiation metric used for this purpose is FST. Based on a qualitative survey, numerical analyses, simulations, and empirical data, I here argue that FST does not express the relationship to allele frequency differentiation between populations generally considered interpretable and desirable by researchers. In particular, FST (1) has low sensitivity when population differentiation is weak, (2) is contingent on the minor allele frequency across the populations, (3) can be strongly affected by asymmetry in sample sizes, and (4) can differ greatly among the available estimators. Together, these features can complicate pattern recognition and interpretation in population genetic and genomic analysis, as illustrated by empirical examples, and overall compromise the comparability of population differentiation among markers and study systems. I argue that a simple differentiation metric displaying intuitive properties, the absolute allele frequency difference AFD, provides a valuable alternative to FST. I provide a general definition of AFD applicable to both bi- and multi-allelic markers and conclude by making recommendations on the sample sizes needed to achieve robust differentiation estimates using AFD.


Genes ◽  
2019 ◽  
Vol 10 (10) ◽  
pp. 810
Author(s):  
Berner

This note is to correct an error in my paper, concerning the Shannon differentiation metric (DShannon) (Reference [43] in the paper). The paper states that DShannon is undefined mathematically whenever one or both populations are monomorphic, that is, fixed for a single allele. Accordingly, the DShannon curve in Figure 1a, showing population differentiation in relation to allele counts for the case in which the pooled minor allele frequency (MAF) is maximal, did not extend across the full range of allele counts; the rightmost data point reflecting complete population differentiation was missing. Moreover, DShannon was completely missing in Figure 1b visualizing the continuum of allele frequency differentiation when the MAF is minimal (one population monomorphic across the entire allele count range).


2006 ◽  
Vol 51 (2) ◽  
pp. 436-437 ◽  
Author(s):  
Ronny Decorte ◽  
Elke Verhoeven ◽  
Elisabeth Vanhoutte ◽  
Katleen Knaepen ◽  
Jean-Jacques Cassiman

2019 ◽  
Author(s):  
Noora R. Al-Snan ◽  
Safia A. Messaoudi ◽  
Yahya M. Khubrani ◽  
Jon H. Wetton ◽  
Mark A. Jobling ◽  
...  

AbstractBahrain location in the Arabian Gulf contributed to the diversity of its indigenous population descended from Christian Arabs, Persians (Zoroastrians), Jews, and Aramaic-speaking agriculturalists. The aim of this study was to examine population substructure within Bahrain using the 27 Y-STRs (short tandem repeats) in the Yfiler Plus kit and to characterize the haplotypes of 562 unrelated Bahraini nationals, sub-divided into the four geographical regions - North, Capital, South and Muharraq. Yfiler Plus provided a significant improvement over the earlier 17-locus Yfiler kit in discrimination capacity, increasing it from 77% to 87.5%, but this value differed widely between regions from 98.4% in Muharraq to 75.2% in the Northern region, an unusually low value possibly as a consequence of the very rapid expansion in population size in the last 80 years. Clusters of closely related male lineages were seen, with only 79.4% of donors displaying unique haplotypes and 59% of instances of shared haplotypes occurring within, rather than between, regions. Haplogroup prediction indicated diverse origins of the population with a predominance of haplogroups J2 and J1, both typical of the Arabian Peninsula, but also haplogroups such as B2 and E1b1a originating in Africa and H, L and R2 indicative of migration from the east. Haplogroup frequencies differed significantly between regions with J2 significantly more common in the Northern region compared with the Southern possibly as a result of differential settlement with Baharna (descended from populations in which J2 predominates) and Arabs (both indigenous and migrant Huwala who are expected to have a higher frequency of J1). Our study illustrated the importance of encompassing geographical and social variation when constructing population databases and the need for highly discriminating multiplexes where rapid expansions have occurred within tightly bounded populations.


2019 ◽  
Vol 20 (S16) ◽  
Author(s):  
Slim Karkar ◽  
Lauren E. Alfonse ◽  
Catherine M. Grgicak ◽  
Desmond S. Lun

Abstract Background In order to isolate an individual’s genotype from a sample of biological material, most laboratories use PCR and Capillary Electrophoresis (CE) to construct a genetic profile based on polymorphic loci known as Short Tandem Repeats (STRs). The resulting profile consists of CE signal which contains information about the length and number of STR units amplified. For samples collected from the environment, interpretation of the signal can be challenging given that information regarding the quality and quantity of the DNA is often limited. The signal can be further compounded by the presence of noise and PCR artifacts such as stutter which can mask or mimic biological alleles. Because manual interpretation methods cannot comprehensively account for such nuances, it would be valuable to develop a signal model that can effectively characterize the various components of STR signal independent of a priori knowledge of the quantity or quality of DNA. Results First, we seek to mathematically characterize the quality of the profile by measuring changes in the signal with respect to amplicon size. Next, we examine the noise, allele, and stutter components of the signal and develop distinct models for each. Using cross-validation and model selection, we identify a model that can be effectively utilized for downstream interpretation. Finally, we show an implementation of the model in NOCIt, a software system that calculates the a posteriori probability distribution on the number of contributors. Conclusion The model was selected using a large, diverse set of DNA samples obtained from 144 different laboratory conditions; with DNA amounts ranging from a single copy of DNA to hundreds of copies, and the quality of the profiles ranging from pristine to highly degraded. Implemented in NOCIt, the model enables a probabilisitc approach to estimating the number of contributors to complex, environmental samples.


2019 ◽  
Vol 62 (1) ◽  
pp. 305-312
Author(s):  
Kairat Dossybayev ◽  
Zarina Orazymbetova ◽  
Aizhan Mussayeva ◽  
Naruya Saitou ◽  
Rakhymbek Zhapbasov ◽  
...  

Abstract. A total of 75 individuals from five sheep populations in Kazakhstan were investigated based on 12 STR (short tandem repeat, also known as microsatellite) markers in order to study their genetic structure and phylogenetic relationship based on genetic distances. These sheep had a high level of genetic diversity. In total, 163 alleles were found in all the populations using 12 microsatellite loci. The mean number of alleles, effective number of alleles, and polymorphism information content (PIC) values per loci were 13.4, 5.9, and 0.78, respectively. Comparing the allelic diversity between the populations, the highest genetic diversity was observed in the Edilbay-1 sheep breed (8.333±0.644), and the lowest parameter was for Kazakh Arkhar-Merino (7.083±0.633). In all populations, there is a deficiency of heterozygosity. The largest genetic diversity was found in loci INRA023 and CSRD247 with 16 alleles, and the smallest polymorphism was noted for the locus D5S2 with 8 alleles. The level of observed heterozygosity was in the range 0.678±0.051 for Kazakh Arkhar-Merino and 0.767±0.047 for Kazakh fat-tailed coarse wool. The expected heterozygosity level range was from 0.702±0.033 for Kazakh Arkhar-Merino to 0.777±0.023 for Edilbay-1. When 12 microsatellite loci are compared, the OarFCB20 locus showed the highest level of genetic variability. Excess of heterozygosity was observed at three loci; MAF065, McM042, and OarFCB20. The highest genetic distance was observed between Kazakh Arkhar-Merino and Edilbay-1, whereas the genetic distance between Edilbay-1 and Edilbay-2 is the smallest using Nei's standard genetic distance. The Edilbay-1 sheep breed possesses the largest genetic diversity among these five populations.


2020 ◽  
pp. 002580242096500
Author(s):  
Supakit Khacha-ananda ◽  
Phatcharin Mahawong

Short tandem repeats (STRs) are widely used as DNA markers in paternity testing and criminal investigations because of their high genetic polymorphism among individuals in population. However, many factors influence genetic variations of STRs. Therefore, understanding STR information within individual populations could provide database and scientifically reliable STR genotyping for forensic genetic purposes. We aimed to examine allele frequencies of X-STRs, including some forensic parameters, in a northern Thai population. A retrospective descriptive study was conducted by collecting X-STR data from unrelated individuals living in a northern region of Thailand. The allele frequency and forensic parameters – for example polymorphism information content (PIC), power of discrimination in females and males (PDf and PDm), mean exclusion chance (MEC) and haplotype frequency – were calculated. The Hardy–Weinberg equilibrium was analysed. A total of 132 alleles were observed, with corresponding allele frequency ranging from 0.0064 to 0.4904. The PIC of all loci was >0.6, representing high genetic polymorphism, except DXS8378 and DXS7423. Notably, DXS10135 was the most diverse loci with the highest PD and MEC, while DXS7423 was the least polymorphic marker with the lowest PD and MEC. The highest haplotype diversity in male data was on linkage group III (DXS10101-DXS10103-HPRTB) by 0.9895. The genetic distance analysis demonstrated that the northern Thai population had a close relationship with Taiwanese (DA = 0.023). There are no significant deviations among the Hardy–Weinberg equilibrium except DXS10148. This study has established a northern Thai X-STRs reference database to be used as a tool for forensic genetic purposes.


Sign in / Sign up

Export Citation Format

Share Document