scholarly journals CSYseq: The first Y-chromosome sequencing tool typing a large number of Y-SNPs and Y-STRs to unravel worldwide human population genetics

PLoS Genetics ◽  
2021 ◽  
Vol 17 (9) ◽  
pp. e1009758
Author(s):  
Sofie Claerhout ◽  
Paulien Verstraete ◽  
Liesbeth Warnez ◽  
Simon Vanpaemel ◽  
Maarten Larmuseau ◽  
...  

Male-specific Y-chromosome (chrY) polymorphisms are interesting components of the DNA for population genetics. While single nucleotide polymorphisms (Y-SNPs) indicate distant evolutionary ancestry, short tandem repeats (Y-STRs) are able to identify close familial kinships. Detailed chrY analysis provides thus both biogeographical background information as paternal lineage identification. The rapid advancement of high-throughput massive parallel sequencing (MPS) technology in the past decade has revolutionized genetic research. Using MPS, single-base information of both Y-SNPs as Y-STRs can be analyzed in a single assay typing multiple samples at once. In this study, we present the first extensive chrY-specific targeted resequencing panel, the ‘CSYseq’, which simultaneously identifies slow mutating Y-SNPs as evolution markers and rapid mutating Y-STRs as patrilineage markers. The panel was validated by paired-end sequencing of 130 males, distributed over 65 deep-rooted pedigrees covering 1,279 generations. The CSYseq successfully targets 15,611 Y-SNPs including 9,014 phylogenetic informative Y-SNPs to identify 1,443 human evolutionary Y-subhaplogroup lineages worldwide. In addition, the CSYseq properly targets 202 Y-STRs, including 81 slow, 68 moderate, 27 fast and 26 rapid mutating Y-STRs to individualize close paternal relatives. The targeted chrY markers cover a high average number of reads (Y-SNP = 717, Y-STR = 150), easy interpretation, powerful discrimination capacity and chrY specificity. The CSYseq is interesting for research on different time scales: to identify evolutionary ancestry, to find distant family and to discriminate closely related males. Therefore, this panel serves as a unique tool valuable for a wide range of genetic-genealogical applications in interdisciplinary research within evolutionary, population, molecular, medical and forensic genetics.

1997 ◽  
Vol 45 (3) ◽  
pp. 265-270 ◽  
Author(s):  
Anna Pérez-Lezaun ◽  
Francesc Calafell ◽  
Mark Seielstad ◽  
Eva Mateu ◽  
David Comas ◽  
...  

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Bing Song ◽  
August E. Woerner ◽  
John Planz

Abstract Background Multi-locus genotype data are widely used in population genetics and disease studies. In evaluating the utility of multi-locus data, the independence of markers is commonly considered in many genomic assessments. Generally, pairwise non-random associations are tested by linkage disequilibrium; however, the dependence of one panel might be triplet, quartet, or other. Therefore, a compatible and user-friendly software is necessary for testing and assessing the global linkage disequilibrium among mixed genetic data. Results This study describes a software package for testing the mutual independence of mixed genetic datasets. Mutual independence is defined as no non-random associations among all subsets of the tested panel. The new R package “mixIndependR” calculates basic genetic parameters like allele frequency, genotype frequency, heterozygosity, Hardy–Weinberg equilibrium, and linkage disequilibrium (LD) by mutual independence from population data, regardless of the type of markers, such as simple nucleotide polymorphisms, short tandem repeats, insertions and deletions, and any other genetic markers. A novel method of assessing the dependence of mixed genetic panels is developed in this study and functionally analyzed in the software package. By comparing the observed distribution of two common summary statistics (the number of heterozygous loci [K] and the number of share alleles [X]) with their expected distributions under the assumption of mutual independence, the overall independence is tested. Conclusion The package “mixIndependR” is compatible to all categories of genetic markers and detects the overall non-random associations. Compared to pairwise disequilibrium, the approach described herein tends to have higher power, especially when number of markers is large. With this package, more multi-functional or stronger genetic panels can be developed, like mixed panels with different kinds of markers. In population genetics, the package “mixIndependR” makes it possible to discover more about admixture of populations, natural selection, genetic drift, and population demographics, as a more powerful method of detecting LD. Moreover, this new approach can optimize variants selection in disease studies and contribute to panel combination for treatments in multimorbidity. Application of this approach in real data is expected in the future, and this might bring a leap in the field of genetic technology. Availability The R package mixIndependR, is available on the Comprehensive R Archive Network (CRAN) at: https://cran.r-project.org/web/packages/mixIndependR/index.html.


2021 ◽  
Vol 12 ◽  
Author(s):  
Francesco Ravasini ◽  
Eugenia D’Atanasio ◽  
Maria Bonito ◽  
Biancamaria Bonucci ◽  
Chiara Della Rocca ◽  
...  

The azoospermia factor c region (AZFc), located in the long arm of the human Y chromosome, is frequently involved in chromosome rearrangements, mainly due to non-allelic homologous recombination events that occur between the nearly identical sequences (amplicon) that comprises it. These rearrangements may have major phenotypic effects like spermatogenic failure or other pathologies linked to male infertility. Moreover, they may also be relevant in forensic genetics, since some of the Y chromosome short tandem repeats (Y-STRs) commonly used in forensic analysis are located in amplicons or in inter-amplicon sequences of the AZFc. In a previous study, we identified four phylogenetically related samples with a null allele at DYS448 and a tetrallelic pattern at DYF387S1, two Y-STRs located in the AZFc. Through NGS read depth analysis, we found that the unusual Y-STR pattern may be due to a 1.6 Mb deletion arising concurrently or after a 3.5 Mb duplication event. The observed large genomic rearrangement results in copy number reduction for the RBMY gene family as well as duplication of other AZFc genes. Based on the diversity of 16 additional Y-STRs, we estimated that the duplication/deletion event occurred at least twenty generations ago, suggesting that it has not been affected by negative selection.


Author(s):  
Gurdeep Matharu Lall ◽  
Maarten H. D. Larmuseau ◽  
Jon H. Wetton ◽  
Chiara Batini ◽  
Pille Hallast ◽  
...  

Abstract The influence of Viking-Age migrants to the British Isles is obvious in archaeological and place-names evidence, but their demographic impact has been unclear. Autosomal genetic analyses support Norse Viking contributions to parts of Britain, but show no signal corresponding to the Danelaw, the region under Scandinavian administrative control from the ninth to eleventh centuries. Y-chromosome haplogroup R1a1 has been considered as a possible marker for Viking migrations because of its high frequency in peninsular Scandinavia (Norway and Sweden). Here we select ten Y-SNPs to discriminate informatively among hg R1a1 sub-haplogroups in Europe, analyse these in 619 hg R1a1 Y chromosomes including 163 from the British Isles, and also type 23 short-tandem repeats (Y-STRs) to assess internal diversity. We find three specifically Western-European sub-haplogroups, two of which predominate in Norway and Sweden, and are also found in Britain; star-like features in the STR networks of these lineages indicate histories of expansion. We ask whether geographical distributions of hg R1a1 overall, and of the two sub-lineages in particular, correlate with regions of Scandinavian influence within Britain. Neither shows any frequency difference between regions that have higher (≥10%) or lower autosomal contributions from Norway and Sweden, but both are significantly overrepresented in the region corresponding to the Danelaw. These differences between autosomal and Y-chromosomal histories suggest either male-specific contribution, or the influence of patrilocality. Comparison of modern DNA with recently available ancient DNA data supports the interpretation that two sub-lineages of hg R1a1 spread with the Vikings from peninsular Scandinavia.


2016 ◽  
Vol 397 (12) ◽  
pp. 1307-1313 ◽  
Author(s):  
John Lai ◽  
Jiyuan An ◽  
Srilakshmi Srinivasan ◽  
Judith A. Clements ◽  
Jyotsna Batra

Abstract The kallikrein related peptidase gene family (KLKs) comprises 15 genes located between 19q13.3-13.4. KLKs have chymotrypsin and/or trypsin like activity, but the tissue/organ expression profile of each KLK varies considerably. Thus, the role of KLKs in human biology is also very diverse, and the deregulation of their function results in a wide-range of diseases. Here, we have cataloged the transcript (variants and fusions) and genetic (single nucleotide polymorphisms, small insertions/deletions, copy number variations (CNVs), and short tandem repeats) diversity at the KLK locus, providing a data set for researchers to explore the mechanisms through which KLK function may be deregulated. We reveal that the KLK locus hosts 85 fusion transcripts, and 80 variant transcripts. Interestingly, some fusion transcripts comprise up to 6 KLK genes. Our analysis of genetic variations of 2504 individuals from the 1000 Genome Project indicated that the KLK locus is rich in genetic diversity, with some fusion transcripts harboring over 1000 single nucleotide variations. We also found evidence from the literature linking 2387 KLK genetic variants with many types of diseases. Finally, genotyping data from the 131 KLK genetic variants in the NCI-60 cancer cell lines is provided as a resource for the cancer and KLK field.


Author(s):  
Irina Alborova ◽  
◽  
Kharis Mustafin ◽  
Maria Mednikova ◽  
Alexandra Buzhilova ◽  
...  

Introduction. The article presents the results of paleogenetic studies of medieval human remains of three people found in a closed archaeological complex (building 32) revealed during the excavations in 2007 in the Taynitsky Garden of the Moscow Kremlin (supervisor of excavations: N.A. Makarov). Previous studies on the dating of the complex links it to the devastation of Moscow by the troops of Tokhtamysh Khan in August 1382. The archaeological layer was formed at a time as a result of a fire and contained the remains of two adults and a 3-4 year old child who remained unburied. The aim of this work was the genetic study of the ancient DNA of the remains of people who died in the 14th century, clarification of their gender, determination of kinship and presumptive origin. Material and methods. For genetic examination, teeth were selected (permanent for adults, primary for a child). The laboratory research algorithm included a set of measures to protect archaeological DNA from contamination, sample preparation and extraction of DNA from dental remains, analysis of STR markers of the Y chromosome in males, analysis of ALU markers of autosomal chromosomes, targeted NGS sequencing of hyper-variable segments of mitochondrial DNA. Results and conclusion. Using the methods of molecular genetic research, it was possible to confirm that a man, a young woman and a child (boy) died in the fire. Based on the analysis of autosomal markers, with a high degree of probability (99.9%), a close biological relationship between a woman and a child (mother-son) was revealed. The man was not a relative of either the woman or the child. The mtDNA haplogroups and STR markers of the male specific Y chromosome identified in all three individuals are generally characteristic of the Slavic population of modern Europe. The mt haplogroup J1c, found in mother and child, is now most characteristic of the inhabitants of Europe. The man has a mitochondrial haplogroup K2, which is found mainly in Northwestern Europe.


Sign in / Sign up

Export Citation Format

Share Document