scholarly journals Characterizing nucleotide variation and expansion dynamics in human-specific variable number tandem repeats

2021 ◽  
Author(s):  
Meredith M. Course ◽  
Arvis Sulovari ◽  
Kathryn Gudsnuk ◽  
Evan E. Eichler ◽  
Paul N. Valdmanis

AbstractThere are over 55,000 variable number tandem repeats (VNTRs) in the human genome, notable for both their striking polymorphism and mutability. Despite their role in human evolution and genomic variation, they have yet to be studied collectively and in detail, partially due to their large size, variability, and predominant location in non-coding regions. Here, we examine 467 VNTRs that are human-specific expansions, unique to one location in the genome, and not associated with retrotransposons. We leverage publicly available long-read genomes – including from the Human Genome Structural Variant Consortium – to ascertain the exact nucleotide composition of these VNTRs, and compare their composition of alleles. We then confirm repeat unit composition in over 3000 short-read samples from the 1000 Genomes Project. Our analysis reveals that these VNTRs contain remarkably structured repeat motif organization, modified by frequent deletion and duplication events. While overall VNTR compositions tend to remain similar between 1000 Genomes Project super-populations, we describe a notable exception with substantial differences in repeat composition (in PCBP3), as well as several VNTRs that are significantly different in length between super-populations (in ART1, PROP1, WDR60, and LOC102723906). We also observe that most of these VNTRs are expanded in archaic human genomes, yet remain stable in length between single generations. Collectively, our findings indicate that repeat motif variability, repeat composition, and repeat length are all informative modalities to consider when characterizing VNTRs and their contribution to genomic variation.

Genetics ◽  
2000 ◽  
Vol 155 (3) ◽  
pp. 1313-1320 ◽  
Author(s):  
John S Taylor ◽  
Felix Breden

Abstract The standard slipped-strand mispairing (SSM) model for the formation of variable number tandem repeats (VNTRs) proposes that a few tandem repeats, produced by chance mutations, provide the “raw material” for VNTR expansion. However, this model is unlikely to explain the formation of VNTRs with long motifs (e.g., minisatellites), because the likelihood of a tandem repeat forming by chance decreases rapidly as the length of the repeat motif increases. Phylogenetic reconstruction of the birth of a mitochondrial (mt) DNA minisatellite in guppies suggests that VNTRs with long motifs can form as a consequence of SSM at noncontiguous repeats. VNTRs formed in this manner have motifs longer than the noncontiguous repeat originally formed by chance and are flanked by one unit of the original, noncontiguous repeat. SSM at noncontiguous repeats can therefore explain the birth of VNTRs with long motifs and the “imperfect” or “short direct” repeats frequently observed adjacent to both mtDNA and nuclear VNTRs.


2021 ◽  
pp. gr.275560.121
Author(s):  
Meredith M Course ◽  
Arvis Sulovari ◽  
Kathryn Gudsnuk ◽  
Evan E Eichler ◽  
Paul N Valdmanis

2017 ◽  
Author(s):  
Mehrdad Bakhtiari ◽  
Sharona Shleizer-Burko ◽  
Melissa Gymrek ◽  
Vikas Bansal ◽  
Vineet Bafna

AbstractWhole Genome Sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. We consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6-100bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. While existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole genome sequenced reads remains challenging. We describe a method, adVNTR, that uses Hidden Markov Models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single molecule (PacBio) whole genome and exome sequencing, and show good results on multiple simulated and real data sets. adVNTR is available at https://github.com/mehrdadbakhtiari/adVNTR


2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Zhi-Jun Zhao ◽  
Ji-Quan Li ◽  
Li Ma ◽  
Hong-Mei Xue ◽  
Xu-Xin Yang ◽  
...  

Abstract Background The prevalence of human brucellosis in Qinghai Province of China has been increasing rapidly, with confirmed cases distributed across 31 counties. However, the epidemiology of brucellosis transmission has not been fully elucidated. To characterize the infecting strains isolated from humans, multiple-locus variable-number tandem repeats analysis (MLVA) and whole-genome single-nucleotide polymorphism (SNP)-based approaches were employed. Methods Strains were isolated from two males blood cultures that were confirmed Brucella melitensis positive following biotyping and MLVA. Genomic DNA was extracted from these two strains, and whole-genome sequencing was performed. Next, SNP-based phylogenetic analysis was performed to compare the two strains to 94 B. melitensis strains (complete genome and draft genome) retrieved from online databases. Results The two Brucella isolates were identified as B. melitensis biovar 3 (QH2019001 and QH2019005) following conventional biotyping and were found to have differences in their variable number tandem repeats (VNTRs) using MLVA-16. Phylogenetic examination assigned the 96 strains to five genotype groups, with QH2019001 and QH2019005 assigned to the same group, but different subgroups. Moreover, the QH2019005 strain was assigned to a new subgenotype, IIj, within genotype II. These findings were then combined to determine the geographic origin of the two Brucella strains. Conclusions Utilizing a whole-genome SNP-based approach enabled differences between the two B. melitensis strains to be more clearly resolved, and facilitated the elucidation of their different evolutionary histories. This approach also revealed that QH2019005 is a member of a new subgenotype (IIj) with an ancient origin in the eastern Mediterranean Sea.


2010 ◽  
Vol 112 (1) ◽  
pp. 296-306 ◽  
Author(s):  
Fahad R. Ali ◽  
Sylvia A. Vasiliou ◽  
Kate Haddley ◽  
Ursula M. Paredes ◽  
Julian C. Roberts ◽  
...  

2012 ◽  
Vol 13 (11) ◽  
pp. 5557-5562 ◽  
Author(s):  
Yu-Qian Wang ◽  
Hai-Hong Zhang ◽  
Chen-Lu Liu ◽  
Qiu Xia ◽  
Hui Wu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document