scholarly journals Discovery and quality analysis of a comprehensive set of structural variants and short tandem repeats

2020 ◽  
Vol 11 (1) ◽  
Author(s):  
David Jakubosky ◽  
◽  
Erin N. Smith ◽  
Matteo D’Antonio ◽  
Marc Jan Bonder ◽  
...  
2019 ◽  
Author(s):  
David Jakubosky ◽  
Erin N. Smith ◽  
Matteo D’Antonio ◽  
Marc Jan Bonder ◽  
William W. Young Greenwald ◽  
...  

AbstractStructural variants (SVs) and short tandem repeats (STRs) are important sources of genetic diversity but are not routinely analyzed in genetic studies because they are difficult to accurately identify and genotype. Because SVs and STRs range in size and type, it is necessary to apply multiple algorithms that incorporate different types of evidence from sequencing data and employ complex filtering strategies to discover a comprehensive set of high-quality and reproducible variants. Here we assembled a set of 719 deep whole genome sequencing (WGS) samples (mean 42x) from 477 distinct individuals which we used to discover and genotype a wide spectrum of SV and STR variants using five algorithms. We used 177 unique pairs of genetic replicates to identify factors that affect variant call reproducibility and developed a systematic filtering strategy to create of one of the most complete and well characterized maps of SVs and STRs to date.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
David Jakubosky ◽  
◽  
Matteo D’Antonio ◽  
Marc Jan Bonder ◽  
Craig Smail ◽  
...  

2020 ◽  
Author(s):  
Milad Mortazavi ◽  
Yangsu Ren ◽  
Shubham Saini ◽  
Danny Antaki ◽  
Celine St. Pierre ◽  
...  

AbstractC57BL/6J is the most widely used inbred mouse strain and is the basis for the mouse reference genome. In addition to C57BL/6J, several other C57BL/6 and C57BL/10 substrains exist. Previous studies have documented extensive phenotypic and genetic differences among these substrains, which are presumed to be due to the accumulation of new mutations. These differences can be used for genome wide association studies. They can also have unintended consequences for reproducibility when substrain differences are not properly accounted for. In this paper, we performed genomic sequencing and RNA-sequencing in the hippocampus of 9 C57BL/6 and 5 C57BL/10 substrains. We identified 985,329 SNPs, 150,344 Short Tandem Repeats (STR) and 896 Structural Variants (SV), out of which 330,178 SNPs and 14,367 STRs differentiated the C57BL/6 and C57BL/10 groups. We found several regions that contained dense polymorphisms. We also identified 578 differentially expressed genes for C57BL/6 substrains and 37 differentially expressed genes for C57BL/10 substrains (FDR < 0.01). We then identified nearby SNPs, STRs and SVs that matched the gene expression patterns. In so doing, we identified SVs in coding regions of Wdfy1, Ide, Fgfbp3 and Btaf1 that explain the expression patterns observed. We replicated several previously reported gene expression differences between substrains (Nnt, Gabra2) as well as many novel gene expression differences (e.g. Kcnc2). Our results illustrate the impact of new mutations on gene expression among these substrains and provides a resource for future mapping studies.


2019 ◽  
Author(s):  
David Jakubosky ◽  
Matteo D’Antonio ◽  
Marc Jan Bonder ◽  
Craig Smail ◽  
Margaret K.R. Donovan ◽  
...  

AbstractStructural variants (SVs) and short tandem repeats (STRs) comprise a broad group of diverse DNA variants which vastly differ in their sizes and distributions across the genome. Here, we show that different SV classes and STRs differentially impact gene expression and complex traits. Functional differences between SV classes and STRs include their genomic locations relative to eGenes, likelihood of being associated with multiple eGenes, associated eGene types (e.g., coding, noncoding, level of evolutionary constraint), effect sizes, linkage disequilibrium with tagging single nucleotide variants used in GWAS, and likelihood of being associated with GWAS traits. We also identified a set of high-impact SVs/STRs associated with the expression of three or more eGenes via chromatin loops and showed they are highly enriched for being associated with GWAS traits. Our study provides insights into the genomic properties of structural variant classes and short tandem repeats that impact gene expression and human traits.


1997 ◽  
Vol 45 (3) ◽  
pp. 265-270 ◽  
Author(s):  
Anna Pérez-Lezaun ◽  
Francesc Calafell ◽  
Mark Seielstad ◽  
Eva Mateu ◽  
David Comas ◽  
...  

Genetics ◽  
2000 ◽  
Vol 155 (4) ◽  
pp. 1973-1980
Author(s):  
Jinko Graham ◽  
James Curran ◽  
B S Weir

Abstract Modern forensic DNA profiles are constructed using microsatellites, short tandem repeats of 2–5 bases. In the absence of genetic data on a crime-specific subpopulation, one tool for evaluating profile evidence is the match probability. The match probability is the conditional probability that a random person would have the profile of interest given that the suspect has it and that these people are different members of the same subpopulation. One issue in evaluating the match probability is population differentiation, which can induce coancestry among subpopulation members. Forensic assessments that ignore coancestry typically overstate the strength of evidence against the suspect. Theory has been developed to account for coancestry; assumptions include a steady-state population and a mutation model in which the allelic state after a mutation event is independent of the prior state. Under these assumptions, the joint allelic probabilities within a subpopulation may be approximated by the moments of a Dirichlet distribution. We investigate the adequacy of this approximation for profiled loci that mutate according to a generalized stepwise model. Simulations suggest that the Dirichlet theory can still overstate the evidence against a suspect with a common microsatellite genotype. However, Dirichlet-based estimators were less biased than the product-rule estimator, which ignores coancestry.


2019 ◽  
Vol 108 (2) ◽  
pp. e115-e117
Author(s):  
Kelly Brown ◽  
Robert Homer ◽  
Marina Baine ◽  
Justin D. Blasberg

Sign in / Sign up

Export Citation Format

Share Document