scholarly journals High-depth whole genome sequencing of an Ashkenazi Jewish reference panel: enhancing sensitivity, accuracy, and imputation

2018 ◽  
Vol 137 (4) ◽  
pp. 343-355 ◽  
Author(s):  
Todd Lencz ◽  
Jin Yu ◽  
Cameron Palmer ◽  
Shai Carmi ◽  
Danny Ben-Avraham ◽  
...  
2020 ◽  
Author(s):  
Peng Zhang ◽  
Huaxia Luo ◽  
Yanyan Li ◽  
You Wang ◽  
Jiajia Wang ◽  
...  

AbstractThe lack of Chinese population specific haplotype reference panel and whole genome sequencing resources has greatly hindered the genetics studies in the world’s largest population. Here we presented the NyuWa genome resource of 71.1M SNPs and 8.2M indels based on deep (26.2X) sequencing of 2,999 Chinese individuals, and constructed NyuWa reference panel of 5,804 haplotypes and 19.3M variants, which is the first publicly available Chinese population specific reference panel with thousands of samples. There were 25.0M novel variants in NyuWa genome resource, and 3.2M specific variants in NyuWa reference panel. Compared with other panels, NyuWa reference panel reduces the Han Chinese imputation error rate by the range of 30% to 51%. Population structure and imputation simulation tests supported the applicability of one integrated reference panel for both northern and southern Chinese. In addition, a total of 22,504 loss-of-function variants in coding and noncoding genes were identified, including 11,493 novel variants. These results highlight the value of NyuWa genome resource to facilitate genetics research in Chinese and Asian populations.


2017 ◽  
Author(s):  
Arthur Gilly ◽  
Lorraine Southam ◽  
Daniel Suveges ◽  
Karoline Kuchenbaecker ◽  
Rachel Moore ◽  
...  

AbstractMotivationVery low depth sequencing has been proposed as a cost-effective approach to capture low-frequency and rare variation in complex trait association studies. However, a full characterisation of the genotype quality and association power for very low depth sequencing designs is still lacking.ResultsWe perform cohort-wide whole genome sequencing (WGS) at low depth in 1,239 individuals (990 at 1x depth and 249 at 4x depth) from an isolated population, and establish a robust pipeline for calling and imputing very low depth WGS genotypes from standard bioinformatics tools. Using genotyping chip, whole-exome sequencing (WES, 75x depth) and high-depth (22x) WGS data in the same samples, we examine in detail the sensitivity of this approach, and show that imputed 1x WGS recapitulates 95.2% of variants found by imputed GWAS with an average minor allele concordance of 97% for common and low-frequency variants. In our study, 1x further allowed the discovery of 140,844 true low-frequency variants with 73% genotype concordance when compared to high-depth WGS data. Finally, using association results for 57 quantitative traits, we show that very low depth WGS is an efficient alternative to imputed GWAS chip designs, allowing the discovery of up to twice as many true association signals than the classical imputed GWAS design.Supplementary DataSupplementary Data are appended to this manuscript.


Cell Reports ◽  
2021 ◽  
Vol 37 (7) ◽  
pp. 110017
Author(s):  
Peng Zhang ◽  
Huaxia Luo ◽  
Yanyan Li ◽  
You Wang ◽  
Jiajia Wang ◽  
...  

2020 ◽  
Author(s):  
Sanghoon Lee ◽  
Li Zhao ◽  
Shannon N. Westin ◽  
Nicholas W. Bateman ◽  
Amir A. Jazaeri ◽  
...  

Genes ◽  
2017 ◽  
Vol 8 (9) ◽  
pp. 210 ◽  
Author(s):  
Kevin Gustafson ◽  
Jacque Duncan ◽  
Pooja Biswas ◽  
Angel Soto-Hermida ◽  
Hiroko Matsui ◽  
...  

2018 ◽  
Vol 35 (15) ◽  
pp. 2555-2561 ◽  
Author(s):  
Arthur Gilly ◽  
Lorraine Southam ◽  
Daniel Suveges ◽  
Karoline Kuchenbaecker ◽  
Rachel Moore ◽  
...  

Abstract Motivation Very low-depth sequencing has been proposed as a cost-effective approach to capture low-frequency and rare variation in complex trait association studies. However, a full characterization of the genotype quality and association power for very low-depth sequencing designs is still lacking. Results We perform cohort-wide whole-genome sequencing (WGS) at low depth in 1239 individuals (990 at 1× depth and 249 at 4× depth) from an isolated population, and establish a robust pipeline for calling and imputing very low-depth WGS genotypes from standard bioinformatics tools. Using genotyping chip, whole-exome sequencing (75× depth) and high-depth (22×) WGS data in the same samples, we examine in detail the sensitivity of this approach, and show that imputed 1× WGS recapitulates 95.2% of variants found by imputed GWAS with an average minor allele concordance of 97% for common and low-frequency variants. In our study, 1× further allowed the discovery of 140 844 true low-frequency variants with 73% genotype concordance when compared to high-depth WGS data. Finally, using association results for 57 quantitative traits, we show that very low-depth WGS is an efficient alternative to imputed GWAS chip designs, allowing the discovery of up to twice as many true association signals than the classical imputed GWAS design. Availability and implementation The HELIC genotype and WGS datasets have been deposited to the European Genome-phenome Archive (https://www.ebi.ac.uk/ega/home): EGAD00010000518; EGAD00010000522; EGAD00010000610; EGAD00001001636, EGAD00001001637. The peakplotter software is available at https://github.com/wtsi-team144/peakplotter, the transformPhenotype app can be downloaded at https://github.com/wtsi-team144/transformPhenotype. Supplementary information Supplementary data are available at Bioinformatics online.


BMC Genomics ◽  
2012 ◽  
Vol 13 (1) ◽  
pp. 468 ◽  
Author(s):  
Rachel Sealfon ◽  
Stephen Gire ◽  
Crystal Ellis ◽  
Stephen Calderwood ◽  
Firdausi Qadri ◽  
...  

2017 ◽  
Author(s):  
Chen Sun ◽  
Paul Medvedev

AbstractMotivationGenotyping a set of variants from a database is an important step for identifying known genetic traits and disease related variants within an individual. The growing size of variant databases as well as the high depth of sequencing data pose an efficiency challenge. In clinical applications, where time is crucial, alignment-based methods are often not fast enough. To fill the gap, Shajii et al. (2016) propose LAVA, an alignment-free genotyping method which is able to more quickly genotype SNPs; however, there remains large room for improvements in running time and accuracy.ResultsWe present the VarGeno method for SNP genotyping from lllumina whole genome sequencing data. VarGeno builds upon LAVA by improving the speed of k-mer querying as well as the accuracy of the genotyping strategy. We evaluate VarGeno on several read datasets using different genotyping SNP lists. VarGeno performs 7-13 times faster than LAVA with similar memory usage, while improving accuracy.AvailabilityVarGeno is freely available at: https://github.com/medvedevgroup/vargeno.


Sign in / Sign up

Export Citation Format

Share Document