scholarly journals Effective variant filtering and expected candidate variant yield in studies of rare human disease

2021 ◽  
Vol 6 (1) ◽  
Brent S. Pedersen ◽  
Joe M. Brown ◽  
Harriet Dashnow ◽  
Amelia D. Wallace ◽  
Matt Velinder ◽  

AbstractIn studies of families with rare disease, it is common to screen for de novo mutations, as well as recessive or dominant variants that explain the phenotype. However, the filtering strategies and software used to prioritize high-confidence variants vary from study to study. In an effort to establish recommendations for rare disease research, we explore effective guidelines for variant (SNP and INDEL) filtering and report the expected number of candidates for de novo dominant, recessive, and autosomal dominant modes of inheritance. We derived these guidelines using two large family-based cohorts that underwent whole-genome sequencing, as well as two family cohorts with whole-exome sequencing. The filters are applied to common attributes, including genotype-quality, sequencing depth, allele balance, and population allele frequency. The resulting guidelines yield ~10 candidate SNP and INDEL variants per exome, and 18 per genome for recessive and de novo dominant modes of inheritance, with substantially more candidates for autosomal dominant inheritance. For family-based, whole-genome sequencing studies, this number includes an average of three de novo, ten compound heterozygous, one autosomal recessive, four X-linked variants, and roughly 100 candidate variants following autosomal dominant inheritance. The slivar software we developed to establish and rapidly apply these filters to VCF files is available at under an MIT license, and includes documentation and recommendations for best practices for rare disease analysis.

2021 ◽  
Vol 9 ◽  
Lingxia Zhang ◽  
Ke Huang ◽  
Shugang Wang ◽  
Haidong Fu ◽  
Jingjing Wang ◽  

Gitelman syndrome (GS, OMIM 263800) is a genetic congenital tubulopathy associated with salt loss, which is characterized by hypokalemic metabolic toxicity, hypocalciuria, and hypomagnesemia. GS, which is typically detected in adolescence or adulthood, has long been considered a benign tubular lesion; however, the disease is associated with a significant decrease in the quality of life. In this study, we assessed the genotype–phenotype correlations based on the medical histories, clinical symptoms, laboratory test results, and whole-exome sequencing profiles from pediatric patients with GS. Between January 2014 and December 2020, all 31 consecutively enrolled patients complained of fatigue, salt craving, and muscle weakness. Sixteen patients demonstrated growth retardation, and five patients presented with nocturia and constipation. All patients presented with hypokalemic metabolic alkalosis, normal blood pressure, hyperaldosteronism, and a preserved glomerular filtration rate, and 24 of the 31 (77.4%) patients had hypomagnesemia. Homozygous, compound heterozygous, and heterozygous mutations in SLC12A3 were detected in 4, 24, and 3 patients, respectively. GS patients often present with muscle weakness and fatigue caused by hypokalemia and hypomagnesemia. Therefore, early diagnosis of GS is important in young children to reduce the possibility of growth retardation, tetany, and seizures. Next-generation sequencing such as whole-exome or whole-genome sequencing provides a practical tool for the early diagnosis and improvement of GS prognosis. Further whole-genome sequencing is expected to reveal more variants in SLC123A among GS patients with single heterozygous mutations.

PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0253440
Samantha Gunasekera ◽  
Sam Abraham ◽  
Marc Stegger ◽  
Stanley Pang ◽  
Penghao Wang ◽  

Whole-genome sequencing is essential to many facets of infectious disease research. However, technical limitations such as bias in coverage and tagmentation, and difficulties characterising genomic regions with extreme GC content have created significant obstacles in its use. Illumina has claimed that the recently released DNA Prep library preparation kit, formerly known as Nextera Flex, overcomes some of these limitations. This study aimed to assess bias in coverage, tagmentation, GC content, average fragment size distribution, and de novo assembly quality using both the Nextera XT and DNA Prep kits from Illumina. When performing whole-genome sequencing on Escherichia coli and where coverage bias is the main concern, the DNA Prep kit may provide higher quality results; though de novo assembly quality, tagmentation bias and GC content related bias are unlikely to improve. Based on these results, laboratories with existing workflows based on Nextera XT would see minor benefits in transitioning to the DNA Prep kit if they were primarily studying organisms with neutral GC content.

2020 ◽  
Vol 29 (1) ◽  
pp. 184-193 ◽  
Jonas Carlsson Almlöf ◽  
Sara Nystedt ◽  
Aikaterini Mechtidou ◽  
Dag Leonard ◽  
Maija-Leena Eloranta ◽  

AbstractBy performing whole-genome sequencing in a Swedish cohort of 71 parent-offspring trios, in which the child in each family is affected by systemic lupus erythematosus (SLE, OMIM 152700), we investigated the contribution of de novo variants to risk of SLE. We found de novo single nucleotide variants (SNVs) to be significantly enriched in gene promoters in SLE patients compared with healthy controls at a level corresponding to 26 de novo promoter SNVs more in each patient than expected. We identified 12 de novo SNVs in promoter regions of genes that have been previously implicated in SLE, or that have functions that could be of relevance to SLE. Furthermore, we detected three missense de novo SNVs, five de novo insertion-deletions, and three de novo structural variants with potential to affect the expression of genes that are relevant for SLE. Based on enrichment analysis, disease-affecting de novo SNVs are expected to occur in one-third of SLE patients. This study shows that de novo variants in promoters commonly contribute to the genetic risk of SLE. The fact that de novo SNVs in SLE were enriched to promoter regions highlights the importance of using whole-genome sequencing for identification of de novo variants.

BMC Genomics ◽  
2011 ◽  
Vol 12 (1) ◽  
Yanliang Jiang ◽  
Jianguo Lu ◽  
Eric Peatman ◽  
Huseyin Kucuktas ◽  
Shikai Liu ◽  

2015 ◽  
Vol 25 (3) ◽  
pp. 426-434 ◽  
Brock A. Peters ◽  
Bahram G. Kermani ◽  
Oleg Alferov ◽  
Misha R. Agarwal ◽  
Mark A. McElwain ◽  

Sign in / Sign up

Export Citation Format

Share Document