Systematic Evaluation of Sanger Validation of Next-Generation Sequencing Variants

Abstract BACKGROUND Next-generation sequencing (NGS) data are used for both clinical care and clinical research. DNA sequence variants identified using NGS are often returned to patients/participants as part of clinical or research protocols. The current standard of care is to validate NGS variants using Sanger sequencing, which is costly and time-consuming. METHODS We performed a large-scale, systematic evaluation of Sanger-based validation of NGS variants using data from the ClinSeq® project. We first used NGS data from 19 genes in 5 participants, comparing them to high-throughput Sanger sequencing results on the same samples, and found no discrepancies among 234 NGS variants. We then compared NGS variants in 5 genes from 684 participants against data from Sanger sequencing. RESULTS Of over 5800 NGS-derived variants, 19 were not validated by Sanger data. Using newly designed sequencing primers, Sanger sequencing confirmed 17 of the NGS variants, and the remaining 2 variants had low quality scores from exome sequencing. Overall, we measured a validation rate of 99.965% for NGS variants using Sanger sequencing, which was higher than many existing medical tests that do not necessitate orthogonal validation. CONCLUSIONS A single round of Sanger sequencing is more likely to incorrectly refute a true-positive variant from NGS than to correctly identify a false-positive variant from NGS. Validation of NGS-derived variants using Sanger sequencing has limited utility, and best practice standards should not include routine orthogonal Sanger validation of NGS variants.

Download Full-text

Identification of Genetic Hereditary Predisposition to Hematologic Malignancies By Clinical Next-Generation Sequencing

Blood ◽

10.1182/blood.v126.23.3854.3854 ◽

2015 ◽

Vol 126 (23) ◽

pp. 3854-3854 ◽

Cited By ~ 2

Author(s):

Amy E Knight Johnson ◽

Lucia Guidugli ◽

Kelly Arndt ◽

Gorka Alkorta-Aranburu ◽

Viswateja Nelakuditi ◽

...

Keyword(s):

Next Generation Sequencing ◽

Sanger Sequencing ◽

Family Members ◽

Hematologic Malignancies ◽

Dyskeratosis Congenita ◽

Molecular Diagnostic ◽

Next Generation ◽

Hereditary Predisposition ◽

Ngs Data ◽

Generation Sequencing

Abstract Introduction: Myelodysplastic syndrome (MDS) and acute leukemia (AL) are a clinically diverse and genetically heterogeneous group of hematologic malignancies. Familial forms of MDS/AL have been increasingly recognized in recent years, and can occur as a primary event or secondary to genetic syndromes, such as inherited bone marrow failure syndromes (IBMFS). It is critical to confirm a genetic diagnosis in patients with hereditary predisposition to hematologic malignancies in order to provide prognostic information and cancer risk assessment, and to aid in identification of at-risk or affected family members. In addition, a molecular diagnosis can help tailor medical management including informing the selection of family members for allogeneic stem cell transplantation donors. Until recently, clinical testing options for this diverse group of hematologic malignancy predisposition genes were limited to the evaluation of single genes by Sanger sequencing, which is a time consuming and expensive process. To improve the diagnosis of hereditary predisposition to hematologic malignancies, our CLIA-licensed laboratory has recently developed Next-Generation Sequencing (NGS) panel-based testing for these genes. Methods: Thirty six patients with personal and/or family history of aplastic anemia, MDS or AL were referred for clinical diagnostic testing. DNA from the referred patients was obtained from cultured skin fibroblasts or peripheral blood and was utilized for preparing libraries with the SureSelectXT Enrichment System. Libraries were sequenced on an Illumina MiSeq instrument and the NGS data was analyzed with a custom bioinformatic pipeline, targeting a panel of 76 genes associated with IBMFS and/or familial MDS/AL. Results: Pathogenic and highly likely pathogenic variants were identified in 7 out of 36 patients analyzed, providing a positive molecular diagnostic rate of 20%. Overall, 6 out of the 7 pathogenic changes identified were novel. In 2 unrelated patients with MDS, heterozygous pathogenic sequence changes were identified in the GATA2 gene. Heterozygous pathogenic changes in the following autosomal dominant genes were each identified in a single patient: RPS26 (Diamond-Blackfan anemia 10), RUNX1 (familial platelet disorder with propensity to myeloid malignancy), TERT (dyskeratosis congenita 4) and TINF2 (dyskeratosis congenita 3). In addition, one novel heterozygous sequence change (c.826+5_826+9del, p.?) in the Fanconi anemia associated gene FANCA was identified. . The RNA analysis demonstrated this variant causes skipping of exon 9 and results in a premature stop codon in exon 10. Further review of the NGS data provided evidence of an additional large heterozygous multi-exon deletion in FANCA in the same patient. This large deletion was confirmed using array-CGH (comparative genomic hybridization). Conclusions: This study demonstrates the effectiveness of using NGS technology to identify patients with a hereditary predisposition to hematologic malignancies. As many of the genes associated with hereditary predisposition to hematologic malignancies have similar or overlapping clinical presentations, analysis of a diverse panel of genes is an efficient and cost-effective approach to molecular diagnostics for these disorders. Unlike Sanger sequencing, NGS technology also has the potential to identify large exonic deletions and duplications. In addition, RNA splicing assay has proven to be helpful in clarifying the pathogenicity of variants suspected to affect splicing. This approach will also allow for identification of a molecular defect in patients who may have atypical presentation of disease. Disclosures No relevant conflicts of interest to declare.

Download Full-text

Epidemiological data analysis of viral quasispecies in the next-generation sequencing era

Briefings in Bioinformatics ◽

10.1093/bib/bbaa101 ◽

2020 ◽

Cited By ~ 6

Author(s):

Sergey Knyazev ◽

Lauren Hughes ◽

Pavel Skums ◽

Alexander Zelikovsky

Keyword(s):

Next Generation Sequencing ◽

Large Scale ◽

Phylogenetic Analyses ◽

Epidemiological Surveillance ◽

Viral Population ◽

Surveillance Systems ◽

Next Generation ◽

Effective Prevention ◽

Ngs Data ◽

Generation Sequencing

Abstract The unprecedented coverage offered by next-generation sequencing (NGS) technology has facilitated the assessment of the population complexity of intra-host RNA viral populations at an unprecedented level of detail. Consequently, analysis of NGS datasets could be used to extract and infer crucial epidemiological and biomedical information on the levels of both infected individuals and susceptible populations, thus enabling the development of more effective prevention strategies and antiviral therapeutics. Such information includes drug resistance, infection stage, transmission clusters and structures of transmission networks. However, NGS data require sophisticated analysis dealing with millions of error-prone short reads per patient. Prior to the NGS era, epidemiological and phylogenetic analyses were geared toward Sanger sequencing technology; now, they must be redesigned to handle the large-scale NGS datasets and properly model the evolution of heterogeneous rapidly mutating viral populations. Additionally, dedicated epidemiological surveillance systems require big data analytics to handle millions of reads obtained from thousands of patients for rapid outbreak investigation and management. We survey bioinformatics tools analyzing NGS data for (i) characterization of intra-host viral population complexity including single nucleotide variant and haplotype calling; (ii) downstream epidemiological analysis and inference of drug-resistant mutations, age of infection and linkage between patients; and (iii) data collection and analytics in surveillance systems for fast response and control of outbreaks.

Download Full-text

Review of Clinical Next-Generation Sequencing

Archives of Pathology & Laboratory Medicine ◽

10.5858/arpa.2016-0501-ra ◽

2017 ◽

Vol 141 (11) ◽

pp. 1544-1557 ◽

Cited By ~ 87

Author(s):

Sophia Yohe ◽

Bharat Thyagarajan

Keyword(s):

Next Generation Sequencing ◽

Clinical Care ◽

Cost Effective ◽

Next Generation ◽

Inherited Disorders ◽

Academic Center ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Generation Sequencing

Context.— Next-generation sequencing (NGS) is a technology being used by many laboratories to test for inherited disorders and tumor mutations. This technology is new for many practicing pathologists, who may not be familiar with the uses, methodology, and limitations of NGS. Objective.— To familiarize pathologists with several aspects of NGS, including current and expanding uses; methodology including wet bench aspects, bioinformatics, and interpretation; validation and proficiency; limitations; and issues related to the integration of NGS data into patient care. Data Sources.— The review is based on peer-reviewed literature and personal experience using NGS in a clinical setting at a major academic center. Conclusions.— The clinical applications of NGS will increase as the technology, bioinformatics, and resources evolve to address the limitations and improve quality of results. The challenge for clinical laboratories is to ensure testing is clinically relevant, cost-effective, and can be integrated into clinical care.

Download Full-text

Practice guidelines for development and validation of software, with particular focus on bioinformatics pipelines for processing NGS data in clinical diagnostic laboratories

10.7287/peerj.preprints.2996 ◽

2017 ◽

Author(s):

Nicola Whiffin ◽

Kim Brugger ◽

Joo Wook Ahn

Keyword(s):

Next Generation Sequencing ◽

Best Practice ◽

Improve Patient Care ◽

Clinical Genetics ◽

Next Generation ◽

Clinical Bioinformatics ◽

Sequencing Technologies ◽

Development And Validation ◽

Ngs Data ◽

Generation Sequencing

Clinical bioinformatics is an emerging field in diagnostic genetics laboratories that are harnessing next-generation sequencing technologies to improve patient care. This document provides guidance for the development and validation of software used for delivery of clinical genetics, focusing on bioinformatics pipelines for next-generation sequencing (NGS) applications. It builds on the literature and guidelines that already exist around software development and validation of clinical genetics diagnostic tests, to detail best practice specific to a clinical bioinformatician’s role.

Download Full-text

Practice guidelines for development and validation of software, with particular focus on bioinformatics pipelines for processing NGS data in clinical diagnostic laboratories

10.7287/peerj.preprints.2996v1 ◽

2017 ◽

Cited By ~ 1

Author(s):

Nicola Whiffin ◽

Kim Brugger ◽

Joo Wook Ahn

Keyword(s):

Next Generation Sequencing ◽

Best Practice ◽

Improve Patient Care ◽

Clinical Genetics ◽

Next Generation ◽

Clinical Bioinformatics ◽

Sequencing Technologies ◽

Development And Validation ◽

Ngs Data ◽

Generation Sequencing

Download Full-text

Introduction to the analysis of next generation sequencing data and its application to venous thromboembolism

Thrombosis and Haemostasis ◽

10.1160/th15-05-0411 ◽

2015 ◽

Vol 114 (11) ◽

pp. 920-932 ◽

Cited By ~ 5

Author(s):

Joost C. M. Meijers ◽

Saskia Middeldorp ◽

Marisa L. R. Cunha

Keyword(s):

Venous Thromboembolism ◽

Next Generation Sequencing ◽

Clinical Care ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Ngs Data Analysis ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Generation Sequencing

SummaryDespite knowledge of various inherited risk factors associated with venous thromboembolism (VTE), no definite cause can be found in about 50% of patients. The application of data-driven searches such as GWAS has not been able to identify genetic variants with implications for clinical care, and unexplained heritability remains. In the past years, the development of several so-called next generation sequencing (NGS) platforms is offering the possibility of generating fast, inexpensive and accurate genomic information. However, so far their application to VTE has been very limited. Here we review basic concepts of NGS data analysis and explore the application of NGS technology to VTE. We provide both computational and biological viewpoints to discuss potentials and challenges of NGS-based studies.

Download Full-text

Variant discovery using next-generation sequencing and its future role in pharmacogenetics

Pharmacogenomics ◽

10.2217/pgs-2019-0190 ◽

2020 ◽

Vol 21 (7) ◽

pp. 471-486

Author(s):

Laura E Russell ◽

Ute I Schwarz

Keyword(s):

Next Generation Sequencing ◽

Protein Function ◽

Large Scale ◽

Clinical Care ◽

Next Generation ◽

Clinical Implementation ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Variant Discovery ◽

Generation Sequencing

Next-generation sequencing (NGS) has enabled the discovery of a multitude of novel and mostly rare variants in pharmacogenes that may alter a patient’s therapeutic response to drugs. In addition to single nucleotide variants, structural variation affecting the number of copies of whole genes or parts of genes can be detected. While current guidelines concerning clinical implementation mostly act upon well-documented, common single nucleotide variants to guide dosing or drug selection, in silico and large-scale functional assessment of rare variant effects on protein function are at the forefront of pharmacogenetic research to facilitate their clinical integration. Here, we discuss the role of NGS in variant discovery, paving the way for more comprehensive genotype-guided pharmacotherapy that can translate to improved clinical care.

Download Full-text

Integrative Analysis of Next-Generation Sequencing for Next-Generation Cancer Research toward Artificial Intelligence

Cancers ◽

10.3390/cancers13133148 ◽

2021 ◽

Vol 13 (13) ◽

pp. 3148

Author(s):

Youngjun Park ◽

Dominik Heider ◽

Anne-Christin Hauschild

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Cancer Research ◽

Data Analysis ◽

Next Generation Sequencing ◽

Systems Biology ◽

Large Scale ◽

Next Generation ◽

Ngs Data ◽

Generation Sequencing

The rapid improvement of next-generation sequencing (NGS) technologies and their application in large-scale cohorts in cancer research led to common challenges of big data. It opened a new research area incorporating systems biology and machine learning. As large-scale NGS data accumulated, sophisticated data analysis methods became indispensable. In addition, NGS data have been integrated with systems biology to build better predictive models to determine the characteristics of tumors and tumor subtypes. Therefore, various machine learning algorithms were introduced to identify underlying biological mechanisms. In this work, we review novel technologies developed for NGS data analysis, and we describe how these computational methodologies integrate systems biology and omics data. Subsequently, we discuss how deep neural networks outperform other approaches, the potential of graph neural networks (GNN) in systems biology, and the limitations in NGS biomedical research. To reflect on the various challenges and corresponding computational solutions, we will discuss the following three topics: (i) molecular characteristics, (ii) tumor heterogeneity, and (iii) drug discovery. We conclude that machine learning and network-based approaches can add valuable insights and build highly accurate models. However, a well-informed choice of learning algorithm and biological network information is crucial for the success of each specific research question.

Download Full-text

Next-Generation Sequencing: An Emerging Tool for Drug Designing

Current Pharmaceutical Design ◽

10.2174/1381612825666190911155508 ◽

2019 ◽

Vol 25 (31) ◽

pp. 3350-3357 ◽

Cited By ~ 1

Author(s):

Pooja Tripathi ◽

Jyotsna Singh ◽

Jonathan A. Lal ◽

Vijay Tripathi

Keyword(s):

Next Generation Sequencing ◽

High Throughput ◽

Large Scale ◽

Massively Parallel Sequencing ◽

Genomic Research ◽

Biological Research ◽

Next Generation ◽

Human Welfare ◽

Drug Designing ◽

Generation Sequencing

Background: With the outbreak of high throughput next-generation sequencing (NGS), the biological research of drug discovery has been directed towards the oncology and infectious disease therapeutic areas, with extensive use in biopharmaceutical development and vaccine production. Method: In this review, an effort was made to address the basic background of NGS technologies, potential applications of NGS in drug designing. Our purpose is also to provide a brief introduction of various Nextgeneration sequencing techniques. Discussions: The high-throughput methods execute Large-scale Unbiased Sequencing (LUS) which comprises of Massively Parallel Sequencing (MPS) or NGS technologies. The Next geneinvolved necessarily executes Largescale Unbiased Sequencing (LUS) which comprises of MPS or NGS technologies. These are related terms that describe a DNA sequencing technology which has revolutionized genomic research. Using NGS, an entire human genome can be sequenced within a single day. Conclusion: Analysis of NGS data unravels important clues in the quest for the treatment of various lifethreatening diseases and other related scientific problems related to human welfare.

Download Full-text

NGSremix: A software tool for estimating pairwise relatedness between admixed individuals from next-generation sequencing data

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab174 ◽

2021 ◽

Author(s):

Anne Krogh Nøhr ◽

Kristian Hanghøj ◽

Genis Garcia Erill ◽

Zilong Li ◽

Ida Moltke ◽

...

Keyword(s):

Next Generation Sequencing ◽

Genetic Research ◽

Likelihood Estimation ◽

Software Tool ◽

Estimation Methods ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Ngs Data ◽

Generation Sequencing

Abstract Estimation of relatedness between pairs of individuals is important in many genetic research areas. When estimating relatedness, it is important to account for admixture if this is present. However, the methods that can account for admixture are all based on genotype data as input, which is a problem for low-depth next-generation sequencing (NGS) data from which genotypes are called with high uncertainty. Here we present a software tool, NGSremix, for maximum likelihood estimation of relatedness between pairs of admixed individuals from low-depth NGS data, which takes the uncertainty of the genotypes into account via genotype likelihoods. Using both simulated and real NGS data for admixed individuals with an average depth of 4x or below we show that our method works well and clearly outperforms all the commonly used state-of-the-art relatedness estimation methods PLINK, KING, relateAdmix, and ngsRelate that all perform quite poorly. Hence, NGSremix is a useful new tool for estimating relatedness in admixed populations from low-depth NGS data. NGSremix is implemented in C/C ++ in a multi-threaded software and is freely available on Github https://github.com/KHanghoj/NGSremix.

Download Full-text