Human Papillomavirus Detection by Whole-Genome Next-Generation Sequencing: Importance of Validationand Quality Assurance Procedures

Next-generation sequencing (NGS) yields powerful opportunities for studying human papillomavirus (HPV) genomics for applications in epidemiology, public health, and clinical diagnostics. HPV genotypes, variants, and point mutations can be investigated in clinical materials and described in previously unprecedented detail. However, both the NGS laboratory analysis and bioinformatical approach require numerous steps and checks to ensure robust interpretation of results. Here, we provide a step-by-step review of recommendations for validation and quality assurance procedures of each step in the typical NGS workflow, with a focus on whole-genome sequencing approaches. The use of directed pilots and protocols to ensure optimization of sequencing data yield, followed by curated bioinformatical procedures, is particularly emphasized. Finally, the storage and sharing of data sets are discussed. The development of international standards for quality assurance should be a goal for the HPV NGS community, similar to what has been developed for other areas of sequencing efforts including microbiology and molecular pathology. We thus propose that it is time for NGS to be included in the global efforts on quality assurance and improvement of HPV-based testing and diagnostics.

Download Full-text

Extraction of Mitochondrial Genome from Whole Genome Next Generation Sequencing Data and Unveiling of Forensically Relevant Markers

Russian Journal of Genetics ◽

10.1134/s1022795420080128 ◽

2020 ◽

Vol 56 (8) ◽

pp. 982-991

Author(s):

S. Rauf ◽

N. Zahra ◽

S. S. Malik ◽

S. A. e Zahra ◽

K. Sughra ◽

...

Keyword(s):

Next Generation Sequencing ◽

Mitochondrial Genome ◽

Next Generation Sequencing Data ◽

Whole Genome ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

Next-Generation Sequencing–Based Cancer Panel Data Conversion Using International Standards to Implement a Clinical Next-Generation Sequencing Research System: Single-Institution Study

JMIR Medical Informatics ◽

10.2196/14710 ◽

2020 ◽

Vol 8 (4) ◽

pp. e14710 ◽

Cited By ~ 1

Author(s):

Phillip Park ◽

Soo-Yong Shin ◽

Seog Yun Park ◽

Jeonghee Yun ◽

Chulmin Shin ◽

...

Keyword(s):

Clinical Practice ◽

Next Generation Sequencing ◽

Clinical Data ◽

Genomic Data ◽

International Standards ◽

Research System ◽

Next Generation ◽

Sequencing Data ◽

Clinical Sequencing ◽

Generation Sequencing

Background The analytical capacity and speed of next-generation sequencing (NGS) technology have been improved. Many genetic variants associated with various diseases have been discovered using NGS. Therefore, applying NGS to clinical practice results in precision or personalized medicine. However, as clinical sequencing reports in electronic health records (EHRs) are not structured according to recommended standards, clinical decision support systems have not been fully utilized. In addition, integrating genomic data with clinical data for translational research remains a great challenge. Objective To apply international standards to clinical sequencing reports and to develop a clinical research information system to integrate standardized genomic data with clinical data. Methods We applied the recently published ISO/TS 20428 standard to 367 clinical sequencing reports generated by panel (91 genes) sequencing in EHRs and implemented a clinical NGS research system by extending the clinical data warehouse to integrate the necessary clinical data for each patient. We also developed a user interface with a clinical research portal and an NGS result viewer. Results A single clinical sequencing report with 28 items was restructured into four database tables and 49 entities. As a result, 367 patients’ clinical sequencing data were connected with clinical data in EHRs, such as diagnosis, surgery, and death information. This system can support the development of cohort or case-control datasets as well. Conclusions The standardized clinical sequencing data are not only for clinical practice and could be further applied to translational research.

Download Full-text

Replicate whole-genome next-generation sequencing data derived from Caucasian donor saliva samples

Data in Brief ◽

10.1016/j.dib.2021.107349 ◽

2021 ◽

pp. 107349

Author(s):

Marcus Høy Hansen ◽

Charlotte Guldborg Nyvold

Keyword(s):

Next Generation Sequencing ◽

Next Generation Sequencing Data ◽

Whole Genome ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

Whole-genome sequencing data of Kazakh individuals

BMC Research Notes ◽

10.1186/s13104-021-05464-4 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Ulykbek Kairov ◽

Askhat Molkenov ◽

Saule Rakhimova ◽

Ulan Kozhamkulov ◽

Aigul Sharip ◽

...

Keyword(s):

Next Generation Sequencing ◽

Whole Genome Sequence ◽

Whole Genome ◽

Next Generation ◽

Sequencing Data ◽

Sequencing Platform ◽

Next Generation Sequencing Platform ◽

Central Asian ◽

Whole Genomes ◽

Generation Sequencing

Abstract Objectives Kazakhstan is a Central Asian crossroad of European and Asian populations situated along the way of the Great Silk Way. The territory of Kazakhstan has historically been inhabited by nomadic tribes and today is the multi-ethnic country with the dominant Kazakh ethnic group. We sequenced and analyzed the whole-genomes of five ethnic healthy Kazakh individuals with high coverage using next-generation sequencing platform. This whole-genome sequence data of healthy Kazakh individuals can be a valuable reference for biomedical studies investigating disease associations and population-wide genomic studies of ethnically diverse Central Asian region. Data description Blood samples have been collected from five ethnic healthy Kazakh individuals living in Kazakhstan. The genomic DNA was extracted from blood and sequenced. Sequencing was performed on Illumina HiSeq2000 next-generation sequencing platform. We sequenced and analyzed the whole-genomes of ethnic Kazakh individuals with the coverage ranging from 26 to 32X. Ranging from 98.85 to 99.58% base pairs were totally mapped and aligned on the human reference genome GRCh37 hg19. Het/Hom and Ts/Tv ratios for each whole genome ranged from 1.35 to 1.49 and from 2.07 to 2.08, respectively. Sequencing data are available in the National Center for Biotechnology Information SRA database under the accession number PRJNA374772.

Download Full-text

Next-Generation Sequencing–Based Cancer Panel Data Conversion Using International Standards to Implement a Clinical Next-Generation Sequencing Research System: Single-Institution Study (Preprint)

10.2196/preprints.14710 ◽

2019 ◽

Author(s):

Phillip Park ◽

Soo-Yong Shin ◽

Seog Yun Park ◽

Jeonghee Yun ◽

Chulmin Shin ◽

...

Keyword(s):

Clinical Practice ◽

Next Generation Sequencing ◽

Clinical Data ◽

Genomic Data ◽

International Standards ◽

Research System ◽

Next Generation ◽

Sequencing Data ◽

Clinical Sequencing ◽

Generation Sequencing

BACKGROUND The analytical capacity and speed of next-generation sequencing (NGS) technology have been improved. Many genetic variants associated with various diseases have been discovered using NGS. Therefore, applying NGS to clinical practice results in precision or personalized medicine. However, as clinical sequencing reports in electronic health records (EHRs) are not structured according to recommended standards, clinical decision support systems have not been fully utilized. In addition, integrating genomic data with clinical data for translational research remains a great challenge. OBJECTIVE To apply international standards to clinical sequencing reports and to develop a clinical research information system to integrate standardized genomic data with clinical data. METHODS We applied the recently published ISO/TS 20428 standard to 367 clinical sequencing reports generated by panel (91 genes) sequencing in EHRs and implemented a clinical NGS research system by extending the clinical data warehouse to integrate the necessary clinical data for each patient. We also developed a user interface with a clinical research portal and an NGS result viewer. RESULTS A single clinical sequencing report with 28 items was restructured into four database tables and 49 entities. As a result, 367 patients’ clinical sequencing data were connected with clinical data in EHRs, such as diagnosis, surgery, and death information. This system can support the development of cohort or case-control datasets as well. CONCLUSIONS The standardized clinical sequencing data are not only for clinical practice and could be further applied to translational research.

Download Full-text

Whole‐Genome Sequencing Analysis Using Next‐Generation Sequencing Data

Current Protocols Essential Laboratory Techniques ◽

10.1002/cpet.2 ◽

2016 ◽

Vol 12 (1) ◽

Author(s):

Chi Kent Ho ◽

Xiaohui Cui ◽

Sharon Grubner ◽

Christopher A. Larson ◽

Ying Wei ◽

...

Keyword(s):

Next Generation Sequencing ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Next Generation Sequencing Data ◽

Whole Genome ◽

Sequencing Analysis ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

A novel algorithm comprehensively characterizes human RH genes using whole-genome sequencing data

Blood Advances ◽

10.1182/bloodadvances.2020002148 ◽

2020 ◽

Vol 4 (18) ◽

pp. 4347-4357

Author(s):

Ti-Cheng Chang ◽

Kelly M. Haupfear ◽

Jing Yu ◽

Evadnie Rampersaud ◽

Vivien A. Sheehan ◽

...

Keyword(s):

Next Generation Sequencing ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Next Generation Sequencing Data ◽

Blood Group Antigens ◽

Whole Genome ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing ◽

Novel Algorithm

Abstract RHD and RHCE genes encode Rh blood group antigens and exhibit extensive single-nucleotide polymorphisms and chromosome structural changes in patients with sickle cell disease (SCD). RH variation can drive loss of antigen epitopes or expression of new epitopes, predisposing patients with SCD to Rh alloimmunization. Serologic antigen typing is limited to common Rh antigens, necessitating a genetic approach to detect variant antigen expression. We developed a novel algorithm termed RHtyper for RH genotyping from existing whole-genome sequencing (WGS) data. RHtyper determined RH genotypes in an average of 3.4 and 3.3 minutes per sample for RHD and RHCE, respectively. In a validation cohort consisting of 57 patients with SCD, RHtyper achieved 100% accuracy for RHD and 98.2% accuracy for RHCE, when compared with genotypes obtained by RH BeadChip and targeted molecular assays and after verification by Sanger sequencing and independent next-generation sequencing assays. RHtyper was next applied to WGS data from an additional 827 patients with SCD. In the total cohort of 884 patients, RHtyper identified 38 RHD and 28 RHCE distinct alleles, including a novel RHD DAU allele, RHD* 602G, 733C, 744T 1136T. RHtyper provides comprehensive and high-throughput RH genotyping from WGS data, facilitating deconvolution of the extensive RH genetic variation among patients with SCD. We have implemented RHtyper as a cloud-based public access application in DNAnexus (https://platform.dnanexus.com/app/RHtyper), enabling clinicians and researchers to perform RH genotyping with next-generation sequencing data.

Download Full-text

Computel: Computation of Mean Telomere Length from Whole-Genome Next-Generation Sequencing Data

PLoS ONE ◽

10.1371/journal.pone.0125201 ◽

2015 ◽

Vol 10 (4) ◽

pp. e0125201 ◽

Cited By ~ 24

Author(s):

Lilit Nersisyan ◽

Arsen Arakelyan

Keyword(s):

Next Generation Sequencing ◽

Telomere Length ◽

Next Generation Sequencing Data ◽

Whole Genome ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

Genetic Diagnosis of Charcot-Marie-Tooth Disease in a Population by Next-Generation Sequencing

BioMed Research International ◽

10.1155/2014/210401 ◽

2014 ◽

Vol 2014 ◽

pp. 1-13 ◽

Cited By ~ 30

Author(s):

Helle Høyer ◽

Geir J. Braathen ◽

Øyvind L. Busk ◽

Øystein L. Holla ◽

Marit Svendsen ◽

...

Keyword(s):

Next Generation Sequencing ◽

Genetic Diagnosis ◽

Point Mutations ◽

Gene Mutations ◽

Population Based ◽

Clinical Diagnostics ◽

Next Generation ◽

Charcot Marie Tooth ◽

Inherited Neuropathy ◽

Generation Sequencing

Charcot-Marie-Tooth (CMT) disease is the most prevalent inherited neuropathy. Today more than 40 CMT genes have been identified. Diagnosing heterogeneous diseases by conventional Sanger sequencing is time consuming and expensive. Thus, more efficient and less costly methods are needed in clinical diagnostics. We included a population based sample of 81 CMT families. Gene mutations had previously been identified in 22 families; the remaining 59 families were analysed by next-generation sequencing. Thirty-two CMT genes and 19 genes causing other inherited neuropathies were included in a custom panel. Variants were classified into five pathogenicity classes by genotype-phenotype correlations and bioinformatics tools. Gene mutations, classified certainly or likely pathogenic, were identified in 37 (46%) of the 81 families. Point mutations in known CMT genes were identified in 21 families (26%), whereas four families (5%) had point mutations in other neuropathy genes,ARHGEF10, POLG, SETX,andSOD1. Eleven families (14%) carried thePMP22duplication and one family carried aMPZduplication (1%). Most mutations were identified not only in known CMT genes but also in other neuropathy genes, emphasising that genetic analysis should not be restricted to CMT genes only. Next-generation sequencing is a cost-effective tool in diagnosis of CMT improving diagnostic precision and time efficiency.

Download Full-text

NPSV: A simulation-driven approach to genotyping structural variants in whole-genome sequencing data

GigaScience ◽

10.1093/gigascience/giab046 ◽

2021 ◽

Vol 10 (7) ◽

Author(s):

Michael D Linderman ◽

Crystal Paudyal ◽

Musab Shakeel ◽

William Kelley ◽

Ali Bashir ◽

...

Keyword(s):

Next Generation Sequencing ◽

De Novo ◽

Training Data ◽

Next Generation Sequencing Data ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Next Generation ◽

Structural Variants ◽

Sequencing Data ◽

Generation Sequencing

Abstract Background Structural variants (SVs) play a causal role in numerous diseases but are difficult to detect and accurately genotype (determine zygosity) in whole-genome next-generation sequencing data. SV genotypers that assume that the aligned sequencing data uniformly reflect the underlying SV or use existing SV call sets as training data can only partially account for variant and sample-specific biases. Results We introduce NPSV, a machine learning–based approach for genotyping previously discovered SVs that uses next-generation sequencing simulation to model the combined effects of the genomic region, sequencer, and alignment pipeline on the observed SV evidence. We evaluate NPSV alongside existing SV genotypers on multiple benchmark call sets. We show that NPSV consistently achieves or exceeds state-of-the-art genotyping accuracy across SV call sets, samples, and variant types. NPSV can specifically identify putative de novo SVs in a trio context and is robust to offset SV breakpoints. Conclusions Growing SV databases and the increasing availability of SV calls from long-read sequencing make stand-alone genotyping of previously identified SVs an increasingly important component of genome analyses. By treating potential biases as a “black box” that can be simulated, NPSV provides a framework for accurately genotyping a broad range of SVs in both targeted and genome-scale applications.

Download Full-text