Improved structural variant interpretation for hereditary cancer susceptibility using long-read sequencing

Abstract Purpose Structural variants (SVs) may be an underestimated cause of hereditary cancer syndromes given the current limitations of short-read next-generation sequencing. Here we investigated the utility of long-read sequencing in resolving germline SVs in cancer susceptibility genes detected through short-read genome sequencing. Methods Known or suspected deleterious germline SVs were identified using Illumina genome sequencing across a cohort of 669 advanced cancer patients with paired tumor genome and transcriptome sequencing. Candidate SVs were subsequently assessed by Oxford Nanopore long-read sequencing. Results Nanopore sequencing confirmed eight simple pathogenic or likely pathogenic SVs, resolving three additional variants whose impact could not be fully elucidated through short-read sequencing. A recurrent sequencing artifact on chromosome 16p13 and one complex rearrangement on chromosome 5q35 were subsequently classified as likely benign, obviating the need for further clinical assessment. Variant configuration was further resolved in one case with a complex pathogenic rearrangement affecting TSC2. Conclusion Our findings demonstrate that long-read sequencing can improve the validation, resolution, and classification of germline SVs. This has important implications for return of results, cascade carrier testing, cancer screening, and prophylactic interventions.

Download Full-text

Whole-Genome Sequencing of a Human Clinical Isolate of the Novel Species Klebsiella quasivariicola sp. nov

Genome Announcements ◽

10.1128/genomea.01057-17 ◽

2017 ◽

Vol 5 (42) ◽

Cited By ~ 15

Author(s):

S. Wesley Long ◽

Sarah E. Linson ◽

Matthew Ojeda Saavedra ◽

Concepcion Cantu ◽

James J. Davis ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Clinical Isolate ◽

Genome Sequencing ◽

Novel Species ◽

Whole Genome ◽

The Novel ◽

Short Read ◽

Oxford Nanopore ◽

Long Read

ABSTRACT In a study of 1,777 Klebsiella strains, we discovered KPN1705, which was distinct from all recognized Klebsiella spp. We closed the genome of strain KPN1705 using a hybrid of Illumina short-read and Oxford Nanopore long-read technologies. For this novel species, we propose the name Klebsiella quasivariicola sp. nov.

Download Full-text

Primer on Hereditary Cancer Predisposition Genes Included Within Somatic Next-Generation Sequencing Panels

JCO Precision Oncology ◽

10.1200/po.18.00258 ◽

2019 ◽

pp. 1-11

Author(s):

Zade Akras ◽

Brandon Bungo ◽

Brandie H. Leach ◽

Jessica Marquard ◽

Manmeet Ahluwalia ◽

...

Keyword(s):

Hereditary Cancer ◽

Cancer Susceptibility ◽

Cancer Predisposition ◽

Comprehensive Overview ◽

Germline Variants ◽

Cancer Susceptibility Genes ◽

Germline Genes ◽

Number Of Genes ◽

Predisposition Genes ◽

Hereditary Cancer Predisposition

PURPOSE It has been estimated that 5% to 10% of cancers are due to hereditary causes. Recent data sets indicate that the incidence of hereditary cancer may be as high as 17.5% in patients with cancer, and a notable subset is missed if screening is solely by family history and current syndrome-based testing guidelines. Identification of germline variants has implications for both patients and their families. There is currently no comprehensive overview of cancer susceptibility genes or inclusion of these genes in commercially available somatic testing. We aimed to summarize genes linked to hereditary cancer and the somatic and germline panels that include such genes. METHODS Germline predisposition genes were chosen if commercially available for testing. Penetrance was defined as low, moderate, or high according to whether the gene conferred a 0% to 20%, 20% to 50%, or 50% to 100% lifetime risk of developing the cancer or, when percentages were not available, was estimated on the basis of existing literature descriptions. RESULTS We identified a total of 89 genes linked to hereditary cancer predisposition, and we summarized these genes alphabetically and by organ system. We considered four germline and six somatic commercially available panel tests and quantified the coverage of germline genes across them. Comparison between the number of genes that had germline importance and the number of genes included in somatic testing showed that many but not all germline genes are tested by frequently used somatic panels. CONCLUSION The inclusion of cancer-predisposing genes in somatic variant testing panels makes incidental germline findings likely. Although somatic testing can be used to screen for germline variants, this strategy is inadequate for comprehensive screening. Access to genetic counseling is essential for interpretation of germline implications of somatic testing and implementation of appropriate screening and follow-up.

Download Full-text

Cancer Risk Assessment and Screening; A Challenge for Clinical Pathology Service?

INDONESIAN JOURNAL OF CLINICAL PATHOLOGY AND MEDICAL LABORATORY ◽

10.24293/ijcpml.v27i1.1660 ◽

2020 ◽

Vol 27 (1) ◽

Author(s):

Siti Boedina Kresno

Keyword(s):

Risk Factors ◽

Cancer Risk ◽

Hereditary Cancer ◽

Cancer Susceptibility ◽

Intrinsic Factor ◽

Cancer Risk Assessment ◽

Prophylactic Surgery ◽

Cancer Etiology ◽

Cancer Burden ◽

Cancer Susceptibility Genes

There is evidence demonstrating that cancer etiology is multi-factorial and modification of risk factors has achievedcancer prevention. There is therefore a need to advance the understanding of cancer etiology through interaction effectsbetween risk factors when estimating the contribution of an individual to the cancer burden in a population. It has beenknown that cancer may arise from genetic susceptibility to the disease as an intrinsic factor; however, non-intrinsic factorsdrive most cancer risk as well and highlight the need for cancer prevention. Are our clinical pathologists aware of thesefacts?. Are they ready to understand and to provide an excellent test with good expertise?. Hereditary cancer testing istypically performed using gene panels, which may be either cancer-specific or pan-cancer to assess risk for a defined orbroader range of cancers, respectively. Given the clinical implications of hereditary cancer testing, diagnostic laboratoriesmust develop high-quality panel tests, which serve a broad, genetically diverse patient population. The result will determinea patient's eligibility for targeted therapy, for instance, or lead a patient to prophylactic surgery, chemoprevention, andsurveillance. This review will introduce the definitions of intrinsic and non-intrinsic risk factors, which have been employed inrecent work and how evidence for their effects on the cancer burden in human subjects has been obtained. Genetic testingof cancer susceptibility genes by use of liquid biopsies and New Generation Sequencing (NGS) is now widely applied inclinical practice to predict the risk of developing cancer, help diagnosis, and treatment monitoring.

Download Full-text

Constructing a Reference Genome in a Single Lab: The Possibility to Use Oxford Nanopore Technology

Plants ◽

10.3390/plants8080270 ◽

2019 ◽

Vol 8 (8) ◽

pp. 270 ◽

Cited By ~ 4

Author(s):

Yun Gyeong Lee ◽

Sang Chul Choi ◽

Yuna Kang ◽

Kyeong Min Kim ◽

Chon-Sik Kang ◽

...

Keyword(s):

Plant Species ◽

Genome Sequencing ◽

Reference Genome ◽

Genome Structure ◽

Plant Genome ◽

Sequence Information ◽

Sequencing Analysis ◽

Oxford Nanopore ◽

A Genome ◽

Long Read

The whole genome sequencing (WGS) has become a crucial tool in understanding genome structure and genetic variation. The MinION sequencing of Oxford Nanopore Technologies (ONT) is an excellent approach for performing WGS and it has advantages in comparison with other Next-Generation Sequencing (NGS): It is relatively inexpensive, portable, has simple library preparation, can be monitored in real-time, and has no theoretical limits on reading length. Sorghum bicolor (L.) Moench is diploid (2n = 2x = 20) with a genome size of about 730 Mb, and its genome sequence information is released in the Phytozome database. Therefore, sorghum can be used as a good reference. However, plant species have complex and large genomes when compared to animals or microorganisms. As a result, complete genome sequencing is difficult for plant species. MinION sequencing that produces long-reads can be an excellent tool for overcoming the weak assembly of short-reads generated from NGS by minimizing the generation of gaps or covering the repetitive sequence that appears on the plant genome. Here, we conducted the genome sequencing for S. bicolor cv. BTx623 while using the MinION platform and obtained 895,678 reads and 17.9 gigabytes (Gb) (ca. 25× coverage of reference) from long-read sequence data. A total of 6124 contigs (covering 45.9%) were generated from Canu, and a total of 2661 contigs (covering 50%) were generated from Minimap and Miniasm with a Racon through a de novo assembly using two different tools and mapped assembled contigs against the sorghum reference genome. Our results provide an optimal series of long-read sequencing analysis for plant species while using the MinION platform and a clue to determine the total sequencing scale for optimal coverage that is based on various genome sizes.

Download Full-text

Complete Genome Sequence of Rubrobacter xylanophilus Strain AA3-22, Isolated from Arima Onsen in Japan

Microbiology Resource Announcements ◽

10.1128/mra.00818-19 ◽

2019 ◽

Vol 8 (34) ◽

Cited By ~ 1

Author(s):

Natsuki Tomariguchi ◽

Kentaro Miyazaki

Keyword(s):

Genome Sequence ◽

Complete Genome Sequence ◽

Complete Genome ◽

Hot Spring ◽

Sequencing Data ◽

Short Read ◽

Content Type ◽

Short Read Sequencing ◽

Oxford Nanopore ◽

Long Read

Rubrobacter xylanophilus strain AA3-22, belonging to the phylum Actinobacteria, was isolated from nonvolcanic Arima Onsen (hot spring) in Japan. Here, we report the complete genome sequence of this organism, which was obtained by combining Oxford Nanopore long-read and Illumina short-read sequencing data.

Download Full-text

Management of germline findings revealed throughout the course of tumor-normal whole genome sequencing in oncology.

Journal of Clinical Oncology ◽

10.1200/jco.2017.35.15_suppl.e13113 ◽

2017 ◽

Vol 35 (15_suppl) ◽

pp. e13113-e13113

Author(s):

Howard John Lim ◽

Kasmintan A Schrader ◽

Sean Young ◽

Jessica Nelson ◽

Alexandra Fok ◽

...

Keyword(s):

Genetic Counseling ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Cancer Susceptibility ◽

Susceptibility Gene ◽

Susceptibility Genes ◽

Genomic Research ◽

Whole Genome ◽

Pathogenic Variants ◽

Cancer Susceptibility Genes

e13113 Background: The Personalized OncoGenomics (POG) project at the BC Cancer Agency utilizes tumor-normal whole genome sequencing (WGS) to understand key driver pathways and guide personalized treatment decisions. Analysis of the germline data can reveal variants; these may be presumed pathogenic, presumed benign or of unknown significance (VUS). We have developed a process for evaluating and returning presumed pathogenic variants in known cancer susceptibility genes to patients, for counseling and validation in a clinical-accredited laboratory. Methods: Patients receive germline cancer related information as part of the consent process for participation in the POG program. A sub-committee comprised of medical geneticists, bioinformaticians, pathologists, oncologists and an ethicist review the germline results. Any variants suspicious of being an artifact undergo a technical validation step. Presumed pathogenic findings of known cancer susceptibility genes are returned to the patient by their treating oncologist and patients are referred to the Hereditary Cancer Program (HCP), for genetic counseling and clinical confirmation. Results: From June 2012 - January 2017 – 466 patients have consented to the project. To date, 39 cases (8.4%) had at least one variant that was deemed pathogenic, 86 cases had at least one VUS in a known cancer susceptibility gene. 11 out of 23 cases (47.8%) with high penetrance mutations were already known to HCP. All VUS were reviewed by the sub-committee taking in to consideration the VUS and clinical context. 8 of the subjects with pathogenic results and 3 with VUS were known to HCP before POG data was generated. A VUS in 7 cases (1.5%) was returned after review. Conclusions: The number of pathogenic variants in known cancer susceptibility genes is consistent with published oncology results. We created a process to manage clinically relevant germline findings discovered during the course of genomic research to ensure appropriate care for patients. Genetic counseling within HCP and validation of variants in the clinically accredited Cancer Genetics Laboratory enables seamless return of research generated clinically relevant germline results to affected subjects. Clinical trial information: NCT02155621.

Download Full-text

Towards a Comprehensive Variation Benchmark for Challenging Medically-Relevant Autosomal Genes

10.1101/2021.06.07.444885 ◽

2021 ◽

Author(s):

Justin Wagner ◽

Nathan D Olson ◽

Lindsay Harris ◽

Jennifer McDaniel ◽

Haoyu Cheng ◽

...

Keyword(s):

Genome Sequencing ◽

Method Development ◽

Clinical Applications ◽

Accurate Analysis ◽

Short Read ◽

Short Read Sequencing ◽

Repetitive Nature ◽

Long Read ◽

Comprehensive Characterization

The repetitive nature and complexity of multiple medically important genes make them intractable to accurate analysis, despite the maturity of short-read sequencing, resulting in a gap in clinical applications of genome sequencing. The Genome in a Bottle Consortium has provided benchmark variant sets, but these excluded some medically relevant genes due to their repetitiveness or polymorphic complexity. In this study, we characterize 273 of these 395 challenging autosomal genes that have multiple implications for medical sequencing. This extended, curated benchmark reports over 17,000 SNVs, 3,600 INDELs, and 200 SVs each for GRCh37 and GRCh38. We show that false duplications in either GRCh37 or GRCh38 result in reference-specific, missed variants for short- and long-read technologies in medically important genes including CBS, CRYAA, and KCNE1. Our proposed solution improves variant recall in these genes from 8% to 100%. This benchmark will significantly improve the comprehensive characterization of these medically relevant genes and guide new method development.

Download Full-text

Long-read sequencing across the C9orf72 ‘GGGGCC’ repeat expansion: implications for clinical use and genetic discovery efforts in human disease

10.1101/176651 ◽

2018 ◽

Cited By ~ 1

Author(s):

Mark T. W. Ebbert ◽

Stefan Farrugia ◽

Jonathon Sens ◽

Karen Jansen-West ◽

Tania F. Gendron ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Repeat Expansion ◽

Whole Genome ◽

Short Read ◽

Short Read Sequencing ◽

Sequencing Technologies ◽

Long Read ◽

Repeat Expansions ◽

Targeted Approach

AbstractBackground: Many neurodegenerative diseases are caused by nucleotide repeat expansions, but most expansions, like the C9orf72 ‘GGGGCC’ (G4C2) repeat that causes approximately 5-7% of all amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) cases, are too long to sequence using short-read sequencing technologies. It is unclear whether long-read sequencing technologies can traverse these long, challenging repeat expansions. Here, we demonstrate that two long-read sequencing technologies, Pacific Biosciences’ (PacBio) and Oxford Nanopore Technologies’ (ONT), can sequence through disease-causing repeats cloned into plasmids, including the FTD/ALS-causing G4C2 repeat expansion. We also report the first long-read sequencing data characterizing the C9orf72 G4C2 repeat expansion at the nucleotide level in two symptomatic expansion carriers using PacBio whole-genome sequencing and a no-amplification (No-Amp) targeted approach based on CRISPR/Cas9.Results: Both the PacBio and ONT platforms successfully sequenced through the repeat expansions in plasmids. Throughput on the MinlON was a challenge for whole-genome sequencing; we were unable to attain reads covering the human C9orf72 repeat expansion using 15 flow cells. We obtained 8x coverage across the C9orf72 locus using the PacBio Sequel, accurately reporting the unexpanded allele at eight repeats, and reading through the entire expansion with 1324 repeats (7941 nucleotides). Using the No-Amp targeted approach, we attained >800x coverage and were able to identify the unexpanded allele, closely estimate expansion size, and assess nucleotide content in a single experiment. We estimate the individual’s repeat region was >99% G4C2 content, though we cannot rule out small interruptions.Conclusions: Our findings indicate that long-read sequencing is well suited to characterizing known repeat expansions, and for discovering new disease-causing, disease-modifying, or risk-modifying repeat expansions that have gone undetected with conventional short-read sequencing. The PacBio No-Amp targeted approach may have future potential in clinical and genetic counseling environments. Larger and deeper long-read sequencing studies in C9orf72 expansion carriers will be important to determine heterogeneity and whether the repeats are interrupted by non-G4C2 content, potentially mitigating or modifying disease course or age of onset, as interruptions are known to do in other repeat-expansion disorders. These results have broad implications across all diseases where the genetic etiology remains unclear.

Download Full-text

Long-read genome sequencing for the diagnosis of neurodevelopmental disorders

10.1101/2020.07.02.185447 ◽

2020 ◽

Author(s):

Susan M. Hiatt ◽

James M.J. Lawlor ◽

Lori H. Handley ◽

Ryne C. Ramaker ◽

Brianne B. Rogers ◽

...

Keyword(s):

Genome Sequencing ◽

Neurodevelopmental Disorders ◽

De Novo ◽

Genomic Analysis ◽

Data Sets ◽

Short Read ◽

Long Read ◽

Variant Detection ◽

Circular Consensus Sequencing

AbstractPurposeExome and genome sequencing have proven to be effective tools for the diagnosis of neurodevelopmental disorders (NDDs), but large fractions of NDDs cannot be attributed to currently detectable genetic variation. This is likely, at least in part, a result of the fact that many genetic variants are difficult or impossible to detect through typical short-read sequencing approaches.MethodsHere, we describe a genomic analysis using Pacific Biosciences circular consensus sequencing (CCS) reads, which are both long (>10 kb) and accurate (>99% bp accuracy). We used CCS on six proband-parent trios with NDDs that were unexplained despite extensive testing, including genome sequencing with short reads.ResultsWe identified variants and created de novo assemblies in each trio, with global metrics indicating these data sets are more accurate and comprehensive than those provided by short-read data. In one proband, we identified a likely pathogenic (LP), de novo L1-mediated insertion in CDKL5 that results in duplication of exon 3, leading to a frameshift. In a second proband, we identified multiple large de novo structural variants, including insertion-translocations affecting DGKB and MLLT3, which we show disrupt MLLT3 transcript levels. We consider this extensive structural variation likely pathogenic.ConclusionThe breadth and quality of variant detection, coupled to finding variants of clinical and research interest in two of six probands with unexplained NDDs strongly support the value of long-read genome sequencing for understanding rare disease.

Download Full-text

Genome Sequence Resource of Phytophthora colocasiae from China Using Nanopore Sequencing Technology

Plant Disease ◽

10.1094/pdis-11-20-2327-a ◽

2021 ◽

Author(s):

Zhixin Wang ◽

Jiandong Bao ◽

Lin Lv ◽

Lianyu Lin ◽

Zhiting Li ◽

...

Keyword(s):

Genome Assembly ◽

Protein Coding ◽

Short Read ◽

Rxlr Effectors ◽

Phytophthora Colocasiae ◽

Oxford Nanopore ◽

Long Read ◽

Infection Mechanisms ◽

Oomycete Pathogen ◽

Taro Leaf Blight

Phytophthora colocasiae is a destructive oomycete pathogen of taro (Colocasia esculenta), which causes taro leaf blight. To date, only one highly fragmented Illumina short-read-based genome assembly is available for this species. To address this problem, we sequenced strain Lyd2019 from China using Oxford Nanopore Technologies (ONT) long-read sequencing and Illumina short-read sequencing. We generated a 92.51-Mb genome assembly consisting of 105 contigs with an N50 of 1.70 Mb and a maximum length of 4.17 Mb. In the genome assembly, we identified 52.78% repeats and 18,322 protein-coding genes, of which 12,782 genes were annotated. We also identified 191 candidate RXLR effectors and 1 candidate CRN effectors. The updated near-chromosome genome assembly and annotation resources will provide a better understanding of the infection mechanisms of P. colocasiae.

Download Full-text