Targeted long-read sequencing resolves complex structural variants and identifies missing disease-causing variants

Mapping Intimacies ◽

10.1101/2020.11.03.365395 ◽

2020 ◽

Author(s):

Danny E. Miller ◽

Arvis Sulovari ◽

Tianyun Wang ◽

Hailey Loucks ◽

Kendra Hoekzema ◽

...

Keyword(s):

Copy Number ◽

Genetic Diagnosis ◽

Clinical Testing ◽

Structural Variants ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Pathogenic Variants ◽

Long Read ◽

Repeat Expansions ◽

Complex Structural

ABSTRACTBACKGROUNDDespite widespread availability of clinical genetic testing, many individuals with suspected genetic conditions do not have a precise diagnosis. This limits their opportunity to take advantage of state-of-the-art treatments. In such instances, testing sometimes reveals difficult-to-evaluate complex structural differences, candidate variants that do not fully explain the phenotype, single pathogenic variants in recessive disorders, or no variants in specific genes of interest. Thus, there is a need for better tools to identify a precise genetic diagnosis in individuals when conventional testing approaches have been exhausted.METHODSTargeted long-read sequencing (T-LRS) was performed on 33 individuals using Read Until on the Oxford Nanopore platform. This method allowed us to computationally target up to 100 Mbp of sequence per experiment, resulting in an average of 20x coverage of target regions, a 500% increase over background. We analyzed patient DNA for pathogenic substitutions, structural variants, and methylation differences using a single data source.RESULTSThe effectiveness of T-LRS was validated by detecting all genomic aberrations, including single-nucleotide variants, copy number changes, repeat expansions, and methylation differences, previously identified by prior clinical testing. In 6/7 individuals who had complex structural rearrangements, T-LRS enabled more precise resolution of the mutation, which led, in one case, to a change in clinical management. In nine individuals with suspected Mendelian conditions who lacked a precise genetic diagnosis, T-LRS identified pathogenic or likely pathogenic variants in five and variants of uncertain significance in two others.CONCLUSIONST-LRS can accurately predict pathogenic copy number variants and triplet repeat expansions, resolve complex rearrangements, and identify single-nucleotide variants not detected by other technologies, including short-read sequencing. T-LRS represents an efficient and cost-effective strategy to evaluate high-priority candidate genes and regions or to further evaluate complex clinical testing results. The application of T-LRS will likely increase the diagnostic rate of rare disorders.

Download Full-text

Re-evaluation of single nucleotide variants and identification of structural variants in a cohort of 45 sudden unexplained death cases

International Journal of Legal Medicine ◽

10.1007/s00414-021-02580-5 ◽

2021 ◽

Author(s):

Jacqueline Neubauer ◽

Shouyu Wang ◽

Giancarlo Russo ◽

Cordula Haas

Keyword(s):

Sudden Death ◽

Cardiac Diseases ◽

Structural Variants ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Sudden Unexplained Death ◽

Unexplained Death ◽

Pathogenic Variants ◽

The Impact ◽

Death Cases

AbstractSudden unexplained death (SUD) takes up a considerable part in overall sudden death cases, especially in adolescents and young adults. During the past decade, many channelopathy- and cardiomyopathy-associated single nucleotide variants (SNVs) have been identified in SUD studies by means of postmortem molecular autopsy, yet the number of cases that remain inconclusive is still high. Recent studies had suggested that structural variants (SVs) might play an important role in SUD, but there is no consensus on the impact of SVs on inherited cardiac diseases. In this study, we searched for potentially pathogenic SVs in 244 genes associated with cardiac diseases. Whole-exome sequencing and appropriate data analysis were performed in 45 SUD cases. Re-analysis of the exome data according to the current ACMG guidelines identified 14 pathogenic or likely pathogenic variants in 10 (22.2%) out of the 45 SUD cases, whereof 2 (4.4%) individuals had variants with likely functional effects in the channelopathy-associated genes SCN5A and TRDN and 1 (2.2%) individual in the cardiomyopathy-associated gene DTNA. In addition, 18 structural variants (SVs) were identified in 15 out of the 45 individuals. Two SVs with likely functional impairment were found in the coding regions of PDSS2 and TRPM4 in 2 SUD cases (4.4%). Both were identified as heterozygous deletions, which were confirmed by multiplex ligation-dependent probe amplification. In conclusion, our findings support that SVs could contribute to the pathology of the sudden death event in some of the cases and therefore should be investigated on a routine basis in suspected SUD cases.

Download Full-text

HGG-41. STRUCTURAL VARIANT DRIVERS IN PEDIATRIC HIGH-GRADE GLIOMA

Neuro-Oncology ◽

10.1093/neuonc/noaa222.322 ◽

2020 ◽

Vol 22 (Supplement_3) ◽

pp. iii351-iii351

Author(s):

Frank Dubois ◽

Ofer Shapira ◽

Noah Greenwald ◽

Travis Zack ◽

Jessica W Tsai ◽

...

Keyword(s):

Copy Number ◽

High Grade Glioma ◽

High Grade ◽

Structural Variants ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Effector Domains ◽

Topologically Associating Domains ◽

Genome Wide ◽

Pediatric High Grade Glioma

Abstract BACKGROUND Driver single nucleotide variants (SNV) and somatic copy number aberrations (SCNA) of pediatric high-grade glioma (pHGGs), including Diffuse Midline Gliomas (DMGs) are characterized. However, structural variants (SVs) in pHGGs and the mechanisms through which they contribute to glioma formation have not been systematically analyzed genome-wide. METHODS Using SvABA for SVs as well as the latest pipelines for SCNAs and SNVs we analyzed whole-genome sequencing from 174 patients. This includes 60 previously unpublished samples, 43 of which are DMGs. Signature analysis allowed us to define pHGG groups with shared SV characteristics. Significantly recurring SV breakpoints and juxtapositions were identified with algorithms we recently developed and the findings were correlated with RNAseq and H3K27ac ChIPseq. RESULTS The SV characteristics in pHGG showed three groups defined by either complex, intermediate or simple signature activities. These associated with distinct combinations of known driver oncogenes. Our statistical analysis revealed recurring SVs in the topologically associating domains of MYCN, MYC, EGFR, PDGFRA & MET. These correlated with increased mRNA expression and amplification of H3K27ac peaks. Complex recurring amplifications showed characteristics of extrachromosomal amplicons and were enriched in coding SVs splitting protein regulatory from effector domains. Integrative analysis of all SCNAs, SNVs & SVs revealed patterns of characteristic combinations between potential drivers and signatures. This included two distinct groups of H3K27M DMGs with either complex or simple signatures and different combinations of associated variants. CONCLUSION Recurrent SVs associate with signatures shaped by an underlying process, which can lead to distinct mechanisms to activate the same oncogene.

Download Full-text

Germline Sequencing Improves Tumor-Only Sequencing Interpretation in a Precision Genomic Study of Patients With Pediatric Solid Tumor

JCO Precision Oncology ◽

10.1200/po.21.00281 ◽

2021 ◽

pp. 1840-1852

Author(s):

Jaclyn Schienda ◽

Alanna J. Church ◽

Laura B. Corson ◽

Brennan Decker ◽

Catherine M. Clinton ◽

...

Keyword(s):

Copy Number ◽

Integrated Analysis ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Clinical Interpretation ◽

Germline Variants ◽

Panel Testing ◽

Pathogenic Variants ◽

Genomic Study

PURPOSE Molecular tumor profiling is becoming a routine part of clinical cancer care, typically involving tumor-only panel testing without matched germline. We hypothesized that integrated germline sequencing could improve clinical interpretation and enhance the identification of germline variants with significant hereditary risks. MATERIALS AND METHODS Tumors from pediatric patients with high-risk, extracranial solid malignancies were sequenced with a targeted panel of cancer-associated genes. Later, germline DNA was analyzed for a subset of these genes. We performed a post hoc analysis to identify how an integrated analysis of tumor and germline data would improve clinical interpretation. RESULTS One hundred sixty participants with both tumor-only and germline sequencing reports were eligible for this analysis. Germline sequencing identified 38 pathogenic or likely pathogenic variants among 35 (22%) patients. Twenty-five (66%) of these were included in the tumor sequencing report. The remaining germline pathogenic or likely pathogenic variants were single-nucleotide variants filtered out of tumor-only analysis because of population frequency or copy-number variation masked by additional copy-number changes in the tumor. In tumor-only sequencing, 308 of 434 (71%) single-nucleotide variants reported were present in the germline, including 31% with suggested clinical utility. Finally, we provide further evidence that the variant allele fraction from tumor-only sequencing is insufficient to differentiate somatic from germline events. CONCLUSION A paired approach to analyzing tumor and germline sequencing data would be expected to improve the efficiency and accuracy of distinguishing somatic mutations and germline variants, thereby facilitating the process of variant curation and therapeutic interpretation for somatic reports, as well as the identification of variants associated with germline cancer predisposition.

Download Full-text

Genome sequencing identifies rare tandem repeat expansions and copy number variants in Lennox–Gastaut syndrome

Brain Communications ◽

10.1093/braincomms/fcab207 ◽

2021 ◽

Vol 3 (3) ◽

Author(s):

Farah Qaiser ◽

Tara Sadoway ◽

Yue Yin ◽

Quratulain Zulfiqar Ali ◽

Charlotte M Nguyen ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Tandem Repeat ◽

Copy Number ◽

Copy Number Variants ◽

Spinocerebellar Ataxia Type ◽

Whole Genome ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Repeat Expansions

Abstract Epilepsies are a group of common neurological disorders with a substantial genetic basis. Despite this, the molecular diagnosis of epilepsies remains challenging due to its heterogeneity. Studies utilizing whole-genome sequencing may provide additional insights into genetic causes of epilepsies of unknown aetiology. Whole-genome sequencing was used to evaluate a cohort of adults with unexplained developmental and epileptic encephalopathies (n = 30), for whom prior genetic tests, including whole-exome sequencing in some cases, were negative or inconclusive. Rare single nucleotide variants, insertions/deletions, copy number variants and tandem repeat expansions were analysed. Seven pathogenic or likely pathogenic single nucleotide variants, and two pathogenic deleterious copy number variants were identified in nine patients (32.1% of the cohort). One of the copy number variants, identified in a patient with Lennox–Gastaut syndrome, was too small to be detected by chromosomal microarray techniques. We also identified two tandem repeat expansions with clinical implications in two other patients with Lennox–Gastaut syndrome: a CGG repeat expansion in the 5′untranslated region of DIP2B, and a CTG expansion in ATXN8OS (previously implicated in spinocerebellar ataxia type 8). Three patients had KCNA2 pathogenic variants. One of them died of sudden unexpected death in epilepsy. The other two patients had, in addition to a KCNA2 variant, a second de novo variant impacting potential epilepsy-relevant genes (KCNIP4 and UBR5). Overall, whole-genome sequencing provided a genetic explanation in 32.1% of the total cohort. This is also the first report of coding and non-coding tandem repeat expansions identified in patients with Lennox–Gastaut syndrome. This study demonstrates that using whole-genome sequencing, the examination of multiple types of rare genetic variation, including those found in the non-coding region of the genome, can help resolve unexplained epilepsies.

Download Full-text

VISOR: a versatile haplotype-aware structural variant simulator for short- and long-read sequencing

Bioinformatics ◽

10.1093/bioinformatics/btz719 ◽

2019 ◽

Author(s):

Davide Bolognini ◽

Ashley Sanders ◽

Jan O Korbel ◽

Alberto Magi ◽

Vladimir Benes ◽

...

Keyword(s):

Single Cell ◽

Supplementary Information ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Cancer Heterogeneity ◽

Long Reads ◽

Long Read ◽

Complex Structural ◽

Error Profiles

Abstract Summary VISOR is a tool for haplotype-specific simulations of simple and complex structural variants (SVs). The method is applicable to haploid, diploid or higher ploidy simulations for bulk or single-cell sequencing data. SVs are implanted into FASTA haplotypes at single-basepair resolution, optionally with nearby single-nucleotide variants. Short or long reads are drawn at random from these haplotypes using standard error profiles. Double- or single-stranded data can be simulated and VISOR supports the generation of haplotype-tagged BAM files. The tool further includes methods to interactively visualize simulated variants in single-stranded data. The versatility of VISOR is unmet by comparable tools and it lays the foundation to simulate haplotype-resolved cancer heterogeneity data in bulk or at single-cell resolution. Availability and implementation VISOR is implemented in python 3.6, open-source and freely available at https://github.com/davidebolo1993/VISOR. Documentation is available at https://davidebolo1993.github.io/visordoc/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Combination of Genome-Wide Polymorphisms and Copy Number Variations of Pharmacogenes in Koreans

Journal of Personalized Medicine ◽

10.3390/jpm11010033 ◽

2021 ◽

Vol 11 (1) ◽

pp. 33

Author(s):

Nayoung Han ◽

Jung Mi Oh ◽

In-Wha Kim

Keyword(s):

Copy Number ◽

Genome Wide Association Study ◽

Copy Number Gain ◽

Copy Number Variations ◽

Gene Gain ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Haplotype Blocks ◽

Genome Wide ◽

Control And Prevention

For predicting phenotypes and executing precision medicine, combination analysis of single nucleotide variants (SNVs) genotyping with copy number variations (CNVs) is required. The aim of this study was to discover SNVs or common copy CNVs and examine the combined frequencies of SNVs and CNVs in pharmacogenes using the Korean genome and epidemiology study (KoGES), a consortium project. The genotypes (N = 72,299) and CNV data (N = 1000) were provided by the Korean National Institute of Health, Korea Centers for Disease Control and Prevention. The allele frequencies of SNVs, CNVs, and combined SNVs with CNVs were calculated and haplotype analysis was performed. CYP2D6 rs1065852 (c.100C>T, p.P34S) was the most common variant allele (48.23%). A total of 8454 haplotype blocks in 18 pharmacogenes were estimated. DMD ranked the highest in frequency for gene gain (64.52%), while TPMT ranked the highest in frequency for gene loss (51.80%). Copy number gain of CYP4F2 was observed in 22 subjects; 13 of those subjects were carriers with CYP4F2*3 gain. In the case of TPMT, approximately one-half of the participants (N = 308) had loss of the TPMT*1*1 diplotype. The frequencies of SNVs and CNVs in pharmacogenes were determined using the Korean cohort-based genome-wide association study.

Download Full-text

Unsuspected somatic mosaicism for FBN1 gene contributes to Marfan syndrome

Genetics in Medicine ◽

10.1038/s41436-020-01078-6 ◽

2021 ◽

Author(s):

Pauline Arnaud ◽

Hélène Morel ◽

Olivier Milleron ◽

Laurent Gouya ◽

Christine Francannet ◽

...

Keyword(s):

Marfan Syndrome ◽

Somatic Mosaicism ◽

Variant Calling ◽

Copy Number Variations ◽

Pathogenic Variant ◽

Single Nucleotide Variants ◽

Bioinformatics Analyses ◽

Single Nucleotide ◽

Fbn1 Gene ◽

Pathogenic Variants

Abstract Purpose Individuals with mosaic pathogenic variants in the FBN1 gene are mainly described in the course of familial screening. In the literature, almost all these mosaic individuals are asymptomatic. In this study, we report the experience of our team on more than 5,000 Marfan syndrome (MFS) probands. Methods Next-generation sequencing (NGS) capture technology allowed us to identify five cases of MFS probands who harbored a mosaic pathogenic variant in the FBN1 gene. Results These five sporadic mosaic probands displayed classical features usually seen in Marfan syndrome. Combined with the results of the literature, these rare findings concerned both single-nucleotide variants and copy-number variations. Conclusion This underestimated finding should not be overlooked in the molecular diagnosis of MFS patients and warrants an adaptation of the parameters used in bioinformatics analyses. The five present cases of symptomatic MFS probands harboring a mosaic FBN1 pathogenic variant reinforce the fact that apparently asymptomatic mosaic parents should have a complete clinical examination and a regular cardiovascular follow-up. We advise that individuals with a typical MFS for whom no single-nucleotide pathogenic variant or exon deletion/duplication was identified should be tested by NGS capture panel with an adapted variant calling analysis.

Download Full-text

An integrative approach to investigate the respective roles of single-nucleotide variants and copy-number variants in Attention-Deficit/Hyperactivity Disorder

Scientific Reports ◽

10.1038/srep22851 ◽

2016 ◽

Vol 6 (1) ◽

Cited By ~ 9

Author(s):

Leandro de Araújo Lima ◽

Ana Cecília Feio-dos-Santos ◽

Sintia Iole Belangero ◽

Ary Gadelha ◽

Rodrigo Affonseca Bressan ◽

...

Keyword(s):

Attention Deficit Hyperactivity Disorder ◽

Attention Deficit ◽

Copy Number ◽

De Novo ◽

Copy Number Variants ◽

Integrative Approach ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Hyperactivity Disorder ◽

New Genes

Abstract Many studies have attempted to investigate the genetic susceptibility of Attention-Deficit/Hyperactivity Disorder (ADHD), but without much success. The present study aimed to analyze both single-nucleotide and copy-number variants contributing to the genetic architecture of ADHD. We generated exome data from 30 Brazilian trios with sporadic ADHD. We also analyzed a Brazilian sample of 503 children/adolescent controls from a High Risk Cohort Study for the Development of Childhood Psychiatric Disorders, and also previously published results of five CNV studies and one GWAS meta-analysis of ADHD involving children/adolescents. The results from the Brazilian trios showed that cases with de novo SNVs tend not to have de novo CNVs and vice-versa. Although the sample size is small, we could also see that various comorbidities are more frequent in cases with only inherited variants. Moreover, using only genes expressed in brain, we constructed two “in silico” protein-protein interaction networks, one with genes from any analysis, and other with genes with hits in two analyses. Topological and functional analyses of genes in this network uncovered genes related to synapse, cell adhesion, glutamatergic and serotoninergic pathways, both confirming findings of previous studies and capturing new genes and genetic variants in these pathways.

Download Full-text

Faculty Opinions recommendation of Complex structural variants in Mendelian disorders: identification and breakpoint resolution using short- and long-read genome sequencing.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.734615989.793574031 ◽

2020 ◽

Author(s):

Guy Rouleau

Keyword(s):

Genome Sequencing ◽

Structural Variants ◽

Mendelian Disorders ◽

Long Read ◽

Complex Structural

Download Full-text

Mapping and phasing of structural variation in patient genomes using nanopore sequencing

10.1101/129379 ◽

2017 ◽

Cited By ~ 4

Author(s):

Mircea Cretu Stancu ◽

Markus J. van Roosmalen ◽

Ivo Renkens ◽

Marleen Nieboer ◽

Sjors Middelkamp ◽

...

Keyword(s):

Single Molecule ◽

De Novo ◽

Structural Variants ◽

Human Genetic Disease ◽

Structural Genomic ◽

Short Read ◽

Sequencing Technologies ◽

Genome Wide ◽

Long Read ◽

Complex Structural

AbstractStructural genomic variants form a common type of genetic alteration underlying human genetic disease and phenotypic variation. Despite major improvements in genome sequencing technology and data analysis, the detection of structural variants still poses challenges, particularly when variants are of high complexity. Emerging long-read single-molecule sequencing technologies provide new opportunities for detection of structural variants. Here, we demonstrate sequencing of the genomes of two patients with congenital abnormalities using the ONT MinION at 11x and 16x mean coverage, respectively. We developed a bioinformatic pipeline - NanoSV - to efficiently map genomic structural variants (SVs) from the long-read data. We demonstrate that the nanopore data are superior to corresponding short-read data with regard to detection of de novo rearrangements originating from complex chromothripsis events in the patients. Additionally, genome-wide surveillance of SVs, revealed 3,253 (33%) novel variants that were missed in short-read data of the same sample, the majority of which are duplications < 200bp in size. Long sequencing reads enabled efficient phasing of genetic variations, allowing the construction of genome-wide maps of phased SVs and SNVs. We employed read-based phasing to show that all de novo chromothripsis breakpoints occurred on paternal chromosomes and we resolved the long-range structure of the chromothripsis. This work demonstrates the value of long-read sequencing for screening whole genomes of patients for complex structural variants.

Download Full-text