scholarly journals Characterization of FMR1 Repeat Expansion and Intragenic Variants by Indirect Sequence Capture

2021 ◽  
Vol 12 ◽  
Author(s):  
Valentina Grosso ◽  
Luca Marcolungo ◽  
Simone Maestri ◽  
Massimiliano Alfano ◽  
Denise Lavezzari ◽  
...  

Traditional methods for the analysis of repeat expansions, which underlie genetic disorders, such as fragile X syndrome (FXS), lack single-nucleotide resolution in repeat analysis and the ability to characterize causative variants outside the repeat array. These drawbacks can be overcome by long-read and short-read sequencing, respectively. However, the routine application of next-generation sequencing in the clinic requires target enrichment, and none of the available methods allows parallel analysis of long-DNA fragments using both sequencing technologies. In this study, we investigated the use of indirect sequence capture (Xdrop technology) coupled to Nanopore and Illumina sequencing to characterize FMR1, the gene responsible of FXS. We achieved the efficient enrichment (> 200×) of large target DNA fragments (~60–80 kbp) encompassing the entire FMR1 gene. The analysis of Xdrop-enriched samples by Nanopore long-read sequencing allowed the complete characterization of repeat lengths in samples with normal, pre-mutation, and full mutation status (> 1 kbp), and correctly identified repeat interruptions relevant for disease prognosis and transmission. Single-nucleotide variants (SNVs) and small insertions/deletions (indels) could be detected in the same samples by Illumina short-read sequencing, completing the mutational testing through the identification of pathogenic variants within the FMR1 gene, when no typical CGG repeat expansion is detected. The study successfully demonstrated the parallel analysis of repeat expansions and SNVs/indels in the FMR1 gene at single-nucleotide resolution by combining Xdrop enrichment with two next-generation sequencing approaches. With the appropriate optimization necessary for the clinical settings, the system could facilitate both the study of genotype–phenotype correlation in FXS and enable a more efficient diagnosis and genetic counseling for patients and their relatives.

2018 ◽  
Author(s):  
Mark T. W. Ebbert ◽  
Stefan Farrugia ◽  
Jonathon Sens ◽  
Karen Jansen-West ◽  
Tania F. Gendron ◽  
...  

AbstractBackground: Many neurodegenerative diseases are caused by nucleotide repeat expansions, but most expansions, like the C9orf72 ‘GGGGCC’ (G4C2) repeat that causes approximately 5-7% of all amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) cases, are too long to sequence using short-read sequencing technologies. It is unclear whether long-read sequencing technologies can traverse these long, challenging repeat expansions. Here, we demonstrate that two long-read sequencing technologies, Pacific Biosciences’ (PacBio) and Oxford Nanopore Technologies’ (ONT), can sequence through disease-causing repeats cloned into plasmids, including the FTD/ALS-causing G4C2 repeat expansion. We also report the first long-read sequencing data characterizing the C9orf72 G4C2 repeat expansion at the nucleotide level in two symptomatic expansion carriers using PacBio whole-genome sequencing and a no-amplification (No-Amp) targeted approach based on CRISPR/Cas9.Results: Both the PacBio and ONT platforms successfully sequenced through the repeat expansions in plasmids. Throughput on the MinlON was a challenge for whole-genome sequencing; we were unable to attain reads covering the human C9orf72 repeat expansion using 15 flow cells. We obtained 8x coverage across the C9orf72 locus using the PacBio Sequel, accurately reporting the unexpanded allele at eight repeats, and reading through the entire expansion with 1324 repeats (7941 nucleotides). Using the No-Amp targeted approach, we attained >800x coverage and were able to identify the unexpanded allele, closely estimate expansion size, and assess nucleotide content in a single experiment. We estimate the individual’s repeat region was >99% G4C2 content, though we cannot rule out small interruptions.Conclusions: Our findings indicate that long-read sequencing is well suited to characterizing known repeat expansions, and for discovering new disease-causing, disease-modifying, or risk-modifying repeat expansions that have gone undetected with conventional short-read sequencing. The PacBio No-Amp targeted approach may have future potential in clinical and genetic counseling environments. Larger and deeper long-read sequencing studies in C9orf72 expansion carriers will be important to determine heterogeneity and whether the repeats are interrupted by non-G4C2 content, potentially mitigating or modifying disease course or age of onset, as interruptions are known to do in other repeat-expansion disorders. These results have broad implications across all diseases where the genetic etiology remains unclear.


2019 ◽  
Author(s):  
Yao-zhong Zhang ◽  
Seiya Imoto ◽  
Satoru Miyano ◽  
Rui Yamaguchi

AbstractMotivationFor short-read sequencing, read-depth based structural variant (SV) callers are difficult to find single-nucleotide-resolution breakpoints due to the bin-size limitation.ResultsIn this paper, we present RDBKE to enhance the breakpoint resolution of read-depth SV callers using deep segmentation model UNet. We show that UNet can be trained with a small amount of data and applied for breakpoint enhancement both in-sample and cross-sample. On both simulation and real data, RDBKE significantly increases the number of SVs with more precise breakpoints.Availabilitysource code of RDBKE is available athttps://github.com/yaozhong/[email protected]


2020 ◽  
Vol 48 (18) ◽  
pp. e104-e104 ◽  
Author(s):  
Jingwen Wang ◽  
Bingnan Li ◽  
Sueli Marques ◽  
Lars M Steinmetz ◽  
Wu Wei ◽  
...  

Abstract Eukaryotic transcriptomes are complex, involving thousands of overlapping transcripts. The interleaved nature of the transcriptomes limits our ability to identify regulatory regions, and in some cases can lead to misinterpretation of gene expression. To improve the understanding of the overlapping transcriptomes, we have developed an optimized method, TIF-Seq2, able to sequence simultaneously the 5′ and 3′ ends of individual RNA molecules at single-nucleotide resolution. We investigated the transcriptome of a well characterized human cell line (K562) and identified thousands of unannotated transcript isoforms. By focusing on transcripts which are challenging to be investigated with RNA-Seq, we accurately defined boundaries of lowly expressed unannotated and read-through transcripts putatively encoding fusion genes. We validated our results by targeted long-read sequencing and standard RNA-Seq for chronic myeloid leukaemia patient samples. Taking the advantage of TIF-Seq2, we explored transcription regulation among overlapping units and investigated their crosstalk. We show that most overlapping upstream transcripts use poly(A) sites within the first 2 kb of the downstream transcription units. Our work shows that, by paring the 5′ and 3′ end of each RNA, TIF-Seq2 can improve the annotation of complex genomes, facilitate accurate assignment of promoters to genes and easily identify transcriptionally fused genes.


2019 ◽  
Author(s):  
Iñigo Prada-Luengo ◽  
Anders Krogh ◽  
Lasse Maretty ◽  
Birgitte Regenberg

AbstractCircular DNA has recently been identified across different species including human normal and cancerous tissue, but short-read mappers are unable to align many of the reads crossing circle junctions and hence limits their detection from short-read sequencing data. Here, we propose a new method, Circle-Map, that guides the realignment of partially aligned reads using information from discordantly mapped reads. We demonstrate how this approach dramatically increases sensitivity for detection of circular DNA on both simulated and real data while retaining high precision.


2019 ◽  
Author(s):  
Bo Cao ◽  
Xiaolin Wu ◽  
Jieliang Zhou ◽  
Hang Wu ◽  
Michael S. DeMott ◽  
...  

AbstractHere we present the Nick-seq platform for quantitative mapping of DNA modifications and damage at single-nucleotide resolution across genomes. Pre-existing breaks are blocked and DNA structures converted to strand-breaks for 3’-extension by nick-translation to produce nuclease-resistant oligonucleotides, and 3’-capture by terminal transferase tailing. Libraries from both products are subjected to next-generation sequencing. Nick-seq is a generally applicable method illustrated with quantitative profiling of single-strand-breaks, phosphorothioate modifications, and DNA oxidation.


Sign in / Sign up

Export Citation Format

Share Document