scholarly journals TSD: A Computational Tool To Study the Complex Structural Variants Using PacBio Targeted Sequencing Data

2019 ◽  
Vol 9 (5) ◽  
pp. 1371-1376 ◽  
Author(s):  
Guofeng Meng ◽  
Ying Tan ◽  
Yue Fan ◽  
Yan Wang ◽  
Guang Yang ◽  
...  
2018 ◽  
Author(s):  
Guofeng Meng ◽  
Ying Tan ◽  
Yue Fan ◽  
Yan Wang ◽  
Guang Yang ◽  
...  

ABSTRACTThe PacBio sequencing is a powerful approach to study the DNA or RNA sequences in a longer scope. It is especially useful in exploring the complex structural variants generated by random integration or multiple rearrangement of internal or external sequences. However, there is still no tool designed to uncover their structural organization in the host genome. Here, we present a tool, TSD, for complex structural variant discovery using PacBio targeted sequencing data. It allows researchers to identify and visualize the genomic structures of targeted sequences by unlimited splitting, alignment and assembly of long PacBio reads. Application to the sequencing data derived from an HBV integrated human cell line(PLC/PRF/5) indicated that TSD could recover the full profile of HBV integration events, especially for the regions with the complex human-HBV genome integrations and multiple HBV rearrangements. Compared to other long read analysis tools, TSD showed a better performance for detecting complex genomic structural variants. TSD is publicly available at: https://github.com/menggf/tsd


2017 ◽  
Author(s):  
Joseph G. Arthur ◽  
Xi Chen ◽  
Bo Zhou ◽  
Alexander E. Urban ◽  
Wing Hung Wong

AbstractDetecting structural variants (SVs) from sequencing data is key to genome analysis, but methods using standard whole-genome sequencing (WGS) data are typically incapable of resolving complex SVs with multiple co-located breakpoints. We introduce the ARC-SV method, which uses a probabilistic model to detect arbitrary local rearrangements from WGS data. Our method performs well on simple SVs while surpassing state-of-the-art methods in complex SV detection.


2018 ◽  
Vol 19 (S20) ◽  
Author(s):  
Zachary Stephens ◽  
Chen Wang ◽  
Ravishankar K. Iyer ◽  
Jean-Pierre Kocher

2021 ◽  
Vol 12 ◽  
Author(s):  
Junfu Guo ◽  
Chang Shi ◽  
Xi Chen ◽  
Ou Wang ◽  
Ping Liu ◽  
...  

Co-barcoded reads originating from long DNA fragments (mean length >30 kbp) maintain both single base level accuracy and long-range genomic information. We propose a pipeline, stLFRsv, to detect structural variation using co-barcoded reads. stLFRsv identifies abnormal large gaps between co-barcoded reads to detect potential breakpoints and reconstruct complex structural variants (SVs). Haplotype phasing by co-barcoded reads increases the signal to noise ratio, and barcode sharing profiles are used to filter out false positives. We integrate the short read SV caller smoove for smaller variants with stLFRsv. The integrated pipeline was evaluated on the well-characterized genome HG002/NA24385, and 74.5% precision and a 22.4% recall rate were obtained for deletions. stLFRsv revealed some large variants not included in the benchmark set that were verified by long reads or assembly. For the HG001/NA12878 genome, stLFRsv also achieved the best performance for both resource usage and the detection of large variants. Our work indicates that co-barcoded read technology has the potential to improve genome completeness.


Author(s):  
B Meier ◽  
NV Volkova ◽  
Y Hong ◽  
S Bertolini ◽  
V González-Huici ◽  
...  

AbstractGenome integrity is particularly important in germ cells to faithfully preserve genetic information across generations. As yet little is known about the contribution of various DNA repair pathways to prevent mutagenesis. Using the C. elegans model we analyse mutational spectra that arise in wild-type and 61 DNA repair and DNA damage response mutants cultivated over multiple generations. Overall, 44% of lines show >2-fold increased mutagenesis with a broad spectrum of mutational outcomes including changes in single or multiple types of base substitutions induced by defects in base excision or nucleotide excision repair, or elevated levels of 50-400 bp deletions in translesion polymerase mutants rev-3(pol ζ) and polh-1(pol η). Mutational signatures associated with defective homologous recombination fall into two classes: 1) mutants lacking brc-1/BRCA1 or rad-51/RAD51 paralogs show elevated base substitutions, indels and structural variants, while 2) deficiency for MUS-81/MUS81 and SLX-1/SLX1 nucleases, and HIM-6/BLM, HELQ-1/HELQ and RTEL-1/RTEL1 helicases primarily cause structural variants. Genome-wide investigation of mutagenesis patterns identified elevated rates of tandem duplications often associated with inverted repeats in helq-1 mutants, and a unique pattern of ‘translocation’ events involving homeologous sequences in rip-1 paralog mutants. atm-1/ATM DNA damage checkpoint mutants harboured complex structural variants enriched in subtelomeric regions, and chromosome end-to-end fusions. Finally, while inactivation of the p53-like gene cep-1 did not affect mutagenesis, combined brc-1 cep-1 deficiency displayed increased, locally clustered mutagenesis. In summary, we provide a global view of how DNA repair pathways prevent germ cell mutagenesis.


Author(s):  
Huan Zhong ◽  
Zongwei Cai ◽  
Zhu Yang ◽  
Yiji Xia

AbstractNAD tagSeq has recently been developed for the identification and characterization of NAD+-capped RNAs (NAD-RNAs). This method adopts a strategy of chemo-enzymatic reactions to label the NAD-RNAs with a synthetic RNA tag before subjecting to the Oxford Nanopore direct RNA sequencing. A computational tool designed for analyzing the sequencing data of tagged RNA will facilitate the broader application of this method. Hence, we introduce TagSeqTools as a flexible, general pipeline for the identification and quantification of tagged RNAs (i.e., NAD+-capped RNAs) using long-read transcriptome sequencing data generated by NAD tagSeq method. TagSeqTools comprises two major modules, TagSeek for differentiating tagged and untagged reads, and TagSeqQuant for the quantitative and further characterization analysis of genes and isoforms. Besides, the pipeline also integrates some advanced functions to identify antisense or splicing, and supports the data reformation for visualization. Therefore, TagSeqTools provides a convenient and comprehensive workflow for researchers to analyze the data produced by the NAD tagSeq method or other tagging-based experiments using Oxford nanopore direct RNA sequencing. The pipeline is available at https://github.com/dorothyzh/TagSeqTools, under Apache License 2.0.


2017 ◽  
Author(s):  
Mircea Cretu Stancu ◽  
Markus J. van Roosmalen ◽  
Ivo Renkens ◽  
Marleen Nieboer ◽  
Sjors Middelkamp ◽  
...  

AbstractStructural genomic variants form a common type of genetic alteration underlying human genetic disease and phenotypic variation. Despite major improvements in genome sequencing technology and data analysis, the detection of structural variants still poses challenges, particularly when variants are of high complexity. Emerging long-read single-molecule sequencing technologies provide new opportunities for detection of structural variants. Here, we demonstrate sequencing of the genomes of two patients with congenital abnormalities using the ONT MinION at 11x and 16x mean coverage, respectively. We developed a bioinformatic pipeline - NanoSV - to efficiently map genomic structural variants (SVs) from the long-read data. We demonstrate that the nanopore data are superior to corresponding short-read data with regard to detection of de novo rearrangements originating from complex chromothripsis events in the patients. Additionally, genome-wide surveillance of SVs, revealed 3,253 (33%) novel variants that were missed in short-read data of the same sample, the majority of which are duplications < 200bp in size. Long sequencing reads enabled efficient phasing of genetic variations, allowing the construction of genome-wide maps of phased SVs and SNVs. We employed read-based phasing to show that all de novo chromothripsis breakpoints occurred on paternal chromosomes and we resolved the long-range structure of the chromothripsis. This work demonstrates the value of long-read sequencing for screening whole genomes of patients for complex structural variants.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Xin Luo ◽  
Yaoxi He ◽  
Chao Zhang ◽  
Xiechao He ◽  
Lanzhen Yan ◽  
...  

AbstractCRISPR-Cas9 is a widely-used genome editing tool, but its off-target effect and on-target complex mutations remain a concern, especially in view of future clinical applications. Non-human primates (NHPs) share close genetic and physiological similarities with humans, making them an ideal preclinical model for developing Cas9-based therapies. However, to our knowledge no comprehensive in vivo off-target and on-target assessment has been conducted in NHPs. Here, we perform whole genome trio sequencing of Cas9-treated rhesus monkeys. We only find a small number of de novo mutations that can be explained by expected spontaneous mutations, and no unexpected off-target mutations (OTMs) were detected. Furthermore, the long-read sequencing data does not detect large structural variants in the target region.


Sign in / Sign up

Export Citation Format

Share Document