gcn.MOPS: Accelerating cn.MOPS with GPU

Mapping Intimacies ◽

10.29007/hb5r ◽

2019 ◽

Author(s):

Mohammad Alkhamis ◽

Amirali Baniasadi

Keyword(s):

Dna Sequencing ◽

Performance Improvement ◽

Copy Number ◽

R Package ◽

Copy Number Variations ◽

Alternative Mechanism ◽

Memory Usage ◽

Sequencing Data ◽

Next Generation Dna Sequencing ◽

Speedup Factor

cn.MOPS is a frequently cited model-based algorithm used to quantitatively detect copy-number variations in next-generation, DNA-sequencing data. Previous work has implemented the algorithm as an R package and has achieved considerable yet limited performance improvement by employing multi-CPU parallelism (maximum achievable speedup was experimentally determined to be 9.24). In this paper, we propose an alternative mechanism of process acceleration. Using one CPU core and a GPU device in the proposed solution, gcn.MOPS, we achieve a speedup factor of 159 and reduce memory usage by more than half compared to cn.MOPS running on one CPU core.

Download Full-text

CoDEX2: full-spectrum copy number variation detection by high-throughput DNA sequencing

10.1101/211698 ◽

2017 ◽

Cited By ~ 1

Author(s):

Yuchao Jiang ◽

Rujin Wang ◽

Eugene Urrutia ◽

Ioannis N. Anastopoulos ◽

Katherine L. Nathanson ◽

...

Keyword(s):

Dna Sequencing ◽

High Throughput ◽

Copy Number ◽

Copy Number Variations ◽

Negative Control ◽

Sequencing Data ◽

Full Spectrum ◽

Number Variation ◽

High Throughput Dna Sequencing ◽

Low Sensitivity

AbstractHigh-throughput DNA sequencing enables detection of copy number variations (CNVs) on the genome-wide scale with finer resolution compared to array-based methods, but suffers from biases and artifacts that lead to false discoveries and low sensitivity. We describe CODEX2, a statistical framework for full-spectrum CNV profiling that is sensitive for variants with both common and rare population frequencies and that is applicable to study designs with and without negative control samples. We demonstrate and evaluate CODEX2 on whole-exome and targeted sequencing data, where biases are the most prominent. CODEX2 outperforms existing methods and, in particular, significantly improves sensitivity for common CNVs.

Download Full-text

FocalCall: An R Package for the Annotation of Focal Copy Number Aberrations

Cancer Informatics ◽

10.4137/cin.s19519 ◽

2014 ◽

Vol 13 ◽

pp. CIN.S19519 ◽

Cited By ~ 2

Author(s):

Oscar Krijgsman ◽

Christian Benner ◽

Gerrit A. Meijer ◽

Mark A. van de Wiel ◽

Bauke Ylstra

Keyword(s):

High Resolution ◽

Copy Number ◽

Software Package ◽

Array Comparative Genomic Hybridization ◽

Germ Line ◽

R Package ◽

Copy Number Variations ◽

Comparative Genomic ◽

Sequencing Data ◽

Copy Number Aberrations

In order to identify somatic focal copy number aberrations (CNAs) in cancer specimens and to distinguish them from germ-line copy number variations (CNVs), we developed the software package FocalCall. FocalCall enables user-defined size cutoffs to recognize focal aberrations and builds on established array comparative genomic hybridization segmentation and calling algorithms. To distinguish CNAs from CNVs, the algorithm uses matched patient normal signals as references or, if this is not available, a list with known CNVs in a population. Furthermore, FocalCall differentiates between homozygous and heterozygous deletions as well as between gains and amplifications and is applicable to high-resolution array and sequencing data. AVAILABILITY AND IMPLEMENTATION: FocalCall is available as an R-package from: https://github.com/OscarKrijgsman/focalCall . The R-package will be available in Bioconductor.org as of release 3.0.

Download Full-text

KNNCNV: A K-Nearest Neighbor Based Method for Detection of Copy Number Variations Using NGS Data

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.796249 ◽

2021 ◽

Vol 9 ◽

Author(s):

Kun Xie ◽

Kang Liu ◽

Haque A K Alvi ◽

Yuehui Chen ◽

Shuzhen Wang ◽

...

Keyword(s):

Copy Number ◽

Nearest Neighbor ◽

Human Cancer ◽

Gaussian Mixture ◽

Disease Diagnosis ◽

Copy Number Variations ◽

Sequencing Data ◽

K Nearest Neighbor ◽

Data Types ◽

Ngs Data

Copy number variation (CNV) is a well-known type of genomic mutation that is associated with the development of human cancer diseases. Detection of CNVs from the human genome is a crucial step for the pipeline of starting from mutation analysis to cancer disease diagnosis and treatment. Next-generation sequencing (NGS) data provides an unprecedented opportunity for CNVs detection at the base-level resolution, and currently, many methods have been developed for CNVs detection using NGS data. However, due to the intrinsic complexity of CNVs structures and NGS data itself, accurate detection of CNVs still faces many challenges. In this paper, we present an alternative method, called KNNCNV (K-Nearest Neighbor based CNV detection), for the detection of CNVs using NGS data. Compared to current methods, KNNCNV has several distinctive features: 1) it assigns an outlier score to each genome segment based solely on its first k nearest-neighbor distances, which is not only easy to extend to other data types but also improves the power of discovering CNVs, especially the local CNVs that are likely to be masked by their surrounding regions; 2) it employs the variational Bayesian Gaussian mixture model (VBGMM) to transform these scores into a series of binary labels without a user-defined threshold. To evaluate the performance of KNNCNV, we conduct both simulation and real sequencing data experiments and make comparisons with peer methods. The experimental results show that KNNCNV could derive better performance than others in terms of F1-score.

Download Full-text

Splicing Express: a software suite for alternative splicing analysis using next-generation sequencing data

PeerJ ◽

10.7717/peerj.1419 ◽

2015 ◽

Vol 3 ◽

pp. e1419 ◽

Cited By ~ 6

Author(s):

Jose E. Kroll ◽

Jihoon Kim ◽

Lucila Ohno-Machado ◽

Sandro J. de Souza

Keyword(s):

Alternative Splicing ◽

Dna Sequencing ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Software Suite ◽

Next Generation Dna Sequencing ◽

Sequencing Technologies ◽

Body Map ◽

Sequencing Platforms

Motivation.Alternative splicing events (ASEs) are prevalent in the transcriptome of eukaryotic species and are known to influence many biological phenomena. The identification and quantification of these events are crucial for a better understanding of biological processes. Next-generation DNA sequencing technologies have allowed deep characterization of transcriptomes and made it possible to address these issues. ASEs analysis, however, represents a challenging task especially when many different samples need to be compared. Some popular tools for the analysis of ASEs are known to report thousands of events without annotations and/or graphical representations. A new tool for the identification and visualization of ASEs is here described, which can be used by biologists without a solid bioinformatics background.Results.A software suite namedSplicing Expresswas created to perform ASEs analysis from transcriptome sequencing data derived from next-generation DNA sequencing platforms. Its major goal is to serve the needs of biomedical researchers who do not have bioinformatics skills.Splicing Expressperforms automatic annotation of transcriptome data (GTF files) using gene coordinates available from the UCSC genome browser and allows the analysis of data from all available species. The identification of ASEs is done by a known algorithm previously implemented in another tool namedSplooce. As a final result,Splicing Expresscreates a set of HTML files composed of graphics and tables designed to describe the expression profile of ASEs among all analyzed samples. By using RNA-Seq data from the Illumina Human Body Map and the Rat Body Map, we show thatSplicing Expressis able to perform all tasks in a straightforward way, identifying well-known specific events.Availability and Implementation.Splicing Expressis written in Perl and is suitable to run only in UNIX-like systems. More details can be found at:http://www.bioinformatics-brazil.org/splicingexpress.

Download Full-text

CNV-P: a machine-learning framework for predicting high confident copy number variations

PeerJ ◽

10.7717/peerj.12564 ◽

2021 ◽

Vol 9 ◽

pp. e12564

Author(s):

Taifu Wang ◽

Jinghua Sun ◽

Xiuqing Zhang ◽

Wen-Jing Wang ◽

Qing Zhou

Keyword(s):

Machine Learning ◽

False Positive ◽

Copy Number ◽

Genetic Disorders ◽

Genetic Diseases ◽

Basic Research ◽

Read Depth ◽

Copy Number Variations ◽

Sequencing Data ◽

Learning Framework

Background Copy-number variants (CNVs) have been recognized as one of the major causes of genetic disorders. Reliable detection of CNVs from genome sequencing data has been a strong demand for disease research. However, current software for detecting CNVs has high false-positive rates, which needs further improvement. Methods Here, we proposed a novel and post-processing approach for CNVs prediction (CNV-P), a machine-learning framework that could efficiently remove false-positive fragments from results of CNVs detecting tools. A series of CNVs signals such as read depth (RD), split reads (SR) and read pair (RP) around the putative CNV fragments were defined as features to train a classifier. Results The prediction results on several real biological datasets showed that our models could accurately classify the CNVs at over 90% precision rate and 85% recall rate, which greatly improves the performance of state-of-the-art algorithms. Furthermore, our results indicate that CNV-P is robust to different sizes of CNVs and the platforms of sequencing. Conclusions Our framework for classifying high-confident CNVs could improve both basic research and clinical diagnosis of genetic diseases.

Download Full-text

Detection of copy number variations by pair analysis using next-generation sequencing data in inherited kidney diseases

Clinical and Experimental Nephrology ◽

10.1007/s10157-018-1534-x ◽

2018 ◽

Vol 22 (4) ◽

pp. 881-888 ◽

Cited By ~ 14

Author(s):

China Nagano ◽

Kandai Nozu ◽

Naoya Morisada ◽

Masahiko Yazawa ◽

Daisuke Ichikawa ◽

...

Keyword(s):

Next Generation Sequencing ◽

Copy Number ◽

Kidney Diseases ◽

Copy Number Variations ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Pair Analysis ◽

Generation Sequencing

Download Full-text

Application of Restriction Site-Associated DNA Sequencing (RAD-Seq) for Copy Number Variation and Triploidy Detection in Human

Cytogenetic and Genome Research ◽

10.1159/000518930 ◽

2021 ◽

pp. 1-8

Author(s):

Jian-Chun He ◽

Shao-Ying Li ◽

Wen-Zhi He ◽

Jia-Jia Xian ◽

Xiao-Yan Ma ◽

...

Keyword(s):

Dna Sequencing ◽

Cell Lines ◽

Genome Sequencing ◽

Copy Number ◽

Restriction Site ◽

Chromosomal Abnormalities ◽

Copy Number Variations ◽

Str Analysis ◽

Tissue Samples ◽

Low Pass

At present, low-pass whole-genome sequencing (WGS) is frequently used in clinical research and in the screening of copy number variations (CNVs). However, there are still some challenges in the detection of triploids. Restriction site-associated DNA sequencing (RAD-Seq) technology is a reduced-representation genome sequencing technology developed based on next-generation sequencing. Here, we verified whether RAD-Seq could be employed to detect CNVs and triploids. In this study, genomic DNA of 11 samples was extracted employing a routine method and used to build libraries. Five cell lines of known karyotypes and 6 triploid abortion tissue samples were included for RAD-Seq testing. The triploid samples were confirmed by STR analysis and also tested by low-pass WGS. The accuracy and efficiency of detecting CNVs and triploids by RAD-Seq were then assessed, compared with low-pass WGS. In our results, RAD-Seq detected 11 out of 11 (100%) chromosomal abnormalities, including 4 deletions and 1 aneuploidy in the purchased cell lines and all triploid samples. By contrast, these triploids were missed by low-pass WGS. Furthermore, RAD-Seq showed a higher resolution and more accurate allele frequency in the detection of triploids than low-pass WGS. Our study shows that, compared with low-pass WGS, RAD-Seq has relatively higher accuracy in CNV detection at a similar cost and is capable of identifying triploids. Therefore, the application of this technique in medical genetics has a significant potential value.

Download Full-text

Hadoop-CNV-RF: a clinically validated and scalable copy number variation detection tool for next-generation sequencing data

10.21203/rs.2.22176/v1 ◽

2020 ◽

Author(s):

Getiria Onsongo ◽

Ham Ching Lam ◽

Matthew Bower ◽

Bharat Thyagarajan

Keyword(s):

Copy Number ◽

Copy Number Variations ◽

Next Generation Sequencing Data ◽

Sequencing Data ◽

Large Gene ◽

Data Framework ◽

Number Variation ◽

Targeted Capture ◽

Objective Detection ◽

Gene Panels

Abstract Objective : Detection of small copy number variations (CNVs) in clinically relevant genes is routinely being used to aid diagnosis. We recently developed a tool, CNV-RF , capable of detecting small clinically relevant CNVs. CNV-RF was designed for small gene panels and did not scale well to large gene panels. On large gene panels, CNV-RF routinely failed due to memory limitations. When successful, it took about 2 days to complete a single analysis, making it impractical for routinely analyzing large gene panels. We need a reliable tool capable of detecting CNVs in the clinic that scales well to large gene panels. Results : We have developed Hadoop-CNV-RF, a scalable implementation of CNV-RF . Hadoop-CNV-RF is a freely available tool capable of rapidly analyzing large gene panels. It takes advantage of Hadoop, a big data framework developed to analyze large amounts of data. Preliminary results show it reduces analysis time from about 2 days to less than 4 hours and can seamlessly scale to large gene panels. Hadoop-CNV-RF has been clinically validated for targeted capture data and is currently being used in a CLIA molecular diagnostics laboratory. Its availability and usage instructions are publicly available at: https://github.com/getiria-onsongo/hadoop-cnvrf-public .

Download Full-text

Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008012 ◽

2020 ◽

Vol 16 (7) ◽

pp. e1008012 ◽

Cited By ~ 2

Author(s):

Xian F. Mallory ◽

Mohammadamin Edrisi ◽

Nicholas Navin ◽

Luay Nakhleh

Keyword(s):

Dna Sequencing ◽

Single Cell ◽

Copy Number ◽

Copy Number Aberration ◽

Sequencing Data ◽

Aberration Detection

Download Full-text

Abstract 170: Development and analytical validation of a novel next-generation DNA sequencing assay, the oncomine lymphoma panel, to detect SNV, insertion, deletion and copy number variants in 25 Lymphoma genes in FFPE samples

10.1158/1538-7445.am2020-170 ◽

2020 ◽

Author(s):

Fiona Hyland ◽

Charles Scafe ◽

Yun Zhu ◽

Chenchen Yang ◽

Yu-Ting Tseng ◽

...

Keyword(s):

Dna Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Next Generation ◽

Analytical Validation ◽

Next Generation Dna Sequencing ◽

Ffpe Samples

Download Full-text