scholarly journals SCSIM: Jointly simulating correlated single-cell and bulk next-generation DNA sequencing data

2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Collin Giguere ◽  
Harsh Vardhan Dubey ◽  
Vishal Kumar Sarsani ◽  
Hachem Saddiki ◽  
Shai He ◽  
...  
2020 ◽  
Author(s):  
Collin Giguere ◽  
Harsh Vardhan Dubey ◽  
Vishal Kumar Sarsani ◽  
Hachem Saddiki ◽  
Shai He ◽  
...  

AbstractBackgroundRecently, it has become possible to collect next-generation DNA sequencing data sets that are composed of multiple samples from multiple biological units where each of these samples may be from a single cell or bulk tissue. Yet, there does not yet exist a tool for simulating DNA sequencing data from such a nested sampling arrangement with single-cell and bulk samples so that developers of analysis methods can assess accuracy and precision.ResultsWe have developed a tool that simulates DNA sequencing data from hierarchically grouped (correlated) samples where each sample is designated bulk or single-cell. Our tool uses a simple configuration file to define the experimental arrangement and can be integrated into software pipelines for testing of variant callers or other genomic tools.ConclusionsThe DNA sequencing data generated by our simulator is representative of real data and integrates seamlessly with standard downstream analysis tools.


PeerJ ◽  
2015 ◽  
Vol 3 ◽  
pp. e1419 ◽  
Author(s):  
Jose E. Kroll ◽  
Jihoon Kim ◽  
Lucila Ohno-Machado ◽  
Sandro J. de Souza

Motivation.Alternative splicing events (ASEs) are prevalent in the transcriptome of eukaryotic species and are known to influence many biological phenomena. The identification and quantification of these events are crucial for a better understanding of biological processes. Next-generation DNA sequencing technologies have allowed deep characterization of transcriptomes and made it possible to address these issues. ASEs analysis, however, represents a challenging task especially when many different samples need to be compared. Some popular tools for the analysis of ASEs are known to report thousands of events without annotations and/or graphical representations. A new tool for the identification and visualization of ASEs is here described, which can be used by biologists without a solid bioinformatics background.Results.A software suite namedSplicing Expresswas created to perform ASEs analysis from transcriptome sequencing data derived from next-generation DNA sequencing platforms. Its major goal is to serve the needs of biomedical researchers who do not have bioinformatics skills.Splicing Expressperforms automatic annotation of transcriptome data (GTF files) using gene coordinates available from the UCSC genome browser and allows the analysis of data from all available species. The identification of ASEs is done by a known algorithm previously implemented in another tool namedSplooce. As a final result,Splicing Expresscreates a set of HTML files composed of graphics and tables designed to describe the expression profile of ASEs among all analyzed samples. By using RNA-Seq data from the Illumina Human Body Map and the Rat Body Map, we show thatSplicing Expressis able to perform all tasks in a straightforward way, identifying well-known specific events.Availability and Implementation.Splicing Expressis written in Perl and is suitable to run only in UNIX-like systems. More details can be found at:http://www.bioinformatics-brazil.org/splicingexpress.


2010 ◽  
Vol 20 (9) ◽  
pp. 1297-1303 ◽  
Author(s):  
A. McKenna ◽  
M. Hanna ◽  
E. Banks ◽  
A. Sivachenko ◽  
K. Cibulskis ◽  
...  

2011 ◽  
Vol 43 (5) ◽  
pp. 491-498 ◽  
Author(s):  
Mark A DePristo ◽  
Eric Banks ◽  
Ryan Poplin ◽  
Kiran V Garimella ◽  
Jared R Maguire ◽  
...  

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Leah L. Weber ◽  
Mohammed El-Kebir

Abstract Background Cancer arises from an evolutionary process where somatic mutations give rise to clonal expansions. Reconstructing this evolutionary process is useful for treatment decision-making as well as understanding evolutionary patterns across patients and cancer types. In particular, classifying a tumor’s evolutionary process as either linear or branched and understanding what cancer types and which patients have each of these trajectories could provide useful insights for both clinicians and researchers. While comprehensive cancer phylogeny inference from single-cell DNA sequencing data is challenging due to limitations with current sequencing technology and the complexity of the resulting problem, current data might provide sufficient signal to accurately classify a tumor’s evolutionary history as either linear or branched. Results We introduce the Linear Perfect Phylogeny Flipping (LPPF) problem as a means of testing two alternative hypotheses for the pattern of evolution, which we prove to be NP-hard. We develop Phyolin, which uses constraint programming to solve the LPPF problem. Through both in silico experiments and real data application, we demonstrate the performance of our method, outperforming a competing machine learning approach. Conclusion Phyolin is an accurate, easy to use and fast method for classifying an evolutionary trajectory as linear or branched given a tumor’s single-cell DNA sequencing data.


2008 ◽  
Vol 26 (10) ◽  
pp. 1135-1145 ◽  
Author(s):  
Jay Shendure ◽  
Hanlee Ji

Sign in / Sign up

Export Citation Format

Share Document