PSiTE: a Phylogeny guided Simulator for Tumor Evolution

Hechuan Yang; Bingxin Lu; Lan Huong Lai; Abner Herbert Lim; Jacob Josiah Santiago Alvarez; Weiwei Zhai

doi:10.1093/bioinformatics/btz028

PSiTE: a Phylogeny guided Simulator for Tumor Evolution

Bioinformatics ◽

10.1093/bioinformatics/btz028 ◽

2019 ◽

Vol 35 (17) ◽

pp. 3148-3150 ◽

Cited By ~ 2

Author(s):

Hechuan Yang ◽

Bingxin Lu ◽

Lan Huong Lai ◽

Abner Herbert Lim ◽

Jacob Josiah Santiago Alvarez ◽

...

Keyword(s):

Cancer Genomics ◽

Clonal Evolution ◽

Cell Tumor ◽

Supplementary Information ◽

Tumor Evolution ◽

Supplementary Data ◽

Efficient Tool ◽

Wide Range ◽

Different Types ◽

Evolutionary Trajectories

Abstract Summary Simulating realistic clonal dynamics of tumors is an important topic in cancer genomics. Here, we present Phylogeny guided Simulator for Tumor Evolution, a tool that can simulate different types of tumor samples including single sector, multi-sector bulk tumor as well as single-cell tumor data under a wide range of evolutionary trajectories. Phylogeny guided Simulator for Tumor Evolution provides an efficient tool for understanding clonal evolution of cancer. Availability and implementation PSiTE is implemented in Python and is available at https://github.com/hchyang/PSiTE. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Epidemiological modeling in StochSS Live!

Bioinformatics ◽

10.1093/bioinformatics/btab061 ◽

2021 ◽

Author(s):

Richard Jiang ◽

Bruno Jacob ◽

Matthew Geiger ◽

Sean Matthew ◽

Bryan Rumsey ◽

...

Keyword(s):

Stochastic Model ◽

Epidemiological Model ◽

Supplementary Information ◽

Supplementary Data ◽

Web Based ◽

Epidemiological Modeling ◽

Modeling Simulation ◽

Wide Range ◽

Biochemical Systems

Abstract Summary We present StochSS Live!, a web-based service for modeling, simulation and analysis of a wide range of mathematical, biological and biochemical systems. Using an epidemiological model of COVID-19, we demonstrate the power of StochSS Live! to enable researchers to quickly develop a deterministic or a discrete stochastic model, infer its parameters and analyze the results. Availability and implementation StochSS Live! is freely available at https://live.stochss.org/ Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

DrawGlycan-SNFG and gpAnnotate: rendering glycans and annotating glycopeptide mass spectra

Bioinformatics ◽

10.1093/bioinformatics/btz819 ◽

2019 ◽

Cited By ~ 4

Author(s):

Kai Cheng ◽

Gabrielle Pawlowski ◽

Xinheng Yu ◽

Yusen Zhou ◽

Sriram Neelamegham

Keyword(s):

Mass Spectrometry ◽

Open Source ◽

Mass Spectra ◽

Supplementary Information ◽

Supplementary Data ◽

International Union ◽

Open Source Program ◽

Source Program ◽

Wide Range ◽

Peptide Modifications

Abstract Summary This manuscript describes an open-source program, DrawGlycan-SNFG (version 2), that accepts IUPAC (International Union of Pure and Applied Chemist)-condensed inputs to render Symbol Nomenclature For Glycans (SNFG) drawings. A wide range of local and global options enable display of various glycan/peptide modifications including bond breakages, adducts, repeat structures, ambiguous identifications etc. These facilities make DrawGlycan-SNFG ideal for integration into various glycoinformatics software, including glycomics and glycoproteomics mass spectrometry (MS) applications. As a demonstration of such usage, we incorporated DrawGlycan-SNFG into gpAnnotate, a standalone application to score and annotate individual MS/MS glycopeptide spectrum in different fragmentation modes. Availability and implementation DrawGlycan-SNFG and gpAnnotate are platform independent. While originally coded using MATLAB, compiled packages are also provided to enable DrawGlycan-SNFG implementation in Python and Java. All programs are available from https://virtualglycome.org/drawglycan; https://virtualglycome.org/gpAnnotate. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

tugHall: a simulator of cancer-cell evolution based on the hallmarks of cancer and tumor-related genes

Bioinformatics ◽

10.1093/bioinformatics/btaa182 ◽

2020 ◽

Vol 36 (11) ◽

pp. 3597-3599 ◽

Cited By ~ 1

Author(s):

Iurii S Nagornov ◽

Mamoru Kato

Keyword(s):

Cancer Cell ◽

Tumor Heterogeneity ◽

Clonal Evolution ◽

Source Code ◽

Genomic Data ◽

Supplementary Information ◽

Cell Behavior ◽

Supplementary Data ◽

Hallmarks Of Cancer ◽

Cell Evolution

Abstract Summary The flood of recent cancer genomic data requires a coherent model that can sort out the findings to systematically explain clonal evolution and the resultant intra-tumor heterogeneity (ITH). Here, we present a new mathematical model designed to computationally simulate the evolution of cancer cells. The model connects the well-known hallmarks of cancer with the specific mutational states of tumor-related genes. The cell behavior phenotypes are stochastically determined, and the hallmarks probabilistically interfere with the phenotypic probabilities. In turn, the hallmark variables depend on the mutational states of tumor-related genes. Thus, our software can deepen our understanding of cancer-cell evolution and generation of ITH. Availability and implementation The open-source code is available in the repository https://github.com/nagornovys/Cancer_cell_evolution. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

ClonArch: Visualizing the Spatial Clonal Architecture of Tumors

10.1101/2020.04.06.027912 ◽

2020 ◽

Author(s):

Jiaqi Wu ◽

Mohammed El-Kebir

Keyword(s):

Visual Analytics ◽

Phylogenetic Trees ◽

Cancer Genomics ◽

Clonal Evolution ◽

Response To Treatment ◽

Intratumor Heterogeneity ◽

Tumor Evolution ◽

Web Based ◽

Clonal Architecture ◽

Spatial Coordinates

AbstractMotivationCancer is caused by the accumulation of somatic mutations that lead to the formation of distinct populations of cells, called clones. The resulting clonal architecture is the main cause of relapse and resistance to treatment. With decreasing costs in DNA sequencing technology, rich cancer genomics datasets with many spatial sequencing samples are becoming increasingly available, enabling the inference of high-resolution tumor clones and prevalences across different spatial coordinates. While temporal and phylogenetic aspects of tumor evolution, such as clonal evolution over time and clonal response to treatment, are commonly visualized in various clonal evolution diagrams, visual analytics methods that reveal the spatial clonal architecture are missing.ResultsThis paper introduces ClonArch, a web-based tool to interactively visualize the phylogenetic tree and spatial distribution of clones in a single tumor mass. ClonArch uses the marching squares algorithm to draw closed boundaries representing the presence of clones in a real or simulated tumor. ClonArch enables researchers to examine the spatial clonal architecture of a subset of relevant mutations at different prevalence thresholds and across multiple phylogenetic trees. In addition to simulated tumors with varying number of biopsies, we demonstrate the use of ClonArch on a hepatocellular carcinoma tumor with ~280 sequencing biopsies. ClonArch provides an automated way to interactively examine the spatial clonal architecture of a tumor, facilitating clinical and biological interpretations of the spatial aspects of intratumor heterogeneity.Availabilityhttps://github.com/elkebir-group/ClonArch

Download Full-text

LRez: C ++ API and toolkit for analyzing and managing Linked-Reads data

Bioinformatics Advances ◽

10.1093/bioadv/vbab022 ◽

2021 ◽

Author(s):

Pierre Morisse ◽

Claire Lemaitre ◽

Fabrice Legeai

Keyword(s):

Genome Assembly ◽

Low Cost ◽

Variant Calling ◽

Supplementary Information ◽

Supplementary Data ◽

High Quality ◽

Dna Molecule ◽

Sequencing Technologies ◽

Wide Range ◽

Genomic Regions

Abstract Motivation Linked-Reads technologies combine both the high-quality and low cost of short-reads sequencing and long-range information, through the use of barcodes tagging reads which originate from a common long DNA molecule. This technology has been employed in a broad range of applications including genome assembly, phasing and scaffolding, as well as structural variant calling. However, to date, no tool or API dedicated to the manipulation of Linked-Reads data exist. Results We introduce LRez, a C ++ API and toolkit which allows easy management of Linked-Reads data. LRez includes various functionalities, for computing numbers of common barcodes between genomic regions, extracting barcodes from BAM files, as well as indexing and querying BAM, FASTQ and gzipped FASTQ files to quickly fetch all reads or alignments containing a given barcode. LRez is compatible with a wide range of Linked-Reads sequencing technologies, and can thus be used in any tool or pipeline requiring barcode processing or indexing, in order to improve their performances. Availability and implementation LRez is implemented in C ++, supported on Unix-based platforms, and available under AGPL-3.0 License at https://github.com/morispi/LRez, and as a bioconda module. Supplementary information Supplementary data are available at Bioinformatics Advances

Download Full-text

PyRanges: efficient comparison of genomic intervals in Python

10.1101/609396 ◽

2019 ◽

Cited By ~ 1

Author(s):

Endre Bakken Stovner ◽

Pål Sætrom

Keyword(s):

Supplementary Information ◽

Supplementary Data ◽

Genomic Libraries ◽

Link Type ◽

Simple Set ◽

Set Operations ◽

Wide Range ◽

Genomic Analyses ◽

Associated Data ◽

Memory Efficient

AbstractSummaryComplex genomic analyses often use sequences of simple set operations like intersection, overlap, and nearest on genomic intervals. These operations, coupled with some custom programming, allow a wide range of analyses to be performed. To this end, we have written PyRanges, a data structure for representing and manipulating genomic intervals and their associated data in Python. Run single-threaded on binary set operations, PyRanges is in median 2.3-9.6 times faster than the popular R GenomicRanges library and is equally memory efficient; run multi-threaded on 8 cores, our library is up to 123 times faster. PyRanges is therefore ideally suited both for individual analyses and as a foundation for future genomic libraries in Python.AvailabilityPyRanges is available open-source under the MIT license at https://github.com/biocore-NTNU/pyranges and documentation exists at https://biocore-NTNU.github.io/pyranges/[email protected] informationSupplementary data are available.

Download Full-text

Varstation: a complete and efficient tool to support NGS data analysis

10.1101/833582 ◽

2019 ◽

Author(s):

ACO Faria ◽

MP Caraciolo ◽

RM Minillo ◽

TF Almeida ◽

SM Pereira ◽

...

Keyword(s):

Genetic Variation ◽

Data Analysis ◽

Supplementary Information ◽

Human Genetic Variation ◽

Supplementary Data ◽

Efficient Tool ◽

Link Type ◽

Data Processor ◽

Ngs Data Analysis ◽

Ngs Data

AbstractSummaryVarstation is a cloud-based NGS data processor and analyzer for human genetic variation. This resource provides a customizable, centralized, safe and clinically validated environment aiming to improve and optimize the flow of NGS analyses and reports related with clinical and research genetics.Availability and implementationVarstation is freely available at http://varstation.com, for academic [email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

Triplet-based similarity score for fully multilabeled trees with poly-occurring labels

Bioinformatics ◽

10.1093/bioinformatics/btaa676 ◽

2020 ◽

Author(s):

Simone Ciccolella ◽

Giulia Bernardini ◽

Luca Denti ◽

Paola Bonizzoni ◽

Marco Previtali ◽

...

Keyword(s):

Open Source ◽

Evolutionary History ◽

Similarity Measures ◽

Real Data ◽

Similarity Score ◽

Supplementary Information ◽

Supplementary Data ◽

Wide Range ◽

Golden Standard ◽

History Of

Abstract Motivation The latest advances in cancer sequencing, and the availability of a wide range of methods to infer the evolutionary history of tumors, have made it important to evaluate, reconcile and cluster different tumor phylogenies. Recently, several notions of distance or similarities have been proposed in the literature, but none of them has emerged as the golden standard. Moreover, none of the known similarity measures is able to manage mutations occurring multiple times in the tree, a circumstance often occurring in real cases. Results To overcome these limitations, in this article, we propose MP3, the first similarity measure for tumor phylogenies able to effectively manage cases where multiple mutations can occur at the same time and mutations can occur multiple times. Moreover, a comparison of MP3 with other measures shows that it is able to classify correctly similar and dissimilar trees, both on simulated and on real data. Availability and implementation An open source implementation of MP3 is publicly available at https://github.com/AlgoLab/mp3treesim. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Nubeam-dedup: a fast and RAM-efficient tool to de-duplicate sequencing reads without mapping

Bioinformatics ◽

10.1093/bioinformatics/btaa112 ◽

2020 ◽

Vol 36 (10) ◽

pp. 3254-3256 ◽

Cited By ~ 2

Author(s):

Hang Dai ◽

Yongtao Guan

Keyword(s):

Hash Function ◽

Reference Genome ◽

State Of The Art ◽

Source Code ◽

Supplementary Information ◽

Supplementary Data ◽

Efficient Tool ◽

Cpu Time ◽

Products Of Matrices

Abstract Summary We present Nubeam-dedup, a fast and RAM-efficient tool to de-duplicate sequencing reads without reference genome. Nubeam-dedup represents nucleotides by matrices, transforms reads into products of matrices, and based on which assigns a unique number to a read. Thus, duplicate reads can be efficiently removed by using a collisionless hash function. Compared with other state-of-the-art reference-free tools, Nubeam-dedup uses 50–70% of CPU time and 10–15% of RAM. Availability and implementation Source code in C++ and manual are available at https://github.com/daihang16/nubeamdedup and https://haplotype.org. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

crisscrosslinkeR: identification and visualization of protein–RNA and protein–protein interactions from crosslinking mass spectrometry

Bioinformatics ◽

10.1093/bioinformatics/btaa1043 ◽

2020 ◽

Author(s):

Emma H Gail ◽

Anup D Shah ◽

Ralf B Schittenhelm ◽

Chen Davidovich

Keyword(s):

Mass Spectrometry ◽

Protein Interactions ◽

R Package ◽

Supplementary Information ◽

Supplementary Data ◽

Protein Protein Interactions ◽

Ribonucleoprotein Complexes ◽

Software Packages ◽

Different Types ◽

Publication Quality

Abstract Summary Unbiased detection of protein–protein and protein–RNA interactions within ribonucleoprotein complexes are enabled through crosslinking followed by mass spectrometry. Yet, different methods detect different types of molecular interactions and therefore require the usage of different software packages with limited compatibility. We present crisscrosslinkeR, an R package that maps both protein–protein and protein–RNA interactions detected by different types of approaches for crosslinking with mass spectrometry. crisscrosslinkeR produces output files that are compatible with visualization using popular software packages for the generation of publication-quality figures. Availability and implementation crisscrosslinkeR is a free and open-source package, available through GitHub: github.com/egmg726/crisscrosslinker. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text