MiSDEED: a synthetic multi-omics engine for microbiome power analysis and study design

Mapping Intimacies ◽

10.1101/2021.08.09.455682 ◽

2021 ◽

Author(s):

Philippe Chlenski ◽

Melody Hsu ◽

Itsik Pe'er

Keyword(s):

Study Design ◽

Relative Abundance ◽

Arbitrary Number ◽

Power Analysis ◽

Command Line ◽

Omics Data ◽

Command Line Tool ◽

Simulation Parameters ◽

Python Package ◽

Analysis And Study

MiSDEED is a command-line tool for generating synthetic longitudinal multi-omics data from simulated microbial environments. It generates relative-abundance timecourses under perturbations for an arbitrary number of samples and patients. All simulation parameters are exposed to the user to facilitate rapid power analysis and aid in study design. Users who want additional flexibility may also use MiSDEED as a Python package. Availability and implementation: MiSDEED is written in Python and is freely available at https://github.com/pchlenski/misdeed. Contact: [email protected]

Get full-text (via PubEx)

amplimap: a versatile tool to process and analyze targeted NGS data

Bioinformatics ◽

10.1093/bioinformatics/btz582 ◽

2019 ◽

Vol 35 (24) ◽

pp. 5349-5350

Author(s):

Nils Koelling ◽

Marie Bernkopf ◽

Eduardo Calpena ◽

Geoffrey J Maher ◽

Kerry A Miller ◽

...

Keyword(s):

Command Line ◽

User Friendliness ◽

Targeted Next Generation Sequencing ◽

Base Calling ◽

Targeted Ngs ◽

Command Line Tool ◽

Versatile Tool ◽

Ngs Data ◽

Python Package ◽

Generation Sequencing

Abstract Summary amplimap is a command-line tool to automate the processing and analysis of data from targeted next-generation sequencing experiments with PCR-based amplicons or capture-based enrichment systems. From raw sequencing reads, amplimap generates output such as read alignments, annotated variant calls, target coverage statistics and variant allele counts and frequencies for each target base pair. In addition to its focus on user-friendliness and reproducibility, amplimap supports advanced features such as consensus base calling for read families based on unique molecular identifiers and filtering false positive variant calls caused by amplification of off-target loci. Availability and implementation amplimap is available as a free Python package under the open-source Apache 2.0 License. Documentation, source code and installation instructions are available at https://github.com/koelling/amplimap.

Get full-text (via PubEx)

BioNetComp: a Python package for biological network development and comparison

10.1101/2021.04.14.439897 ◽

2021 ◽

Author(s):

Lucas Miguel Carvalho

Keyword(s):

Biological Networks ◽

Large Scale ◽

Biological Network ◽

Biological Data ◽

Command Line ◽

Omics Data ◽

Network Development ◽

Network Metrics ◽

Web Platform ◽

Python Package

Due to the large generation of omics data on a large scale in the last few years, the extraction of information from biological data has become more complex and its integration or comparison as well. One of the ways to represent interactions of biological data is through networks, which summarize information on interactions between their nodes through edges. The comparison of two biological networks using network metrics, biological enrichment, and visualization consists of data that allows us to understand differences in the interactomes of contrasting conditions. We describe BioNetComp, a python package to compare two different interactomes through different metrics and data visualization without the need for a web platform or software, just by command-line. As a result, we present a comparison made between the interactomes generated from the differentially expressed genes at two different points during a typical bioethanol fermentation. BioNetComp is available at github.com/lmigueel/BioNetComp.

Get full-text (via PubEx)

BiSulfite Bolt: A bisulfite sequencing analysis platform

GigaScience ◽

10.1093/gigascience/giab033 ◽

2021 ◽

Vol 10 (5) ◽

Author(s):

Colin Farrell ◽

Michael Thompson ◽

Anela Tosevska ◽

Adewale Oyetunde ◽

Matteo Pellegrini

Keyword(s):

Data Aggregation ◽

Bisulfite Sequencing ◽

Low Complexity ◽

Sequencing Analysis ◽

Command Line ◽

Sequencing Data ◽

Bisulfite Sequencing Data ◽

Analysis Platform ◽

Python Package ◽

Bisulfite Sequencing Analysis

Abstract Background Bisulfite sequencing is commonly used to measure DNA methylation. Processing bisulfite sequencing data is often challenging owing to the computational demands of mapping a low-complexity, asymmetrical library and the lack of a unified processing toolset to produce an analysis-ready methylation matrix from read alignments. To address these shortcomings, we have developed BiSulfite Bolt (BSBolt), a fast and scalable bisulfite sequencing analysis platform. BSBolt performs a pre-alignment sequencing read assessment step to improve efficiency when handling asymmetrical bisulfite sequencing libraries. Findings We evaluated BSBolt against simulated and real bisulfite sequencing libraries. We found that BSBolt provides accurate and fast bisulfite sequencing alignments and methylation calls. We also compared BSBolt to several existing bisulfite alignment tools and found BSBolt outperforms Bismark, BSSeeker2, BISCUIT, and BWA-Meth based on alignment accuracy and methylation calling accuracy. Conclusion BSBolt offers streamlined processing of bisulfite sequencing data through an integrated toolset that offers support for simulation, alignment, methylation calling, and data aggregation. BSBolt is implemented as a Python package and command line utility for flexibility when building informatics pipelines. BSBolt is available at https://github.com/NuttyLogic/BSBolt under an MIT license.

Get full-text (via PubEx)

Alview: Portable Software for Viewing Sequence Reads in BAM Formatted Files

Cancer Informatics ◽

10.4137/cin.s26470 ◽

2015 ◽

Vol 14 ◽

pp. CIN.S26470 ◽

Cited By ~ 2

Author(s):

Richard P. Finney ◽

Qing-Rong Chen ◽

Cu V. Nguyen ◽

Chih Hao Hsu ◽

Chunhua Yan ◽

...

Keyword(s):

Graphical User Interface ◽

Reference Genome ◽

Source Code ◽

Software Tool ◽

Command Line ◽

Sequencing Data ◽

Genome Data ◽

Command Line Tool ◽

Portable Software ◽

Microsoft Windows

The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview . The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview .

Get full-text (via PubEx)

FAN-C: A Feature-rich Framework for the Analysis and Visualisation of C data

10.1101/2020.02.03.932517 ◽

2020 ◽

Cited By ~ 6

Author(s):

Kai Kruse ◽

Clemens B. Hug ◽

Juan M. Vaquerizas

Keyword(s):

High Throughput ◽

Matrix Analysis ◽

Set Covering ◽

Command Line ◽

Chromosome Conformation ◽

C Storage ◽

Data Formats ◽

Analysis Tools ◽

Command Line Tool ◽

Broad Feature

Chromosome conformation capture data, particularly from high-throughput approaches such as Hi-C and its derivatives, are typically very complex to analyse. Existing analysis tools are often single-purpose, or limited in compatibility to a small number of data formats, frequently making Hi-C analyses tedious and time-consuming. Here, we present FAN-C, an easy-to-use command-line tool and powerful Python API with a broad feature set covering matrix generation, analysis, and visualisation for C-like data (https://github.com/vaquerizaslab/fanc). Due to its comprehensiveness and compatibility with the most prevalent Hi-C storage formats, FAN-C can be used in combination with a large number of existing analysis tools, thus greatly simplifying Hi-C matrix analysis.

Get full-text (via PubEx)

xml2jupyter: Mapping parameters between XML and Jupyter widgets

10.1101/601211 ◽

2019 ◽

Cited By ~ 1

Author(s):

Randy Heiland ◽

Daniel Mishler ◽

Tyler Zhang ◽

Eric Bower ◽

Paul Macklin

Keyword(s):

Programming Languages ◽

Markup Language ◽

Command Line ◽

Graphical Interface ◽

Web Browser ◽

Agent Based ◽

Mapping Parameters ◽

Extensible Markup ◽

Parameter Values ◽

Python Package

AbstractJupyter Notebooks [4, 6] provide executable documents (in a variety of programming languages) that can be run in a web browser. When a notebook contains graphical widgets, it becomes an easy-to-use graphical user interface (GUI). Many scientific simulation packages use text-based configuration files to provide parameter values and run at the command line without a graphical interface. Manually editing these files to explore how different values affect a simulation can be burdensome for technical users, and impossible to use for those with other scientific backgrounds. xml2jupyter is a Python package that addresses these scientific bottlenecks. It provides a mapping between configuration files, formatted in the Extensible Markup Language (XML), and Jupyter widgets. Widgets are automatically generated from the XML file and these can, optionally, be incorporated into a larger GUI for a simulation package, and optionally hosted on cloud resources. Users modify parameter values via the widgets, and the values are written to the XML configuration file which is input to the simulation’s command-line interface. xml2jupyter has been tested using PhysiCell [1], an open source, agent-based simulator for biology, and it is being used by students for classroom and research projects. In addition, we use xml2jupyter to help create Jupyter GUIs for PhysiCell-related applications running on nanoHUB [5].

Get full-text (via PubEx)

ScaffoldGraph: an open-source library for the generation and analysis of molecular scaffold networks and scaffold trees

Bioinformatics ◽

10.1093/bioinformatics/btaa219 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3930-3931 ◽

Cited By ~ 1

Author(s):

Oliver B Scott ◽

A W Edith Chan

Keyword(s):

Open Source ◽

High Throughput Screening ◽

Chemical Space ◽

Diversity Analysis ◽

Graph Analysis ◽

Command Line ◽

Molecular Scaffold ◽

Large Sets ◽

Command Line Tool ◽

Scaffold Diversity

Abstract Summary ScaffoldGraph (SG) is an open-source Python library and command-line tool for the generation and analysis of molecular scaffold networks and trees, with the capability of processing large sets of input molecules. With the increase in high-throughput screening data, scaffold graphs have proven useful for the navigation and analysis of chemical space, being used for visualization, clustering, scaffold-diversity analysis and active-series identification. Built on RDKit and NetworkX, SG integrates scaffold graph analysis into the growing scientific/cheminformatics Python stack, increasing the flexibility and extendibility of the tool compared to existing software. Availability and implementation SG is freely available and released under the MIT licence at https://github.com/UCLCheminformatics/ScaffoldGraph.

Get full-text (via PubEx)

Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data

Bioinformatics ◽

10.1093/bioinformatics/btaa070 ◽

2020 ◽

Vol 36 (10) ◽

pp. 3263-3265 ◽

Cited By ~ 14

Author(s):

Lucas Czech ◽

Pierre Barbera ◽

Alexandros Stamatakis

Keyword(s):

Phylogenetic Trees ◽

Supplementary Information ◽

Command Line ◽

Supplementary Data ◽

Computationally Efficient ◽

Data Types ◽

Low Level ◽

Phylogenetic Placement ◽

Command Line Tool ◽

High Level

Abstract Summary We present genesis, a library for working with phylogenetic data, and gappa, an accompanying command-line tool for conducting typical analyses on such data. The tools target phylogenetic trees and phylogenetic placements, sequences, taxonomies and other relevant data types, offer high-level simplicity as well as low-level customizability, and are computationally efficient, well-tested and field-proven. Availability and implementation Both genesis and gappa are written in modern C++11, and are freely available under GPLv3 at http://github.com/lczech/genesis and http://github.com/lczech/gappa. Supplementary information Supplementary data are available at Bioinformatics online.

Get full-text (via PubEx)

Spliceogen: an integrative, scalable tool for the discovery of splice-altering variants

Bioinformatics ◽

10.1093/bioinformatics/btz263 ◽

2019 ◽

Vol 35 (21) ◽

pp. 4405-4407 ◽

Cited By ~ 1

Author(s):

Steven Monger ◽

Michael Troup ◽

Eddie Ip ◽

Sally L Dunwoodie ◽

Eleni Giannoulatou

Keyword(s):

Supplementary Information ◽

Command Line ◽

Supplementary Data ◽

In Silico Prediction ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Prediction Tools ◽

Motif Prediction ◽

Command Line Tool ◽

Genome Scale

Abstract Motivation In silico prediction tools are essential for identifying variants which create or disrupt cis-splicing motifs. However, there are limited options for genome-scale discovery of splice-altering variants. Results We have developed Spliceogen, a highly scalable pipeline integrating predictions from some of the individually best performing models for splice motif prediction: MaxEntScan, GeneSplicer, ESRseq and Branchpointer. Availability and implementation Spliceogen is available as a command line tool which accepts VCF/BED inputs and handles both single nucleotide variants (SNVs) and indels (https://github.com/VCCRI/Spliceogen). SNV databases with prediction scores are also available, covering all possible SNVs at all genomic positions within all Gencode-annotated multi-exon transcripts. Supplementary information Supplementary data are available at Bioinformatics online.

Get full-text (via PubEx)

aCLImatise: automated generation of tool definitions for bioinformatics workflows

Bioinformatics ◽

10.1093/bioinformatics/btaa1033 ◽

2020 ◽

Author(s):

Michael Milton ◽

Natalie Thorne

Keyword(s):

Source Code ◽

Supplementary Information ◽

Command Line ◽

Supplementary Data ◽

Automated Generation ◽

Base Camp ◽

Python Package ◽

Bioinformatics Workflow ◽

Bioinformatics Workflows

Abstract Summary aCLImatise is a utility for automatically generating tool definitions compatible with bioinformatics workflow languages, by parsing command-line help output. aCLImatise also has an associated database called the aCLImatise Base Camp, which provides thousands of pre-computed tool definitions. Availability and implementation The latest aCLImatise source code is available within a GitHub organisation, under the GPL-3.0 license: https://github.com/aCLImatise. In particular, documentation for the aCLImatise Python package is available at https://aclimatise.github.io/CliHelpParser/, and the aCLImatise Base Camp is available at https://aclimatise.github.io/BaseCamp/. Supplementary information Supplementary data are available at Bioinformatics online.

Get full-text (via PubEx)