idCOV: a pipeline for quick clade identification of SARS-CoV-2 isolates

Mapping Intimacies ◽

10.1101/2020.10.08.330456 ◽

2020 ◽

Author(s):

Xun Zhu ◽

Ti-Cheng Chang ◽

Richard Webby ◽

Gang Wu

Keyword(s):

Personal Computer ◽

Source Code ◽

Command Line ◽

Sequencing Data ◽

Link Type ◽

Public Dataset ◽

Virus Isolates

AbstractidCOV is a phylogenetic pipeline for quickly identifying the clades of SARS-CoV-2 virus isolates from raw sequencing data based on a selected clade-defining marker list. Using a public dataset, we show that idCOV can make equivalent calls as annotated by Nextstrain.org on all three common clade systems using user uploaded FastQ files directly. Web and equivalent command-line interfaces are available. It can be deployed on any Linux environment, including personal computer, HPC and the cloud. The source code is available at https://github.com/xz-stjude/idcov. A documentation for installation can be found at https://github.com/xz-stjude/idcov/blob/master/README.md.

Download Full-text

NanoPack: visualizing and processing long read sequencing data

10.1101/237180 ◽

2017 ◽

Cited By ~ 2

Author(s):

Wouter De Coster ◽

Svenn D’Hert ◽

Darrin T. Schultz ◽

Marc Cruts ◽

Christine Van Broeckhoven

Keyword(s):

Web Service ◽

Graphical User Interface ◽

Source Code ◽

Supplementary Information ◽

Command Line ◽

Sequencing Data ◽

Link Type ◽

Oxford Nanopore ◽

Long Read ◽

Oxford Nanopore Technologies

AbstractSummary: Here we describe NanoPack, a set of tools developed for visualization and processing of long read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences.Availability and Implementation: The NanoPack tools are written in Python3 and released under the GNU GPL3.0 Licence. The source code can be found at https://github.com/wdecoster/nanopack, together with links to separate scripts and their documentation. The scripts are compatible with Linux, Mac OS and the MS Windows 10 subsystem for linux and are available as a graphical user interface, a web service at http://nanoplot.bioinf.be and command line tools.Contact:[email protected] information: Supplementary tables and figures are available at Bioinformatics online.

Download Full-text

Alview: Portable Software for Viewing Sequence Reads in BAM Formatted Files

Cancer Informatics ◽

10.4137/cin.s26470 ◽

2015 ◽

Vol 14 ◽

pp. CIN.S26470 ◽

Cited By ~ 2

Author(s):

Richard P. Finney ◽

Qing-Rong Chen ◽

Cu V. Nguyen ◽

Chih Hao Hsu ◽

Chunhua Yan ◽

...

Keyword(s):

Graphical User Interface ◽

Reference Genome ◽

Source Code ◽

Software Tool ◽

Command Line ◽

Sequencing Data ◽

Genome Data ◽

Command Line Tool ◽

Portable Software ◽

Microsoft Windows

The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview . The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview .

Download Full-text

ASaiM: a Galaxy-based framework to analyze raw shotgun data from microbiota

10.1101/183970 ◽

2017 ◽

Cited By ~ 2

Author(s):

Bérénice Batut ◽

Kévin Gravouil ◽

Clémence Defois ◽

Saskia Hiltemann ◽

Jean-François Brugère ◽

...

Keyword(s):

Technological Progress ◽

Source Code ◽

Command Line ◽

Bioinformatic Tools ◽

Link Type ◽

Data Analyses ◽

The Galaxy ◽

Sequencing Platforms ◽

User Friendly ◽

New Generation

AbstractBackgroundNew generation of sequencing platforms coupled to numerous bioinformatics tools has led to rapid technological progress in metagenomics and metatranscriptomics to investigate complex microorganism communities. Nevertheless, a combination of different bioinformatic tools remains necessary to draw conclusions out of microbiota studies. Modular and user-friendly tools would greatly improve such studies.FindingsWe therefore developed ASaiM, an Open-Source Galaxy-based framework dedicated to microbiota data analyses. ASaiM provides a curated collection of tools to explore and visualize taxonomic and functional information from raw amplicon, metagenomic or metatranscriptomic sequences. To guide different analyses, several customizable workflows are included. All workflows are supported by tutorials and Galaxy interactive tours to guide the users through the analyses step by step. ASaiM is implemented as Galaxy Docker flavour. It is scalable to many thousand datasets, but also can be used a normal PC. The associated source code is available under Apache 2 license at https://github.com/ASaiM/framework and documentation can be found online (http://asaim.readthedocs.io/)ConclusionsBased on the Galaxy framework, ASaiM offers sophisticated analyses to scientists without command-line knowledge. ASaiM provides a powerful framework to easily and quickly explore microbiota data in a reproducible and transparent environment.

Download Full-text

Phigaro: high throughput prophage sequence annotation

10.1101/598243 ◽

2019 ◽

Cited By ~ 6

Author(s):

Elizaveta V. Starikova ◽

Polina O. Tikhonova ◽

Nikita A. Prianichnikov ◽

Chris M. Rands ◽

Evgeny M. Zdobnov ◽

...

Keyword(s):

Test Data ◽

High Throughput ◽

Source Code ◽

Sequence Annotation ◽

Command Line ◽

Link Type ◽

Genome Maps ◽

Transposon Insertion ◽

Prophage Sequence

AbstractSummaryPhigaro is a standalone command-line application that is able to detect prophage regions taking raw genome and metagenome assemblies as an input. It also produces dynamic annotated “prophage genome maps” and marks possible transposon insertion spots inside prophages. It provides putative taxonomic annotations that can distinguish tailed from non-tailed phages. It is applicable for mining prophage regions from large metagenomic datasets.AvailabilitySource code for Phigaro is freely available for download at https://github.com/bobeobibo/phigaro along with test data. The code is written in Python.

Download Full-text

fluff: exploratory analysis and visualization of high-throughput sequencing data

PeerJ ◽

10.7717/peerj.2209 ◽

2016 ◽

Vol 4 ◽

pp. e2209 ◽

Cited By ~ 28

Author(s):

Georgios Georgiou ◽

Simon J. van Heeringen

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Developmental Stages ◽

Command Line ◽

Clustering Methods ◽

Sequencing Data ◽

Link Type ◽

High Throughput Sequencing Data ◽

Genome Wide ◽

Genome Wide Data

Summary.In this article we describe fluff, a software package that allows for simple exploration, clustering and visualization of high-throughput sequencing data mapped to a reference genome. The package contains three command-line tools to generate publication-quality figures in an uncomplicated manner using sensible defaults. Genome-wide data can be aggregated, clustered and visualized in a heatmap, according to different clustering methods. This includes a predefined setting to identify dynamic clusters between different conditions or developmental stages. Alternatively, clustered data can be visualized in a bandplot. Finally, fluff includes a tool to generate genomic profiles. As command-line tools, the fluff programs can easily be integrated into standard analysis pipelines. The installation is straightforward and documentation is available athttp://fluff.readthedocs.org.Availability.fluff is implemented in Python and runs on Linux. The source code is freely available for download athttps://github.com/simonvh/fluff.

Download Full-text

animalcules: interactive microbiome analytics and visualization in R

Microbiome ◽

10.1186/s40168-021-01013-0 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Yue Zhao ◽

Anthony Federico ◽

Tyler Faits ◽

Solaiappan Manimaran ◽

Daniel Segrè ◽

...

Keyword(s):

16S Rrna ◽

Microbial Communities ◽

R Package ◽

Command Line ◽

Data Generation ◽

Sequencing Data ◽

Shotgun Metagenomics ◽

Microbiome Analysis ◽

Link Type ◽

R Shiny

Abstract Background Microbial communities that live in and on the human body play a vital role in health and disease. Recent advances in sequencing technologies have enabled the study of microbial communities at unprecedented resolution. However, these advances in data generation have presented novel challenges to researchers attempting to analyze and visualize these data. Results To address some of these challenges, we have developed animalcules, an easy-to-use interactive microbiome analysis toolkit for 16S rRNA sequencing data, shotgun DNA metagenomics data, and RNA-based metatranscriptomics profiling data. This toolkit combines novel and existing analytics, visualization methods, and machine learning models. For example, the toolkit features traditional microbiome analyses such as alpha/beta diversity and differential abundance analysis, combined with new methods for biomarker identification are. In addition, animalcules provides interactive and dynamic figures that enable users to understand their data and discover new insights. animalcules can be used as a standalone command-line R package or users can explore their data with the accompanying interactive R Shiny interface. Conclusions We present animalcules, an R package for interactive microbiome analysis through either an interactive interface facilitated by R Shiny or various command-line functions. It is the first microbiome analysis toolkit that supports the analysis of all 16S rRNA, DNA-based shotgun metagenomics, and RNA-sequencing based metatranscriptomics datasets. animalcules can be freely downloaded from GitHub at https://github.com/compbiomed/animalcules or installed through Bioconductor at https://www.bioconductor.org/packages/release/bioc/html/animalcules.html.

Download Full-text

Verification of Arabidopsis stock collections using SNPmatch - an algorithm for genotyping high-plexed samples

10.1101/109520 ◽

2017 ◽

Cited By ~ 2

Author(s):

Rahul Pisupati ◽

Ilka Reichardt ◽

Ümit Seren ◽

Pamela Korte ◽

Viktoria Nizhynska ◽

...

Keyword(s):

Arabidopsis Thaliana ◽

Genetic Variation ◽

Phenotypic Variation ◽

Large Scale ◽

Command Line ◽

Web Interface ◽

Sequencing Data ◽

Link Type ◽

Gregor Mendel ◽

Low Coverage

AbstractLarge-scale studies such as the Arabidopsis thaliana 1001 Genomes Project aim to understand genetic variation in populations and link it to phenotypic variation. Such studies require routine genotyping of stocks to avoid sample contamination and mix-ups. To genotype samples efficiently and economically, sequencing must be inexpensive and data processing simple. Here we present SNPmatch, a tool which identifies the most likely strain (inbred line, or “accession”) from a SNP database. We tested the tool by performing low-coverage sequencing of over 2000 strains. SNPmatch could readily genotype samples correctly from 1-fold coverage sequencing data, and could also identify the parents of F1 or F2 individuals. SNPmatch can be run either on the command line or through AraGeno (https://arageno.gmi.oeaw.ac.at), a web interface that permits sample genotyping from a user-uploaded VCF or BED file.Availability and implementation: https://github.com/Gregor-Mendel-Institute/SNPmatch.git

Download Full-text

myCircos: Facilitating the Creation and Use of Circos Plots Online

10.1101/052605 ◽

2016 ◽

Author(s):

Caroline Labelle ◽

Geneviève Boucher ◽

Sébastien Lemieux

Keyword(s):

User Interface ◽

Web Application ◽

Graphical Representation ◽

Source Code ◽

Command Line ◽

Genomic Information ◽

Link Type ◽

Intuitive User Interface ◽

The Creation

AbstractCircos plots were designed to display large amounts of processed genomic information on a single graphical representation. The creation of such plots remains challenging for less technical users as the leading tool requires command-line proficiency. Here, we introduce myCircos, a web application that facilitates the generation of Circos plots by providing an intuitive user interface, adding interactive functionalities to the representation and providing persistence of previous requests. myCircos is available at: http://mycircos.iric.ca. Non registered users can explore the application through the Guest user. Source code (for local server installation) is available upon request.

Download Full-text

UROPA GUI: A web platform for genomic region annotation

10.1101/302091 ◽

2018 ◽

Author(s):

Hendrik Schultheis ◽

Jens Preussner ◽

Annika Fust ◽

Mette Bentsen ◽

Carsten Kuenne ◽

...

Keyword(s):

Graphical User Interface ◽

Bioinformatics Analysis ◽

Source Code ◽

Genomic Region ◽

Command Line ◽

Web Based ◽

Link Type ◽

R Shiny ◽

Considerable Impact ◽

Web Platform

AbstractThe annotation of genomic ranges such as peaks resulting from ChIP-seq/ATAC-seq or other techniques represents a fundamental task of bioinformatics analysis with considerable impact on many downstream analyses. In our previous work, we introduced the Universal Robust Peak Annotator (UROPA), a flexible command line based tool which improves upon the functionality of existing annotation software. In order to reduce the complexity for biologists and clinicians, we have implemented an intuitive web-based graphical user interface (GUI) and fully functional service platform for UROPA. This extension will empower all users to generate annotations for regions of interest interactively.Availability and ImplementationThe open source UROPA GUI server was implemented in R Shiny and Python and is available from http://loosolab.mpi-bn.mpg.de. The source code of our App can be downloaded at https://github.molgen.mpg.de/loosolab/UROPA_GUI under the MIT license.

Download Full-text

A decoupled, modular and scriptable architecture for tools to curate data platforms

10.1101/2020.09.28.282699 ◽

2020 ◽

Author(s):

Moritz Langenstein ◽

Henning Hermjakob ◽

Manuel Bernal Llinares

Keyword(s):

Web Application ◽

Production Systems ◽

Source Code ◽

Black Box ◽

Command Line ◽

Web Interface ◽

Link Type ◽

Data Platform ◽

The Web

AbstractMotivationCuration is essential for any data platform to maintain the quality of the data it provides. Existing databases, which require maintenance, and the amount of newly published information that needs to be surveyed, are growing rapidly. More efficient curation is often vital to keep up with this growth, requiring modern curation tools. However, curation interfaces are often complex and difficult to further develop. Furthermore, opportunities for experimentation with curation workflows may be lost due to a lack of development resources, or a reluctance to change sensitive production systems.ResultsWe propose a decoupled, modular and scriptable architecture to build curation tools on top of existing platforms. Instead of modifying the existing infrastructure, our architecture treats the existing platform as a black box and relies only on its public APIs and web application. As a decoupled program, the tool’s architecture gives more freedom to developers and curators. This added flexibility allows for quickly prototyping new curation workflows as well as adding all kinds of analysis around the data platform. The tool can also streamline and enhance the curator’s interaction with the web interface of the platform. We have implemented this design in cmd-iaso, a command-line curation tool for the identifiers.org registry.AvailabilityThe cmd-iaso curation tool is implemented in Python 3.7+ and supports Linux, macOS and Windows. Its source code and documentation are freely available from https://github.com/identifiers-org/cmd-iaso. It is also published as a Docker container at https://hub.docker.com/r/identifiersorg/[email protected]

Download Full-text