scholarly journals COVID-Align: Accurate online alignment of hCoV-19 genomes using a profile HMM

Author(s):  
Frédéric Lemoine ◽  
Luc Blassel ◽  
Jakub Voznica ◽  
Olivier Gascuel

AbstractMotivationThe first cases of the COVID-19 pandemic emerged in December 2019. Until the end of February 2020, the number of available genomes was below 1,000, and their multiple alignment was easily achieved using standard approaches. Subsequently, the availability of genomes has grown dramatically. Moreover, some genomes are of low quality with sequencing/assembly errors, making accurate re-alignment of all genomes nearly impossible on a daily basis. A more efficient, yet accurate approach was clearly required to pursue all subsequent bioinformatics analyses of this crucial data.ResultshCoV-19 genomes are highly conserved, with very few indels and no recombination. This makes the profile HMM approach particularly well suited to align new genomes, add them to an existing alignment and filter problematic ones. Using a core of ∼2,500 high quality genomes, we estimated a profile using HMMER, and implemented this profile in COVID-Align, a user-friendly interface to be used online or as standalone via Docker. The alignment of 1,000 genomes requires less than 20mn on our cluster. Moreover, COVID-Align provides summary statistics, which can be used to determine the sequencing quality and evolutionary novelty of input genomes (e.g. number of new mutations and indels).Availabilityhttps://covalign.pasteur.cloud, hub.docker.com/r/evolbioinfo/[email protected], [email protected] informationSupplementary information is available at Bioinformatics online.

Author(s):  
Frédéric Lemoine ◽  
Luc Blassel ◽  
Jakub Voznica ◽  
Olivier Gascuel

Abstract Motivation The first cases of the COVID-19 pandemic emerged in December 2019. Until the end of February 2020, the number of available genomes was below 1,000, and their multiple alignment was easily achieved using standard approaches. Subsequently, the availability of genomes has grown dramatically. Moreover, some genomes are of low quality with sequencing/assembly errors, making accurate re-alignment of all genomes nearly impossible on a daily basis. A more efficient, yet accurate approach was clearly required to pursue all subsequent bioinformatics analyses of this crucial data. Results hCoV-19 genomes are highly conserved, with very few indels and no recombination. This makes the profile HMM approach particularly well suited to align new genomes, add them to an existing alignment and filter problematic ones. Using a core of ∼2,500 high quality genomes, we estimated a profile using HMMER, and implemented this profile in COVID-Align, a user-friendly interface to be used online or as standalone via Docker. The alignment of 1,000 genomes requires ∼50mn on our cluster. Moreover, COVID-Align provides summary statistics, which can be used to determine the sequencing quality and evolutionary novelty of input genomes (e.g. number of new mutations and indels). Availability https://covalign.pasteur.cloud, hub.docker.com/r/evolbioinfo/covid-align Supplementary information Supplementary information is available at Bioinformatics online.


2016 ◽  
Author(s):  
Stephen G. Gaffney ◽  
Jeffrey P. Townsend

ABSTRACTSummaryPathScore quantifies the level of enrichment of somatic mutations within curated pathways, applying a novel approach that identifies pathways enriched across patients. The application provides several user-friendly, interactive graphic interfaces for data exploration, including tools for comparing pathway effect sizes, significance, gene-set overlap and enrichment differences between projects.Availability and ImplementationWeb application available at pathscore.publichealth.yale.edu. Site implemented in Python and MySQL, with all major browsers supported. Source code available at github.com/sggaffney/pathscore with a GPLv3 [email protected] InformationAdditional documentation can be found at http://pathscore.publichealth.yale.edu/faq.


Author(s):  
Matteo Perini ◽  
Gherard Batisti Biffignandi ◽  
Domenico Di Carlo ◽  
Ajay Ratan Pasala ◽  
Aurora Piazza ◽  
...  

AbstractSummaryMeltingPlot is an open source web tool for pathogen typing and epidemiological investigations using High Resolution Melting (HRM) data. The tool implements a graph-based algorithm designed to discriminate pathogen clones on the basis of HRM data, producing portable typing results. MeltingPlot also merges typing information with isolates and patients metadata to create graphical and tabular outputs useful in epidemiological studies. HRM technique allows pathogen typing in less than 5 hours with ~5 euros per sample. MeltingPlot is the first tool specifically designed for HRM-based epidemiological studies and it can analyse hundreds of isolates in a few seconds. Thus, the use of MeltingPlot makes HRM-based typing suitable for large surveillance programs as well as for rapid outbreak reconstructions.Availability and implementationMeltingPlot is implemented in R.The web interface is available at https://skynet.unimi.it/index.php/tools/meltingplot.The source code is also available at https://github.com/MatteoPS/[email protected] informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Jiangming Sun ◽  
Yunpeng Wang

ABSTRACTSummaryPost-GWAS studies using the results from large consortium meta-analysis often need to correctly take care of the overlapping sample issue. The gold standard approach for resolving this issue is to reperform the GWAS or meta-analysis excluding the overlapped participants. However, such approach is time-consuming and, sometimes, restricted by the available data. deMeta provides a user friendly and computationally efficient command-line implementation for removing the effect of a contributing sub-study to a consortium from the meta-analysis results. Only the summary statistics of the meta-analysis the sub-study to be removed are required. In addition, deMeta can generate contrasting Manhattan and quantile-quantile plots for users to visualize the impact of the sub-study on the meta-analysis results.Availability and ImplementationThe python source code, examples and documentations of deMeta are publicly available at https://github.com/Computational-NeuroGenetics/[email protected] (J. Sun); [email protected] (Y. Wang)Supplementary informationNone.


2019 ◽  
Author(s):  
Yu Amanda Guo ◽  
Mei Mei Chang ◽  
Anders Jacobsen Skanderup

AbstractSummaryRecurrence and clustering of somatic mutations (hotspots) in cancer genomes may indicate positive selection and involvement in tumorigenesis. MutSpot performs genome-wide inference of mutation hotspots in non-coding and regulatory DNA of cancer genomes. MutSpot performs feature selection across hundreds of epigenetic and sequence features followed by estimation of position and patient-specific background somatic mutation probabilities. MutSpot is user-friendly, works on a standard workstation, and scales to thousands of cancer genomes.Availability and implementationMutSpot is implemented as an R package and is available at https://github.com/skandlab/MutSpot/Supplementary informationSupplementary data are available at https://github.com/skandlab/MutSpot/


2020 ◽  
Author(s):  
Janel L. Davis ◽  
Brian Soetikno ◽  
Ki-Hee Song ◽  
Yang Zhang ◽  
Cheng Sun ◽  
...  

AbstractSummarySpectroscopic single-molecule localization microscopy (sSMLM) simultaneously captures the spatial locations and full spectra of stochastically emitting fluorescent single molecules. It provides an optical platform to develop new multi-molecular and functional imaging capabilities. While several open-source software suites provide sub-diffraction localization of fluorescent molecules, software suites for spectroscopic analysis of sSMLM data remain unavailable. RainbowSTORM is an open-source, user-friendly ImageJ/FIJI plugin for end-to-end spectroscopic analysis and visualization for sSMLM images. RainbowSTORM allows users to calibrate, preview, and quantitatively analyze emission spectra acquired using different reported sSMLM system designs and fluorescent labels.AvailabilityRainbowSTORM is a java plugin for ImageJ (https://imagej.net)/FIJI (http://fiji.sc) freely available through: https://github.com/FOIL-NU/RainbowSTORM. RainbowSTORM has been tested with Windows and Mac operating systems and ImageJ/FIJI version 1.52.Supplementary informationSupplementary data are available online.


2018 ◽  
Author(s):  
Bruno Henrique Ribeiro Da Fonseca ◽  
Douglas Silva Domingues ◽  
Alexandre Rossi Paschoal

AbstractMotivationMirtrons are originated from short introns with atypical cleavage from the miRNA canonical pathway by using the splicing mechanism. Several studies describe mirtrons in chordates, invertebrates and plants but in the current literature there is no repository that centralizes and organizes these public and available data. To fill this gap, we created the first knowledge database dedicated to mirtron, called mirtronDB, available at http://mirtrondb.cp.utfpr.edu.br/. MirtronDB has a total of 1,407 mirtron precursors and 2,426 mirtron mature sequences in 18 species.ResultsThrough a user-friendly interface, users can browse and search mirtrons by organism, organism group, type and name. MirtronDB is a specialized resource to explore mirtrons and their regulations, providing free, user-friendly access to knowledge on mirtron data.AvailabilityMirtronDB is available at http://mirtrondb.cp.utfpr.edu.br/[email protected] informationSupplementary data are available.


2017 ◽  
Author(s):  
Sarah Bastkowski ◽  
Daniel Mapleson ◽  
Andreas Spillner ◽  
Taoyang Wu ◽  
Monika Balvočiūtė ◽  
...  

ABSTRACTSummarySplit-networks are a generalization of phylogenetic trees that have proven to be a powerful tool in phylogenetics. Various ways have been developed for computing such networks, including split-decomposition, NeighborNet, QNet and FlatNJ. Some of these approaches are implemented in the user-friendly SplitsTree software package. However, to give the user the option to adjust and extend these approaches and to facilitate their integration into analysis pipelines, there is a need for robust, open-source implementations of associated data structures and algorithms. Here we present SPECTRE, a readily available, open-source library of data structures written in Java, that comes complete with new implementations of several pre-published algorithms and a basic interactive graphical interface for visualizing planar split networks. SPECTRE also supports the use of longer running algorithms by providing command line interfaces, which can be executed on servers or in High Performance Computing (HPC) environments.AvailabilityFull source code is available under the GPLv3 license at: https://github.com/maplesond/SPECTRESPECTRE’s core library is available from Maven Central at: https://mvnrepository.com/artifactuk.ac.uea.cmp.spectre/coreDocumentation is available at: http://spectre-suite-of-phylogenetic-tools-for-reticulate-evolution.readthedocs.io/en/latest/[email protected] Information (SI)Supplementary information is available at Bioinformatics online.


2017 ◽  
Author(s):  
Saima Sultana Tithi ◽  
Jiyoung Lee ◽  
Liqing Zhang ◽  
Song Li ◽  
Na Meng

AbstractAnalyzing next generation sequencing data always requires researchers to install many tools, prepare input data compliant to the required data format, and execute the tools in specific orders. Such tool installation and workflow execution process is tedious and error-prone, and becomes very challenging when researchers need to compare multiple alternative tool chains. To mitigate this problem, we developed a new lightweight and portable system, Biopipe, to simplify the creation and execution of bioinformatics tools and workflows, and to further enable the comparison between alternative tools or workflows. Biopipe allows users to create and edit workflows with user-friendly web interfaces, and automates tool installation as well as workflow synthesis by downloading and executing predefined Docker images. With Biopipe, biologists can easily experiment with and compare different bioinformatics tools and workflows without much computer science knowledge. There are mainly two parts in Biopipe: a web application and a standalone Java application. They are freely available at http://bench.cs.vt.edu:8282/Biopipe-Workflow-Editor-0.0.1/index.xhtml and https://code.vt.edu/saima5/[email protected] informationSupplementary data are available online.


2020 ◽  
Vol 36 (14) ◽  
pp. 4211-4213
Author(s):  
Xiao Wang ◽  
Haidong Yi ◽  
Jia Wang ◽  
Zhandong Liu ◽  
Yanbin Yin ◽  
...  

Abstract Summary We developed GDASC, a web version of our former DASC algorithm implemented with GPU. It provides a user-friendly web interface for detecting batch factors. Based on the good performance of DASC algorithm, it is able to give the most accurate results. For two steps of DASC, data-adaptive shrinkage and semi-non-negative matrix factorization, we designed parallelization strategies facing convex clustering solution and decomposition process. It runs more than 50 times faster than the original version on the representative RNA sequencing quality control dataset. With its accuracy and high speed, this server will be a useful tool for batch effects analysis. Availability and implementation http://bioinfo.nankai.edu.cn/gdasc.php. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document