scholarly journals NCBI BLAST+ integrated into Galaxy

2015 ◽  
Author(s):  
Peter J. A. Cock ◽  
John M. Chilton ◽  
Björn Grüning ◽  
James E. Johnson ◽  
Nicola Soranzo

Background: The NCBI BLAST suite has become ubiquitous in modern molecular biology, used for small tasks like checking capillary sequencing results of single PCR products through to genome annotation or even larger scale pan-genome analyses. For early adopters of the Galaxy web-based biomedical data analysis platform, integrating BLAST was a natural step for sequence comparison workflows. Findings: The command line NCBI BLAST+ tool suite was wrapped for use within Galaxy, defining appropriate datatypes as needed, with the goal of making common BLAST tasks easy, and advanced tasks possible. Conclusions: This effort has been come an informal international collaborative effort, and is deployed and used on Galaxy servers worldwide. Several example use-cases are described herein.

GigaScience ◽  
2021 ◽  
Vol 10 (5) ◽  
Author(s):  
Colin Farrell ◽  
Michael Thompson ◽  
Anela Tosevska ◽  
Adewale Oyetunde ◽  
Matteo Pellegrini

Abstract Background Bisulfite sequencing is commonly used to measure DNA methylation. Processing bisulfite sequencing data is often challenging owing to the computational demands of mapping a low-complexity, asymmetrical library and the lack of a unified processing toolset to produce an analysis-ready methylation matrix from read alignments. To address these shortcomings, we have developed BiSulfite Bolt (BSBolt), a fast and scalable bisulfite sequencing analysis platform. BSBolt performs a pre-alignment sequencing read assessment step to improve efficiency when handling asymmetrical bisulfite sequencing libraries. Findings We evaluated BSBolt against simulated and real bisulfite sequencing libraries. We found that BSBolt provides accurate and fast bisulfite sequencing alignments and methylation calls. We also compared BSBolt to several existing bisulfite alignment tools and found BSBolt outperforms Bismark, BSSeeker2, BISCUIT, and BWA-Meth based on alignment accuracy and methylation calling accuracy. Conclusion BSBolt offers streamlined processing of bisulfite sequencing data through an integrated toolset that offers support for simulation, alignment, methylation calling, and data aggregation. BSBolt is implemented as a Python package and command line utility for flexibility when building informatics pipelines. BSBolt is available at https://github.com/NuttyLogic/BSBolt under an MIT license.


Cell Systems ◽  
2021 ◽  
Author(s):  
Samuel Katz ◽  
Jian Song ◽  
Kyle P. Webb ◽  
Nicolas W. Lounsbury ◽  
Clare E. Bryant ◽  
...  

GigaScience ◽  
2020 ◽  
Vol 9 (5) ◽  
Author(s):  
Katarzyna Murat ◽  
Björn Grüning ◽  
Paulina Wiktoria Poterlowicz ◽  
Gillian Westgate ◽  
Desmond J Tobin ◽  
...  

Abstract Background Infinium Human Methylation BeadChip is an array platform for complex evaluation of DNA methylation at an individual CpG locus in the human genome based on Illumina’s bead technology and is one of the most common techniques used in epigenome-wide association studies. Finding associations between epigenetic variation and phenotype is a significant challenge in biomedical research. The newest version, HumanMethylationEPIC, quantifies the DNA methylation level of 850,000 CpG sites, while the previous versions, HumanMethylation450 and HumanMethylation27, measured >450,000 and 27,000 loci, respectively. Although a number of bioinformatics tools have been developed to analyse this assay, they require some programming skills and experience in order to be usable. Results We have developed a pipeline for the Galaxy platform for those without experience aimed at DNA methylation analysis using the Infinium Human Methylation BeadChip. Our tool is integrated into Galaxy (http://galaxyproject.org), a web-based platform. This allows users to analyse data from the Infinium Human Methylation BeadChip in the easiest possible way. Conclusions The pipeline provides a group of integrated analytical methods wrapped into an easy-to-use interface. Our tool is available from the Galaxy ToolShed, GitHub repository, and also as a Docker image. The aim of this project is to make Infinium Human Methylation BeadChip analysis more flexible and accessible to everyone.


Database ◽  
2020 ◽  
Vol 2020 ◽  
Author(s):  
Shawna Spoor ◽  
Connor Wytko ◽  
Brian Soto ◽  
Ming Chen ◽  
Abdullah Almsaeed ◽  
...  

Abstract Online biological databases housing genomics, genetic and breeding data can be constructed using the Tripal toolkit. Tripal is an open-source, internationally developed framework that implements FAIR data principles and is meant to ease the burden of constructing such websites for research communities. Use of a common, open framework improves the sustainability and manageability of such as site. Site developers can create extensions for their site and in turn share those extensions with others. One challenge that community databases often face is the need to provide tools for their users that analyze increasingly larger datasets using multiple software tools strung together in a scientific workflow on complicated computational resources. The Tripal Galaxy module, a ‘plug-in’ for Tripal, meets this need through integration of Tripal with the Galaxy Project workflow management system. Site developers can create workflows appropriate to the needs of their community using Galaxy and then share those for execution on their Tripal sites via automatically constructed, but configurable, web forms or using an application programming interface to power web-based analytical applications. The Tripal Galaxy module helps reduce duplication of effort by allowing site developers to spend time constructing workflows and building their applications rather than rebuilding infrastructure for job management of multi-step applications.


2017 ◽  
Author(s):  
Daniele Pierpaolo Colobraro ◽  
Paolo Romano

Due to the fragmentation of microbial information and the several branch of human activities encompassed by microorganism applications, a comprehensive approach for merging information on microbes is needed. Although on line service providers collect several data on microorganisms and provide services for microbial Biological Resource Centres (mBRCs), such services are still limited both in contents and aims. The USMI Galaxy Demonstrator (UGD), an implementation of the Galaxy framework exploiting the XML-based Microbiological Common Language (MCL), is meant to support researchers to make an integrated access to enriched information from microbial catalogues, as well as to help mBRC curators in validating and enriching the contents of their catalogues. Researchers and mBRC curators may exploit the UGD to avoid manual, potentially long, searches on the web and to identify and select microorganisms of interest. UGD tools are written in Python, version 2.7. They allow to enrich the basic information provided by catalogues with related taxonomy, literature, sequence and chemical compound data retrieved from some of the main databases on the basis of the strain number, i.e. the unique identifier for a given culture, and the species names. The data is retrieved by querying database Web Services using either the Simple Object Access Protocol (SOAP) or the Representational State Transfer (REST) access protocols. The MCL format provides a versatile way to archive and exchange data among mBRCs. Galaxy is a well-known, open, web-based platform which offers many tools to retrieve, manage and analyze different kind of information arising from any life science domain. By exploiting Galaxy flexibility,UGD implements some tools and workflows that can be used to find and integrate several information on microorganisms. UGD tools integrate basic information which may support mBRC staff in the insertion of all fundamental strain information in a proper format allowing integration and interoperability with external databases. They also extend the output by adding information on source materials, including species and strain numbers, and retrieve associated microorganisms which use a compound or an enzyme in whatever metabolic pathway by returning the accession number, synonyms, links to external databases, taxon name, and strain number of the requested molecule.


2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Andreas Friedrich ◽  
Erhan Kenar ◽  
Oliver Kohlbacher ◽  
Sven Nahnsen

Big data bioinformatics aims at drawing biological conclusions from huge and complex biological datasets. Added value from the analysis of big data, however, is only possible if the data is accompanied by accurate metadata annotation. Particularly in high-throughput experiments intelligent approaches are needed to keep track of the experimental design, including the conditions that are studied as well as information that might be interesting for failure analysis or further experiments in the future. In addition to the management of this information, means for an integrated design and interfaces for structured data annotation are urgently needed by researchers. Here, we propose a factor-based experimental design approach that enables scientists to easily create large-scale experiments with the help of a web-based system. We present a novel implementation of a web-based interface allowing the collection of arbitrary metadata. To exchange and edit information we provide a spreadsheet-based, humanly readable format. Subsequently, sample sheets with identifiers and metainformation for data generation facilities can be created. Data files created after measurement of the samples can be uploaded to a datastore, where they are automatically linked to the previously created experimental design model.


2017 ◽  
Author(s):  
Bérénice Batut ◽  
Kévin Gravouil ◽  
Clémence Defois ◽  
Saskia Hiltemann ◽  
Jean-François Brugère ◽  
...  

AbstractBackgroundNew generation of sequencing platforms coupled to numerous bioinformatics tools has led to rapid technological progress in metagenomics and metatranscriptomics to investigate complex microorganism communities. Nevertheless, a combination of different bioinformatic tools remains necessary to draw conclusions out of microbiota studies. Modular and user-friendly tools would greatly improve such studies.FindingsWe therefore developed ASaiM, an Open-Source Galaxy-based framework dedicated to microbiota data analyses. ASaiM provides a curated collection of tools to explore and visualize taxonomic and functional information from raw amplicon, metagenomic or metatranscriptomic sequences. To guide different analyses, several customizable workflows are included. All workflows are supported by tutorials and Galaxy interactive tours to guide the users through the analyses step by step. ASaiM is implemented as Galaxy Docker flavour. It is scalable to many thousand datasets, but also can be used a normal PC. The associated source code is available under Apache 2 license at https://github.com/ASaiM/framework and documentation can be found online (http://asaim.readthedocs.io/)ConclusionsBased on the Galaxy framework, ASaiM offers sophisticated analyses to scientists without command-line knowledge. ASaiM provides a powerful framework to easily and quickly explore microbiota data in a reproducible and transparent environment.


2020 ◽  
Author(s):  
Colin Farrell ◽  
Michael Thompson ◽  
Anela Tosevska ◽  
Adewale Oyetunde ◽  
Matteo Pellegrini

AbstractBackgroundBisulfite sequencing is commonly employed to measure DNA methylation. Processing bisulfite sequencing data is often challenging due to the computational demands of mapping a low complexity, asymmetrical library and the lack of a unified processing toolset to produce an analysis ready methylation matrix from read alignments. To address these shortcomings, we have developed BiSulfite Bolt (BSBolt); a fast and scalable bisulfite sequencing analysis platform.FindingsWe evaluated BSBolt against simulated and real bisulfite sequencing libraries. We found that BSBolt provides accurate and fast bisulfite sequencing alignments and methylation calls. We also compared BSBolt to several existing bisulfite alignment tools and found BSBolt outperforms Bismark, BSSeeker2, BISCUIT, and BWA-Meth based on alignment accuracy and methylation calling accuracy.ConclusionBSBolt offers streamlined processing of bisulfite sequencing data through an integrated toolset that offers support for simulation, alignment, methylation calling, and data aggregation. BSBolt is implemented as a python package and command line utility for flexibility when building informatics pipelines. BSBolt is available at https://github.com/NuttyLogic/BSBolt under a MIT license.


2017 ◽  
Vol 109 (1) ◽  
pp. 39-50 ◽  
Author(s):  
Matīss Rikters ◽  
Mark Fishel ◽  
Ondřej Bojar

Abstract In this article, we describe a tool for visualizing the output and attention weights of neural machine translation systems and for estimating confidence about the output based on the attention. Our aim is to help researchers and developers better understand the behaviour of their NMT systems without the need for any reference translations. Our tool includes command line and web-based interfaces that allow to systematically evaluate translation outputs from various engines and experiments. We also present a web demo of our tool with examples of good and bad translations: http://ej.uz/nmt-attention.


Sign in / Sign up

Export Citation Format

Share Document