scholarly journals BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences

Author(s):  
Aziz Khan ◽  
Rafael Riudavets Puig ◽  
Paul Boddie ◽  
Anthony Mathelier

Abstract Motivation Accurate motif enrichment analyses depend on the choice of background DNA sequences used, which should ideally match the sequence composition of the foreground sequences. It is important to avoid false positive enrichment due to sequence biases in the genome, such as GC-bias. Therefore, relying on an appropriate set of background sequences is crucial for enrichment analysis. Results We developed BiasAway, a command line tool and its dedicated easy-to-use web server to generate synthetic sequences matching any k-mer nucleotide composition or select genomic DNA sequences matching the mononucleotide composition of the foreground sequences through four different models. For genomic sequences, we provide precomputed partitions of genomes from nine species with five different bin sizes to generate appropriate genomic background sequences. Availability and implementation BiasAway source code is freely available from Bitbucket (https://bitbucket.org/CBGR/biasaway) and can be easily installed using bioconda or pip. The web server is available at https://biasaway.uio.no and a detailed documentation is available at https://biasaway.readthedocs.io. Supplementary information Supplementary data are available at Bioinformatics online.

2015 ◽  
Vol 14 ◽  
pp. CIN.S26470 ◽  
Author(s):  
Richard P. Finney ◽  
Qing-Rong Chen ◽  
Cu V. Nguyen ◽  
Chih Hao Hsu ◽  
Chunhua Yan ◽  
...  

The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview . The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview .


2020 ◽  
Vol 36 (10) ◽  
pp. 3263-3265 ◽  
Author(s):  
Lucas Czech ◽  
Pierre Barbera ◽  
Alexandros Stamatakis

Abstract Summary We present genesis, a library for working with phylogenetic data, and gappa, an accompanying command-line tool for conducting typical analyses on such data. The tools target phylogenetic trees and phylogenetic placements, sequences, taxonomies and other relevant data types, offer high-level simplicity as well as low-level customizability, and are computationally efficient, well-tested and field-proven. Availability and implementation Both genesis and gappa are written in modern C++11, and are freely available under GPLv3 at http://github.com/lczech/genesis and http://github.com/lczech/gappa. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (21) ◽  
pp. 4405-4407 ◽  
Author(s):  
Steven Monger ◽  
Michael Troup ◽  
Eddie Ip ◽  
Sally L Dunwoodie ◽  
Eleni Giannoulatou

Abstract Motivation In silico prediction tools are essential for identifying variants which create or disrupt cis-splicing motifs. However, there are limited options for genome-scale discovery of splice-altering variants. Results We have developed Spliceogen, a highly scalable pipeline integrating predictions from some of the individually best performing models for splice motif prediction: MaxEntScan, GeneSplicer, ESRseq and Branchpointer. Availability and implementation Spliceogen is available as a command line tool which accepts VCF/BED inputs and handles both single nucleotide variants (SNVs) and indels (https://github.com/VCCRI/Spliceogen). SNV databases with prediction scores are also available, covering all possible SNVs at all genomic positions within all Gencode-annotated multi-exon transcripts. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Michael Milton ◽  
Natalie Thorne

Abstract Summary aCLImatise is a utility for automatically generating tool definitions compatible with bioinformatics workflow languages, by parsing command-line help output. aCLImatise also has an associated database called the aCLImatise Base Camp, which provides thousands of pre-computed tool definitions. Availability and implementation The latest aCLImatise source code is available within a GitHub organisation, under the GPL-3.0 license: https://github.com/aCLImatise. In particular, documentation for the aCLImatise Python package is available at https://aclimatise.github.io/CliHelpParser/, and the aCLImatise Base Camp is available at https://aclimatise.github.io/BaseCamp/. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (9) ◽  
pp. 2934-2935 ◽  
Author(s):  
Yi Zheng ◽  
Fangqing Zhao

Abstract Summary Circular RNAs (circRNAs) are proved to have unique compositions and splicing events distinct from canonical mRNAs. However, there is no visualization tool designed for the exploration of complex splicing patterns in circRNA transcriptomes. Here, we present CIRI-vis, a Java command-line tool for quantifying and visualizing circRNAs by integrating the alignments and junctions of circular transcripts. CIRI-vis can be applied to visualize the internal structure and isoform abundance of circRNAs and perform circRNA transcriptome comparison across multiple samples. Availability and implementation https://sourceforge.net/projects/ciri/files/CIRI-vis. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Fábio K Mendes ◽  
Dan Vanderpool ◽  
Ben Fulton ◽  
Matthew W Hahn

Abstract Motivation Genome sequencing projects have revealed frequent gains and losses of genes between species. Previous versions of our software, Computational Analysis of gene Family Evolution (CAFE), have allowed researchers to estimate parameters of gene gain and loss across a phylogenetic tree. However, the underlying model assumed that all gene families had the same rate of evolution, despite evidence suggesting a large amount of variation in rates among families. Results Here, we present CAFE 5, a completely re-written software package with numerous performance and user-interface enhancements over previous versions. These include improved support for multithreading, the explicit modeling of rate variation among families using gamma-distributed rate categories, and command-line arguments that preclude the use of accessory scripts. Availability and implementation CAFE 5 source code, documentation, test data and a detailed manual with examples are freely available at https://github.com/hahnlab/CAFE5/releases. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Aleksandra I Jarmolinska ◽  
Anna Gambin ◽  
Joanna I Sulkowska

Abstract Summary The biggest hurdle in studying topology in biopolymers is the steep learning curve for actually seeing the knots in structure visualization. Knot_pull is a command line utility designed to simplify this process—it presents the user with a smoothing trajectory for provided structures (any number and length of protein, RNA or chromatin chains in PDB, CIF or XYZ format), and calculates the knot type (including presence of any links, and slipknots when a subchain is specified). Availability and implementation Knot_pull works under Python >=2.7 and is system independent. Source code and documentation are available at http://github.com/dzarmola/knot_pull under GNU GPL license and include also a wrapper script for PyMOL for easier visualization. Examples of smoothing trajectories can be found at: https://www.youtube.com/watch?v=IzSGDfc1vAY. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (12) ◽  
pp. 3902-3904
Author(s):  
Timothy O’Connor ◽  
Charles E Grant ◽  
Mikael Bodén ◽  
Timothy L Bailey

Abstract Motivation Identifying the genes regulated by a given transcription factor (TF) (its ‘target genes’) is a key step in developing a comprehensive understanding of gene regulation. Previously, we developed a method (CisMapper) for predicting the target genes of a TF based solely on the correlation between a histone modification at the TF’s binding site and the expression of the gene across a set of tissues or cell lines. That approach is limited to organisms for which extensive histone and expression data are available, and does not explicitly incorporate the genomic distance between the TF and the gene. Results We present the T-Gene algorithm, which overcomes these limitations. It can be used to predict which genes are most likely to be regulated by a TF, and which of the TF’s binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene’s promoter, achieving median precision above 60%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median precision above 40%) based on distance alone when extensive histone/expression data is not available for the organism. T-Gene provides an estimate of the statistical significance of each of its predictions. Availability and implementation The T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://meme-suite.org. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Judith Neukamm ◽  
Alexander Peltzer ◽  
Kay Nieselt

Abstract Motivation In ancient DNA research, the authentication of ancient samples based on specific features remains a crucial step in data analysis. Because of this central importance, researchers lacking deeper programming knowledge should be able to run a basic damage authentication analysis. Such software should be user-friendly and easy to integrate into an analysis pipeline. Results DamageProfiler is a Java based, stand-alone software to determine damage patterns in ancient DNA. The results are provided in various file formats and plots for further processing. DamageProfiler has an intuitive graphical as well as command line interface that allows the tool to be easily embedded into an analysis pipeline. Availability All of the source code is freely available on GitHub (https://github.com/Integrative-Transcriptomics/DamageProfiler). Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 36 (7) ◽  
pp. 2306-2307 ◽  
Author(s):  
Sergii Domanskyi ◽  
Carlo Piermarocchi ◽  
George I Mias

Abstract Summary PyIOmica is an open-source Python package focusing on integrating longitudinal multiple omics datasets, characterizing and categorizing temporal trends. The package includes multiple bioinformatics tools including data normalization, annotation, categorization, visualization and enrichment analysis for gene ontology terms and pathways. Additionally, the package includes an implementation of visibility graphs to visualize time series as networks. Availability and implementation PyIOmica is implemented as a Python package (pyiomica), available for download and installation through the Python Package Index (https://pypi.python.org/pypi/pyiomica), and can be deployed using the Python import function following installation. PyIOmica has been tested on Mac OS X, Unix/Linux and Microsoft Windows. The application is distributed under an MIT license. Source code for each release is also available for download on Zenodo (https://doi.org/10.5281/zenodo.3548040). Supplementary information Supplementary data are available at Bioinformatics


Sign in / Sign up

Export Citation Format

Share Document