FlaGs and webFlaGs: discovering novel biology through the analysis of gene neighbourhood conservation

Bioinformatics ◽

10.1093/bioinformatics/btaa788 ◽

2020 ◽

Author(s):

Chayan Kumar Saha ◽

Rodrigo Sanches Pires ◽

Harald Brolin ◽

Maxence Delannoy ◽

Gemma Catherine Atkinson

Keyword(s):

Phylogenetic Tree ◽

Supplementary Information ◽

Evolutionary Analysis ◽

Gene Conservation ◽

Supplementary Data ◽

Web Tool ◽

Cluster Evolution ◽

Graphical Visualization ◽

Molecular Evolutionary Analysis ◽

The Web

Abstract Summary Analysis of conservation of gene neighbourhoods over different evolutionary levels is important for understanding operon and gene cluster evolution, and predicting functional associations. Our tool FlaGs (standing for Flanking Genes) takes a list of NCBI protein accessions as input, clusters neighbourhood-encoded proteins into homologous groups using sensitive sequence searching, and outputs a graphical visualization of the gene neighbourhood and its conservation, along with a phylogenetic tree annotated with flanking gene conservation. FlaGs has demonstrated utility for molecular evolutionary analysis, having uncovered a new toxin–antitoxin system in prokaryotes and bacteriophages. The web tool version of FlaGs (webFlaGs) can optionally include a BLASTP search against a reduced RefSeq database to generate an input accession list and analyse neighbourhood conservation within the same run. Availability and implementation FlaGs can be downloaded from https://github.com/GCA-VH-lab/FlaGs or run online at http://www.webflags.se/. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Predicting Functional Associations using Flanking Genes (FlaGs)

10.1101/362095 ◽

2018 ◽

Cited By ~ 2

Author(s):

Chayan Kumar Saha ◽

Rodrigo Sanches Pires ◽

Harald Brolin ◽

Maxence Delannoy ◽

Gemma Catherine Atkinson

Keyword(s):

Phylogenetic Tree ◽

Gene Cluster ◽

Evolutionary Analysis ◽

Gene Conservation ◽

Link Type ◽

Cluster Evolution ◽

Graphical Visualization ◽

Encoded Proteins ◽

Molecular Evolutionary Analysis

AbstractAnalysis of conservation of gene neighbourhoods over different evolutionary levels is important for understanding operon and gene cluster evolution, and predicting functional associations. Our tool FlaGs (Flanking Genes) takes a list of NCBI protein accessions as in input, clusters neighbourhood-encoded proteins into homologous groups using sensitive sequence searching, and outputs a graphical visualization of the gene neighbourhood and its conservation, along with a phylogenetic tree annotated with flanking gene conservation. FlaGs has demonstrated utility for molecular evolutionary analysis, having uncovered a new toxin-antitoxin system in prokaryotes and bacteriophages. FlaGs can be downloaded from https://github.com/GCA-VH-lab/FlaGs or run at www.webflags.se.

Download Full-text

VarMap: a web tool for mapping genomic coordinates to protein sequence and structure and retrieving protein structural annotations

Bioinformatics ◽

10.1093/bioinformatics/btz482 ◽

2019 ◽

Vol 35 (22) ◽

pp. 4854-4856 ◽

Cited By ~ 8

Author(s):

James D Stephenson ◽

Roman A Laskowski ◽

Andrew Nightingale ◽

Matthew E Hurles ◽

Janet M Thornton

Keyword(s):

Protein Sequence ◽

Structural Information ◽

Protein Structures ◽

Supplementary Information ◽

Supplementary Data ◽

Web Tool ◽

Genomic Variants ◽

Structural Context ◽

Pathogenic Variants ◽

Transcript Evidence

Abstract Motivation Understanding the protein structural context and patterning on proteins of genomic variants can help to separate benign from pathogenic variants and reveal molecular consequences. However, mapping genomic coordinates to protein structures is non-trivial, complicated by alternative splicing and transcript evidence. Results Here we present VarMap, a web tool for mapping a list of chromosome coordinates to canonical UniProt sequences and associated protein 3D structures, including validation checks, and annotating them with structural information. Availability and implementation https://www.ebi.ac.uk/thornton-srv/databases/VarMap. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

MAJIQ-SPEL: web-tool to interrogate classical and complex splicing variations from RNA-Seq data

Bioinformatics ◽

10.1093/bioinformatics/btx565 ◽

2017 ◽

Vol 34 (2) ◽

pp. 300-302 ◽

Cited By ~ 2

Author(s):

Christopher J Green ◽

Matthew R Gazzara ◽

Yoseph Barash

Keyword(s):

Experimental Validation ◽

Ucsc Genome Browser ◽

Supplementary Information ◽

Supplementary Data ◽

Rna Seq ◽

Web Tool ◽

Rt Pcr ◽

Design Algorithm ◽

Gene Isoforms ◽

Downstream Analysis

Abstract Summary Analysis of RNA sequencing (RNA-Seq) data have highlighted the fact that most genes undergo alternative splicing (AS) and that these patterns are tightly regulated. Many of these events are complex, resulting in numerous possible isoforms that quickly become difficult to visualize, interpret and experimentally validate. To address these challenges we developed MAJIQ-SPEL, a web-tool that takes as input local splicing variations (LSVs) quantified from RNA-Seq data and provides users with visualization and quantification of gene isoforms associated with those. Importantly, MAJIQ-SPEL is able to handle both classical (binary) and complex, non-binary, splicing variations. Using a matching primer design algorithm it also suggests to users possible primers for experimental validation by RT-PCR and displays those, along with the matching protein domains affected by the LSV, on UCSC Genome Browser for further downstream analysis. Availability and implementation Program and code will be available athttp://majiq.biociphers.org/majiq-spel. Supplementary information Supplementary data are available atBioinformatics online.

Download Full-text

eFORGE v2.0: updated analysis of cell type-specific signal in epigenomic data

Bioinformatics ◽

10.1093/bioinformatics/btz456 ◽

2019 ◽

Vol 35 (22) ◽

pp. 4767-4769 ◽

Cited By ~ 9

Author(s):

Charles E Breeze ◽

Alex P Reynolds ◽

Jenny van Dongen ◽

Ian Dunham ◽

John Lazar ◽

...

Keyword(s):

Supplementary Information ◽

Supplementary Data ◽

Cell Type ◽

Web Tool ◽

Methylation Analysis ◽

450K Array ◽

Composition Effects ◽

Epigenome Editing ◽

Cell Type Specific ◽

Dna Methylation Analysis

Abstract Summary The Illumina Infinium EPIC BeadChip is a new high-throughput array for DNA methylation analysis, extending the earlier 450k array by over 400 000 new sites. Previously, a method named eFORGE was developed to provide insights into cell type-specific and cell-composition effects for 450k data. Here, we present a significantly updated and improved version of eFORGE that can analyze both EPIC and 450k array data. New features include analysis of chromatin states, transcription factor motifs and DNase I footprints, providing tools for epigenome-wide association study interpretation and epigenome editing. Availability and implementation eFORGE v2.0 is implemented as a web tool available from https://eforge.altiusinstitute.org and https://eforge-tf.altiusinstitute.org/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

HaploGrouper: a generalized approach to haplogroup classification

Bioinformatics ◽

10.1093/bioinformatics/btaa729 ◽

2020 ◽

Author(s):

Anuradha Jagadeesan ◽

S Sunna Ebenesersdóttir ◽

Valdis B Guðmundsdóttir ◽

Elisabet Linda Thordardottir ◽

Kristjan H S Moore ◽

...

Keyword(s):

Mitochondrial Dna ◽

Phylogenetic Tree ◽

Y Chromosome ◽

State Of The Art ◽

Supplementary Information ◽

Sequence Variants ◽

Use Case ◽

Supplementary Data ◽

Human Mitochondrial Dna ◽

Comparable Accuracy

Abstract Motivation We introduce HaploGrouper, a versatile software to classify haplotypes into haplogroups on the basis of a known phylogenetic tree. A typical use case for this software is the assignment of haplogroups to human mitochondrial DNA (mtDNA) or Y-chromosome haplotypes. Existing state-of-the-art haplogroup-calling software is typically hard-wired to work only with either mtDNA or Y-chromosome haplotypes from humans. Results HaploGrouper exhibits comparable accuracy in these instances and has the advantage of being able to assign haplogroups to any kind of haplotypes from any species—given an extant annotated phylogenetic tree defined by sequence variants. Availability and implementation The software is available at the following URL https://gitlab.com/bio_anth_decode/haploGrouper. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

QPARSE: searching for long-looped or multimeric G-quadruplexes potentially distinctive and druggable

Bioinformatics ◽

10.1093/bioinformatics/btz569 ◽

2019 ◽

Cited By ~ 1

Author(s):

Michele Berselli ◽

Enrico Lavezzo ◽

Stefano Toppo

Keyword(s):

Human Gene ◽

State Of The Art ◽

Comprehensive Analysis ◽

Supplementary Information ◽

Gene Promoters ◽

Supplementary Data ◽

Stem Loop ◽

Hiv 1 ◽

Rna And Dna ◽

The Web

Abstract Motivation G-quadruplexes (G4s) are non-canonical nucleic acid conformations that are widespread in all kingdoms of life and are emerging as important regulators both in RNA and DNA. Recently, two new higher-order architectures have been reported: adjacent interacting G4s, and G4s with stable long loops forming stem-loop structures. As there are no specialized tools to identify these conformations, we developed QPARSE. Results QPARSE can exhaustively search for degenerate potential quadruplex-forming sequences (PQSs) containing bulges and/or mismatches at genomic level, as well as either multimeric or long-looped PQS (MPQS and LLPQS respectively). While its assessment vs. known reference datasets is comparable with the state-of-the-art, what is more interesting is its performance in the identification of MPQS and LLPQS that present algorithms are not designed to search for. We report a comprehensive analysis of MPQS in human gene promoters and the analysis of LLPQS on three experimentally validated case studies from HIV-1, BCL2, and hTERT. Availability QPARSE is freely accessible on the web at http://www.medcomp.medicina.unipd.it/qparse/index or downloadable from github as a python 2.7 program https://github.com/B3rse/qparse Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

MMseqs2 desktop and local web server app for fast, interactive sequence searches

Bioinformatics ◽

10.1093/bioinformatics/bty1057 ◽

2019 ◽

Vol 35 (16) ◽

pp. 2856-2858 ◽

Cited By ~ 17

Author(s):

Milot Mirdita ◽

Martin Steinegger ◽

Johannes Söding

Keyword(s):

Protein Sequence ◽

Response Times ◽

Web Server ◽

Supplementary Information ◽

Supplementary Data ◽

Server Application ◽

The Web

Abstract Summary The MMseqs2 desktop and web server app facilitates interactive sequence searches through custom protein sequence and profile databases on personal workstations. By eliminating MMseqs2’s runtime overhead, we reduced response times to a few seconds at sensitivities close to BLAST. Availability and implementation The app is easy to install for non-experts. GPLv3-licensed code, pre-built desktop app packages for Windows, MacOS and Linux, Docker images for the web server application and a demo web server are available at https://search.mmseqs.com. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

RBPSponge: genome-wide identification of lncRNAs that sponge RBPs

Bioinformatics ◽

10.1093/bioinformatics/btz448 ◽

2019 ◽

Vol 35 (22) ◽

pp. 4760-4763 ◽

Cited By ~ 7

Author(s):

Saber HafezQorani ◽

Aissa Houdjedj ◽

Mehmet Arici ◽

Abdesselam Said ◽

Hilal Kazan

Keyword(s):

Binding Sites ◽

Regulatory Network ◽

Target Genes ◽

Rna Binding ◽

Rna Binding Protein ◽

Supplementary Information ◽

Web Tool ◽

Genome Wide ◽

Non Coding Rnas ◽

The Web

Abstract Summary Long non-coding RNAs (lncRNAs) can act as molecular sponge or decoys for an RNA-binding protein (RBP) through their RBP-binding sites, thereby modulating the expression of all target genes of the corresponding RBP of interest. Here, we present a web tool named RBPSponge to explore lncRNAs based on their potential to act as a sponge for an RBP of interest. RBPSponge identifies the occurrences of RBP-binding sites and CLIP peaks on lncRNAs, and enables users to run statistical analyses to investigate the regulatory network between lncRNAs, RBPs and targets of RBPs. Availability and implementation The web server is available at https://www.RBPSponge.com. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

DoubleRecViz: a web-based tool for visualizing transcript–gene–species tree reconciliation

Bioinformatics ◽

10.1093/bioinformatics/btaa882 ◽

2020 ◽

Author(s):

Esaie Kuitche ◽

Yanchun Qi ◽

Nadia Tahiri ◽

Jack Parmer ◽

Aïda Ouangraoua

Keyword(s):

Phylogenetic Tree ◽

Phylogenetic Trees ◽

Source Code ◽

Species Tree ◽

Supplementary Information ◽

Dynamic Visualization ◽

Supplementary Data ◽

Web Based ◽

Tree Reconciliation ◽

Transcript Gene

Abstract Motivation A phylogenetic tree reconciliation is a mapping of one phylogenetic tree onto another which represents the co-evolution of two sets of taxa (e.g. parasite–host co-evolution, gene–species co-evolution). The reconciliation framework was extended to allow modeling the co-evolution of three sets of taxa such as transcript–gene–species co-evolutions. Several web-based tools have been developed for the display and manipulation of phylogenetic trees and co-phylogenetic trees involving two trees, but there currently exists no tool for visualizing the joint reconciliation between three phylogenetic trees. Results Here, we present DoubleRecViz, a web-based tool for visualizing double reconciliations between phylogenetic trees at three levels: transcript, gene and species. DoubleRecViz extends the RecPhyloXML model—developed for gene–species tree reconciliation—to represent joint transcript–gene and gene–species tree reconciliations. It is implemented using the Dash library, which is a toolbox that provides dynamic visualization functionalities for web data visualization in Python. Availability and implementation DoubleRecViz is available through a web server at https://doublerecviz.cobius.usherbrooke.ca. The source code and information about installation procedures are also available at https://github.com/UdeS-CoBIUS/DoubleRecViz. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

eMPRess: a systematic cophylogeny reconciliation tool

Bioinformatics ◽

10.1093/bioinformatics/btaa978 ◽

2020 ◽

Author(s):

Santi Santichaivekin ◽

Qing Yang ◽

Jingyi Liu ◽

Ross Mawhorter ◽

Justin Jiang ◽

...

Keyword(s):

Phylogenetic Tree ◽

Supplementary Information ◽

Supplementary Data ◽

Tree Reconciliation ◽

Loss Model ◽

Software Program

Abstract Summary We describe eMPRess, a software program for phylogenetic tree reconciliation under the duplication-transfer-loss model that systematically addresses the problems of choosing event costs and selecting representative solutions, enabling users to make more robust inferences. Availability and implementation eMPRess is freely available at http://www.cs.hmc.edu/empress. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text