IGLOSS: iterative gapless local similarity search

Braslav Rabar; Maja Zagorščak; Strahil Ristov; Martin Rosenzweig; Pavle Goldstein

doi:10.1093/bioinformatics/btz086

IGLOSS: iterative gapless local similarity search

Bioinformatics ◽

10.1093/bioinformatics/btz086 ◽

2019 ◽

Vol 35 (18) ◽

pp. 3491-3492

Author(s):

Braslav Rabar ◽

Maja Zagorščak ◽

Strahil Ristov ◽

Martin Rosenzweig ◽

Pavle Goldstein

Keyword(s):

Parameter Estimation ◽

Similarity Search ◽

Sequence Similarity ◽

Web Server ◽

Supplementary Information ◽

Local Similarity ◽

Supplementary Data ◽

Matching Algorithm ◽

Local Sequence ◽

Sequence Patterns

Abstract Summary Searching for local sequence patterns is one of the basic tasks in bioinformatics. Sequence patterns might have structural, functional or some other relevance, and numerous methods have been developed to detect and analyze them. These methods often depend on the wealth of information already collected. The explosion in the number of newly available sequences calls for novel methods to explore local sequence similarity. We have developed a new method for iterative motif scanning that will look for ungapped sequence patterns similar to a submitted query. Using careful parameter estimation and an adaptation of a fast string-matching algorithm, the method performs significantly better in this context than the existing software. Availability and implementation The IGLOSS web server is available at http://compbioserv.math.hr/igloss/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Minimally-overlapping words for sequence similarity search

Bioinformatics ◽

10.1093/bioinformatics/btaa1054 ◽

2020 ◽

Author(s):

Martin C Frith ◽

Laurent Noé ◽

Gregory Kucherov

Keyword(s):

Similarity Search ◽

Sequence Similarity ◽

Random Sequence ◽

Supplementary Information ◽

Sequence Similarity Search ◽

Supplementary Data ◽

Huge Data ◽

Open Questions ◽

Seeding Method ◽

Genetic Sequences

Abstract Motivation Analysis of genetic sequences is usually based on finding similar parts of sequences, e.g. DNA reads and/or genomes. For big data, this is typically done via “seeds”: simple similarities (e.g. exact matches) that can be found quickly. For huge data, sparse seeding is useful, where we only consider seeds at a subset of positions in a sequence. Results Here we study a simple sparse-seeding method: using seeds at positions of certain “words” (e.g. ac, at, gc, or gt). Sensitivity is maximized by using words with minimal overlaps. That is because, in a random sequence, minimally-overlapping words are anti-clumped. We provide evidence that this is often superior to acclaimed “minimizer” sparse-seeding methods. Our approach can be unified with design of inexact (spaced and subset) seeds, further boosting sensitivity. Thus, we present a promising approach to sequence similarity search, with open questions on how to optimize it. Availability and Implementation Software to design and test minimally-overlapping words is freely available at https://gitlab.com/mcfrith/noverlap. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

COVID-19 Docking Server: a meta server for docking small molecules, peptides and antibodies against potential targets of COVID-19

Bioinformatics ◽

10.1093/bioinformatics/btaa645 ◽

2020 ◽

Vol 36 (20) ◽

pp. 5109-5111 ◽

Cited By ~ 10

Author(s):

Ren Kong ◽

Guangbo Yang ◽

Rui Xue ◽

Ming Liu ◽

Feng Wang ◽

...

Keyword(s):

Life Cycle ◽

Drug Discovery ◽

Small Molecules ◽

Web Server ◽

Supplementary Information ◽

Binding Modes ◽

Supplementary Data ◽

Virus Life Cycle ◽

Ligand Interactions ◽

New Type

Abstract Motivation The coronavirus disease 2019 (COVID-19) caused by a new type of coronavirus has been emerging from China and led to thousands of death globally since December 2019. Despite many groups have engaged in studying the newly emerged virus and searching for the treatment of COVID-19, the understanding of the COVID-19 target–ligand interactions represents a key challenge. Herein, we introduce COVID-19 Docking Server, a web server that predicts the binding modes between COVID-19 targets and the ligands including small molecules, peptides and antibodies. Results Structures of proteins involved in the virus life cycle were collected or constructed based on the homologs of coronavirus, and prepared ready for docking. The meta-platform provides a free and interactive tool for the prediction of COVID-19 target–ligand interactions and following drug discovery for COVID-19. Availability and implementation http://ncov.schanglab.org.cn. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

HotSpot3D web server: an integrated resource for mutation analysis in protein 3D structures

Bioinformatics ◽

10.1093/bioinformatics/btaa258 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3944-3946 ◽

Cited By ~ 2

Author(s):

Shanyu Chen ◽

Xiaoyu He ◽

Ruilin Li ◽

Xiaohong Duan ◽

Beifang Niu

Keyword(s):

Mutation Analysis ◽

Web Server ◽

Cancer Genome ◽

The Cancer Genome Atlas ◽

Supplementary Information ◽

Supplementary Data ◽

3D Structures ◽

One Stop ◽

Cancer Genome Atlas ◽

Genome Atlas

Abstract Motivation HotSpot3D is a widely used software for identifying mutation hotspots on the 3D structures of proteins. To further assist users, we developed a new HotSpot3D web server to make this software more versatile, convenient and interactive. Results The HotSpot3D web server performs data pre-processing, clustering, visualization and log-viewing on one stop. Users can interactively explore each cluster and easily re-visualize the mutational clusters within browsers. We also provide a database that allows users to search and visualize proximal mutations from 33 cancers in the Cancer Genome Atlas. Availability and implementation http://niulab.scgrid.cn/HotSpot3D/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

SEABED: Small molEcule activity scanner weB servicE baseD

Bioinformatics ◽

10.1093/bioinformatics/btu709 ◽

2014 ◽

Vol 31 (5) ◽

pp. 773-775 ◽

Cited By ~ 3

Author(s):

Carlos Fenollosa ◽

Marcel Otón ◽

Pau Andrio ◽

Jorge Cortés ◽

Modesto Orozco ◽

...

Keyword(s):

Web Service ◽

Web Server ◽

Software As A Service ◽

Supplementary Information ◽

Supplementary Data ◽

Ensemble Docking ◽

Web Tools ◽

Protein Mutants ◽

User Friendly ◽

Hybrid Docking

Abstract Motivation: The SEABED web server integrates a variety of docking and QSAR techniques in a user-friendly environment. SEABED goes beyond the basic docking and QSAR web tools and implements extended functionalities like receptor preparation, library editing, flexible ensemble docking, hybrid docking/QSAR experiments or virtual screening on protein mutants. SEABED is not a monolithic workflow tool but Software as a Service platform. Availability and implementation: SEABED is a free web server available athttp://www.bsc.es/SEABED. No registration is required. Contact: [email protected] Supplementary information: Supplementary data are available atBioinformatics online.

Download Full-text

Estimating the power of sequence covariation for detecting conserved RNA structure

Bioinformatics ◽

10.1093/bioinformatics/btaa080 ◽

2020 ◽

Vol 36 (10) ◽

pp. 3072-3076 ◽

Cited By ~ 11

Author(s):

Elena Rivas ◽

Jody Clements ◽

Sean R Eddy

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Rna Structure ◽

Rna Secondary Structure ◽

Source Code ◽

Web Server ◽

Supplementary Information ◽

Supplementary Data ◽

Detection Power ◽

Non Coding Rnas

Abstract Pairwise sequence covariations are a signal of conserved RNA secondary structure. We describe a method for distinguishing when lack of covariation signal can be taken as evidence against a conserved RNA structure, as opposed to when a sequence alignment merely has insufficient variation to detect covariations. We find that alignments for several long non-coding RNAs previously shown to lack covariation support do have adequate covariation detection power, providing additional evidence against their proposed conserved structures. Availability and implementation The R-scape web server is at eddylab.org/R-scape, with a link to download the source code. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

The HMMER Web Server for Protein Sequence Similarity Search

Current Protocols in Bioinformatics ◽

10.1002/cpbi.40 ◽

2017 ◽

Vol 60 (1) ◽

Cited By ~ 26

Author(s):

Ananth Prakash ◽

Matt Jeffryes ◽

Alex Bateman ◽

Robert D. Finn

Keyword(s):

Protein Sequence ◽

Similarity Search ◽

Sequence Similarity ◽

Web Server ◽

Sequence Similarity Search ◽

Protein Sequence Similarity

Download Full-text

MUFold-SSW: a new web server for predicting protein secondary structures, torsion angles and turns

Bioinformatics ◽

10.1093/bioinformatics/btz712 ◽

2019 ◽

Cited By ~ 2

Author(s):

Chao Fang ◽

Zhaoyu Li ◽

Dong Xu ◽

Yi Shang

Keyword(s):

Sequence Similarity ◽

Protein Secondary Structure ◽

Web Server ◽

Experimental Tests ◽

Secondary Structures ◽

Supplementary Information ◽

Beta Turns ◽

Torsion Angles ◽

Protein Secondary Structures ◽

Protein Functions

Abstract Motivation Protein secondary structure and backbone torsion angle prediction can provide important information for predicting protein 3D structures and protein functions. Our new methods MUFold-SS, MUFold-Angle, MUFold-BetaTurn and MUFold-GammaTurn, developed based on advanced deep neural networks, achieved state-of-the-art performance for predicting secondary structures, backbone torsion angles, beta-turns and gamma-turns, respectively. An easy-to-use web service will provide the community a convenient way to use these methods for research and development. Results MUFold-SSW, a new web server, is presented. It provides predictions of protein secondary structures, torsion angles, beta-turns and gamma-turns for a given protein sequence. This server implements MUFold-SS, MUFold-Angle, MUFold-BetaTurn and MUFold-GammaTurn, which performed well for both easy targets (proteins with weak sequence similarity in PDB) and hard targets (proteins without detectable similarity in PDB) in various experimental tests, achieving results better than or comparable with those of existing methods. Availability and implementation MUFold-SSW is accessible at http://mufold.org/mufold-ss-angle. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

The TMCrys server for supporting crystallization of transmembrane proteins

Bioinformatics ◽

10.1093/bioinformatics/btz108 ◽

2019 ◽

Vol 35 (20) ◽

pp. 4203-4204

Author(s):

Julia K Varga ◽

Gábor E Tusnády

Keyword(s):

Membrane Proteins ◽

Structure Determination ◽

Web Server ◽

Transmembrane Proteins ◽

Prediction Method ◽

Supplementary Information ◽

Supplementary Data ◽

Successful Completion ◽

Determination Process

Abstract Motivation Due to their special properties, the structures of transmembrane proteins are extremely hard to determine. Several methods exist to predict the propensity of successful completion of the structure determination process. However, available predictors incorporate data of any kind of proteins, hence they can hardly differentiate between crystallizable and non-crystallizable membrane proteins. Results We implemented a web server to simplify running TMCrys prediction method that was developed specifically to separate crystallizable and non-crystallizable membrane proteins. Availability and implementation http://tmcrys.enzim.ttk.mta.hu Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

BLANT—fast graphlet sampling tool

Bioinformatics ◽

10.1093/bioinformatics/btz603 ◽

2019 ◽

Vol 35 (24) ◽

pp. 5363-5364

Author(s):

Sridevi Maharaj ◽

Brennan Tracy ◽

Wayne B Hayes

Keyword(s):

Functional Similarity ◽

Supplementary Information ◽

Local Alignment ◽

Supplementary Data ◽

Input Graph ◽

Sequence Alignments ◽

Ppi Networks ◽

Local Sequence ◽

Taxonomic Trees

Abstract Summary BLAST creates local sequence alignments by first building a database of small k-letter sub-sequences called k-mers. Identical k-mers from different regions provide ‘seeds’ for longer local alignments. This seed-and-extend heuristic makes BLAST extremely fast and has led to its almost exclusive use despite the existence of more accurate, but slower, algorithms. In this paper, we introduce the Basic Local Alignment for Networks Tool (BLANT). BLANT is the analog of BLAST, but for networks: given an input graph, it samples small, induced, k-node sub-graphs called k-graphlets. Graphlets have been used to classify networks, quantify structure, align networks both locally and globally, identify topology-function relationships and build taxonomic trees without the use of sequences. Given an input network, BLANT produces millions of graphlet samples in seconds—orders of magnitude faster than existing methods. BLANT offers sampled graphlets in various forms: distributions of graphlets or their orbits; graphlet degree or graphlet orbit degree vectors, the latter being compatible with ORCA; or an index to be used as the basis for seed-and-extend local alignments. We demonstrate BLANT’s usefelness by using its indexing mode to find functional similarity between yeast and human PPI networks. Availability and implementation BLANT is written in C and is available at https://github.com/waynebhayes/BLANT/releases. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

MMseqs2 desktop and local web server app for fast, interactive sequence searches

Bioinformatics ◽

10.1093/bioinformatics/bty1057 ◽

2019 ◽

Vol 35 (16) ◽

pp. 2856-2858 ◽

Cited By ~ 17

Author(s):

Milot Mirdita ◽

Martin Steinegger ◽

Johannes Söding

Keyword(s):

Protein Sequence ◽

Response Times ◽

Web Server ◽

Supplementary Information ◽

Supplementary Data ◽

Server Application ◽

The Web

Abstract Summary The MMseqs2 desktop and web server app facilitates interactive sequence searches through custom protein sequence and profile databases on personal workstations. By eliminating MMseqs2’s runtime overhead, we reduced response times to a few seconds at sensitivities close to BLAST. Availability and implementation The app is easy to install for non-experts. GPLv3-licensed code, pre-built desktop app packages for Windows, MacOS and Linux, Docker images for the web server application and a demo web server are available at https://search.mmseqs.com. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text