Topology-independent and global protein structure alignment through an FFT-based algorithm

Bioinformatics ◽

10.1093/bioinformatics/btz609 ◽

2019 ◽

Author(s):

Zeyu Wen ◽

Jiahua He ◽

Sheng-You Huang

Keyword(s):

Protein Structure ◽

Search Algorithm ◽

Pairwise Alignment ◽

Structure Alignment ◽

Protein Structure Alignment ◽

Computationally Efficient ◽

Search Feature ◽

Test Sets ◽

Suboptimal Alignment ◽

Difficult Cases

Abstract Motivation Protein structure alignment is one of the fundamental problems in computational structure biology. A variety of algorithms have been developed to address this important issue in the past decade. However, due to their heuristic nature, current structure alignment methods may suffer from suboptimal alignment and/or over-fragmentation and thus lead to a biologically wrong alignment in some cases. To overcome these limitations, we have developed an accurate topology-independent and global structure alignment method through an FFT-based exhaustive search algorithm, which is referred to as FTAlign. Results Our FTAlign algorithm was extensively tested on six commonly used datasets and compared with seven state-of-the-art structure alignment approaches, TMalign, DeepAlign, Kpax, 3DCOMB, MICAN, SPalignNS and CLICK. It was shown that FTAlign outperformed the other methods in reproducing manually curated alignments and obtained a high success rate of 96.7 and 90.0% on two gold-standard benchmarks, MALIDUP and MALISAM, respectively. Moreover, FTAlign also achieved the overall best performance in terms of biologically meaningful structure overlap (SO) and TMscore on both the sequential alignment test sets including MALIDUP, MALISAM and 64 difficult cases from HOMSTRAD, and the non-sequential sets including MALIDUP-NS, MALISAM-NS, 199 topology-different cases, where FTAlign especially showed more advantage for non-sequential alignment. Despite its global search feature, FTAlign is also computationally efficient and can normally complete a pairwise alignment within one second. Availability and implementation http://huanglab.phys.hust.edu.cn/ftalign/.

Download Full-text

OPTIMAL PAIRWISE ALIGNMENT OF FIXED PROTEIN STRUCTURES IN SUBQUADRATIC TIME

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720011005562 ◽

2011 ◽

Vol 09 (03) ◽

pp. 367-382 ◽

Cited By ~ 7

Author(s):

ALEKSANDAR POLEKSIC

Keyword(s):

Protein Structure ◽

Structural Alignment ◽

Protein Structures ◽

Dynamic Programming Algorithm ◽

Pairwise Alignment ◽

Structure Alignment ◽

Programming Algorithm ◽

Protein Structure Alignment ◽

Running Time ◽

Speed Accuracy

The problem of finding an optimal structural alignment for a pair of superimposed proteins is often amenable to the Smith–Waterman dynamic programming algorithm, which runs in time proportional to the product of lengths of the sequences being aligned. While the quadratic running time is acceptable for computing a single alignment of two fixed protein structures, the time complexity becomes a bottleneck when running the Smith–Waterman routine multiple times in order to find a globally optimal superposition and alignment of the input proteins. We present a subquadratic running time algorithm capable of computing an alignment that optimizes one of the most widely used measures of protein structure similarity, defined as the number of pairs of residues in two proteins that can be superimposed under a predefined distance cutoff. The algorithm presented in this article can be used to significantly improve the speed–accuracy tradeoff in a number of popular protein structure alignment methods.

Download Full-text

PROTEIN STRUCTURE ALIGNMENT AND FAST SIMILARITY SEARCH USING LOCAL SHAPE SIGNATURES

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720004000533 ◽

2004 ◽

Vol 02 (01) ◽

pp. 215-239 ◽

Cited By ~ 4

Author(s):

TOLGA CAN ◽

YUAN-FANG WANG

Keyword(s):

Protein Structure ◽

Protein Structures ◽

Structure Alignment ◽

Protein Structure Alignment ◽

Specific Information ◽

Alignment Algorithm ◽

Screening Process ◽

Domain Specific ◽

Local Sequence ◽

Shape Signatures

We present a new method for conducting protein structure similarity searches, which improves on the efficiency of some existing techniques. Our method is grounded in the theory of differential geometry on 3D space curve matching. We generate shape signatures for proteins that are invariant, localized, robust, compact, and biologically meaningful. The invariancy of the shape signatures allows us to improve similarity searching efficiency by adopting a hierarchical coarse-to-fine strategy. We index the shape signatures using an efficient hashing-based technique. With the help of this technique we screen out unlikely candidates and perform detailed pairwise alignments only for a small number of candidates that survive the screening process. Contrary to other hashing based techniques, our technique employs domain specific information (not just geometric information) in constructing the hash key, and hence, is more tuned to the domain of biology. Furthermore, the invariancy, localization, and compactness of the shape signatures allow us to utilize a well-known local sequence alignment algorithm for aligning two protein structures. One measure of the efficacy of the proposed technique is that we were able to perform structure alignment queries 36 times faster (on the average) than a well-known method while keeping the quality of the query results at an approximately similar level.

Download Full-text

Fast and accurate non-sequential protein structure alignment using a new asymmetric linear sum assignment heuristic

Bioinformatics ◽

10.1093/bioinformatics/btv580 ◽

2015 ◽

Vol 32 (3) ◽

pp. 370-377 ◽

Cited By ~ 11

Author(s):

Peter Brown ◽

Wayne Pullan ◽

Yuedong Yang ◽

Yaoqi Zhou

Keyword(s):

Protein Structure ◽

Structure Alignment ◽

Protein Structure Alignment

Download Full-text

Implementation of a Parallel Protein Structure Alignment Service on Cloud

International Journal of Genomics ◽

10.1155/2013/439681 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 17

Author(s):

Che-Lun Hung ◽

Yaw-Ling Lin

Keyword(s):

Protein Structure ◽

Programming Model ◽

Protein Structures ◽

Structure Alignment ◽

Evolutionary Relationships ◽

Protein Structure Alignment ◽

Alignment Algorithm ◽

Cloud Platform ◽

Computational Performance ◽

Refinement Algorithm

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.

Download Full-text