OPTIMAL PAIRWISE ALIGNMENT OF FIXED PROTEIN STRUCTURES IN SUBQUADRATIC TIME

The problem of finding an optimal structural alignment for a pair of superimposed proteins is often amenable to the Smith–Waterman dynamic programming algorithm, which runs in time proportional to the product of lengths of the sequences being aligned. While the quadratic running time is acceptable for computing a single alignment of two fixed protein structures, the time complexity becomes a bottleneck when running the Smith–Waterman routine multiple times in order to find a globally optimal superposition and alignment of the input proteins. We present a subquadratic running time algorithm capable of computing an alignment that optimizes one of the most widely used measures of protein structure similarity, defined as the number of pairs of residues in two proteins that can be superimposed under a predefined distance cutoff. The algorithm presented in this article can be used to significantly improve the speed–accuracy tradeoff in a number of popular protein structure alignment methods.

Download Full-text

GADP-align: A genetic algorithm and dynamic programming-based method for structural alignment of proteins

Bioimpacts ◽

10.34172/bi.2021.37 ◽

2020 ◽

Vol 11 (4) ◽

pp. 271-279

Author(s):

Soraya Mirzaei ◽

Jafar Razmara ◽

Shahriar Lotfi

Keyword(s):

Genetic Algorithm ◽

Dynamic Programming ◽

Protein Structure ◽

Hybrid Method ◽

Structural Alignment ◽

Dynamic Programming Algorithm ◽

Structure Alignment ◽

Programming Algorithm ◽

Programming Technique ◽

Iterative Dynamic Programming

Introduction: Similarity analysis of protein structure is considered as a fundamental step to give insight into the relationships between proteins. The primary step in structural alignment is looking for the optimal correspondence between residues of two structures to optimize the scoring function. An exhaustive search for finding such a correspondence between two structures is intractable. Methods: In this paper, a hybrid method is proposed, namely GADP-align, for pairwise protein structure alignment. The proposed method looks for an optimal alignment using a hybrid method based on a genetic algorithm and an iterative dynamic programming technique. To this end, the method first creates an initial map of correspondence between secondary structure elements (SSEs) of two proteins. Then, a genetic algorithm combined with an iterative dynamic programming algorithm is employed to optimize the alignment. Results: The GADP-align algorithm was employed to align 10 ‘difficult to align’ protein pairs in order to evaluate its performance. The experimental study shows that the proposed hybrid method produces highly accurate alignments in comparison with the methods using exactly the dynamic programming technique. Furthermore, the proposed method prevents the local optimal traps caused by the unsuitable initial guess of the corresponding residues. Conclusion: The findings of this paper demonstrate that employing the genetic algorithm along with the dynamic programming technique yields highly accurate alignments between a protein pair by exploring the global alignment and avoiding trapping in local alignments.

Download Full-text

PROTEIN STRUCTURE ALIGNMENT AND FAST SIMILARITY SEARCH USING LOCAL SHAPE SIGNATURES

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720004000533 ◽

2004 ◽

Vol 02 (01) ◽

pp. 215-239 ◽

Cited By ~ 4

Author(s):

TOLGA CAN ◽

YUAN-FANG WANG

Keyword(s):

Protein Structure ◽

Protein Structures ◽

Structure Alignment ◽

Protein Structure Alignment ◽

Specific Information ◽

Alignment Algorithm ◽

Screening Process ◽

Domain Specific ◽

Local Sequence ◽

Shape Signatures

We present a new method for conducting protein structure similarity searches, which improves on the efficiency of some existing techniques. Our method is grounded in the theory of differential geometry on 3D space curve matching. We generate shape signatures for proteins that are invariant, localized, robust, compact, and biologically meaningful. The invariancy of the shape signatures allows us to improve similarity searching efficiency by adopting a hierarchical coarse-to-fine strategy. We index the shape signatures using an efficient hashing-based technique. With the help of this technique we screen out unlikely candidates and perform detailed pairwise alignments only for a small number of candidates that survive the screening process. Contrary to other hashing based techniques, our technique employs domain specific information (not just geometric information) in constructing the hash key, and hence, is more tuned to the domain of biology. Furthermore, the invariancy, localization, and compactness of the shape signatures allow us to utilize a well-known local sequence alignment algorithm for aligning two protein structures. One measure of the efficacy of the proposed technique is that we were able to perform structure alignment queries 36 times faster (on the average) than a well-known method while keeping the quality of the query results at an approximately similar level.

Download Full-text

Implementation of a Parallel Protein Structure Alignment Service on Cloud

International Journal of Genomics ◽

10.1155/2013/439681 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 17

Author(s):

Che-Lun Hung ◽

Yaw-Ling Lin

Keyword(s):

Protein Structure ◽

Programming Model ◽

Protein Structures ◽

Structure Alignment ◽

Evolutionary Relationships ◽

Protein Structure Alignment ◽

Alignment Algorithm ◽

Cloud Platform ◽

Computational Performance ◽

Refinement Algorithm

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.

Download Full-text

TOP: a new method for protein structure comparisons and similarity searches

Journal of Applied Crystallography ◽

10.1107/s0021889899012339 ◽

2000 ◽

Vol 33 (1) ◽

pp. 176-183 ◽

Cited By ~ 149

Author(s):

Guoguang Lu

Keyword(s):

User Interface ◽

Protein Structure ◽

Protein Structures ◽

Three Dimensional ◽

Data Bank ◽

Structure Alignment ◽

Dimensional Structure ◽

Protein Structure Alignment ◽

Protein Structure Analysis ◽

Structure Comparison

In order to facilitate the three-dimensional structure comparison of proteins, software for making comparisons and searching for similarities to protein structures in databases has been developed. The program identifies the residues that share similar positions of both main-chain and side-chain atoms between two proteins. The unique functions of the software also include database processingviaInternet- and Web-based servers for different types of users. The developed method and its friendly user interface copes with many of the problems that frequently occur in protein structure comparisons, such as detecting structurally equivalent residues, misalignment caused by coincident match of Cαatoms, circular sequence permutations, tedious repetition of access, maintenance of the most recent database, and inconvenience of user interface. The program is also designed to cooperate with other tools in structural bioinformatics, such as the 3DB Browser software [Prilusky (1998).Protein Data Bank Q. Newslett.84, 3–4] and the SCOP database [Murzin, Brenner, Hubbard & Chothia (1995).J. Mol. Biol.247, 536–540], for convenient molecular modelling and protein structure analysis. A similarity ranking score of `structure diversity' is proposed in order to estimate the evolutionary distance between proteins based on the comparisons of their three-dimensional structures. The function of the program has been utilized as a part of an automated program for multiple protein structure alignment. In this paper, the algorithm of the program and results of systematic tests are presented and discussed.

Download Full-text

MatAlign: PRECISE PROTEIN STRUCTURE COMPARISON BY MATRIX ALIGNMENT

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720006002417 ◽

2006 ◽

Vol 04 (06) ◽

pp. 1197-1216 ◽

Cited By ~ 18

Author(s):

ZEYAR AUNG ◽

KIAN-LEE TAN

Keyword(s):

Protein Structure ◽

Protein Structures ◽

Scoring Function ◽

Structure Alignment ◽

Supplementary Information ◽

Protein Structure Alignment ◽

Initial Alignment ◽

Structure Comparison ◽

Structural Database ◽

Step Algorithm

We propose a detailed protein structure alignment method named "MatAlign". It is a two-step algorithm. Firstly, we represent 3D protein structures as 2D distance matrices, and align these matrices by means of dynamic programming in order to find the initially aligned residue pairs. Secondly, we refine the initial alignment iteratively into the optimal one according to an objective scoring function. We compare our method against DALI and CE, which are among the most accurate and the most widely used of the existing structural comparison tools. On the benchmark set of 68 protein structure pairs by Fischer et al., MatAlign provides better alignment results, according to four different criteria, than both DALI and CE in a majority of cases. MatAlign also performs as well in structural database search as DALI does, and much better than CE does. MatAlign is about two to three times faster than DALI, and has about the same speed as CE. The software and the supplementary information for this paper are available at . .

Download Full-text

ALIGNING MULTIPLE PROTEIN STRUCTURES BY DETERMINISTIC ANNEALING

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720005001351 ◽

2005 ◽

Vol 03 (04) ◽

pp. 837-860 ◽

Cited By ~ 5

Author(s):

TIANSHOU ZHOU ◽

LUONAN CHEN ◽

YUN TANG ◽

XIANGSUN ZHANG

Keyword(s):

Protein Structure ◽

Structure Prediction ◽

Protein Structures ◽

Structure Alignment ◽

Deterministic Annealing ◽

Protein Structure Alignment ◽

Protein Chain ◽

Fold Family ◽

Multiple Protein ◽

Wide Range

Protein structure alignment plays a key role in protein structure prediction and fold family classification. An efficient method for multiple protein structure alignment in a mathematical manner is presented, based on deterministic annealing technique. The alignment problem is mapped onto a nonlinear continuous optimization problem (NCOP) with common consensus chain, matching assignment matrices and atomic coordinates as variables. At each step in the annealing procedure, the NCOP is decomposed into as many subproblems as the number of protein chains, each of which is actually an independent pairwise structure alignment between a protein chain and the consensus chain and hence can be efficiently solved by the parallel computation technique. The proposed method is robust with respect to choice of iteration parameters for a wide range of proteins, and performs well in both multiple and pairwise structure alignment cases, compared with existing alignment methods.

Download Full-text

Bayesian comparison of protein structures using partial Procrustes distance

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2016-0014 ◽

2017 ◽

Vol 16 (4) ◽

Cited By ~ 2

Author(s):

Nasim Ejlali ◽

Mohammad Reza Faghihi ◽

Mehdi Sadeghi

Keyword(s):

Protein Structure ◽

Statistical Methods ◽

Bayesian Model ◽

Protein Structures ◽

Structure Alignment ◽

Protein Structure Alignment ◽

Geometric Information ◽

Procrustes Distance

AbstractAn important topic in bioinformatics is the protein structure alignment. Some statistical methods have been proposed for this problem, but most of them align two protein structures based on the global geometric information without considering the effect of neighbourhood in the structures. In this paper, we provide a Bayesian model to align protein structures, by considering the effect of both local and global geometric information of protein structures. Local geometric information is incorporated to the model through the partial Procrustes distance of small substructures. These substructures are composed of

Download Full-text

A Visualization Tool to Evaluate Pairwise Protein Structure Alignment Algorithms

10.1101/342899 ◽

2018 ◽

Author(s):

Shalini Bhattacharjee ◽

Asish Mukhopadhyay

Keyword(s):

Protein Structure ◽

Fundamental Problem ◽

Protein Structures ◽

Structural Bioinformatics ◽

Structure Alignment ◽

Spatial Proximity ◽

Protein Structure Alignment ◽

Visualization Tool ◽

Novel Approach ◽

Alignment Algorithms

AbstractThe alignment of two protein structures is a fundamental problem in structural bioinformatics. In this paper, we propose a novel approach to measure the effectiveness of a sample of three such algorithms, DALI, TM-align and EDAlignsse. The underlying premise of our approach is that structural proximity should translate into spatial proximity.

Download Full-text

Topology-independent and global protein structure alignment through an FFT-based algorithm

Bioinformatics ◽

10.1093/bioinformatics/btz609 ◽

2019 ◽

Author(s):

Zeyu Wen ◽

Jiahua He ◽

Sheng-You Huang

Keyword(s):

Protein Structure ◽

Search Algorithm ◽

Pairwise Alignment ◽

Structure Alignment ◽

Protein Structure Alignment ◽

Computationally Efficient ◽

Search Feature ◽

Test Sets ◽

Suboptimal Alignment ◽

Difficult Cases

Abstract Motivation Protein structure alignment is one of the fundamental problems in computational structure biology. A variety of algorithms have been developed to address this important issue in the past decade. However, due to their heuristic nature, current structure alignment methods may suffer from suboptimal alignment and/or over-fragmentation and thus lead to a biologically wrong alignment in some cases. To overcome these limitations, we have developed an accurate topology-independent and global structure alignment method through an FFT-based exhaustive search algorithm, which is referred to as FTAlign. Results Our FTAlign algorithm was extensively tested on six commonly used datasets and compared with seven state-of-the-art structure alignment approaches, TMalign, DeepAlign, Kpax, 3DCOMB, MICAN, SPalignNS and CLICK. It was shown that FTAlign outperformed the other methods in reproducing manually curated alignments and obtained a high success rate of 96.7 and 90.0% on two gold-standard benchmarks, MALIDUP and MALISAM, respectively. Moreover, FTAlign also achieved the overall best performance in terms of biologically meaningful structure overlap (SO) and TMscore on both the sequential alignment test sets including MALIDUP, MALISAM and 64 difficult cases from HOMSTRAD, and the non-sequential sets including MALIDUP-NS, MALISAM-NS, 199 topology-different cases, where FTAlign especially showed more advantage for non-sequential alignment. Despite its global search feature, FTAlign is also computationally efficient and can normally complete a pairwise alignment within one second. Availability and implementation http://huanglab.phys.hust.edu.cn/ftalign/.

Download Full-text

PAIRWISE PROTEIN STRUCTURE ALIGNMENT BASED ON AN ORIENTATION-INDEPENDENT BACKBONE REPRESENTATION

Journal of Bioinformatics and Computational Biology ◽

10.1142/s021972000400082x ◽

2004 ◽

Vol 02 (04) ◽

pp. 699-717 ◽

Cited By ~ 7

Author(s):

JIEPING YE ◽

RAVI JANARDAN ◽

SONGTAO LIU

Keyword(s):

Dynamic Programming ◽

Protein Structure ◽

Protein Structures ◽

Rigid Motion ◽

The Other ◽

Structure Alignment ◽

Evolutionary Relationships ◽

Protein Structure Alignment ◽

Initial Alignment ◽

Protein Backbones

Determining structural similarities between proteins is an important problem since it can help identify functional and evolutionary relationships. In this paper, an algorithm is proposed to align two protein structures. Given the protein backbones, the algorithm finds a rigid motion of one backbone onto the other such that large substructures are matched. The algorithm uses a representation of the backbones that is independent of their relative orientations in space and applies dynamic programming to this representation to compute an initial alignment, which is then refined iteratively. Experiments indicate that the algorithm is competitive with two well-known algorithms, namely DALI and LOCK.

Download Full-text