Sequence Specific Dihedral Angle Distribution: Application in Protein Structure Prediction and Evaluation

1970 ◽  
Vol 19 (2) ◽  
pp. 217-226
S. M. Minhaz Ud-Dean ◽  
Mahdi Muhammad Moosa

Protein structure prediction and evaluation is one of the major fields of computational biology. Estimation of dihedral angle can provide information about the acceptability of both theoretically predicted and experimentally determined structures. Here we report on the sequence specific dihedral angle distribution of high resolution protein structures available in PDB and have developed Sasichandran, a tool for sequence specific dihedral angle prediction and structure evaluation. This tool will allow evaluation of a protein structure in pdb format from the sequence specific distribution of Ramachandran angles. Additionally, it will allow retrieval of the most probable Ramachandran angles for a given sequence along with the sequence specific data. Key words: Torsion angle, φ-ψ distribution, sequence specific ramachandran plot, Ramasekharan, protein structure appraisal D.O.I. 10.3329/ptcb.v19i2.5439 Plant Tissue Cult. & Biotech. 19(2): 217-226, 2009 (December)

Arun G. Ingale

To predict the structure of protein from a primary amino acid sequence is computationally difficult. An investigation of the methods and algorithms used to predict protein structure and a thorough knowledge of the function and structure of proteins are critical for the advancement of biology and the life sciences as well as the development of better drugs, higher-yield crops, and even synthetic bio-fuels. To that end, this chapter sheds light on the methods used for protein structure prediction. This chapter covers the applications of modeled protein structures and unravels the relationship between pure sequence information and three-dimensional structure, which continues to be one of the greatest challenges in molecular biology. With this resource, it presents an all-encompassing examination of the problems, methods, tools, servers, databases, and applications of protein structure prediction, giving unique insight into the future applications of the modeled protein structures. In this chapter, current protein structure prediction methods are reviewed for a milieu on structure prediction, the prediction of structural fundamentals, tertiary structure prediction, and functional imminent. The basic ideas and advances of these directions are discussed in detail.

2019 ◽  
Jie Hou

Protein structure prediction is one of the most important scientific problems in the field of bioinformatics and computational biology. The availability of protein three-dimensional (3D) structure is crucial for studying biological and cellular functions of proteins. The importance of four major sub-problems in protein structure prediction have been clearly recognized. Those include, first, protein secondary structure prediction, second, protein fold recognition, third, protein quality assessment, and fourth, multi-domain assembly. In recent years, deep learning techniques have proved to be a highly effective machine learning method, which has brought revolutionary advances in computer vision, speech recognition and bioinformatics. In this dissertation, five contributions are described. First, DNSS2, a method for protein secondary structure prediction using one-dimensional deep convolution network. Second, DeepSF, a method of applying deep convolutional network to classify protein sequence into one of thousands known folds. Third, CNNQA and DeepRank, two deep neural network approaches to systematically evaluate the quality of predicted protein structures and select the most accurate model as the final protein structure prediction. Fourth, MULTICOM, a protein structure prediction system empowered by deep learning and protein contact prediction. Finally, SAXSDOM, a data-assisted method for protein domain assembly using small-angle X-ray scattering data. All the methods are available as software tools or web servers which are freely available to the scientific community.

2020 ◽  
Ronald Ayoub ◽  
Yugyung Lee

AbstractProtein structure prediction is a long-standing unsolved problem in molecular biology that has seen renewed interest with the recent success of deep learning with AlphaFold at CASP13. While developing and evaluating protein structure prediction methods, researchers may want to identify the most similar known structures to their predicted structures. These predicted structures often have low sequence and structure similarity to known structures. We show how RUPEE, a purely geometric protein structure search, is able to identify the structures most similar to structure predictions, regardless of how they vary from known structures, something existing protein structure searches struggle with. RUPEE accomplishes this through the use of a novel linear encoding of protein structures as a sequence of residue descriptors. Using a fast Needleman-Wunsch algorithm, RUPEE is able to perform alignments on the sequences of residue descriptors for every available structure. This is followed by a series of increasingly accurate structure alignments from TM-align alignments initialized with the Needleman-Wunsch residue descriptor alignments to standard TM-align alignments of the final results. By using alignment normalization effectively at each stage, RUPEE also can execute containment searches in addition to full-length searches to identify structural motifs within proteins. We compare the results of RUPEE to mTM-align, SSM, CATHEDRAL and VAST using a benchmark derived from the protein structure predictions submitted to CASP13. RUPEE identifies better alignments on average with respect to RMSD and TM-score as well as Q-score and SSAP-score, scores specific to SSM and CATHEDRAL, respectively. Finally, we show a sample of the top-scoring alignments that RUPEE identified that none of the other protein structure searches we compared to were able to identify.The RUPEE protein structure search is available at Code and data are available at

Jiaxi Liu ◽  

The prediction of protein three-dimensional structure from amino acid sequence has been a challenge problem in bioinformatics, owing to the many potential applications for robust protein structure prediction methods. Protein structure prediction is essential to bioscience, and its research results are important for other research areas. Methods for the prediction an才d design of protein structures have advanced dramatically. The prediction of protein structure based on average hydrophobic values is discussed and an improved genetic algorithm is proposed to solve the optimization problem of hydrophobic protein structure prediction. An adjustment operator is designed with the average hydrophobic value to prevent the overlapping of amino acid positions. Finally, some numerical experiments are conducted to verify the feasibility and effectiveness of the proposed algorithm by comparing with the traditional HNN algorithm.

2019 ◽  
Vol 17 (06) ◽  
pp. 1940013
Ahmed Bin Zaman ◽  
Amarda Shehu

An important goal in template-free protein structure prediction is how to control the quality of computed tertiary structures of a target amino-acid sequence. Despite great advances in algorithmic research, given the size, dimensionality, and inherent characteristics of the protein structure space, this task remains exceptionally challenging. It is current practice to aim to generate as many structures as can be afforded so as to increase the likelihood that some of them will reside near the sought but unknown biologically-active/native structure. When operating within a given computational budget, this is impractical and uninformed by any metrics of interest. In this paper, we propose instead to equip algorithms that generate tertiary structures, also known as decoy generation algorithms, with memory of the protein structure space that they explore. Specifically, we propose an evolving, granularity-controllable map of the protein structure space that makes use of low-dimensional representations of protein structures. Evaluations on diverse target sequences that include recent hard CASP targets show that drastic reductions in storage can be made without sacrificing decoy quality. The presented results make the case that integrating a map of the protein structure space is a promising mechanism to enhance decoy generation algorithms in template-free protein structure prediction.

2020 ◽  
Vol 10 (1) ◽  
Kun Tian ◽  
Xin Zhao ◽  
Xiaogeng Wan ◽  
Stephen S.-T. Yau

AbstractProtein structure can provide insights that help biologists to predict and understand protein functions and interactions. However, the number of known protein structures has not kept pace with the number of protein sequences determined by high-throughput sequencing. Current techniques used to determine the structure of proteins are complex and require a lot of time to analyze the experimental results, especially for large protein molecules. The limitations of these methods have motivated us to create a new approach for protein structure prediction. Here we describe a new approach to predict of protein structures and structure classes from amino acid sequences. Our prediction model performs well in comparison with previous methods when applied to the structural classification of two CATH datasets with more than 5000 protein domains. The average accuracy is 92.5% for structure classification, which is higher than that of previous research. We also used our model to predict four known protein structures with a single amino acid sequence, while many other existing methods could only obtain one possible structure for a given sequence. The results show that our method provides a new effective and reliable tool for protein structure prediction research.

Leonardo de Lima Corrêa ◽  
Márcio Dorn

Tertiary protein structure prediction in silico is currently a challenging problem in Structural Bioinformatics and can be classified according to the computational complexity theory as an NP-hard problem. Determining the 3-D structure of a protein is both experimentally expensive, and time-consuming. The agent-based paradigm has been shown a useful technique for the applications that have repetitive and time-consuming activities, knowledge share and management, such as integration of different knowledge sources and modeling of complex systems, supporting a great variety of domains. This chapter provides an integrated view and insights about the protein structure prediction area concerned to the usage, application and implementation of multi-agent systems to predict the protein structures or to support and coordinate the existing predictors, as well as it is advantages, issues, needs, and demands. It is noteworthy that there is a great need for works related to multi-agent and agent-based paradigms applied to the problem due to their excellent suitability to the problem.

2021 ◽  
Vol 22 (11) ◽  
pp. 5553
Subash C Pakhrin ◽  
Bikash Shrestha ◽  
Badri Adhikari ◽  
Dukka B KC

Obtaining an accurate description of protein structure is a fundamental step toward understanding the underpinning of biology. Although recent advances in experimental approaches have greatly enhanced our capabilities to experimentally determine protein structures, the gap between the number of protein sequences and known protein structures is ever increasing. Computational protein structure prediction is one of the ways to fill this gap. Recently, the protein structure prediction field has witnessed a lot of advances due to Deep Learning (DL)-based approaches as evidenced by the success of AlphaFold2 in the most recent Critical Assessment of protein Structure Prediction (CASP14). In this article, we highlight important milestones and progresses in the field of protein structure prediction due to DL-based methods as observed in CASP experiments. We describe advances in various steps of protein structure prediction pipeline viz. protein contact map prediction, protein distogram prediction, protein real-valued distance prediction, and Quality Assessment/refinement. We also highlight some end-to-end DL-based approaches for protein structure prediction approaches. Additionally, as there have been some recent DL-based advances in protein structure determination using Cryo-Electron (Cryo-EM) microscopy based, we also highlight some of the important progress in the field. Finally, we provide an outlook and possible future research directions for DL-based approaches in the protein structure prediction arena.

2021 ◽  
Vol 22 (11) ◽  
pp. 6032
Donghyuk Suh ◽  
Jai Woo Lee ◽  
Sun Choi ◽  
Yoonji Lee

The new advances in deep learning methods have influenced many aspects of scientific research, including the study of the protein system. The prediction of proteins’ 3D structural components is now heavily dependent on machine learning techniques that interpret how protein sequences and their homology govern the inter-residue contacts and structural organization. Especially, methods employing deep neural networks have had a significant impact on recent CASP13 and CASP14 competition. Here, we explore the recent applications of deep learning methods in the protein structure prediction area. We also look at the potential opportunities for deep learning methods to identify unknown protein structures and functions to be discovered and help guide drug–target interactions. Although significant problems still need to be addressed, we expect these techniques in the near future to play crucial roles in protein structural bioinformatics as well as in drug discovery.

2013 ◽  
Vol 2013 ◽  
pp. 1-15 ◽  
Mahmood A. Rashid ◽  
M. A. Hakim Newton ◽  
Md. Tamjidul Hoque ◽  
Abdul Sattar

Protein structure prediction (PSP) is computationally a very challenging problem. The challenge largely comes from the fact that the energy function that needs to be minimised in order to obtain the native structure of a given protein is not clearly known. A high resolution20×20energy model could better capture the behaviour of the actual energy function than a low resolution energy model such as hydrophobic polar. However, the fine grained details of the high resolution interaction energy matrix are often not very informative for guiding the search. In contrast, a low resolution energy model could effectively bias the search towards certain promising directions. In this paper, we develop a genetic algorithm that mainly uses a high resolution energy model for protein structure evaluation but uses a low resolution HP energy model in focussing the search towards exploring structures that have hydrophobic cores. We experimentally show that this mixing of energy models leads to significant lower energy structures compared to the state-of-the-art results.

Sign in / Sign up

Export Citation Format

Share Document