scholarly journals NEPRE: a Scoring Function for Protein Structures based on Neighbourhood Preference

2018 ◽  
Author(s):  
Siyuan Liu ◽  
Xilun Xiang ◽  
Haiguang Liu

ABSTRACTProtein structure prediction relies on two major components, a method to generate good models that are close to the native structure and a scoring function that can select the good models. Based on the statistics from known structures in the protein data bank, a statistical energy function is derived to reflect the amino acid neighbourhood preferences. The neighbourhood of one amino acid is defined by its contacting residues, and the energy function is determined by the neighbhoring residue types and relative positions. A scoring algorithm, Nepre, has been implemented and its performance was tested with several decoy sets. The results show that the Nepre program can be applied in model ranking to improve the success rate in structure predictions.

2020 ◽  
Author(s):  
Lu-yun Wu ◽  
Xia-yu Xia ◽  
Xian-ming Pan

AbstractProtein structure resolution has lagged far behind sequence determination, as it is often laborious and time-consuming to resolve individual protein structure – more often than not even impossible. For computational prediction, due to the lack of detailed knowledge on the folding driving forces, how to design an energy function is still an open question. Furthermore, an effective criterion to evaluate the performance of the energy function is also lacking. Here we present a novel knowledge-based-energy scoring function, simply considering the interactions of peptide bonds, rather than, as conventionally, the residues or atoms as the most important energy contribution. This energy scoring was evaluated by selecting the X-ray structure from a large number of possibilities. It not only outperforms the best of the previously published statistical potentials, but also has very low computational expense. Besides, we suggest an alternative criterion to evaluate the performance of the energy scoring function, measured by the template modeling score of the selected rank-one. We argue that the comparison should allow for some deviation between the x-ray and predicted structures. Collectively, this accurate and simple energy scoring function, together with the optimized criterion, will significantly advance the computational protein structure prediction.


2020 ◽  
Author(s):  
Kun Tian ◽  
Xin Zhao ◽  
Xiaogeng Wan ◽  
Stephen Yau

Abstract Background Protein structure can provide insights that help biologists to predict and understand protein functions and interactions. However, the number of known protein structures has not kept pace with the number of protein sequences determined by high-throughput sequencing. Current techniques used to determine the structure of proteins, such as X-ray crystallography and Nuclear Magnetic Resonance (NMR) spectroscopy, are complex and may require a lot of time to analyze the experimental results, especially for large protein molecules. The limitations of these methods have motivated us to create a new approach for protein structure prediction.Results Here we describe a new approach that uses integration and analysis of torsion angle information from the Protein Data Bank to enable prediction of protein structures from amino acid sequences. Our prediction model performed well in comparison with previous methods when applied to the structural classification of two CATH datasets with more than 5000 protein domains. This new prediction model performs well with an average of 92.5% accuracy for structure classification, which is higher than the previous research. We also used our model to predict four known protein structures with a single amino acid sequence, while many other existing methods could only obtain one possible structure for a given sequence.Conclusions The results show that our method provides a new effective and reliable tool for protein structure prediction research.


2020 ◽  
Vol 36 (12) ◽  
pp. 3758-3765 ◽  
Author(s):  
Xiaoqiang Huang ◽  
Robin Pearce ◽  
Yang Zhang

Abstract Motivation Protein structure and function are essentially determined by how the side-chain atoms interact with each other. Thus, accurate protein side-chain packing (PSCP) is a critical step toward protein structure prediction and protein design. Despite the importance of the problem, however, the accuracy and speed of current PSCP programs are still not satisfactory. Results We present FASPR for fast and accurate PSCP by using an optimized scoring function in combination with a deterministic searching algorithm. The performance of FASPR was compared with four state-of-the-art PSCP methods (CISRR, RASP, SCATD and SCWRL4) on both native and non-native protein backbones. For the assessment on native backbones, FASPR achieved a good performance by correctly predicting 69.1% of all the side-chain dihedral angles using a stringent tolerance criterion of 20°, compared favorably with SCWRL4, CISRR, RASP and SCATD which successfully predicted 68.8%, 68.6%, 67.8% and 61.7%, respectively. Additionally, FASPR achieved the highest speed for packing the 379 test protein structures in only 34.3 s, which was significantly faster than the control methods. For the assessment on non-native backbones, FASPR showed an equivalent or better performance on I-TASSER predicted backbones and the backbones perturbed from experimental structures. Detailed analyses showed that the major advantage of FASPR lies in the optimal combination of the dead-end elimination and tree decomposition with a well optimized scoring function, which makes FASPR of practical use for both protein structure modeling and protein design studies. Availability and implementation The web server, source code and datasets are freely available at https://zhanglab.ccmb.med.umich.edu/FASPR and https://github.com/tommyhuangthu/FASPR. Supplementary information Supplementary data are available at Bioinformatics online.


2015 ◽  
Vol 32 (6) ◽  
pp. 843-849 ◽  
Author(s):  
Rhys Heffernan ◽  
Abdollah Dehzangi ◽  
James Lyons ◽  
Kuldip Paliwal ◽  
Alok Sharma ◽  
...  

Abstract Motivation: Solvent exposure of amino acid residues of proteins plays an important role in understanding and predicting protein structure, function and interactions. Solvent exposure can be characterized by several measures including solvent accessible surface area (ASA), residue depth (RD) and contact numbers (CN). More recently, an orientation-dependent contact number called half-sphere exposure (HSE) was introduced by separating the contacts within upper and down half spheres defined according to the Cα-Cβ (HSEβ) vector or neighboring Cα-Cα vectors (HSEα). HSEα calculated from protein structures was found to better describe the solvent exposure over ASA, CN and RD in many applications. Thus, a sequence-based prediction is desirable, as most proteins do not have experimentally determined structures. To our best knowledge, there is no method to predict HSEα and only one method to predict HSEβ. Results: This study developed a novel method for predicting both HSEα and HSEβ (SPIDER-HSE) that achieved a consistent performance for 10-fold cross validation and two independent tests. The correlation coefficients between predicted and measured HSEβ (0.73 for upper sphere, 0.69 for down sphere and 0.76 for contact numbers) for the independent test set of 1199 proteins are significantly higher than existing methods. Moreover, predicted HSEα has a higher correlation coefficient (0.46) to the stability change by residue mutants than predicted HSEβ (0.37) and ASA (0.43). The results, together with its easy Cα-atom-based calculation, highlight the potential usefulness of predicted HSEα for protein structure prediction and refinement as well as function prediction. Availability and implementation: The method is available at http://sparks-lab.org. Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Author(s):  
Jiaxi Liu ◽  

The prediction of protein three-dimensional structure from amino acid sequence has been a challenge problem in bioinformatics, owing to the many potential applications for robust protein structure prediction methods. Protein structure prediction is essential to bioscience, and its research results are important for other research areas. Methods for the prediction an才d design of protein structures have advanced dramatically. The prediction of protein structure based on average hydrophobic values is discussed and an improved genetic algorithm is proposed to solve the optimization problem of hydrophobic protein structure prediction. An adjustment operator is designed with the average hydrophobic value to prevent the overlapping of amino acid positions. Finally, some numerical experiments are conducted to verify the feasibility and effectiveness of the proposed algorithm by comparing with the traditional HNN algorithm.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Kun Tian ◽  
Xin Zhao ◽  
Xiaogeng Wan ◽  
Stephen S.-T. Yau

AbstractProtein structure can provide insights that help biologists to predict and understand protein functions and interactions. However, the number of known protein structures has not kept pace with the number of protein sequences determined by high-throughput sequencing. Current techniques used to determine the structure of proteins are complex and require a lot of time to analyze the experimental results, especially for large protein molecules. The limitations of these methods have motivated us to create a new approach for protein structure prediction. Here we describe a new approach to predict of protein structures and structure classes from amino acid sequences. Our prediction model performs well in comparison with previous methods when applied to the structural classification of two CATH datasets with more than 5000 protein domains. The average accuracy is 92.5% for structure classification, which is higher than that of previous research. We also used our model to predict four known protein structures with a single amino acid sequence, while many other existing methods could only obtain one possible structure for a given sequence. The results show that our method provides a new effective and reliable tool for protein structure prediction research.


Author(s):  
CHANDRAYANI N. ROKDE ◽  
DR.MANALI KSHIRSAGAR

Protein structure prediction (PSP) from amino acid sequence is one of the high focus problems in bioinformatics today. This is due to the fact that the biological function of the protein is determined by its three dimensional structure. The understanding of protein structures is vital to determine the function of a protein and its interaction with DNA, RNA and enzyme. Thus, protein structure is a fundamental area of computational biology. Its importance is intensed by large amounts of sequence data coming from PDB (Protein Data Bank) and the fact that experimentally methods such as X-ray crystallography or Nuclear Magnetic Resonance (NMR)which are used to determining protein structures remains very expensive and time consuming. In this paper, different types of protein structures and methods for its prediction are described.


1970 ◽  
Vol 19 (2) ◽  
pp. 217-226
Author(s):  
S. M. Minhaz Ud-Dean ◽  
Mahdi Muhammad Moosa

Protein structure prediction and evaluation is one of the major fields of computational biology. Estimation of dihedral angle can provide information about the acceptability of both theoretically predicted and experimentally determined structures. Here we report on the sequence specific dihedral angle distribution of high resolution protein structures available in PDB and have developed Sasichandran, a tool for sequence specific dihedral angle prediction and structure evaluation. This tool will allow evaluation of a protein structure in pdb format from the sequence specific distribution of Ramachandran angles. Additionally, it will allow retrieval of the most probable Ramachandran angles for a given sequence along with the sequence specific data. Key words: Torsion angle, φ-ψ distribution, sequence specific ramachandran plot, Ramasekharan, protein structure appraisal D.O.I. 10.3329/ptcb.v19i2.5439 Plant Tissue Cult. & Biotech. 19(2): 217-226, 2009 (December)


2014 ◽  
Vol 70 (a1) ◽  
pp. C491-C491
Author(s):  
Jürgen Haas ◽  
Alessandro Barbato ◽  
Tobias Schmidt ◽  
Steven Roth ◽  
Andrew Waterhouse ◽  
...  

Computational modeling and prediction of three-dimensional macromolecular structures and complexes from their sequence has been a long standing goal in structural biology. Over the last two decades, a paradigm shift has occurred: starting from a large "knowledge gap" between the huge number of protein sequences compared to a small number of experimentally known structures, today, some form of structural information – either experimental or computational – is available for the majority of amino acids encoded by common model organism genomes. Methods for structure modeling and prediction have made substantial progress of the last decades, and template based homology modeling techniques have matured to a point where they are now routinely used to complement experimental techniques. However, computational modeling and prediction techniques often fall short in accuracy compared to high-resolution experimental structures, and it is often difficult to convey the expected accuracy and structural variability of a specific model. Retrospectively assessing the quality of blind structure prediction in comparison to experimental reference structures allows benchmarking the state-of-the-art in structure prediction and identifying areas which need further development. The Critical Assessment of Structure Prediction (CASP) experiment has for the last 20 years assessed the progress in the field of protein structure modeling based on predictions for ca. 100 blind prediction targets per experiment which are carefully evaluated by human experts. The "Continuous Model EvaluatiOn" (CAMEO) project aims to provide a fully automated blind assessment for prediction servers based on weekly pre-released sequences of the Protein Data Bank PDB. CAMEO has been made possible by the development of novel scoring methods such as lDDT, which are robust against domain movements to allow for automated continuous structure comparison without human intervention.


2019 ◽  
Vol 35 (22) ◽  
pp. 4862-4865 ◽  
Author(s):  
Mohammed AlQuraishi

Abstract Summary: Computational prediction of protein structure from sequence is broadly viewed as a foundational problem of biochemistry and one of the most difficult challenges in bioinformatics. Once every two years the Critical Assessment of protein Structure Prediction (CASP) experiments are held to assess the state of the art in the field in a blind fashion, by presenting predictor groups with protein sequences whose structures have been solved but have not yet been made publicly available. The first CASP was organized in 1994, and the latest, CASP13, took place last December, when for the first time the industrial laboratory DeepMind entered the competition. DeepMind's entry, AlphaFold, placed first in the Free Modeling (FM) category, which assesses methods on their ability to predict novel protein folds (the Zhang group placed first in the Template-Based Modeling (TBM) category, which assess methods on predicting proteins whose folds are related to ones already in the Protein Data Bank.) DeepMind's success generated significant public interest. Their approach builds on two ideas developed in the academic community during the preceding decade: (i) the use of co-evolutionary analysis to map residue co-variation in protein sequence to physical contact in protein structure, and (ii) the application of deep neural networks to robustly identify patterns in protein sequence and co-evolutionary couplings and convert them into contact maps. In this Letter, we contextualize the significance of DeepMind's entry within the broader history of CASP, relate AlphaFold's methodological advances to prior work, and speculate on the future of this important problem.


Sign in / Sign up

Export Citation Format

Share Document