A novel score for highly accurate and efficient prediction of native protein structures

AbstractProtein structure resolution has lagged far behind sequence determination, as it is often laborious and time-consuming to resolve individual protein structure – more often than not even impossible. For computational prediction, due to the lack of detailed knowledge on the folding driving forces, how to design an energy function is still an open question. Furthermore, an effective criterion to evaluate the performance of the energy function is also lacking. Here we present a novel knowledge-based-energy scoring function, simply considering the interactions of peptide bonds, rather than, as conventionally, the residues or atoms as the most important energy contribution. This energy scoring was evaluated by selecting the X-ray structure from a large number of possibilities. It not only outperforms the best of the previously published statistical potentials, but also has very low computational expense. Besides, we suggest an alternative criterion to evaluate the performance of the energy scoring function, measured by the template modeling score of the selected rank-one. We argue that the comparison should allow for some deviation between the x-ray and predicted structures. Collectively, this accurate and simple energy scoring function, together with the optimized criterion, will significantly advance the computational protein structure prediction.

Download Full-text

NEPRE: a Scoring Function for Protein Structures based on Neighbourhood Preference

10.1101/463554 ◽

2018 ◽

Author(s):

Siyuan Liu ◽

Xilun Xiang ◽

Haiguang Liu

Keyword(s):

Protein Structure ◽

Amino Acid ◽

Energy Function ◽

Structure Prediction ◽

Protein Structures ◽

Scoring Function ◽

Data Bank ◽

Native Structure ◽

Scoring Algorithm ◽

Model Ranking

ABSTRACTProtein structure prediction relies on two major components, a method to generate good models that are close to the native structure and a scoring function that can select the good models. Based on the statistics from known structures in the protein data bank, a statistical energy function is derived to reflect the amino acid neighbourhood preferences. The neighbourhood of one amino acid is defined by its contacting residues, and the energy function is determined by the neighbhoring residue types and relative positions. A scoring algorithm, Nepre, has been implemented and its performance was tested with several decoy sets. The results show that the Nepre program can be applied in model ranking to improve the success rate in structure predictions.

Download Full-text

FASPR: an open-source tool for fast and accurate protein side-chain packing

Bioinformatics ◽

10.1093/bioinformatics/btaa234 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3758-3765 ◽

Cited By ~ 6

Author(s):

Xiaoqiang Huang ◽

Robin Pearce ◽

Yang Zhang

Keyword(s):

Protein Structure ◽

Protein Design ◽

Structure Prediction ◽

Protein Structures ◽

Scoring Function ◽

Supplementary Information ◽

Side Chain ◽

Chain Packing ◽

And Function ◽

Side Chain Packing

Abstract Motivation Protein structure and function are essentially determined by how the side-chain atoms interact with each other. Thus, accurate protein side-chain packing (PSCP) is a critical step toward protein structure prediction and protein design. Despite the importance of the problem, however, the accuracy and speed of current PSCP programs are still not satisfactory. Results We present FASPR for fast and accurate PSCP by using an optimized scoring function in combination with a deterministic searching algorithm. The performance of FASPR was compared with four state-of-the-art PSCP methods (CISRR, RASP, SCATD and SCWRL4) on both native and non-native protein backbones. For the assessment on native backbones, FASPR achieved a good performance by correctly predicting 69.1% of all the side-chain dihedral angles using a stringent tolerance criterion of 20°, compared favorably with SCWRL4, CISRR, RASP and SCATD which successfully predicted 68.8%, 68.6%, 67.8% and 61.7%, respectively. Additionally, FASPR achieved the highest speed for packing the 379 test protein structures in only 34.3 s, which was significantly faster than the control methods. For the assessment on non-native backbones, FASPR showed an equivalent or better performance on I-TASSER predicted backbones and the backbones perturbed from experimental structures. Detailed analyses showed that the major advantage of FASPR lies in the optimal combination of the dead-end elimination and tree decomposition with a well optimized scoring function, which makes FASPR of practical use for both protein structure modeling and protein design studies. Availability and implementation The web server, source code and datasets are freely available at https://zhanglab.ccmb.med.umich.edu/FASPR and https://github.com/tommyhuangthu/FASPR. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Protein structure determination using chemical shifts

10.7287/peerj.preprints.374v1 ◽

2014 ◽

Author(s):

Anders S Christensen

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Energy Function ◽

Chemical Shifts ◽

Protein A ◽

Protein Structures ◽

Protein Structure Determination ◽

Coarse Grained ◽

X Ray ◽

Small Proteins

In this thesis, a protein structure determination using chemical shifts is presented. The method is implemented in the open source PHAISTOS protein simulation framework. The method combines sampling from a generative model with a coarse-grained force field and an energy function that includes chemical shifts. The method is benchmarked on folding simulations of five small proteins. In four cases the resulting structures are in excellent agreement with experimental data, the fifth case fail likely due to inaccuracies in the energy function. For the Chymotrypsin Inhibitor protein, a structure is determined using only chemical shifts recorded and assigned through automated processes. The CA-RMSD to the experimental X-ray for this structure is 1.1 Å. Additionally, the method is combined with very sparse NOE-restraints and evolutionary distance restraints and tested on several protein structures >100 residues. For Rhodopsin (225 residues) a structure is found at 2.5 Å CA-RMSD from the experimental X-ray structure, and a structure is determined for the Savinase protein (269 residues) with 2.9 Å CA-RMSD from the experimental X-ray structure.

Download Full-text

Protein structure determination using chemical shifts

10.7287/peerj.preprints.374 ◽

2014 ◽

Author(s):

Anders S Christensen

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Energy Function ◽

Chemical Shifts ◽

Protein A ◽

Protein Structures ◽

Protein Structure Determination ◽

Coarse Grained ◽

X Ray ◽

Small Proteins

Download Full-text

Sequence Specific Dihedral Angle Distribution: Application in Protein Structure Prediction and Evaluation

Plant Tissue Culture and Biotechnology ◽

10.3329/ptcb.v19i2.5439 ◽

1970 ◽

Vol 19 (2) ◽

pp. 217-226

Author(s):

S. M. Minhaz Ud-Dean ◽

Mahdi Muhammad Moosa

Keyword(s):

Protein Structure ◽

Dihedral Angle ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Protein Structures ◽

Angle Distribution ◽

Ramachandran Plot ◽

Specific Data ◽

Specific Distribution ◽

Structure Evaluation

Protein structure prediction and evaluation is one of the major fields of computational biology. Estimation of dihedral angle can provide information about the acceptability of both theoretically predicted and experimentally determined structures. Here we report on the sequence specific dihedral angle distribution of high resolution protein structures available in PDB and have developed Sasichandran, a tool for sequence specific dihedral angle prediction and structure evaluation. This tool will allow evaluation of a protein structure in pdb format from the sequence specific distribution of Ramachandran angles. Additionally, it will allow retrieval of the most probable Ramachandran angles for a given sequence along with the sequence specific data. Key words: Torsion angle, φ-ψ distribution, sequence specific ramachandran plot, Ramasekharan, protein structure appraisal D.O.I. 10.3329/ptcb.v19i2.5439 Plant Tissue Cult. & Biotech. 19(2): 217-226, 2009 (December)

Download Full-text

AlphaFold at CASP13

Bioinformatics ◽

10.1093/bioinformatics/btz422 ◽

2019 ◽

Vol 35 (22) ◽

pp. 4862-4865 ◽

Cited By ~ 48

Author(s):

Mohammed AlQuraishi

Keyword(s):

Protein Structure ◽

Protein Sequence ◽

Structure Prediction ◽

Computational Prediction ◽

Data Bank ◽

Academic Community ◽

Physical Contact ◽

Evolutionary Analysis ◽

History Of ◽

First Time

Abstract Summary: Computational prediction of protein structure from sequence is broadly viewed as a foundational problem of biochemistry and one of the most difficult challenges in bioinformatics. Once every two years the Critical Assessment of protein Structure Prediction (CASP) experiments are held to assess the state of the art in the field in a blind fashion, by presenting predictor groups with protein sequences whose structures have been solved but have not yet been made publicly available. The first CASP was organized in 1994, and the latest, CASP13, took place last December, when for the first time the industrial laboratory DeepMind entered the competition. DeepMind's entry, AlphaFold, placed first in the Free Modeling (FM) category, which assesses methods on their ability to predict novel protein folds (the Zhang group placed first in the Template-Based Modeling (TBM) category, which assess methods on predicting proteins whose folds are related to ones already in the Protein Data Bank.) DeepMind's success generated significant public interest. Their approach builds on two ideas developed in the academic community during the preceding decade: (i) the use of co-evolutionary analysis to map residue co-variation in protein sequence to physical contact in protein structure, and (ii) the application of deep neural networks to robustly identify patterns in protein sequence and co-evolutionary couplings and convert them into contact maps. In this Letter, we contextualize the significance of DeepMind's entry within the broader history of CASP, relate AlphaFold's methodological advances to prior work, and speculate on the future of this important problem.

Download Full-text

Prediction of Structural and Functional Aspects of Protein

Advances in Secure Computing, Internet Services, and Applications - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-4666-4940-8.ch016 ◽

2014 ◽

pp. 317-333

Author(s):

Arun G. Ingale

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Tertiary Structure ◽

Protein Structures ◽

Three Dimensional ◽

Dimensional Structure ◽

Sequence Information ◽

Predict Protein Structure ◽

Basic Ideas

To predict the structure of protein from a primary amino acid sequence is computationally difficult. An investigation of the methods and algorithms used to predict protein structure and a thorough knowledge of the function and structure of proteins are critical for the advancement of biology and the life sciences as well as the development of better drugs, higher-yield crops, and even synthetic bio-fuels. To that end, this chapter sheds light on the methods used for protein structure prediction. This chapter covers the applications of modeled protein structures and unravels the relationship between pure sequence information and three-dimensional structure, which continues to be one of the greatest challenges in molecular biology. With this resource, it presents an all-encompassing examination of the problems, methods, tools, servers, databases, and applications of protein structure prediction, giving unique insight into the future applications of the modeled protein structures. In this chapter, current protein structure prediction methods are reviewed for a milieu on structure prediction, the prediction of structural fundamentals, tertiary structure prediction, and functional imminent. The basic ideas and advances of these directions are discussed in detail.

Download Full-text

Protein Structure Determination in Living Cells

International Journal of Molecular Sciences ◽

10.3390/ijms20102442 ◽

2019 ◽

Vol 20 (10) ◽

pp. 2442 ◽

Cited By ~ 2

Author(s):

Teppei Ikeya ◽

Peter Güntert ◽

Yutaka Ito

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Structure Prediction ◽

Structural Information ◽

Nuclear Overhauser Effect ◽

Protein Structures ◽

Three Dimensional ◽

Structural Data ◽

Sample Tube ◽

In Cells

To date, in-cell NMR has elucidated various aspects of protein behaviour by associating structures in physiological conditions. Meanwhile, current studies of this method mostly have deduced protein states in cells exclusively based on ‘indirect’ structural information from peak patterns and chemical shift changes but not ‘direct’ data explicitly including interatomic distances and angles. To fully understand the functions and physical properties of proteins inside cells, it is indispensable to obtain explicit structural data or determine three-dimensional (3D) structures of proteins in cells. Whilst the short lifetime of cells in a sample tube, low sample concentrations, and massive background signals make it difficult to observe NMR signals from proteins inside cells, several methodological advances help to overcome the problems. Paramagnetic effects have an outstanding potential for in-cell structural analysis. The combination of a limited amount of experimental in-cell data with software for ab initio protein structure prediction opens an avenue to visualise 3D protein structures inside cells. Conventional nuclear Overhauser effect spectroscopy (NOESY)-based structure determination is advantageous to elucidate the conformations of side-chain atoms of proteins as well as global structures. In this article, we review current progress for the structure analysis of proteins in living systems and discuss the feasibility of its future works.

Download Full-text