scholarly journals A New Clustering and Nomenclature for Beta Turns Derived from High-Resolution Protein Structures

2018 ◽  
Author(s):  
Maxim Shapovalov ◽  
Slobodan Vucetic ◽  
Roland L. Dunbrack

AbstractProtein loops connect regular secondary structures and contain 4-residue beta turns which represent 63% of the residues in loops. The commonly used classification of beta turns (Type I, I’, II, II’, VIa1, VIa2, VIb, and VIII) was developed in the 1970s and 1980s from analysis of a small number of proteins of average resolution, and represents only two thirds of beta turns observed in proteins (with a generic class Type IV representing the rest). We present a new clustering of beta turn conformations from a set of 13,030 turns from 1078 ultra-high resolution protein structures (≤1.2 Å). Our clustering is derived from applying the DBSCAN andk-medoids algorithms to this data set with a metric commonly used in directional statistics applied to the set of dihedral angles from the second and third residues of each turn. We define 18 turn types compared to the 8 classical turn types in common use. We propose a new 2-letter nomenclature for all 18 beta-turn types using Ramachandran region names for the two central residues (e.g., ‘A’ and ‘D’ for alpha regions on the left side of the Ramachandran map and ‘a’ and ‘d’ for equivalent regions on the right-hand side; classical Type I turns are ‘AD’ turns and Type I’ turns are ‘ad’). We identify 11 new types of beta turn, 5 of which are sub-types of classical beta turn types. Up-to-date statistics, probability densities of conformations, and sequence profiles of beta turns in loops were collected and analyzed. A library of turn types,BetaTurnLib18, and cross-platform software,BetaTurnTool18, which identifies turns in an input protein structure, are freely available and redistributable fromdunbrack.fccc.edu/betaturnandgithub.com/sh-maxim/BetaTurn18. Given the ubiquitous nature of beta turns, this comprehensive study updates understanding of beta turns and should also provide useful tools for protein structure determination, refinement, and prediction programs.


2019 ◽  
Author(s):  
Xian Wei ◽  
Zhicheng Li ◽  
Shijian Li ◽  
Xubiao Peng ◽  
Qing Zhao

AbstractThe protein nuclear magnetic resonance (NMR) structure determination is one of the most extensively studied problems due to its increasing importance in biological function analysis. We adopt a novel method, based on one of the matrix completion (MC) techniques–the Riemannian approach, to solve the protein structure determination problem. We formulate the protein structure in terms of low-rank matrix which can be solved by an optimization problem in the Riemannian spectrahedron manifold whose objective function has been delimited with the derived boundary condition. Two efficient algorithms in Riemannian approach-the trust-region (Tr) algorithm and the conjugate gradient (Cg) algorithm are used to reconstruct protein structures. We first use the two algorithms in a toy model and show that the Tr algorithm is more robust. Afterwards, we rebuild the protein structure from the NOE distance information deposited in NMR Restraints Grid (http://restraintsgrid.bmrb.wisc.edu/NRG/MRGridServlet). A dataset with both X-ray crystallographic structure and NMR structure deposited in Protein Data Bank (PDB) is used to statistically evaluate the performance of our method. By comparing both our rebuilt structures and NMR counterparts with the “standard” X-ray structures, we conclude that our rebuilt structures have similar (sometimes even smaller) RMSDs relative to “standard” X-ray structures in contrast with the reference NMR structures. Besides, we also validate our method by comparing the Z-scores between our rebuilt structures with reference structures using Protein Structure Validation Software suit. All the validation scores indicate that the Riemannian approach in MC techniques is valid in reconstructing the protein structures from NOE distance information. The software based on Riemannian approach is freely available athttps://github.com/xubiaopeng/Protein_Recon_MCRiemman.Author summaryMatrix Completion is a technique widely used in many aspects, such as the global positioning in sensor networks, collaborative filtering in recommendation system for many companies and face recognition, etc. In biology, distance geometry used to be a popular method for reconstructing protein structures related to NMR experiment. However, due to the low quality of the reconstructed results, those methods were replaced by other dynamic methods such as ARIA, CYANA and UNIO. Recently, a new MC technique named Riemannian approach is introduced and proved mathematically, which promotes us to apply it in protein structure determination from NMR measurements. In this paper, by combining the Riemannian approach and some post-processing procedures together, we reconstruct the protein structures from the incomplete distance information measured by NMR. By evaluating our results and comparing with the corresponding PDB NMR deposits, we show that the current Riemannian approach method is valid and at least comparable with (if not better than) the state-of-art methods in NMR structure determination.



2008 ◽  
Vol 2 (1) ◽  
pp. 37-49 ◽  
Author(s):  
Kevin Campbell ◽  
Lukasz Kurgan

Development of accurate β-turn (beta-turn) type prediction methods would contribute towards the prediction of the tertiary protein structure and would provide useful insights/inputs for the fold recognition and drug design. Only one existing sequence-only method is available for the prediction of beta-turn types (for type I and II) for the entire protein chains, while the proposed method allows for prediction of type I, II, IV, VII, and non-specific (NS) beta-turns, filling in the gap. The proposed predictor, which is based solely on protein sequence, is shown to provide similar performance to other sequence-only methods for prediction of beta-turns and beta-turn types. The main advantage of the proposed method is simplicity and interpretability of the underlying model. We developed novel sequence-based features that allow identifying beta-turns types and differentiating them from non-beta-turns. The features, which are based on tetrapeptides (entire beta-turns) rather than a window centered over the predicted residues as in the case of recent competing methods, provide a more biologically sound model. They include 12 features based on collocation of amino acid pairs, focusing on amino acids (Gly, Asp, and Asn) that are known to be predisposed to form beta-turns. At the same time, our model also includes features that are geared towards exclusion of non-beta-turns, which are based on amino acids known to be strongly detrimental to formation of beta-turns (Met, Ile, Leu, and Val).



Structure ◽  
2008 ◽  
Vol 16 (2) ◽  
pp. 181-195 ◽  
Author(s):  
Nathan Alexander ◽  
Ahmad Al-Mestarihi ◽  
Marco Bortolus ◽  
Hassane Mchaourab ◽  
Jens Meiler


2010 ◽  
Vol 43 (1) ◽  
pp. 65-158 ◽  
Author(s):  
Kutti R. Vinothkumar ◽  
Richard Henderson

AbstractIn reviewing the structures of membrane proteins determined up to the end of 2009, we present in words and pictures the most informative examples from each family. We group the structures together according to their function and architecture to provide an overview of the major principles and variations on the most common themes. The first structures, determined 20 years ago, were those of naturally abundant proteins with limited conformational variability, and each membrane protein structure determined was a major landmark. With the advent of complete genome sequences and efficient expression systems, there has been an explosion in the rate of membrane protein structure determination, with many classes represented. New structures are published every month and more than 150 unique membrane protein structures have been determined. This review analyses the reasons for this success, discusses the challenges that still lie ahead, and presents a concise summary of the key achievements with illustrated examples selected from each class.



2021 ◽  
Vol 11 (Suppl_1) ◽  
pp. S13-S13
Author(s):  
Valery Novoseletsky ◽  
Mikhail Lozhnikov ◽  
Grigoriy Armeev ◽  
Aleksandr Kudriavtsev ◽  
Alexey Shaytan ◽  
...  

Background: Protein structure determination using X-ray free-electron laser (XFEL) includes analysis and merging a large number of snapshot diffraction patterns. Convolutional neural networks are widely used to solve numerous computer vision problems, e.g. image classification, and can be used for diffraction pattern analysis. But the task of protein structure determination with the use of CNNs only is not yet solved. Methods: We simulated the diffraction patterns using the Condor software library and obtained more than 1000 diffraction patterns for each structure with simulation parameters resembling real ones. To classify diffraction patterns, we tried two approaches, which are widely known in the area of image classification: a classic VGG network and residual networks. Results: 1. Recognition of a protein class (GPCRs vs globins). Globins and GPCR-like proteins are typical α-helical proteins. Each of these protein families has a large number of representatives (including those with known structure) but we used only 8 structures from every family. 12,000 of diffraction patterns were used for training and 4,000 patterns for testing. Results indicate that all considered networks are able to recognize the protein family type with high accuracy. 2. Recognition of the number of protein molecules in the liposome. We considered the usage of lyposomes as carriers of membrane or globular proteins for sample delivery in XFEL experiments in order to improve the X-ray beam hit rate. Three sets of diffractograms for liposomes of various radius were calculated, including diffractograms for empty liposomes, liposomes loaded with 5 bacteriorhodopsin molecules, and liposomes loaded with 10 bacteriorhodopsin molecules. The training set consisted of 23625 diffraction patterns, and test set of 7875 patterns. We found that all networks used in our study were able to identify the number of protein molecules in liposomes independent of the liposome radius. Our findings make this approach rather promising for the usage of liposomes as protein carriers in XFEL experiments. Conclusion: Thus, the performed numerical experiments show that the use of neural network algorithms for the recognition of diffraction images from single macromolecular particles makes it possible to determine changes in the structure at the angstrom scale.



2021 ◽  
Vol 8 (3) ◽  
pp. 103-111
Author(s):  
Krishna R Gupta ◽  
Uttam Patle ◽  
Uma Kabra ◽  
P. Mishra ◽  
Milind J Umekar

Three-dimensional protein structure prediction from amino acid sequence has been a thought-provoking task for decades, but it of pivotal importance as it provides a better understanding of its function. In recent years, the methods for prediction of protein structures have advanced considerably. Computational techniques and increase in protein sequence and structure databases have influence the laborious protein structure determination process. Still there is no single method which can predict all the protein structures. In this review, we describe the four stages of protein structure determination. We have also explored the currenttechniques used to uncover the protein structure and highpoint best suitable method for a given protein.



2020 ◽  
Author(s):  
Xiaogen Zhou ◽  
Yang Li ◽  
Chengxin Zhang ◽  
Wei Zheng ◽  
Guijun Zhang ◽  
...  

ABSTRACTProgress in cryo-electron microscopy (cryo-EM) has provided the potential for large-size protein structure determination. However, the solution rate for multi-domain proteins remains low due to the difficulty in modeling inter-domain orientations. We developed DEMO-EM, an automatic method to assemble multi-domain structures from cryo-EM maps through a progressive structural refinement procedure combining rigid-body domain fitting and flexible assembly simulations with deep neural network inter-domain distance profiles. The method was tested on a large-scale benchmark set of proteins containing up to twelve continuous and discontinuous domains with medium-to-low-resolution density maps, where DEMO-EM produced models with correct inter-domain orientations (TM-score >0.5) for 98% of cases and significantly outperformed the state-of-the-art methods. DEMO-EM was applied to SARS-Cov-2 coronavirus genome and generated models with average TM-score/RMSD of 0.97/1.4Å to the deposited structures. These results demonstrated an efficient pipeline that enables automated and reliable large-scale multi-domain protein structure modeling with atomic-level accuracy from cryo-EM maps.



2014 ◽  
Author(s):  
Anders S Christensen

In this thesis, a protein structure determination using chemical shifts is presented. The method is implemented in the open source PHAISTOS protein simulation framework. The method combines sampling from a generative model with a coarse-grained force field and an energy function that includes chemical shifts. The method is benchmarked on folding simulations of five small proteins. In four cases the resulting structures are in excellent agreement with experimental data, the fifth case fail likely due to inaccuracies in the energy function. For the Chymotrypsin Inhibitor protein, a structure is determined using only chemical shifts recorded and assigned through automated processes. The CA-RMSD to the experimental X-ray for this structure is 1.1 Å. Additionally, the method is combined with very sparse NOE-restraints and evolutionary distance restraints and tested on several protein structures >100 residues. For Rhodopsin (225 residues) a structure is found at 2.5 Å CA-RMSD from the experimental X-ray structure, and a structure is determined for the Savinase protein (269 residues) with 2.9 Å CA-RMSD from the experimental X-ray structure.



2020 ◽  
Author(s):  
Jiahua He ◽  
Sheng-You Huang

AbstractAdvances in microscopy instruments and image processing algorithms have led to an increasing number of cryo-EM maps. However, building accurate models for the EM maps at 3-5 Å resolution remains a challenging and time-consuming process. With the rapid growth of deposited EM maps, there is an increasing gap between the maps and reconstructed/modeled 3-dimensional (3D) structures. Therefore, automatic reconstruction of atomic-accuracy full-atom structures from EM maps is pressingly needed. Here, we present a semi-automatic de novo structure determination method using a deep learning-based framework, named as DeepMM, which builds atomic-accuracy all-atom models from cryo-EM maps at near-atomic resolution. In our method, the main-chain and Cα positions as well as their amino acid and secondary structure types are predicted in the EM map using Densely Connected Convolutional Networks. DeepMM was extensively validated on 40 simulated maps at 5 Å resolution and 30 experimental maps at 2.6-4.8 Å resolution as well as an EMDB-wide data set of 2931 experimental maps at 2.6-4.9 Å resolution, and compared with state-of-the-art algorithms including RosettaES, MAINMAST, and Phenix. Overall, our DeepMM algorithm obtained a significant improvement over existing methods in terms of both accuracy and coverage in building full-length protein structures on all test sets, demonstrating the efficacy and general applicability of DeepMM.Availabilityhttps://github.com/JiahuaHe/DeepMMSupplementary informationSupplementary data are available.



Sign in / Sign up

Export Citation Format

Share Document