scholarly journals Computing structure-based lipid accessibility of membrane proteins with mp_lipid_acc in RosettaMP

2016 ◽  
Author(s):  
Julia Koehler Leman ◽  
Sergey Lyskov ◽  
Richard Bonneau

AbstractBackgroundMembrane proteins are vastly underrepresented in structural databases, which has led to a lack of computational tools and the corresponding inappropriate use of tools designed for soluble proteins. For membrane proteins, lipid accessibility is an essential property. Even though programs are available for sequence-based prediction of lipid accessibility and structure-based identification of solvent-accessible surface area, the latter does not distinguish between water accessible and lipid accessible residues in membrane proteins.ResultsHere we present mp_lipid_acc, the first method to identify lipid accessible residues from the protein structure, implemented in the RosettaMP framework and available as a webserver. Our method uses protein structures transformed in membrane coordinates, for instance from PDBTM or OPM databases, and a defined membrane thickness to classify lipid accessibility of residues. mp_lipid_acc is applicable to both α-helical and β-barrel membrane proteins of diverse architectures with or without water-filled pores and uses a concave hull algorithm for classification. We further provide a manually curated benchmark dataset, on which our method achieves prediction accuracies of 90%.ConclusionWe present a novel tool to classify lipid accessibility from the protein structure, which is applicable to proteins of diverse architectures and achieves prediction accuracies of 90% on a manually curated database. mp_lipid_acc is part of the Rosetta software suite, available at www.rosettacommons.org. The webserver is available at http://rosie.graylab.jhu.edu/mp_lipid_acc/submit and the benchmark dataset is available at http://tinyurl.com/mp-lipid-acc-dataset.Supplementary informationSupplementary information is available at BMC Bioinformatics.


2015 ◽  
Vol 32 (6) ◽  
pp. 843-849 ◽  
Author(s):  
Rhys Heffernan ◽  
Abdollah Dehzangi ◽  
James Lyons ◽  
Kuldip Paliwal ◽  
Alok Sharma ◽  
...  

Abstract Motivation: Solvent exposure of amino acid residues of proteins plays an important role in understanding and predicting protein structure, function and interactions. Solvent exposure can be characterized by several measures including solvent accessible surface area (ASA), residue depth (RD) and contact numbers (CN). More recently, an orientation-dependent contact number called half-sphere exposure (HSE) was introduced by separating the contacts within upper and down half spheres defined according to the Cα-Cβ (HSEβ) vector or neighboring Cα-Cα vectors (HSEα). HSEα calculated from protein structures was found to better describe the solvent exposure over ASA, CN and RD in many applications. Thus, a sequence-based prediction is desirable, as most proteins do not have experimentally determined structures. To our best knowledge, there is no method to predict HSEα and only one method to predict HSEβ. Results: This study developed a novel method for predicting both HSEα and HSEβ (SPIDER-HSE) that achieved a consistent performance for 10-fold cross validation and two independent tests. The correlation coefficients between predicted and measured HSEβ (0.73 for upper sphere, 0.69 for down sphere and 0.76 for contact numbers) for the independent test set of 1199 proteins are significantly higher than existing methods. Moreover, predicted HSEα has a higher correlation coefficient (0.46) to the stability change by residue mutants than predicted HSEβ (0.37) and ASA (0.43). The results, together with its easy Cα-atom-based calculation, highlight the potential usefulness of predicted HSEα for protein structure prediction and refinement as well as function prediction. Availability and implementation: The method is available at http://sparks-lab.org. Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.



2018 ◽  
Vol 35 (15) ◽  
pp. 2578-2584 ◽  
Author(s):  
Eduardo Mayol ◽  
Mercedes Campillo ◽  
Arnau Cordomí ◽  
Mireia Olivella

Abstract Motivation The number of available membrane protein structures has markedly increased in the last years and, in parallel, the reliability of the methods to detect transmembrane (TM) segments. In the present report, we characterized inter-residue interactions in α-helical membrane proteins using a dataset of 3462 TM helices from 430 proteins. This is by far the largest analysis published to date. Results Our analysis of residue–residue interactions in TM segments of membrane proteins shows that almost all interactions involve aliphatic residues and Phe. There is lack of polar–polar, polar–charged and charged–charged interactions except for those between Thr or Ser sidechains and the backbone carbonyl of aliphatic and Phe residues. The results are discussed in the context of the preferences of amino acids to be in the protein core or exposed to the lipid bilayer and to occupy specific positions along the TM segment. Comparison to datasets of β-barrel membrane proteins and of α-helical globular proteins unveils the specific patterns of interactions and residue composition characteristic of α-helical membrane proteins that are the clue to understanding their structure. Availability and implementation Results data and datasets used are available at http://lmc.uab.cat/TMalphaDB/interactions.php. Supplementary information Supplementary data are available at Bioinformatics online.



2010 ◽  
Vol 43 (1) ◽  
pp. 65-158 ◽  
Author(s):  
Kutti R. Vinothkumar ◽  
Richard Henderson

AbstractIn reviewing the structures of membrane proteins determined up to the end of 2009, we present in words and pictures the most informative examples from each family. We group the structures together according to their function and architecture to provide an overview of the major principles and variations on the most common themes. The first structures, determined 20 years ago, were those of naturally abundant proteins with limited conformational variability, and each membrane protein structure determined was a major landmark. With the advent of complete genome sequences and efficient expression systems, there has been an explosion in the rate of membrane protein structure determination, with many classes represented. New structures are published every month and more than 150 unique membrane protein structures have been determined. This review analyses the reasons for this success, discusses the challenges that still lie ahead, and presents a concise summary of the key achievements with illustrated examples selected from each class.



2018 ◽  
Author(s):  
Maxim Shapovalov ◽  
Slobodan Vucetic ◽  
Roland L. Dunbrack

AbstractProtein loops connect regular secondary structures and contain 4-residue beta turns which represent 63% of the residues in loops. The commonly used classification of beta turns (Type I, I’, II, II’, VIa1, VIa2, VIb, and VIII) was developed in the 1970s and 1980s from analysis of a small number of proteins of average resolution, and represents only two thirds of beta turns observed in proteins (with a generic class Type IV representing the rest). We present a new clustering of beta turn conformations from a set of 13,030 turns from 1078 ultra-high resolution protein structures (≤1.2 Å). Our clustering is derived from applying the DBSCAN andk-medoids algorithms to this data set with a metric commonly used in directional statistics applied to the set of dihedral angles from the second and third residues of each turn. We define 18 turn types compared to the 8 classical turn types in common use. We propose a new 2-letter nomenclature for all 18 beta-turn types using Ramachandran region names for the two central residues (e.g., ‘A’ and ‘D’ for alpha regions on the left side of the Ramachandran map and ‘a’ and ‘d’ for equivalent regions on the right-hand side; classical Type I turns are ‘AD’ turns and Type I’ turns are ‘ad’). We identify 11 new types of beta turn, 5 of which are sub-types of classical beta turn types. Up-to-date statistics, probability densities of conformations, and sequence profiles of beta turns in loops were collected and analyzed. A library of turn types,BetaTurnLib18, and cross-platform software,BetaTurnTool18, which identifies turns in an input protein structure, are freely available and redistributable fromdunbrack.fccc.edu/betaturnandgithub.com/sh-maxim/BetaTurn18. Given the ubiquitous nature of beta turns, this comprehensive study updates understanding of beta turns and should also provide useful tools for protein structure determination, refinement, and prediction programs.



2021 ◽  
Author(s):  
Sandeep Kaur ◽  
Neblina Sikta ◽  
Andrea Schafferhans ◽  
Nicola Bordin ◽  
Mark J. Cowley ◽  
...  

AbstractMotivationVariant analysis is a core task in bioinformatics that requires integrating data from many sources. This process can be helped by using 3D structures of proteins, which can provide a spatial context that can provide insight into how variants affect function. Many available tools can help with mapping variants onto structures; but each has specific restrictions, with the result that many researchers fail to benefit from valuable insights that could be gained from structural data.ResultsTo address this, we have created a streamlined system for incorporating 3D structures into variant analysis. Variants can be easily specified via URLs that are easily readable and writable, and use the notation recommended by the Human Genome Variation Society (HGVS). For example, ‘https://aquaria.app/SARS-CoV-2/S/?N501Y’ specifies the N501Y variant of SARS-CoV-2 S protein. In addition to mapping variants onto structures, our system provides summary information from multiple external resources, including COSMIC, CATH-FunVar, and PredictProtein. Furthermore, our system identifies and summarizes structures containing the variant, as well as the variant-position. Our system supports essentially any mutation for any well-studied protein, and uses all available structural data — including models inferred via very remote homology — integrated into a system that is fast and simple to use. By giving researchers easy, streamlined access to a wealth of structural information during variant analysis, our system will help in revealing novel insights into the molecular mechanisms underlying protein function in health and disease.AvailabilityOur resource is freely available at the project home page (https://aquaria.app). After peer review, the code will be openly available via a GPL version 2 license at https://github.com/ODonoghueLab/Aquaria. PSSH2, the database of sequence-to-structure alignments, is also freely available for download at https://zenodo.org/record/[email protected] informationNone.



2020 ◽  
Vol 36 (12) ◽  
pp. 3758-3765 ◽  
Author(s):  
Xiaoqiang Huang ◽  
Robin Pearce ◽  
Yang Zhang

Abstract Motivation Protein structure and function are essentially determined by how the side-chain atoms interact with each other. Thus, accurate protein side-chain packing (PSCP) is a critical step toward protein structure prediction and protein design. Despite the importance of the problem, however, the accuracy and speed of current PSCP programs are still not satisfactory. Results We present FASPR for fast and accurate PSCP by using an optimized scoring function in combination with a deterministic searching algorithm. The performance of FASPR was compared with four state-of-the-art PSCP methods (CISRR, RASP, SCATD and SCWRL4) on both native and non-native protein backbones. For the assessment on native backbones, FASPR achieved a good performance by correctly predicting 69.1% of all the side-chain dihedral angles using a stringent tolerance criterion of 20°, compared favorably with SCWRL4, CISRR, RASP and SCATD which successfully predicted 68.8%, 68.6%, 67.8% and 61.7%, respectively. Additionally, FASPR achieved the highest speed for packing the 379 test protein structures in only 34.3 s, which was significantly faster than the control methods. For the assessment on non-native backbones, FASPR showed an equivalent or better performance on I-TASSER predicted backbones and the backbones perturbed from experimental structures. Detailed analyses showed that the major advantage of FASPR lies in the optimal combination of the dead-end elimination and tree decomposition with a well optimized scoring function, which makes FASPR of practical use for both protein structure modeling and protein design studies. Availability and implementation The web server, source code and datasets are freely available at https://zhanglab.ccmb.med.umich.edu/FASPR and https://github.com/tommyhuangthu/FASPR. Supplementary information Supplementary data are available at Bioinformatics online.



2018 ◽  
Vol 35 (14) ◽  
pp. 2403-2410 ◽  
Author(s):  
Jack Hanson ◽  
Kuldip Paliwal ◽  
Thomas Litfin ◽  
Yuedong Yang ◽  
Yaoqi Zhou

Abstract Motivation Sequence-based prediction of one dimensional structural properties of proteins has been a long-standing subproblem of protein structure prediction. Recently, prediction accuracy has been significantly improved due to the rapid expansion of protein sequence and structure libraries and advances in deep learning techniques, such as residual convolutional networks (ResNets) and Long-Short-Term Memory Cells in Bidirectional Recurrent Neural Networks (LSTM-BRNNs). Here we leverage an ensemble of LSTM-BRNN and ResNet models, together with predicted residue-residue contact maps, to continue the push towards the attainable limit of prediction for 3- and 8-state secondary structure, backbone angles (θ, τ, ϕ and ψ), half-sphere exposure, contact numbers and solvent accessible surface area (ASA). Results The new method, named SPOT-1D, achieves similar, high performance on a large validation set and test set (≈1000 proteins in each set), suggesting robust performance for unseen data. For the large test set, it achieves 87% and 77% in 3- and 8-state secondary structure prediction and 0.82 and 0.86 in correlation coefficients between predicted and measured ASA and contact numbers, respectively. Comparison to current state-of-the-art techniques reveals substantial improvement in secondary structure and backbone angle prediction. In particular, 44% of 40-residue fragment structures constructed from predicted backbone Cα-based θ and τ angles are less than 6 Å root-mean-squared-distance from their native conformations, nearly 20% better than the next best. The method is expected to be useful for advancing protein structure and function prediction. Availability and implementation SPOT-1D and its data is available at: http://sparks-lab.org/. Supplementary information Supplementary data are available at Bioinformatics online.



2017 ◽  
Author(s):  
Ibrahim Tanyalcin ◽  
Julien Ferte ◽  
Taushif Khan ◽  
Carla Al Assaf

ABSTRACTSummaryOne of the main goals of proteomics is to understand how point mutations impact on the protein structure. Visualization and clustering of point mutations on user-defined 3 dimensional space can allow researchers to have new insights and hypothesis about the mutation’s mechanism of action.Availability and ImplementationWe have developed an interactive I-PV add-on called INDORIL to visualize point mutations. Indoril can be downloaded fromhttp://[email protected][email protected] InformationPlease refer to the supplementary section andhttp://www.i-pv.org.



2020 ◽  
Author(s):  
Ronald Ayoub ◽  
Yugyung Lee

AbstractProtein structure prediction is a long-standing unsolved problem in molecular biology that has seen renewed interest with the recent success of deep learning with AlphaFold at CASP13. While developing and evaluating protein structure prediction methods, researchers may want to identify the most similar known structures to their predicted structures. These predicted structures often have low sequence and structure similarity to known structures. We show how RUPEE, a purely geometric protein structure search, is able to identify the structures most similar to structure predictions, regardless of how they vary from known structures, something existing protein structure searches struggle with. RUPEE accomplishes this through the use of a novel linear encoding of protein structures as a sequence of residue descriptors. Using a fast Needleman-Wunsch algorithm, RUPEE is able to perform alignments on the sequences of residue descriptors for every available structure. This is followed by a series of increasingly accurate structure alignments from TM-align alignments initialized with the Needleman-Wunsch residue descriptor alignments to standard TM-align alignments of the final results. By using alignment normalization effectively at each stage, RUPEE also can execute containment searches in addition to full-length searches to identify structural motifs within proteins. We compare the results of RUPEE to mTM-align, SSM, CATHEDRAL and VAST using a benchmark derived from the protein structure predictions submitted to CASP13. RUPEE identifies better alignments on average with respect to RMSD and TM-score as well as Q-score and SSAP-score, scores specific to SSM and CATHEDRAL, respectively. Finally, we show a sample of the top-scoring alignments that RUPEE identified that none of the other protein structure searches we compared to were able to identify.The RUPEE protein structure search is available at https://ayoubresearch.com. Code and data are available at https://github.com/rayoub/rupee.



2006 ◽  
Vol 04 (06) ◽  
pp. 1197-1216 ◽  
Author(s):  
ZEYAR AUNG ◽  
KIAN-LEE TAN

We propose a detailed protein structure alignment method named "MatAlign". It is a two-step algorithm. Firstly, we represent 3D protein structures as 2D distance matrices, and align these matrices by means of dynamic programming in order to find the initially aligned residue pairs. Secondly, we refine the initial alignment iteratively into the optimal one according to an objective scoring function. We compare our method against DALI and CE, which are among the most accurate and the most widely used of the existing structural comparison tools. On the benchmark set of 68 protein structure pairs by Fischer et al., MatAlign provides better alignment results, according to four different criteria, than both DALI and CE in a majority of cases. MatAlign also performs as well in structural database search as DALI does, and much better than CE does. MatAlign is about two to three times faster than DALI, and has about the same speed as CE. The software and the supplementary information for this paper are available at . .



Sign in / Sign up

Export Citation Format

Share Document