scholarly journals Assessment of Globularity of Protein Structures via Minimum Volume Ellipsoids and Voxel-Based Atom Representation

Crystals ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 1539
Author(s):  
Mateusz Banach

A computer algorithm for assessment of globularity of protein structures is presented. By enclosing the input protein in a minimum volume ellipsoid (MVEE) and calculating a profile measuring how voxelized space within this shape (cubes on a uniform grid) is occupied by atoms, it is possible to estimate how well the molecule resembles a globule. For any protein to satisfy the proposed globularity criterion, its ellipsoid profile (EP) should first confirm that atoms adequately fill the ellipsoid’s center. This property should then propagate towards the surface of the ellipsoid, although with diminishing importance. It is not required to compute the molecular surface. Globular status (full or partial) is assigned to proteins with values of their ellipsoid profiles, called here the ellipsoid indexes (EI), above certain levels. Due to structural outliers which may considerably distort the measurements, a companion method for their detection and reduction of their influence is also introduced. It is based on kernel density estimation and is shown to work well as an optional input preparation step for MVEE. Finally, the complete workflow is applied to over two thousand representatives of SCOP 2.08 domain superfamilies, surveying the landscape of tertiary structure of proteins from the Protein Data Bank.

A knowledge of the three-dimensional structure of proteins is an essential prerequisite for the design of new molecules. When the tertiary structure is not available from high-resolution X-ray or n.m.r. analysis, the success of prediction is improved by using a relational database of known protein structures. This can be searched to provide information on secondary structure motifs and domains which are recognized by characteristic sequence patterns and which are assembled as ‘spare parts’ by using computer graphics. Similar techniques can be used to give approximate structures for amino-acid replacements, deletions and insertions introduced by mutagenesis. The resulting structures are optimized by using interactive graphics, energy minimization and molecular dynamics.


2017 ◽  
Vol 2017 ◽  
pp. 1-12 ◽  
Author(s):  
Ke Yan ◽  
Bing Wang ◽  
Holun Cheng ◽  
Zhiwei Ji ◽  
Jing Huang ◽  
...  

Molecular skin surface (MSS), proposed by Edelsbrunner, is a C2 continuous smooth surface modeling approach of biological macromolecules. Compared to the traditional methods of molecular surface representations (e.g., the solvent exclusive surface), MSS has distinctive advantages including having no self-intersection and being decomposable and transformable. For further promoting MSS to the field of bioinformatics, transformation between different MSS representations mimicking the macromolecular dynamics is demanded. The transformation process helps biologists understand the macromolecular dynamics processes visually in the atomic level, which is important in studying the protein structures and binding sites for optimizing drug design. However, modeling the transformation between different MSSs suffers from high computational cost while the traditional approaches reconstruct every intermediate MSS from respective intermediate union of balls. In this study, we propose a novel computational framework named general MSS transformation framework (GMSSTF) between two MSSs without the assistance of union of balls. To evaluate the effectiveness of GMSSTF, we applied it on a popular public database PDB (Protein Data Bank) and compared the existing MSS algorithms with and without GMSSTF. The simulation results show that the proposed GMSSTF effectively improves the computational efficiency and is potentially useful for macromolecular dynamic simulations.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Alejandro Miguel Cisneros-Martínez ◽  
Arturo Becerra ◽  
Antonio Lazcano

Abstract To date only a handful of duplicated genes have been described in RNA viruses. This shortage can be attributed to different factors, including the RNA viruses with high mutation rate that would make a large genome more prone to acquire deleterious mutations. This may explain why sequence-based approaches have only found duplications in their most recent evolutionary history. To detect earlier duplications, we performed protein tertiary structure comparisons for every RNA virus family represented in the Protein Data Bank. We present a list of thirty pairs of possible paralogs with <30 per cent sequence identity. It is argued that these pairs are the outcome of six duplication events. These include the α and β subunits of the fungal toxin KP6 present in the dsRNA Ustilago maydis virus (family Totiviridae), the SARS-CoV (Coronaviridae) nsp3 domains SUD-N, SUD-M and X-domain, the Picornavirales (families Picornaviridae, Dicistroviridae, Iflaviridae and Secoviridae) capsid proteins VP1, VP2 and VP3, and the Enterovirus (family Picornaviridae) 3C and 2A cysteine-proteases. Protein tertiary structure comparisons may reveal more duplication events as more three-dimensional protein structures are determined and suggests that, although still rare, gene duplications may be more frequent in RNA viruses than previously thought. Keywords: gene duplications; RNA viruses.


2019 ◽  
Author(s):  
Broto Chakrabarty ◽  
Nita Parekh

AbstractTandemly repeated structural motifs in proteins form highly stable structural folds and provide multiple binding sites associated with diverse functional roles. The tertiary structure and function of these proteins are determined by the type and copy number of the repeating units. Each repeat type exhibits a unique pattern of intra- and inter-repeat unit interactions that is well-captured by the topological features in the network representation of protein structures. Here we present an improved version of our graph based algorithm, PRIGSA, with structure-based validation and filtering steps incorporated for accurate detection of tandem structural repeats. The algorithm integrates available knowledge on repeat families with de novo prediction to detect repeats in single monomer chains as well as in multimeric protein complexes. Three levels of performance evaluation are presented: comparison with state-of-the-art algorithms on benchmark dataset of repeat and non-repeat proteins, accuracy in the detection of members of 13 known repeat families reported in UniProt and execution on the complete Protein Data Bank to show its ability to identify previously uncharacterized proteins. A ∼3-fold increase in the coverage of the members of 13 known families and 3,408 novel uncharacterized structural repeat proteins are identified on executing it on PDB. URL: http://bioinf.iiit.ac.in/PRIGSA2/.


2017 ◽  
Author(s):  
Spencer Bliven ◽  
Aleix Lafita ◽  
Althea Parker ◽  
Guido Capitani ◽  
Jose M Duarte

AbstractA correct assessment of the quaternary structure of proteins is a fundamental prerequisite to understanding their function, physico-chemical properties and mode of interaction with other proteins. Currently about 90% of structures in the Protein Data Bank are crystal structures, in which the correct quaternary structure is embedded in the crystal lattice among a number of crystal contacts. Computational methods are required to 1) classify all protein-protein contacts in crystal lattices as biologically relevant or crystal contacts and 2) provide an assessment of how the biologically relevant interfaces combine into a biological assembly In our previous work we addressed the first problem with our EPPIC (Evolutionary Protein Protein Interface Classifier) method. Here, we present our solution to the second problem with a new method that combines the interface classification results with symmetry and topology considerations. The new algorithm enumerates all possible valid assemblies within the crystal using a graph representation of the lattice and predicts the most probable biological unit based on the pairwise interface scoring. Our method achieves 85% precision on a new dataset of 1,481 biological assemblies with consensus of PDB annotations. Although almost the same precision is achieved by PISA, currently the most popular quaternary structure assignment method, we show that, due to the fundamentally different approach to the problem, the two methods are complementary and could be combined to improve biological assembly assignments. The software for the automatic assessment of protein assemblies (EPPIC version 3) has been made available through a web server at http://www.eppic-web.org.Author summaryX-ray diffraction experiments are the main experimental technique to reveal the detailed atomic 3-dimensional structure of proteins. In these experiments, proteins are packed into crystals, an environment that is far away from their native solution environment. Determining which parts of the structure reflect the protein’s state in the cell rather than being artifacts of the crystal environment can be a difficult task. How the different protein subunits assemble together in solution is known as the quaternary structure. Finding the correct quaternary structure is important both to understand protein oligomerization and for the understanding of protein-protein interactions at large. Here we present a new method to automatically determine the quaternary structure of proteins given their crystal structure. We provide a theoretical basis for properties that correct protein assemblies should possess, and provide a systematic evaluation of all possible assemblies according to these properties. The method provides a guidance to the experimental structural biologist as well as to structural bioinformaticians analyzing protein structures in bulk. Assemblies are provided for all proteins in the Protein Data Bank through a public website and database that is updated weekly as new structures are released.


Author(s):  
CHANDRAYANI N. ROKDE ◽  
DR.MANALI KSHIRSAGAR

Protein structure prediction (PSP) from amino acid sequence is one of the high focus problems in bioinformatics today. This is due to the fact that the biological function of the protein is determined by its three dimensional structure. The understanding of protein structures is vital to determine the function of a protein and its interaction with DNA, RNA and enzyme. Thus, protein structure is a fundamental area of computational biology. Its importance is intensed by large amounts of sequence data coming from PDB (Protein Data Bank) and the fact that experimentally methods such as X-ray crystallography or Nuclear Magnetic Resonance (NMR)which are used to determining protein structures remains very expensive and time consuming. In this paper, different types of protein structures and methods for its prediction are described.


2021 ◽  
Vol 7 ◽  
Author(s):  
Vasam Manjveekar Prabantu ◽  
Nagarajan Naveenkumar ◽  
Narayanaswamy Srinivasan

The interactions between residues in a protein tertiary structure can be studied effectively using the approach of protein structure network (PSN). A PSN is a node-edge representation of the structure with nodes representing residues and interactions between residues represented by edges. In this study, we have employed weighted PSNs to understand the influence of disease-causing mutations on proteins of known 3D structures. We have used manually curated information on disease mutations from UniProtKB/Swiss-Prot and their corresponding protein structures of wildtype and disease variant from the protein data bank. The PSNs of the wildtype and disease-causing mutant are compared to analyse variation of global and local dissimilarity in the overall network and at specific sites. We study how a mutation at a given site can affect the structural network at a distant site which may be involved in the function of the protein. We have discussed specific examples of the disease cases where the protein structure undergoes limited structural divergence in their backbone but have large dissimilarity in their all atom networks and vice versa, wherein large conformational alterations are observed while retaining overall network. We analyse the effect of variation of network parameters that characterize alteration of function or stability.


1998 ◽  
Vol 54 (6) ◽  
pp. 1085-1094 ◽  
Author(s):  
Helge Weissig ◽  
Ilya N. Shindyalov ◽  
Philip E. Bourne

Databases containing macromolecular structure data provide a crystallographer with important tools for use in solving, refining and understanding the functional significance of their protein structures. Given this importance, this paper briefly summarizes past progress by outlining the features of the significant number of relevant databases developed to date. One recent database, PDB+, containing all current and obsolete structures deposited with the Protein Data Bank (PDB) is discussed in more detail. PDB+ has been used to analyze the self-consistency of the current (1 January 1998) corpus of over 7000 structures. A summary of those findings is presented (a full discussion will appear elsewhere) in the form of global and temporal trends within the data. These trends indicate that challenges exist if crystallographers are to provide the community with complete and consistent structural results in the future. It is argued that better information management practices are required to meet these challenges.


2018 ◽  
Vol 19 (11) ◽  
pp. 3405 ◽  
Author(s):  
Emanuel Peter ◽  
Jiří Černý

In this article, we present a method for the enhanced molecular dynamics simulation of protein and DNA systems called potential of mean force (PMF)-enriched sampling. The method uses partitions derived from the potentials of mean force, which we determined from DNA and protein structures in the Protein Data Bank (PDB). We define a partition function from a set of PDB-derived PMFs, which efficiently compensates for the error introduced by the assumption of a homogeneous partition function from the PDB datasets. The bias based on the PDB-derived partitions is added in the form of a hybrid Hamiltonian using a renormalization method, which adds the PMF-enriched gradient to the system depending on a linear weighting factor and the underlying force field. We validated the method using simulations of dialanine, the folding of TrpCage, and the conformational sampling of the Dickerson–Drew DNA dodecamer. Our results show the potential for the PMF-enriched simulation technique to enrich the conformational space of biomolecules along their order parameters, while we also observe a considerable speed increase in the sampling by factors ranging from 13.1 to 82. The novel method can effectively be combined with enhanced sampling or coarse-graining methods to enrich conformational sampling with a partition derived from the PDB.


Sign in / Sign up

Export Citation Format

Share Document