scholarly journals Structural Bioinformatics: Computational Software and Databases for the Evaluation of Protein Structure

2018 ◽  
Vol 9 (2) ◽  
pp. 94-101
Author(s):  
Ayisha Amanullah ◽  
Suad Naheed

Databases are the computerized platform where information is stored and can be retrieved easily by public users. Biological databases are the repositories of biological data. These biological data libraries contain facts and figures related to various disciplines of research including genomics, proteomics, microarray technology, metabolomics and phylogenetics. By using biological databases, a broad collection of essential biological information can be exploited ranging from function, structure and localization of gene, clinical consequences of mutation to similarity index among biological sequences and structures. Nowadays, different kinds of biological databases are available on the web. The present write up focuses on biological databases and bioinformatics tools for protein structure analysis. This review also aims to elaborate the searching schemes, available in different structural databases. The wide variety of different levels and types of information content related to 3D protein structures are available on web-based databases. Regarding the biological functions and 3D structures of various proteins, these databases provide a huge range of useful links, schematic diagrams as well as strategies for detailed analysis of proteins and other macromolecules structures. 3D structural illustration of proteins stored in structural databases is determined and visualized by X-ray crystallography, electron microscopy and NMR spectroscopy. On regular basis, a large number of protein structures are submitted by structural biologists, updated and curated by subject experts. Most familiar biological databases that store 3D protein and other macromolecules structures include, PDB, 3D Genomics, CATH, & SCOP. These databases contain valuable information of overall protein structures, domains and motif structures, protein-protein complex systems and complex of protein with other biomolecules.

Author(s):  
N. Srinivasan ◽  
G. Agarwal ◽  
R. M. Bhaskara ◽  
R. Gadkari ◽  
O. Krishnadev ◽  
...  

In the post-genomic era, biological databases are growing at a tremendous rate. Despite rapid accumulation of biological information, functions and other biological properties of many putative gene products of various organisms remain either unknown or obscure. This paper examines how strategic integration of large biological databases and combinations of various biological information helps address some of the fundamental questions on protein structure, function and interactions. New developments in function recognition by remote homology detection and strategic use of sequence databases aid recognition of functions of newly discovered proteins. Knowledge of 3-D structures and combined use of sequences and 3-D structures of homologous protein domains expands the ability of remote homology detection enormously. The authors also demonstrate how combined consideration of functions of individual domains of multi-domain proteins helps in recognizing gross biological attributes. This paper also discusses a few cases of combining disparate biological datasets or combination of disparate biological information in obtaining new insights about protein-protein interactions across a host and a pathogen. Finally, the authors discuss how combinations of low resolution structural data, obtained using cryoEM studies, of gigantic multi-component assemblies, and atomic level 3-D structures of the components is effective in inferring finer features in the assembly.


Author(s):  
Elena S. Boltanova ◽  
◽  
Maria P. Imekova ◽  

In the world, it is customary to create biological databases of different species. And initially, the databases for the investigation of crimes were widespread. However, later, when their potential and benefits, including for medicine, were assessed, the databases for other areas appeared. Russia was no exception in this regard. Although, in our country, unlike foreign states, the activities of biological databases based on purposes other than the disclosure of crimes are practically not regulated in any way. This article deals with the analysis of legal regulation of biobanks in the Russian Federation and abroad. Special attention is paid to the classification of biobanks. The purpose of the study is to determine the feasibility in the legislative regulation of their activities, as well as the patterns in such a regulation. To achieve this goal, the authors studied extensive regulatory material, which included EU directives and national regulations of the EU member states. The methodological basis of the study was the general scientific and private scientific meth-ods of research. Of course, such private scientific research methods as the comparative-legal method and the formal legal method have been widely used. Due to the comparative legal analysis, it is established that the EU countries have a high level of legislative activity in terms of determining the legal regime of biological databases. All countries recognize the specifics of such a legal regime, which can largely be explained by a special legal nature of biological samples and biological data. In this regard, the following issues related to the activities of biological databases are reflected everywhere in the EU countries at the level of law: the procedure for their creation; the procedure for receiving, processing, storing and transmitting biological samples and the data obtained on their basis; the rights and obligations of database creators and persons who have provided their biological samples and biological data about themselves; a set of measures aimed at protecting the rights and interests of donors and third parties, etc. As it seems, a similar approach to the regulation of the activities of biological bases estab-lished not for the investigation of crimes should be implemented by Russia. At the same time, special attention should be paid to the research of biological databases. In the Russian Federa-tion, they are created, as a rule, at the local level. Their main drawback is that they are sepa-rate sources of limited biological information, functioning independently of each other while comprehensive (concentrated in one place) information can bring invaluable benefits and advantages for Russian science and medicine as a whole. However, this requires the estab-lishment of an appropriate legal framework.


2019 ◽  
Vol 47 (W1) ◽  
pp. W471-W476 ◽  
Author(s):  
Rasim Murat Aydınkal ◽  
Onur Serçinoğlu ◽  
Pemra Ozbek

AbstractProSNEx (Protein Structure Network Explorer) is a web service for construction and analysis of Protein Structure Networks (PSNs) alongside amino acid flexibility, sequence conservation and annotation features. ProSNEx constructs a PSN by adding nodes to represent residues and edges between these nodes using user-specified interaction distance cutoffs for either carbon-alpha, carbon-beta or atom-pair contact networks. Different types of weighted networks can also be constructed by using either (i) the residue-residue interaction energies in the format returned by gRINN, resulting in a Protein Energy Network (PEN); (ii) the dynamical cross correlations from a coarse-grained Normal Mode Analysis (NMA) of the protein structure; (iii) interaction strength. Upon construction of the network, common network metrics (such as node centralities) as well as shortest paths between nodes and k-cliques are calculated. Moreover, additional features of each residue in the form of conservation scores and mutation/natural variant information are included in the analysis. By this way, tool offers an enhanced and direct comparison of network-based residue metrics with other types of biological information. ProSNEx is free and open to all users without login requirement at http://prosnex-tool.com.


2000 ◽  
Vol 33 (1) ◽  
pp. 176-183 ◽  
Author(s):  
Guoguang Lu

In order to facilitate the three-dimensional structure comparison of proteins, software for making comparisons and searching for similarities to protein structures in databases has been developed. The program identifies the residues that share similar positions of both main-chain and side-chain atoms between two proteins. The unique functions of the software also include database processingviaInternet- and Web-based servers for different types of users. The developed method and its friendly user interface copes with many of the problems that frequently occur in protein structure comparisons, such as detecting structurally equivalent residues, misalignment caused by coincident match of Cαatoms, circular sequence permutations, tedious repetition of access, maintenance of the most recent database, and inconvenience of user interface. The program is also designed to cooperate with other tools in structural bioinformatics, such as the 3DB Browser software [Prilusky (1998).Protein Data Bank Q. Newslett.84, 3–4] and the SCOP database [Murzin, Brenner, Hubbard & Chothia (1995).J. Mol. Biol.247, 536–540], for convenient molecular modelling and protein structure analysis. A similarity ranking score of `structure diversity' is proposed in order to estimate the evolutionary distance between proteins based on the comparisons of their three-dimensional structures. The function of the program has been utilized as a part of an automated program for multiple protein structure alignment. In this paper, the algorithm of the program and results of systematic tests are presented and discussed.


Author(s):  
Denise Fukumi Tsunoda ◽  
Heitor Silvério Lopes ◽  
Ana Tereza Vasconcelos

Bioinformatics means solving problems arising from biology using methods from computer science. The National Center for Biotechnology Information (www.ncbi.nih.gov) defines bioinformatics as: “…the field of science in which biology, computer science, and information technology merge into a single discipline...There are three important sub-disciplines within bioinformatics: the development of new algorithms and statistics with which to access relationships among members of large data sets; the analysis and interpretation of various types of data including nucleotide and amino acid sequences, protein domains, and protein structures; and the development and implementation of tools that enable efficient access and management of different types of information.”


2007 ◽  
Vol 40 (4) ◽  
pp. 773-777 ◽  
Author(s):  
B. Balamurugan ◽  
M. N. A. Md. Roshan ◽  
B. Shaahul Hameed ◽  
K. Sumathi ◽  
R. Senthilkumar ◽  
...  

A computing engine, theProtein Structure Analysis Package(PSAP), has been developed to calculate and display various hidden structural and functional features of three-dimensional protein structures. The proposed computing engine has several utilities to enable structural biologists to analyze three-dimensional protein molecules and provides an easy-to-use Web interface to compute and visualize the necessary features dynamically on the client machine. Users need to provide the Protein Data Bank (PDB) identification code or upload three-dimensional atomic coordinates from the client machine. For visualization, the free molecular graphics programsRasMolandJmolare deployed in the computing engine. Furthermore, the computing engine is interfaced with an up-to-date local copy of the PDB. The atomic coordinates are updated every week and hence users can access all the structures available in the PDB. The computing engine is free and is accessible online at http://iris.physics.iisc.ernet.in/psap/.


2020 ◽  
Vol 48 (W1) ◽  
pp. W132-W139
Author(s):  
Sumaiya Iqbal ◽  
David Hoksza ◽  
Eduardo Pérez-Palma ◽  
Patrick May ◽  
Jakob B Jespersen ◽  
...  

Abstract Human genome sequencing efforts have greatly expanded, and a plethora of missense variants identified both in patients and in the general population is now publicly accessible. Interpretation of the molecular-level effect of missense variants, however, remains challenging and requires a particular investigation of amino acid substitutions in the context of protein structure and function. Answers to questions like ‘Is a variant perturbing a site involved in key macromolecular interactions and/or cellular signaling?’, or ‘Is a variant changing an amino acid located at the protein core or part of a cluster of known pathogenic mutations in 3D?’ are crucial. Motivated by these needs, we developed MISCAST (missense variant to protein structure analysis web suite; http://miscast.broadinstitute.org/). MISCAST is an interactive and user-friendly web server to visualize and analyze missense variants in protein sequence and structure space. Additionally, a comprehensive set of protein structural and functional features have been aggregated in MISCAST from multiple databases, and displayed on structures alongside the variants to provide users with the biological context of the variant location in an integrated platform. We further made the annotated data and protein structures readily downloadable from MISCAST to foster advanced offline analysis of missense variants by a wide biological community.


2002 ◽  
Vol 01 (01) ◽  
pp. 187-211 ◽  
Author(s):  
SARASWATHI VISHVESHWARA ◽  
K. V. BRINDA ◽  
N. KANNAN

The sequence and structure of a large body of proteins are becoming increasingly available. It is desirable to explore mathematical tools for efficient extraction of information from such sources. The principles of graph theory, which was earlier applied in fields such as electrical engineering and computer networks are now being adopted to investigate protein structure, folding, stability, function and dynamics. This review deals with a brief account of relevant graphs and graph theoretic concepts. The concepts of protein graph construction are discussed. The manner in which graphs are analyzed and parameters relevant to protein structure are extracted, are explained. The structural and biological information derived from protein structures using these methods is presented.


1995 ◽  
Vol 28 (5) ◽  
pp. 624-630 ◽  
Author(s):  
X.-J. Zhang ◽  
B. W. Matthews

EDPDB is a Fortran program that simplifies the analysis of protein structure and makes it easy to extract various types of geometrical and biologically relevant information for the molecule both in isolation as well as in its crystallographic context. EDPDB offers a large set of functions by which the user can evaluate, select and manipulate the coordinates of protein structures. Types of calculation available include the determination of solvent accessibility, bond lengths and torsion angles, determination of the van der Waals volume of a group of atoms, determination of the best-fit plane through a set of points, evaluation of crystal contacts between a molecule in a crystal and all symmetry-related molecules, and the determination of `hinge-bending' motion between protein domains. It is also possible to compare different structures, to perform coordinate manipulations and to edit coordinate files. The program augments the graphic analysis of protein structure by allowing the user to construct a simple set of commands that will rapidly screen an entire structure. It may also make special purpose analyses feasible without complicated programming.


2021 ◽  
Author(s):  
Alexander Derry ◽  
Kristy A. Carpenter ◽  
Russ B. Altman

The three-dimensional structures of proteins are crucial for understanding their molecular mechanisms and interactions. Machine learning algorithms that are able to learn accurate representations of protein structures are therefore poised to play a key role in protein engineering and drug development. The accuracy of such models in deployment is directly influenced by training data quality. The use of different experimental methods for protein structure determination may introduce bias into the training data. In this work, we evaluate the magnitude of this effect across three distinct tasks: estimation of model accuracy, protein sequence design, and catalytic residue prediction. Most protein structures are derived from X-ray crystallography, nuclear magnetic resonance (NMR), or cryo-electron microscopy (cryo-EM); we trained each model on datasets consisting of either all three structure types or of only X-ray data. We find that across these tasks, models consistently perform worse on test sets derived from NMR and cryo-EM than they do on test sets of structures derived from X-ray crystallography, but that the difference can be mitigated when NMR and cryo-EM structures are included in the training set. Importantly, we show that including all three types of structures in the training set does not degrade test performance on X-ray structures, and in some cases even increases it. Finally, we examine the relationship between model performance and the biophysical properties of each method, and recommend that the biochemistry of the task of interest should be considered when composing training sets.


Sign in / Sign up

Export Citation Format

Share Document