Searching techniques for databases of protein secondary structures

This paper summarizes the findings of a recent, British Library-funded research project into computer techniques for searching the three-dimensional protein structures that occur in the Protein Data Bank. The work focuses on the secondary structures of proteins and utilizes both angular and distance geometric information. Algorithms are presented for the auto matic identification of secondary structure elements, of sec ondary structure motifs and of proteins with similar secondary structures.

Download Full-text

Revisiting Chameleon Sequences in the Protein Data Bank

Algorithms ◽

10.3390/a11080114 ◽

2018 ◽

Vol 11 (8) ◽

pp. 114 ◽

Cited By ~ 3

Author(s):

Mihaly Mezei

Keyword(s):

Protein Data Bank ◽

Protein Structures ◽

Data Bank ◽

Secondary Structures ◽

Steady Growth ◽

Periodic Repetition

The steady growth of the Protein Data Bank (PDB) suggests the periodic repetition of searches for sequences that form different secondary structures in different protein structures; these are called chameleon sequences. This paper presents a fast (nlog(n)) algorithm for such searches and presents the results on all protein structures in the PDB. The longest such sequence found consists of 20 residues.

Download Full-text

FTIR Analysis of Conformational Changes in the Secondary Structure of Ovalbumin: Effect of pH and Cosolvent Sugar-free Natura

Science & Technology Journal ◽

10.22232/stj.2020.08.01.10 ◽

2020 ◽

Vol 8 (1) ◽

pp. 78-83

Author(s):

P. Agalya ◽

◽

V. Velusamy

Keyword(s):

Secondary Structure ◽

Conformational Changes ◽

Three Dimensional ◽

Protein Secondary Structure ◽

Secondary Structures ◽

Structural Elements ◽

Random Coils ◽

Protein Secondary Structures ◽

Derivative Analysis ◽

Influence Of Ph

a-helix, þ-sheet, þ-turns, and random coils are the three-dimensional local segments that constitute a protein secondary structure. Molecular vibrations of proteins are sensitive to structural organizations of peptide chains hence Fourier Transform infrared (FTIR) spectroscopy is one of the recognized techniques for the identification of protein secondary structures. However, the lower frequency region of FTIR especially the amide VI bands (in the region 590-490cm-1) is little studied for proteins. Further, the effect of sugar-free natura on ovalbumin stability is not yet studied to our knowledge. The present study examines the conformational changes in the secondary structure of ovalbumin (OVA) protein under the influence of pH variations (2, 5, 7, 9, and 12) and also cosolvent sugar-free Natura (SFN) inclusion. From the primary absorption spectra of the amide VI bands, the second derivative analysis is furnished to quantify the secondary structural elements of protein thereby conformational changes are analyzed. From obtained results, it is found that conformational changes occur between two major secondary structures of a-helix and þ-sheet of OVA due to variation of pH and inclusion of cosolvent. Also, the results confirm that the denaturation of OVA in the presence of SFN irrespective of pH.

Download Full-text

Statistical dependence of protein secondary structure on amino acid bigrams

Chemical Industry and Chemical Engineering Quarterly ◽

10.2298/ciceq0601082z ◽

2006 ◽

Vol 12 (1) ◽

pp. 82-85

Author(s):

Miodrag Zivkovic ◽

Sasa Malkov ◽

Snezana Zaric ◽

Milena Vujosevic-Janicic ◽

Jelena Tomasevic ◽

...

Keyword(s):

Amino Acid ◽

Secondary Structure ◽

Amino Acid Sequence ◽

Protein Secondary Structure ◽

Data Bank ◽

Secondary Structures ◽

Statistical Dependence ◽

Conditional Probabilities ◽

Protein Secondary Structures ◽

Protein Data Bank Database

The statistical dependence of protein secondary structure on amino acid bigram frequencies was studied. Proteins in the PDBSELECT subset of the Protein Data Bank database were investigated. Protein secondary structures were determined using DSSP software. The conditional probabilities of protein secondary structures were calculated and presented. The results on bigrams show the frequencies of all the possible bigrams in all secondary structure types. These results elucidate some factors important for the prediction of the secondary structures of proteins based on the amino acid sequence.

Download Full-text

2StrucCompare: a webserver for visualizing small but noteworthy differences between protein tertiary structures through interrogation of the secondary structure content

Nucleic Acids Research ◽

10.1093/nar/gkz456 ◽

2019 ◽

Vol 47 (W1) ◽

pp. W477-W481 ◽

Cited By ~ 4

Author(s):

Elliot D Drew ◽

Robert W Janes

Keyword(s):

Secondary Structure ◽

Protein Structures ◽

Secondary Structures ◽

Residue Level ◽

Side Chain ◽

Related Protein ◽

Tertiary Structures ◽

Secondary Structure Content ◽

Protein Secondary Structures

Abstract 2StrucCompare is a webserver whose primary aim is to visualize subtle but functionally important differences between two related protein structures, either of the same protein or related homologues, with similar or functionally different tertiary structures. At the heart of the package is identifying and visualizing differences between conformations at the secondary structure and at the residue level, such as contact differences or side chain conformational differences found between two protein chains. The protein secondary structures are determined according to four established methods (DSSP, STRIDE, P-SEA and STICKS), and as each employs different assignment strategies, small conformational differences between the two structures can give rise to paired residues being denoted as having different secondary structure features with the different methods. 2StrucCompare captures both the large and more subtle differences found between structures, enabling visualization of these differences that could be key to an understanding of a proteins’ function. 2StrucCompare is freely accessible at http://2struccompare.cryst.bbk.ac.uk/index.php

Download Full-text

PAP: a protein analysis package

Journal of Applied Crystallography ◽

10.1107/s0021889890004228 ◽

1990 ◽

Vol 23 (5) ◽

pp. 434-436 ◽

Cited By ~ 7

Author(s):

T. Callahan ◽

W. B. Gleason ◽

T. P. Lybrand

Keyword(s):

Protein Data Bank ◽

Digital Equipment Corporation ◽

Protein Structures ◽

Three Dimensional ◽

Data Bank ◽

Scatter Plot ◽

Linear Sequence ◽

Analysis Package ◽

Temperature Factors ◽

Digital Equipment

A program package has been assembled for the analysis of protein coordinates which are in the Brookhaven Protein Data Bank (PDB) format. These programs can be used to make two types of φ–ψ plots: a Ramachandran-style scatter plot, and a plot of φ and ψ values as a function of the linear sequence. Programs are also available for the display of distance diagonal plots for proteins. Two protein structures can be compared and the resulting r.m.s. differences in the structures plotted as a function of sequence. Temperature factors can be analyzed and plotted as a function of the linear sequence. In addition, various utilities are supplied for splitting PDB files which contain multiple subunits into individual files and also for renumbering PDB files. A utility is also provided for converting Amber-style PDB files into standard PDB files. Priestle's program RIBBON [J. Appl. Cryst. (1988), 21, 572–576] has been converted to run in a stand-alone mode with interactive rotation of the three-dimensional ribbon picture. Programs are Silicon Graphics four-dimensional level and have been tested on 4D70/GT and personal Iris workstations, although programs which give Postscript output have been converted to run on Digital Equipment Corporation VAX computers and Sun workstations.

Download Full-text

Structural relation matching: an algorithm to identify structural patterns into RNAs and their interactions

Journal of Integrative Bioinformatics ◽

10.1515/jib-2020-0039 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Michela Quadrini

Keyword(s):

Hydrogen Bonding ◽

Secondary Structure ◽

Nucleotide Sequence ◽

Thermus Thermophilus ◽

Three Dimensional ◽

Secondary Structures ◽

Structural Effect ◽

Biological Processes ◽

Structural Pattern ◽

Rna Molecules

Abstract RNA molecules play crucial roles in various biological processes. Their three-dimensional configurations determine the functions and, in turn, influences the interaction with other molecules. RNAs and their interaction structures, the so-called RNA–RNA interactions, can be abstracted in terms of secondary structures, i.e., a list of the nucleotide bases paired by hydrogen bonding within its nucleotide sequence. Each secondary structure, in turn, can be abstracted into cores and shadows. Both are determined by collapsing nucleotides and arcs properly. We formalize all of these abstractions as arc diagrams, whose arcs determine loops. A secondary structure, represented by an arc diagram, is pseudoknot-free if its arc diagram does not present any crossing among arcs otherwise, it is said pseudoknotted. In this study, we face the problem of identifying a given structural pattern into secondary structures or the associated cores or shadow of both RNAs and RNA–RNA interactions, characterized by arbitrary pseudoknots. These abstractions are mapped into a matrix, whose elements represent the relations among loops. Therefore, we face the problem of taking advantage of matrices and submatrices. The algorithms, implemented in Python, work in polynomial time. We test our approach on a set of 16S ribosomal RNAs with inhibitors of Thermus thermophilus, and we quantify the structural effect of the inhibitors.

Download Full-text

In Silico Study of Secondary Structure of Hemoglobin Protein

Research Journal of Pharmacy and Technology ◽

10.52711/0974-360x.2021.01080 ◽

2021 ◽

pp. 6245-6249

Author(s):

Roma Chandra

Keyword(s):

Secondary Structure ◽

Protein Sequence ◽

Structure Prediction ◽

Tertiary Structure ◽

Secondary Structure Prediction ◽

Three Dimensional ◽

Protein Secondary Structure ◽

Alpha Helix ◽

Prediction Methods ◽

Protein Secondary Structures

Protein structure prediction is one of the important goals in the area of bioinformatics and biotechnology. Prediction methods include structure prediction of both secondary and tertiary structures of protein. Protein secondary structure prediction infers knowledge related to presence of helixes, sheets and coils in a polypeptide chain whereas protein tertiary structure prediction infers knowledge related to three dimensional structures of proteins. Protein secondary structures represent the possible motifs or regular expressions represented as patterns that are predicted from primary protein sequence in the form of alpha helix, betastr and and coils. The secondary structure prediction is useful as it infers information related to the structure and function of unknown protein sequence. There are various secondary structure prediction methods used to predict about helixes, sheets and coils. Based on these methods there are various prediction tools under study. This study includes prediction of hemoglobin using various tools. The results produced inferred knowledge with reference to percentage of amino acids participating to produce helices, sheets and coils. PHD and DSC produced the best of the results out of all the tools used.

Download Full-text

Use of synchrotron-based FTIR microspectroscopy to determine protein secondary structures of raw and heat-treated brown and golden flaxseeds: A novel approach

Canadian Journal of Animal Science ◽

10.4141/a05-004 ◽

2005 ◽

Vol 85 (4) ◽

pp. 437-448 ◽

Cited By ~ 21

Author(s):

P. Yu ◽

J. J. McKinnon ◽

H. W. Soita ◽

C. R. Christensen ◽

D. A. Christensen

Keyword(s):

Secondary Structure ◽

Matrix Protein ◽

Protein Secondary Structure ◽

Secondary Structures ◽

Heat Processing ◽

Lower Percentage ◽

Protein Secondary Structures ◽

Novel Approach ◽

Α Helix ◽

Β Sheet

The objectives of the study were to use synchrotron Fourier transform infrared microspectroscopy (S-FTIR) as a novel approach to: (1) reveal ultra-structural chemical features of protein secondary structures of flaxseed tissues affected by variety (golden and brown) and heat processing (raw and roasted), and (2) quantify protein secondary structures using Gaussian and Lorentzian methods of multi-component peak modeling. By using multi-component peak modeling at protein amide I region of 1700–1620 cm-1, the results showed that the golden flaxseed contained relatively higher percentage of α-helix (47.1 vs. 36.9%), lower percentage of β-sheet (37.2 vs. 46.3%) and higher (P < 0.05) ratio of α-helix to β-sheet than the brown flaxseed (1.3 vs. 0.8). The roasting reduced (P < 0.05) percentage of α-helix (from 47.1 to 36.1%), increased percentage of β-sheet (from 37.2 to 49.8%) and reduced α-helix to β-sheet ratio (1.3 to 0.7) of the golden flaxseed tissues. However, the roasting did not affect percentage and ratio of α-helix and β-sheet in the brown flaxseed tissue. No significant differences were found in quantification of protein secondary structures between Gaussian and Lorentzian methods. These results demonstrate the potential of highly spatially resolved S-FTIR to localize relatively pure protein in the tissue and reveal protein secondary structures at a cellular level. The results indicated relative differences in protein secondary structures between flaxseed varieties and differences in sensitivities of protein secondary structure to the heat processing. Further study is needed to understand the relationship between protein secondary structure and protein digestion and utilization of flaxseed and to investigate whether the changes in the relative amounts of protein secondary structures are primarily responsible for differences in protein availability. Key words: Synchrotron, FTIR microspectrosopy, flaxseeds, intrinsic structural matrix, protein secondary structures, protein nutritive value

Download Full-text

Enriched Conformational Sampling of DNA and Proteins with a Hybrid Hamiltonian Derived from the Protein Data Bank

International Journal of Molecular Sciences ◽

10.3390/ijms19113405 ◽

2018 ◽

Vol 19 (11) ◽

pp. 3405 ◽

Cited By ~ 3

Author(s):

Emanuel Peter ◽

Jiří Černý

Keyword(s):

Partition Function ◽

Protein Data Bank ◽

Protein Structures ◽

Data Bank ◽

Weighting Factor ◽

Potential Of Mean Force ◽

Conformational Space ◽

Dynamics Simulation ◽

Conformational Sampling ◽

Speed Increase

In this article, we present a method for the enhanced molecular dynamics simulation of protein and DNA systems called potential of mean force (PMF)-enriched sampling. The method uses partitions derived from the potentials of mean force, which we determined from DNA and protein structures in the Protein Data Bank (PDB). We define a partition function from a set of PDB-derived PMFs, which efficiently compensates for the error introduced by the assumption of a homogeneous partition function from the PDB datasets. The bias based on the PDB-derived partitions is added in the form of a hybrid Hamiltonian using a renormalization method, which adds the PMF-enriched gradient to the system depending on a linear weighting factor and the underlying force field. We validated the method using simulations of dialanine, the folding of TrpCage, and the conformational sampling of the Dickerson–Drew DNA dodecamer. Our results show the potential for the PMF-enriched simulation technique to enrich the conformational space of biomolecules along their order parameters, while we also observe a considerable speed increase in the sampling by factors ranging from 13.1 to 82. The novel method can effectively be combined with enhanced sampling or coarse-graining methods to enrich conformational sampling with a partition derived from the PDB.

Download Full-text

Three-Dimensional Graph Matching to Identify Secondary Structure Correspondence of Medium-Resolution Cryo-EM Density Maps

Biomolecules ◽

10.3390/biom11121773 ◽

2021 ◽

Vol 11 (12) ◽

pp. 1773

Author(s):

Bahareh Behkamal ◽

Mahmoud Naghibzadeh ◽

Mohammad Reza Saberi ◽

Zeinab Amiri Tehranizadeh ◽

Andrea Pagnani ◽

...

Keyword(s):

Secondary Structure ◽

Graph Matching ◽

Protein Complexes ◽

Three Dimensional ◽

Protein Structure Determination ◽

Secondary Structures ◽

Computational Method ◽

Target Sequence ◽

Matching Problem ◽

Medium Resolution

Cryo-electron microscopy (cryo-EM) is a structural technique that has played a significant role in protein structure determination in recent years. Compared to the traditional methods of X-ray crystallography and NMR spectroscopy, cryo-EM is capable of producing images of much larger protein complexes. However, cryo-EM reconstructions are limited to medium-resolution (~4–10 Å) for some cases. At this resolution range, a cryo-EM density map can hardly be used to directly determine the structure of proteins at atomic level resolutions, or even at their amino acid residue backbones. At such a resolution, only the position and orientation of secondary structure elements (SSEs) such as α-helices and β-sheets are observable. Consequently, finding the mapping of the secondary structures of the modeled structure (SSEs-A) to the cryo-EM map (SSEs-C) is one of the primary concerns in cryo-EM modeling. To address this issue, this study proposes a novel automatic computational method to identify SSEs correspondence in three-dimensional (3D) space. Initially, through a modeling of the target sequence with the aid of extracting highly reliable features from a generated 3D model and map, the SSEs matching problem is formulated as a 3D vector matching problem. Afterward, the 3D vector matching problem is transformed into a 3D graph matching problem. Finally, a similarity-based voting algorithm combined with the principle of least conflict (PLC) concept is developed to obtain the SSEs correspondence. To evaluate the accuracy of the method, a testing set of 25 experimental and simulated maps with a maximum of 65 SSEs is selected. Comparative studies are also conducted to demonstrate the superiority of the proposed method over some state-of-the-art techniques. The results demonstrate that the method is efficient, robust, and works well in the presence of errors in the predicted secondary structures of the cryo-EM images.

Download Full-text