A comparison between internal protein nanoenvironments of α-helices and β-sheets

Secondary structure elements are generally found in almost all protein structures revealed so far. In general, there are more β-sheets than α helices found inside the protein structures. For example, considering the PDB, DSSP and Stride definitions for secondary structure elements and by using the consensus among those, we found 60,727 helices in 4,376 chains identified in all-α structures and 129,440 helices in 7,898 chains identified in all-α and α + β structures. For β-sheets, we identified 837,345 strands in 184,925 β-sheets located within 50,803 chains of all-β structures and 1,541,961 strands in 355,431 β-sheets located within 86,939 chains in all-β and α + β structures (data extracted on February 1, 2019). In this paper we would first like to address a full characterization of the nanoenvironment found at beta sheet locations and then compare those characteristics with the ones we already published for alpha helical secondary structure elements. For such characterization, we use here, as in our previous work about alpha helical nanoenvironments, set of STING protein structure descriptors. As in the previous work, we assume that we will be able to prove that there is a set of protein structure parameters/attributes/descriptors, which could fully describe the nanoenvironment around beta sheets and that appropriate statistically analysis will point out to significant changes in values for those parameters when compared for loci considered inside and outside defined secondary structure element. Clearly, while the univariate analysis is straightforward and intuitively understood, it is severely limited in coverage: it could be successfully applied at best in up to 25% of studied cases. The indication of the main descriptors for the specific secondary structure element (SSE) by means of the multivariate MANOVA test is the strong statistical tool for complete discrimination among the SSEs, and it revealed itself as the one with the highest coverage. The complete description of the nanoenvironment, by analogy, might be understood in terms of describing a key lock system, where all lock mini cylinders need to combine their elevation (controlled by a matching key) to open the lock. The main idea is as follows: a set of descriptors (cylinders in the key-lock example) must precisely combine their values (elevation) to form and maintain a specific secondary structure element nanoenvironment (a required condition for a key being able to open a lock).

Download Full-text

Protein Structure Abstractionand Automatic Clustering Using Secondary Structure Element Sequences

Computational Science and Its Applications – ICCSA 2005 - Lecture Notes in Computer Science ◽

10.1007/11424826_136 ◽

2005 ◽

pp. 1284-1292 ◽

Cited By ~ 1

Author(s):

Sung Hee Park ◽

Chan Yong Park ◽

Dae Hee Kim ◽

Seon Hee Park ◽

Jeong Seop Sim

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Secondary Structure Element ◽

Automatic Clustering

Download Full-text

Hermes: an ensemble machine learning architecture for protein secondary structure prediction

10.1101/640656 ◽

2019 ◽

Author(s):

Larry Bliss ◽

Ben Pascoe ◽

Samuel K Sheppard

Keyword(s):

Machine Learning ◽

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Cross Validation ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Lower Boundary ◽

Protein Secondary Structure ◽

Homologous Proteins

AbstractMotivationProtein structure predictions, that combine theoretical chemistry and bioinformatics, are an increasingly important technique in biotechnology and biomedical research, for example in the design of novel enzymes and drugs. Here, we present a new ensemble bi-layered machine learning architecture, that directly builds on ten existing pipelines providing rapid, high accuracy, 3-State secondary structure prediction of proteins.ResultsAfter training on 1348 solved protein structures, we evaluated the model with four independent datasets: JPRED4 - compiled by the authors of the successful predictor with the same name, and CASP11, CASP12 & CASP13 - assembled by the Critical Assessment of protein Structure Prediction consortium who run biannual experiments focused on objective testing of predictors. These rigorous, pre-established protocols included 7-fold cross-validation and blind testing. This led to a mean Hermes accuracy of 95.5%, significantly (p<0.05) better than the ten previously published models analysed in this paper. Furthermore, Hermes yielded a reduction in standard deviation, lower boundary outliers, and reduced dependency on solved structures of homologous proteins, as measured by NEFF score. This architecture provides advantages over other pipelines, while remaining accessible to users at any level of bioinformatics experience.Availability and ImplementationThe source code for Hermes is freely available at: https://github.com/HermesPrediction/Hermes. This page also includes the cross-validation with corresponding models, and all training/testing data presented in this study with predictions and accuracy.

Download Full-text

Literature Survey of Protein Secondary Structure Prediction

Jurnal Teknologi ◽

10.11113/jt.v34.642 ◽

2012 ◽

Author(s):

Satya Nanda Vel Arjunan ◽

Safaai Deris ◽

Rosli Md Illias

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Large Scale ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Protein Secondary Structure ◽

Fundamental Theory ◽

Protein Secondary Structure Prediction ◽

General Guide

Dengan wujudnya projek jujukan DNA secara besar-besaran, teknik yang tepat untuk meramalkan struktur protein diperlukan. Masalah meramalkan struktur protein daripada jujukan DNA pada dasarnya masih belum dapat diselesaikan walaupun kajian intensif telah dilakukan selama lebih daripada tiga dekad. Dalam kertas kerja ini, teori asas struktur protein akan dibincangkan sebagai panduan umum bagi kajian peramalan struktur protein sekunder. Analisis jujukan terkini serta prinsi p yang digunakan dalam teknik-teknik tersebut akan diterangkan. Kata kunci: peramalan stuktur sekunder protein; rangkaian neural. In the wake of large-scale DNA sequencing projects, accurate tools are needed to predict protein structures. The problem of predicting protein structure from DNA sequence remains fundamentally unsolved even after more than three decades of intensive research. In this paper, fundamental theory of the protein structure of the protein structure will be presented as a general guide to protein secondary structure prediction research. An overview of the state-of-theart in sequence analysis and some princi ples of the methods invloved wil be described. Key words: protein secondary structure prediction;neural networks.

Download Full-text

Prediction of Protein Secondary Structure

Jurnal Teknologi ◽

10.11113/jt.v35.605 ◽

2012 ◽

Author(s):

Satya Nanda Vel Arjunan ◽

Safaai Deris ◽

Rosli Md Illias

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Large Scale ◽

Secondary Structure Prediction ◽

State Of The Art ◽

Protein Structures ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

General Guide

Dengan wujudnya projek jujukan DNA secara besar–besaran, teknik yang tepat untuk meramalkan struktur protein diperlukan. Masalah meramalkan struktur protein daripada jujukan DNA pada dasarnya masih belum dapat diselesaikan walaupun kajian intensif telah dilakukan selama lebih daripada tiga dekad. Dalam kertas kerja ini, teori asas struktur protein akan dibincangkan sebagai panduan umum bagi kajian peramalan struktur protein sekunder. Analisis jujukan terkini serta prinsip yang digunakan dalam teknik–teknik tersebut akan diterangkan. Kata kunci: Peramalan struktur sekunder protein; Rangkaian Neural In the wake of large-scale DNA sequencing projects, accurate tools are needed to predict protein structures. The problem of predicting protein structure from DNA sequence remains fundamentally unsolved even after more than three decades of intensive research. In this paper, fundamental theory of the protein structure will be presented as a general guide to protein secondary structure prediction research. An overview of the state–of–the–art in sequence analysis and some principles of the methods involved wil be described. Key words: Protein secondary structure prediction; Neural networks

Download Full-text

FAST SIMILARITY SEARCH FOR PROTEIN 3D STRUCTURES USING TOPOLOGICAL PATTERN MATCHING BASED ON SPATIAL RELATIONS

International Journal of Neural Systems ◽

10.1142/s0129065705000244 ◽

2005 ◽

Vol 15 (04) ◽

pp. 287-296 ◽

Cited By ~ 2

Author(s):

SUNG-HEE PARK ◽

KEUN HO RYU ◽

DAVID GILBERT

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Similarity Search ◽

Spatial Databases ◽

Topological String ◽

Protein Structures ◽

Structural Similarity ◽

Spatial Relations ◽

3D Structures ◽

Comparison Results

Similarity search for protein 3D structures become complex and computationally expensive due to the fact that the size of protein structure databases continues to grow tremendously. Recently, fast structural similarity search systems have been required to put them into practical use in protein structure classification whilst existing comparison systems do not provide comparison results on time. Our approach uses multi-step processing that composes of a preprocessing step to represent geometry of protein structures with spatial objects, a filter step to generate a small candidate set using approximate topological string matching, and a refinement step to compute a structural alignment. This paper describes the preprocessing and filtering for fast similarity search using the discovery of topological patterns of secondary structure elements based on spatial relations. Our system is fully implemented by using Oracle 8i spatial. We have previously shown1 that our approach has the advantage of speed of performance compared with other approach such as DALI. This work shows that the discovery of topological relations of secondary structure elements in protein structures by using spatial relations of spatial databases is practical for fast structural similarity search for proteins.

Download Full-text

Improving protein structure prediction by deep learning and computational optimization

10.32469/10355/76251 ◽

2019 ◽

Author(s):

◽

Jie Hou

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Secondary Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Protein Secondary Structure ◽

Scattering Data ◽

Protein Secondary Structure Prediction

Protein structure prediction is one of the most important scientific problems in the field of bioinformatics and computational biology. The availability of protein three-dimensional (3D) structure is crucial for studying biological and cellular functions of proteins. The importance of four major sub-problems in protein structure prediction have been clearly recognized. Those include, first, protein secondary structure prediction, second, protein fold recognition, third, protein quality assessment, and fourth, multi-domain assembly. In recent years, deep learning techniques have proved to be a highly effective machine learning method, which has brought revolutionary advances in computer vision, speech recognition and bioinformatics. In this dissertation, five contributions are described. First, DNSS2, a method for protein secondary structure prediction using one-dimensional deep convolution network. Second, DeepSF, a method of applying deep convolutional network to classify protein sequence into one of thousands known folds. Third, CNNQA and DeepRank, two deep neural network approaches to systematically evaluate the quality of predicted protein structures and select the most accurate model as the final protein structure prediction. Fourth, MULTICOM, a protein structure prediction system empowered by deep learning and protein contact prediction. Finally, SAXSDOM, a data-assisted method for protein domain assembly using small-angle X-ray scattering data. All the methods are available as software tools or web servers which are freely available to the scientific community.

Download Full-text

By using MADALINE Learning with Back Propagation and Keras to Predict the Protein Secondary Structure

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b4964.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 4878-4882

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Back Propagation ◽

Protein Secondary Structure ◽

Sigmoid Function ◽

Crucial Component ◽

Main Motive

Understanding of intermediate protein structure prediction serves as a crucial component to find the function of residues of amino acid. In this paper, focus on the intermediate protein structure by using feed forward and feedback method and enhancing the concept of sliding window. Prediction of secondary structure is a very cosmic problem of bioinformatics. This can be reduced by predicting or unfold the protein structures if it is unfolded so that can give the great results in medical sciences. Our main motive is to improve the accuracy of secondary structures and minimize the error .Experimentally, use the Multilayer ADALINE network for learning and KERAS TENSORFLOW use for train the weight matrix and sigmoid function for calculating the resultant with back propagation. Resultant of this paper results provides more prominent results as compare to already existing methods. Those improve the accuracy of secondary structure prediction

Download Full-text

On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab027 ◽

2021 ◽

Vol 3 (2) ◽

Author(s):

Bernat Anton ◽

Mireia Besalú ◽

Oriol Fornes ◽

Jaume Bonet ◽

Alexis Molina ◽

...

Keyword(s):

Amino Acids ◽

Protein Structure ◽

Secondary Structure ◽

Protein Structures ◽

Three Dimensional ◽

Direct Coupling ◽

Dimensional Structure ◽

Coupling Analysis ◽

Multiple Sequence ◽

Direct Coupling Analysis

Abstract Direct-coupling analysis (DCA) for studying the coevolution of residues in proteins has been widely used to predict the three-dimensional structure of a protein from its sequence. We present RADI/raDIMod, a variation of the original DCA algorithm that groups chemically equivalent residues combined with super-secondary structure motifs to model protein structures. Interestingly, the simplification produced by grouping amino acids into only two groups (polar and non-polar) is still representative of the physicochemical nature that characterizes the protein structure and it is in line with the role of hydrophobic forces in protein-folding funneling. As a result of a compressed alphabet, the number of sequences required for the multiple sequence alignment is reduced. The number of long-range contacts predicted is limited; therefore, our approach requires the use of neighboring sequence-positions. We use the prediction of secondary structure and motifs of super-secondary structures to predict local contacts. We use RADI and raDIMod, a fragment-based protein structure modelling, achieving near native conformations when the number of super-secondary motifs covers >30–50% of the sequence. Interestingly, although different contacts are predicted with different alphabets, they produce similar structures.

Download Full-text

Sequence Specific Dihedral Angle Distribution: Application in Protein Structure Prediction and Evaluation

Plant Tissue Culture and Biotechnology ◽

10.3329/ptcb.v19i2.5439 ◽

1970 ◽

Vol 19 (2) ◽

pp. 217-226

Author(s):

S. M. Minhaz Ud-Dean ◽

Mahdi Muhammad Moosa

Keyword(s):

Protein Structure ◽

Dihedral Angle ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Protein Structures ◽

Angle Distribution ◽

Ramachandran Plot ◽

Specific Data ◽

Specific Distribution ◽

Structure Evaluation

Protein structure prediction and evaluation is one of the major fields of computational biology. Estimation of dihedral angle can provide information about the acceptability of both theoretically predicted and experimentally determined structures. Here we report on the sequence specific dihedral angle distribution of high resolution protein structures available in PDB and have developed Sasichandran, a tool for sequence specific dihedral angle prediction and structure evaluation. This tool will allow evaluation of a protein structure in pdb format from the sequence specific distribution of Ramachandran angles. Additionally, it will allow retrieval of the most probable Ramachandran angles for a given sequence along with the sequence specific data. Key words: Torsion angle, φ-ψ distribution, sequence specific ramachandran plot, Ramasekharan, protein structure appraisal D.O.I. 10.3329/ptcb.v19i2.5439 Plant Tissue Cult. & Biotech. 19(2): 217-226, 2009 (December)

Download Full-text

A Thorough Theoretical Exploration of Intriguing Characteristics of Cyclo[18]carbon: Geometry, Bonding Nature, Aromaticity, Weak Interaction, Reactivity, Excited States, Vibrations, Molecular Dynamics and Various Molecular Properties

10.26434/chemrxiv.11320130.v1 ◽

2019 ◽

Cited By ~ 2

Author(s):

Tian Lu ◽

Qinxue Chen ◽

Zeyu Liu

Keyword(s):

Molecular Dynamics ◽

Excited States ◽

Electronic Excitation ◽

Ring Structure ◽

Molecular Properties ◽

Molecular Vibration ◽

Full Characterization ◽

Long Time ◽

Almost All

Although cyclo[18]carbon has been theoretically and experimentally investigated since long time ago, only very recently it was prepared and directly observed by means of STM/AFM in condensed phase (Kaiser et al., <i>Science</i>, <b>365</b>, 1299 (2019)). The unique ring structure and dual 18-center π delocalization feature bring a variety of unusual characteristics and properties to the cyclo[18]carbon, which are quite worth to be explored. In this work, we present an extremely comprehensive and detailed investigation on almost all aspects of the cyclo[18]carbon, including (1) Geometric characteristics (2) Bonding nature (3) Electron delocalization and aromaticity (4) Intermolecular interaction (5) Reactivity (6) Electronic excitation and UV/Vis spectrum (7) Molecular vibration and IR/Raman spectrum (8) Molecular dynamics (9) Response to external field (10) Electron ionization, affinity and accompanied process (11) Various molecular properties. We believe that our full characterization of the cyclo[18]carbon will greatly deepen researchers' understanding of this system, and thereby help them to utilize it in practice and design its various valuable derivatives.

Download Full-text