Advances in Protein Super-Secondary Structure Prediction and Application to Protein Structure Prediction

Improving protein structure prediction by deep learning and computational optimization

10.32469/10355/76251 ◽

2019 ◽

Author(s):

◽

Jie Hou

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Secondary Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Protein Secondary Structure ◽

Scattering Data ◽

Protein Secondary Structure Prediction

Protein structure prediction is one of the most important scientific problems in the field of bioinformatics and computational biology. The availability of protein three-dimensional (3D) structure is crucial for studying biological and cellular functions of proteins. The importance of four major sub-problems in protein structure prediction have been clearly recognized. Those include, first, protein secondary structure prediction, second, protein fold recognition, third, protein quality assessment, and fourth, multi-domain assembly. In recent years, deep learning techniques have proved to be a highly effective machine learning method, which has brought revolutionary advances in computer vision, speech recognition and bioinformatics. In this dissertation, five contributions are described. First, DNSS2, a method for protein secondary structure prediction using one-dimensional deep convolution network. Second, DeepSF, a method of applying deep convolutional network to classify protein sequence into one of thousands known folds. Third, CNNQA and DeepRank, two deep neural network approaches to systematically evaluate the quality of predicted protein structures and select the most accurate model as the final protein structure prediction. Fourth, MULTICOM, a protein structure prediction system empowered by deep learning and protein contact prediction. Finally, SAXSDOM, a data-assisted method for protein domain assembly using small-angle X-ray scattering data. All the methods are available as software tools or web servers which are freely available to the scientific community.

Download Full-text

Protein Structure Prediction: Assembly of Secondary Structure Elements by Basin-Hopping

ChemPhysChem ◽

10.1002/cphc.201402247 ◽

2014 ◽

Vol 15 (15) ◽

pp. 3378-3390 ◽

Cited By ~ 1

Author(s):

Falk Hoffmann ◽

Ioan Vancea ◽

Sanjay G. Kamat ◽

Birgit Strodel

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Basin Hopping

Download Full-text

Hermes: an ensemble machine learning architecture for protein secondary structure prediction

10.1101/640656 ◽

2019 ◽

Author(s):

Larry Bliss ◽

Ben Pascoe ◽

Samuel K Sheppard

Keyword(s):

Machine Learning ◽

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Cross Validation ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Lower Boundary ◽

Protein Secondary Structure ◽

Homologous Proteins

AbstractMotivationProtein structure predictions, that combine theoretical chemistry and bioinformatics, are an increasingly important technique in biotechnology and biomedical research, for example in the design of novel enzymes and drugs. Here, we present a new ensemble bi-layered machine learning architecture, that directly builds on ten existing pipelines providing rapid, high accuracy, 3-State secondary structure prediction of proteins.ResultsAfter training on 1348 solved protein structures, we evaluated the model with four independent datasets: JPRED4 - compiled by the authors of the successful predictor with the same name, and CASP11, CASP12 & CASP13 - assembled by the Critical Assessment of protein Structure Prediction consortium who run biannual experiments focused on objective testing of predictors. These rigorous, pre-established protocols included 7-fold cross-validation and blind testing. This led to a mean Hermes accuracy of 95.5%, significantly (p<0.05) better than the ten previously published models analysed in this paper. Furthermore, Hermes yielded a reduction in standard deviation, lower boundary outliers, and reduced dependency on solved structures of homologous proteins, as measured by NEFF score. This architecture provides advantages over other pipelines, while remaining accessible to users at any level of bioinformatics experience.Availability and ImplementationThe source code for Hermes is freely available at: https://github.com/HermesPrediction/Hermes. This page also includes the cross-validation with corresponding models, and all training/testing data presented in this study with predictions and accuracy.

Download Full-text

Literature Survey of Protein Secondary Structure Prediction

Jurnal Teknologi ◽

10.11113/jt.v34.642 ◽

2012 ◽

Author(s):

Satya Nanda Vel Arjunan ◽

Safaai Deris ◽

Rosli Md Illias

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Large Scale ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Protein Secondary Structure ◽

Fundamental Theory ◽

Protein Secondary Structure Prediction ◽

General Guide

Dengan wujudnya projek jujukan DNA secara besar-besaran, teknik yang tepat untuk meramalkan struktur protein diperlukan. Masalah meramalkan struktur protein daripada jujukan DNA pada dasarnya masih belum dapat diselesaikan walaupun kajian intensif telah dilakukan selama lebih daripada tiga dekad. Dalam kertas kerja ini, teori asas struktur protein akan dibincangkan sebagai panduan umum bagi kajian peramalan struktur protein sekunder. Analisis jujukan terkini serta prinsi p yang digunakan dalam teknik-teknik tersebut akan diterangkan. Kata kunci: peramalan stuktur sekunder protein; rangkaian neural. In the wake of large-scale DNA sequencing projects, accurate tools are needed to predict protein structures. The problem of predicting protein structure from DNA sequence remains fundamentally unsolved even after more than three decades of intensive research. In this paper, fundamental theory of the protein structure of the protein structure will be presented as a general guide to protein secondary structure prediction research. An overview of the state-of-theart in sequence analysis and some princi ples of the methods invloved wil be described. Key words: protein secondary structure prediction;neural networks.

Download Full-text

Prediction of Protein Secondary Structure

Jurnal Teknologi ◽

10.11113/jt.v35.605 ◽

2012 ◽

Author(s):

Satya Nanda Vel Arjunan ◽

Safaai Deris ◽

Rosli Md Illias

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Large Scale ◽

Secondary Structure Prediction ◽

State Of The Art ◽

Protein Structures ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

General Guide

Dengan wujudnya projek jujukan DNA secara besar–besaran, teknik yang tepat untuk meramalkan struktur protein diperlukan. Masalah meramalkan struktur protein daripada jujukan DNA pada dasarnya masih belum dapat diselesaikan walaupun kajian intensif telah dilakukan selama lebih daripada tiga dekad. Dalam kertas kerja ini, teori asas struktur protein akan dibincangkan sebagai panduan umum bagi kajian peramalan struktur protein sekunder. Analisis jujukan terkini serta prinsip yang digunakan dalam teknik–teknik tersebut akan diterangkan. Kata kunci: Peramalan struktur sekunder protein; Rangkaian Neural In the wake of large-scale DNA sequencing projects, accurate tools are needed to predict protein structures. The problem of predicting protein structure from DNA sequence remains fundamentally unsolved even after more than three decades of intensive research. In this paper, fundamental theory of the protein structure will be presented as a general guide to protein secondary structure prediction research. An overview of the state–of–the–art in sequence analysis and some principles of the methods involved wil be described. Key words: Protein secondary structure prediction; Neural networks

Download Full-text

Enhancing fragment-based protein structure prediction by customising fragment cardinality according to local secondary structure

BMC Bioinformatics ◽

10.1186/s12859-020-3491-0 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Jad Abbass ◽

Jean-Christophe Nebel

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Local Secondary Structure

Download Full-text

Reconstruction and Stability of Secondary Structure Elements in the Context of Protein Structure Prediction

Biophysical Journal ◽

10.1016/j.bpj.2009.02.057 ◽

2009 ◽

Vol 96 (11) ◽

pp. 4399-4408 ◽

Cited By ~ 5

Author(s):

Alexei A. Podtelezhnikov ◽

David L. Wild

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Protein Structure Prediction ◽

Structure Prediction

Download Full-text

Deep Learning Mechanism Augmented with 16-Hybrid Cellular Automata for Secondary Structure Prediction

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b6458.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 490-493

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Amino Acid ◽

Secondary Structure ◽

Human Body ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Learning Mechanism ◽

Hybrid Cellular Automata ◽

Cellular Development

A protein plays various role in our human body like cellular development, reproduction, endurance and regulation of human body. Based on the structure of the genes we can extract lots of information regarding the human body. It is very easy to extract lots of information from a structure than a sequence. Identifying the protein structure helps in drug design. The secondary structure, to some extent tells about the effect of amino acid changes and explains the reason for the disease of an individual. A doctor can suggest medicines without any side effects to a patient based on the protein structure acquired from DNA. We have developed a classifier DL-16-MACA which can predict the secondary structure of an amino acid sequence of different lengths. In this prediction we have considered three classes Helix (H), Strands (E), Coiled(C). For Helix class the sensitivity, percentage accuracy is 0.923 and 90.6% respectively. For Strands class the sensitivity, percentage accuracy is 0.852 and 85.55%respectively. For Coiled class the sensitivity, percentage accuracy is 0.789 and 77.1% respectively. The percentage accuracy when tested with PDB datasets is 85.4% which substantially comparable with existing literature.

Download Full-text

Adjusting Local Conformational Sampling For Fragment Assembly Protein Structure Prediction Based On Secondary Structure Complexity

10.1109/imcet53404.2021.9665455 ◽

2021 ◽

Author(s):

Jad Abbass ◽

Jean-Christophe Nebel

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Conformational Sampling ◽

Fragment Assembly ◽

Structure Complexity

Download Full-text

Assigning Secondary Structure in Proteins using AI

10.1101/2021.02.02.429329 ◽

2021 ◽

Author(s):

Jisna Vellara Antony ◽

Prayagh Madhu ◽

Jayaraj Pottekkattuvalappil Balakrishnan

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Test Accuracy ◽

Protein Fragments ◽

Functional Understanding ◽

Prediction Systems ◽

Structure Assignment

AbstractKnowledge about protein structure assignment enriches the structural and functional understanding of proteins. Accurate and reliable structure assignment data is crucial for secondary structure prediction systems. Since the ’80s various methods based on hydrogen bond analysis and atomic coordinate geometry, followed by Machine Learning, have been employed in protein structure assignment. However, the assignment process becomes challenging when missing atoms are present in protein files. Our model develops a multi-class classifier program named DLFSA for assigning protein Secondary Structure Elements(SSE) using Convolutional Neural Networks(CNN). A fast and efficient GPU based parallel procedure extracts fragments from protein files. The model implemented in this work is trained with a subset of protein fragments and achieves 88.1% and 82.5% train and test accuracy, respectively. Our model uses only Cα coordinates for secondary structure assignments. The model is successfully tested on a few full-length proteins also. Results from the fragment-based studies demonstrate the feasibility of applying deep learning solutions for structure assignment problems.

Download Full-text