DeepPrime2Sec: Deep Learning for Protein Secondary Structure Prediction from the Primary Sequences

AbstractMotivationHere we investigate deep learning-based prediction of protein secondary structure from the protein primary sequence. We study the function of different features in this task, including one-hot vectors, biophysical features, protein sequence embedding (ProtVec), deep contextualized embedding (known as ELMo), and the Position Specific Scoring Matrix (PSSM). In addition to the role of features, we evaluate various deep learning architectures including the following models/mechanisms and certain combinations: Bidirectional Long Short-Term Memory (BiLSTM), convolutional neural network (CNN), highway connections, attention mechanism, recurrent neural random fields, and gated multi-scale CNN. Our results suggest that PSSM concatenated to one-hot vectors are the most important features for the task of secondary structure prediction.ResultsUtilizing the CNN-BiLSTM network, we achieved an accuracy of 69.9% and 70.4% using ensemble top-k models, for 8-class of protein secondary structure on the CB513 dataset, the most challenging dataset for protein secondary structure prediction. Through error analysis on the best performing model, we showed that the misclassification is significantly more common at positions that undergo secondary structure transitions, which is most likely due to the inaccurate assignments of the secondary structure at the boundary regions. Notably, when ignoring amino acids at secondary structure transitions in the evaluation, the accuracy increases to 90.3%. Furthermore, the best performing model mostly mistook similar structures for one another, indicating that the deep learning model inferred high-level information on the secondary structure.AvailabilityThe developed software called DeepPrime2Sec and the used datasets are available at http://llp.berkeley.edu/[email protected]

Download Full-text

OCLSTM: Optimized convolutional and long short-term memory neural network model for protein secondary structure prediction

PLoS ONE ◽

10.1371/journal.pone.0245982 ◽

2021 ◽

Vol 16 (2) ◽

pp. e0245982

Author(s):

Yawu Zhao ◽

Yihui Liu

Keyword(s):

Neural Network ◽

Secondary Structure ◽

Structure Prediction ◽

Short Term Memory ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Short Term ◽

Protein Secondary Structure Prediction ◽

Term Memory ◽

Long Short Term Memory

Protein secondary structure prediction is extremely important for determining the spatial structure and function of proteins. In this paper, we apply an optimized convolutional neural network and long short-term memory neural network models to protein secondary structure prediction, which is called OCLSTM. We use an optimized convolutional neural network to extract local features between amino acid residues. Then use the bidirectional long short-term memory neural network to extract the remote interactions between the internal residues of the protein sequence to predict the protein structure. Experiments are performed on CASP10, CASP11, CASP12, CB513, and 25PDB datasets, and the good performance of 84.68%, 82.36%, 82.91%, 84.21% and 85.08% is achieved respectively. Experimental results show that the model can achieve better results.

Download Full-text

Review for "DNSS2 : Improved ab initio protein secondary structure prediction using advanced deep learning architectures"

10.1002/prot.26007/v2/review2 ◽

2020 ◽

Keyword(s):

Deep Learning ◽

Secondary Structure ◽

Ab Initio ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Learning Architectures

Download Full-text

Decision letter for "DNSS2 : Improved ab initio protein secondary structure prediction using advanced deep learning architectures"

10.1002/prot.26007/v2/decision1 ◽

2020 ◽

Keyword(s):

Deep Learning ◽

Secondary Structure ◽

Ab Initio ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Learning Architectures

Download Full-text

OneHotEncoding and LSTM based Deep Learning Models for Protein Secondary Structure Prediction

10.22541/au.159739202.25582842 ◽

2020 ◽

Author(s):

Vamsidhar Enireddy ◽

C Karthikeyan

Keyword(s):

Deep Learning ◽

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Learning Models ◽

Protein Secondary Structure Prediction

Download Full-text

Protein Secondary Structure Prediction Based on Deep Learning

DEStech Transactions on Engineering and Technology Research ◽

10.12783/dtetr/ismii2017/16664 ◽

2017 ◽

Author(s):

Lin Zheng ◽

Hong-ling Li ◽

Nan Wu ◽

Li Ao

Keyword(s):

Deep Learning ◽

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction

Download Full-text

DeepACLSTM: deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction

BMC Bioinformatics ◽

10.1186/s12859-019-2940-0 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 16

Author(s):

Yanbu Guo ◽

Weihua Li ◽

Bingyi Wang ◽

Huiqing Liu ◽

Dongming Zhou

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Short Term Memory ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Neural Models ◽

Short Term ◽

Protein Secondary Structure Prediction ◽

Term Memory ◽

Long Short Term Memory

Download Full-text

An enhanced protein secondary structure prediction using deep learning framework on hybrid profile based features

Applied Soft Computing ◽

10.1016/j.asoc.2019.105926 ◽

2020 ◽

Vol 86 ◽

pp. 105926 ◽

Cited By ~ 1

Author(s):

Prince Kumar ◽

Sanjay Bankapur ◽

Nagamma Patil

Keyword(s):

Deep Learning ◽

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Learning Framework

Download Full-text

DNSS2 : Improved ab initio protein secondary structure prediction using advanced deep learning architectures

Proteins Structure Function and Bioinformatics ◽

10.1002/prot.26007 ◽

2020 ◽

Author(s):

Zhiye Guo ◽

Jie Hou ◽

Jianlin Cheng

Keyword(s):

Deep Learning ◽

Secondary Structure ◽

Ab Initio ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Learning Architectures

Download Full-text

Improving protein structure prediction by deep learning and computational optimization

10.32469/10355/76251 ◽

2019 ◽

Author(s):

◽

Jie Hou

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Secondary Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Protein Secondary Structure ◽

Scattering Data ◽

Protein Secondary Structure Prediction

Protein structure prediction is one of the most important scientific problems in the field of bioinformatics and computational biology. The availability of protein three-dimensional (3D) structure is crucial for studying biological and cellular functions of proteins. The importance of four major sub-problems in protein structure prediction have been clearly recognized. Those include, first, protein secondary structure prediction, second, protein fold recognition, third, protein quality assessment, and fourth, multi-domain assembly. In recent years, deep learning techniques have proved to be a highly effective machine learning method, which has brought revolutionary advances in computer vision, speech recognition and bioinformatics. In this dissertation, five contributions are described. First, DNSS2, a method for protein secondary structure prediction using one-dimensional deep convolution network. Second, DeepSF, a method of applying deep convolutional network to classify protein sequence into one of thousands known folds. Third, CNNQA and DeepRank, two deep neural network approaches to systematically evaluate the quality of predicted protein structures and select the most accurate model as the final protein structure prediction. Fourth, MULTICOM, a protein structure prediction system empowered by deep learning and protein contact prediction. Finally, SAXSDOM, a data-assisted method for protein domain assembly using small-angle X-ray scattering data. All the methods are available as software tools or web servers which are freely available to the scientific community.

Download Full-text