THE SEQUENCE ATTRIBUTES METHOD FOR DETERMINING CORRELATIONS BETWEEN AMINO ACID SEQUENCE AND PROTEIN SECONDARY STRUCTURE

Author(s):  
Gregory E. Arnold ◽  
A. Keith Dunker ◽  
Susan J. Johns ◽  
Richard J. Douthart
2006 ◽  
Vol 12 (1) ◽  
pp. 82-85
Author(s):  
Miodrag Zivkovic ◽  
Sasa Malkov ◽  
Snezana Zaric ◽  
Milena Vujosevic-Janicic ◽  
Jelena Tomasevic ◽  
...  

The statistical dependence of protein secondary structure on amino acid bigram frequencies was studied. Proteins in the PDBSELECT subset of the Protein Data Bank database were investigated. Protein secondary structures were determined using DSSP software. The conditional probabilities of protein secondary structures were calculated and presented. The results on bigrams show the frequencies of all the possible bigrams in all secondary structure types. These results elucidate some factors important for the prediction of the secondary structures of proteins based on the amino acid sequence.


In bioinformatics the prediction of the secondary structure of the protein from its primary amino acid sequence is very difficult, which has a huge impact on the field of science and medicine. The hardest part is how to learn the most effective and correct protein features to improve prediction. Here, we carry out a deep learning model to enhance structure prediction. The core achievement of this paper is a group of recurrent neural networks (RNNs) that can manage high-level relational features from a pair of input protein sequence and target protein sequences. This paper contrasts the different type of recurrent network in recurrent neural networks (RNNs). In addition, the emphasis is on more advanced systems which incorporate a gating utility is called long short term memory (LSTM) unit and the newly added gated recurrent unit (GRU). This recurrent units has been calculated on the basis of predicting protein secondary structure using an amino acid sequence. The dataset has been taken from a publicly available database server (RCSB), and this study shows that advanced recurrent units LSTM is better than GRU for a long protein sequence.


Sign in / Sign up

Export Citation Format

Share Document