IPSNN: Identification of Protein Structure based Neural Network

Author(s):  
Hongxuan Hua
Author(s):  
Lina Yang ◽  
Pu Wei ◽  
Cheng Zhong ◽  
Xichun Li ◽  
Yuan Yan Tang

The spatial structure of the protein reflects the biological function and activity mechanism. Predicting the secondary structure of a protein is the basis content for predicting its spatial structure. Traditional methods based on statistics and sequential patterns do not achieve higher accuracy. In this paper, the application of BN-GRU neural network in protein structure prediction is discussed. The main idea is to construct a Gated Recurrent Unit (GRU) neural network. The GRU neural network can learn long-term dependencies. It can handle long sequences better than traditional methods. Based on this, BN is combined with GRU to construct a new network. Position Specific Scoring Matrix (PSSM) is used to associate with other features to build a completely new feature set. It can be proved that the application of BN on GRU can improve the accuracy of the results. The idea in this paper can also be applied to the analysis of similarity of other sequences.


2014 ◽  
Vol 6 (17) ◽  
pp. 6721-6726 ◽  
Author(s):  
Vincent Hall ◽  
Anthony Nash ◽  
Alison Rodger

SSNN is a self-organising map neural network approach for estimating protein structure from circular dichroism (CD) spectra. The method for using SSNN is described here, and SSNN is compared with CDSSTR, a well-known methodology for finding secondary structures from CD. SSNN compares well with similar methodologies.


2021 ◽  
Vol 11 (Suppl_1) ◽  
pp. S13-S13
Author(s):  
Valery Novoseletsky ◽  
Mikhail Lozhnikov ◽  
Grigoriy Armeev ◽  
Aleksandr Kudriavtsev ◽  
Alexey Shaytan ◽  
...  

Background: Protein structure determination using X-ray free-electron laser (XFEL) includes analysis and merging a large number of snapshot diffraction patterns. Convolutional neural networks are widely used to solve numerous computer vision problems, e.g. image classification, and can be used for diffraction pattern analysis. But the task of protein structure determination with the use of CNNs only is not yet solved. Methods: We simulated the diffraction patterns using the Condor software library and obtained more than 1000 diffraction patterns for each structure with simulation parameters resembling real ones. To classify diffraction patterns, we tried two approaches, which are widely known in the area of image classification: a classic VGG network and residual networks. Results: 1. Recognition of a protein class (GPCRs vs globins). Globins and GPCR-like proteins are typical α-helical proteins. Each of these protein families has a large number of representatives (including those with known structure) but we used only 8 structures from every family. 12,000 of diffraction patterns were used for training and 4,000 patterns for testing. Results indicate that all considered networks are able to recognize the protein family type with high accuracy. 2. Recognition of the number of protein molecules in the liposome. We considered the usage of lyposomes as carriers of membrane or globular proteins for sample delivery in XFEL experiments in order to improve the X-ray beam hit rate. Three sets of diffractograms for liposomes of various radius were calculated, including diffractograms for empty liposomes, liposomes loaded with 5 bacteriorhodopsin molecules, and liposomes loaded with 10 bacteriorhodopsin molecules. The training set consisted of 23625 diffraction patterns, and test set of 7875 patterns. We found that all networks used in our study were able to identify the number of protein molecules in liposomes independent of the liposome radius. Our findings make this approach rather promising for the usage of liposomes as protein carriers in XFEL experiments. Conclusion: Thus, the performed numerical experiments show that the use of neural network algorithms for the recognition of diffraction images from single macromolecular particles makes it possible to determine changes in the structure at the angstrom scale.


2021 ◽  
Author(s):  
Yong-Chang Xu ◽  
Tian-Jun ShangGuan ◽  
Xue-Ming Ding ◽  
Ngaam J. Cheung

The amino acid sequence of a protein contains all the necessary information to specify its shape, which dictates its biological activities. However, it is challenging and expensive to experimentally determine the three-dimensional structure of proteins. The backbone torsion angles, as an important structural constraint, play a critical role in protein structure prediction, and accurately predicting the angles can considerably advance the tertiary structure prediction by accelerating efficient sampling of the large conformational space for low energy structures. On account of the rapid growth of protein databases and striking breakthroughs in deep learning algorithms, computational advances allow us to extract knowledge from large-scale data to address key biological questions. Here we propose evolutionary signatures that are computed from protein sequence profiles, and a deep neural network, termed ESIDEN, that adopts a straightforward architecture of recurrent neural networks with a small number of learnable parameters. The proposed ESIDEN is validated on three benchmark datasets, including D2020, TEST2016/2018, and CASPs datasets. On the D2020, using the combination of the four novel features and basic features, the ESIDEN achieves the mean absolute error (MAE) of 15.8 and 20.1 for ϕ and ψ, respectively. Comparing to the best-so-far methods, we show that the ESIDEN significantly improves the angle ψ by the MAE decrements of more than 2 degrees on both TEST2016 and TEST2018 and achieves closely approximate MAE of the angle ϕ although it adopts simple architecture and fewer learnable parameters. On fifty-nine template-free modeling targets, the ESIDEN achieves high accuracy by reducing the MAEs by about 0.4 and more than 2.5 degrees on average for the torsion angles ϕ and ψ in the CASPs, respectively. Using the predicted torsion angles, we infer the tertiary structures of four representative template-free modeling targets that achieve high precision with regard to the root-mean-square deviation and TM-score by comparing them to the native structures. The results demonstrate that the ESIDEN can make accurate predictions of the torsion angles by leveraging the evolutionary signatures compared to widely used classical features. The proposed evolutionary signatures would be also used as alternative features in predicting residue-residue distance, protein structure, and protein-ligand binding sites. Moreover, the high-precision torsion angles predicted by the ESIDEN can be used to accurately infer protein tertiary structures, and the ESIDEN would potentially pave the way to improve protein structure prediction.


Author(s):  
Sheshang Degadwala ◽  
Dhairya Vyas ◽  
Harsh S Dave

In Bioinformatics field Protein Structure Classification is the hugest undertaking. The realized proteins have been requested subject to their level, feature, work, amino destructive and family and superfamily. Protein structure segregated into four sorts: all ? protein structure, all ? protein structure, ?+? protein structure, and ?/? protein structure. The use of a standard way to deal with perform plan is a very inconvenient and dreary task. The quantity of cutting edge Machine Intelligence enrolling strategies such Support Vector Machine, Random Forest, Artificial Neural Network, Decision Tree and Naïve Bayes Classifier had been proposed in the composition. Our objective right currently is to develop a system that performs better than anything past markers for protein structure gathering by thinking about the separation among the distinctive Amino Acid buildups. To take a gander at the display of proposed work particular datasets are used.


2016 ◽  
Author(s):  
Noah Fleming ◽  
Benjamin Kinsella ◽  
Christopher Ing

AbstractA large number of human diseases result from disruptions to protein structure and function caused by missense mutations. Computational methods are frequently employed to assist in the prediction of protein stability upon mutation. These methods utilize a combination of protein sequence data, protein structure data, empirical energy functions, and physicochemical properties of amino acids. In this work, we present the first use of dynamic protein structural features in order to improve stability predictions upon mutation. This is achieved through the use of a set of timeseries extracted from microsecond timescale atomistic molecular dynamics simulations of proteins. Standard machine learning algorithms using mean, variance, and histograms of these timeseries were found to be 60-70% accurate in stability classification based on experimental ΔΔGor protein-chaperone interaction measurements. A recurrent neural network with full treatment of timeseries data was found to be 80% accurate according the F1 score. The performance of our models was found to be equal or better than two recently developed machine learning methods for binary classification as well as two industry-standard stability prediction algorithms. In addition to classification, understanding the molecular basis of protein stability disruption due to disease-causing mutations is a significant challenge that impedes the development of drugs and therapies that may be used treat genetic diseases. The use of dynamic structural features allows for novel insight into the molecular basis of protein disruption by mutation in a diverse set of soluble proteins. To assist in the interpretation of machine learning results, we present a technique for determining the importance of features to a recurrent neural network using Garson’s method. We propose a novel extension of neural interpretation diagrams by implementing Garson’s method to scale each node in the neural interpretation diagram according to its relative importance to the network.


1992 ◽  
Vol 03 (supp01) ◽  
pp. 227-233
Author(s):  
Joseph D. Bryngelson

Attempts to predict protein tertiary structure, through neural network or other means, generally try to optimize some potential function or other “score” over a set of structures. This paper develops a formalism that addresses the question: What are the accuracy requirements for a potential function that predicts protein structure? The results of a simple model calculation with this formalism are also presented. The paper closes with a discussion of the implications of these results for practical structure prediction.


Sign in / Sign up

Export Citation Format

Share Document