scholarly journals Evolutionary Trace Annotation Server: automated enzyme function prediction in protein structures using 3D templates

2009 ◽  
Vol 25 (11) ◽  
pp. 1426-1427 ◽  
Author(s):  
R. Matthew Ward ◽  
E. Venner ◽  
B. Daines ◽  
S. Murray ◽  
S. Erdin ◽  
...  
2017 ◽  
Author(s):  
Evangelia I Zacharaki

Background. The availability of large databases containing high resolution three-dimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction. Methods. In this work, novel shape features are extracted representing protein structure in the form of local (per amino acid) distribution of angles and amino acid distances, respectively. Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN) for function prediction and the outputs are fused through Support Vector Machines (SVM) or a correlation-based k-nearest neighbor classifier. Two different architectures are investigated employing either one CNN per multi-channel feature set, or one CNN per image channel. Results. Cross validation experiments on enzymes (n = 44,661) from the PDB database achieved 90.1% correct classification demonstrating the effectiveness of the proposed method for automatic function annotation of protein structures. Discussion. The automatic prediction of protein function can provide quick annotations on extensive datasets opening the path for relevant applications, such as pharmacological target identification.


2020 ◽  
Vol 118 (3) ◽  
pp. 533a
Author(s):  
Safyan Aman Memon ◽  
Kinaan Aamir Khan ◽  
Hammad Naveed

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Vladimir Gligorijević ◽  
P. Douglas Renfrew ◽  
Tomasz Kosciolek ◽  
Julia Koehler Leman ◽  
Daniel Berenberg ◽  
...  

AbstractThe rapid increase in the number of proteins in sequence databases and the diversity of their functions challenge computational approaches for automated function prediction. Here, we introduce DeepFRI, a Graph Convolutional Network for predicting protein functions by leveraging sequence features extracted from a protein language model and protein structures. It outperforms current leading methods and sequence-based Convolutional Neural Networks and scales to the size of current sequence repositories. Augmenting the training set of experimental structures with homology models allows us to significantly expand the number of predictable functions. DeepFRI has significant de-noising capability, with only a minor drop in performance when experimental structures are replaced by protein models. Class activation mapping allows function predictions at an unprecedented resolution, allowing site-specific annotations at the residue-level in an automated manner. We show the utility and high performance of our method by annotating structures from the PDB and SWISS-MODEL, making several new confident function predictions. DeepFRI is available as a webserver at https://beta.deepfri.flatironinstitute.org/.


Author(s):  
Brian Y. Chen ◽  
Drew H. Bryant ◽  
Amanda E. Cruess ◽  
Joseph H. Bylund ◽  
Viacheslav Y. Fofanov ◽  
...  

2019 ◽  
Author(s):  
Vladimir Gligorijevic ◽  
P. Douglas Renfrew ◽  
Tomasz Kosciolek ◽  
Julia Koehler Leman ◽  
Daniel Berenberg ◽  
...  

The large number of available sequences and the diversity of protein functions challenge current experimental and computational approaches to determining and predicting protein function. We present a deep learning Graph Convolutional Network (GCN) for predicting protein functions and concurrently identifying functionally important residues. This model is initially trained using experimentally determined structures from the Protein Data Bank (PDB) but has significant de-noising capability, with only a minor drop in performance observed when structure predictions are used. We take advantage of this denoising property to train the model on > 200,000 protein structures, including many homology-predicted structures, greatly expanding the reach and applications of the method. Our model learns general structure-function relationships by robustly predicting functions of proteins with ≤ 40% sequence identity to the training set. We show that our GCN architecture predicts functions more accurately than Convolutional Neural Networks trained on sequence data alone and previous competing methods. Using class activation mapping, we automatically identify structural regions at the residue-level that lead to each function prediction for every confidently predicted protein, advancing site-specific function prediction. We use our method to annotate PDB and SWISS-MODEL proteins, making several new confident function predictions spanning both fold and function classifications.


2014 ◽  
Vol 39 (8) ◽  
pp. 363-371 ◽  
Author(s):  
Matthew P. Jacobson ◽  
Chakrapani Kalyanaraman ◽  
Suwen Zhao ◽  
Boxue Tian

PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3095 ◽  
Author(s):  
Shervine Amidi ◽  
Afshine Amidi ◽  
Dimitrios Vlachakis ◽  
Nikos Paragios ◽  
Evangelia I. Zacharaki

The number of protein structures in the PDB database has been increasing more than 15-fold since 1999. The creation of computational models predicting enzymatic function is of major importance since such models provide the means to better understand the behavior of newly discovered enzymes when catalyzing chemical reactions. Until now, single-label classification has been widely performed for predicting enzymatic function limiting the application to enzymes performing unique reactions and introducing errors when multi-functional enzymes are examined. Indeed, some enzymes may be performing different reactions and can hence be directly associated with multiple enzymatic functions. In the present work, we propose a multi-label enzymatic function classification scheme that combines structural and amino acid sequence information. We investigate two fusion approaches (in the feature level and decision level) and assess the methodology for general enzymatic function prediction indicated by the first digit of the enzyme commission (EC) code (six main classes) on 40,034 enzymes from the PDB database. The proposed single-label and multi-label models predict correctly the actual functional activities in 97.8% and 95.5% (based on Hamming-loss) of the cases, respectively. Also the multi-label model predicts all possible enzymatic reactions in 85.4% of the multi-labeled enzymes when the number of reactions is unknown. Code and datasets are available athttps://figshare.com/s/a63e0bafa9b71fc7cbd7.


Sign in / Sign up

Export Citation Format

Share Document