Protein Homology Analysis for Function Prediction with Parallel Sub-Graph Isomorphism

2013 ◽  
pp. 386-399 ◽  
Author(s):  
Alper Küçükural ◽  
Andras Szilagyi ◽  
O. Ugur Sezerman ◽  
Yang Zhang

To annotate the biological function of a protein molecule, it is essential to have information on its 3D structure. Many successful methods for function prediction are based on determining structurally conserved regions because the functional residues are proved to be more conservative than others in protein evolution. Since the 3D conformation of a protein can be represented by a contact map graph, graph matching, algorithms are often employed to identify the conserved residues in weakly homologous protein pairs. However, the general graph matching algorithm is computationally expensive because graph similarity searching is essentially a NP-hard problem. Parallel implementations of the graph matching are often exploited to speed up the process. In this chapter,the authors review theoretical and computational approaches of graph theory and the recently developed graph matching algorithms for protein function prediction.

Author(s):  
Alper Küçükural ◽  
Andras Szilagyi ◽  
O. Ugur Sezerman ◽  
Yang Zhang

To annotate the biological function of a protein molecule, it is essential to have information on its 3D structure. Many successful methods for function prediction are based on determining structurally conserved regions because the functional residues are proved to be more conservative than others in protein evolution. Since the 3D conformation of a protein can be represented by a contact map graph, graph matching, algorithms are often employed to identify the conserved residues in weakly homologous protein pairs. However, the general graph matching algorithm is computationally expensive because graph similarity searching is essentially a NP-hard problem. Parallel implementations of the graph matching are often exploited to speed up the process. In this chapter,the authors review theoretical and computational approaches of graph theory and the recently developed graph matching algorithms for protein function prediction.


Author(s):  
Amelia Villegas-Morcillo ◽  
Stavros Makrodimitris ◽  
Roeland C H J van Ham ◽  
Angel M Gomez ◽  
Victoria Sanchez ◽  
...  

Abstract Motivation Protein function prediction is a difficult bioinformatics problem. Many recent methods use deep neural networks to learn complex sequence representations and predict function from these. Deep supervised models require a lot of labeled training data which are not available for this task. However, a very large amount of protein sequences without functional labels is available. Results We applied an existing deep sequence model that had been pretrained in an unsupervised setting on the supervised task of protein molecular function prediction. We found that this complex feature representation is effective for this task, outperforming hand-crafted features such as one-hot encoding of amino acids, k-mer counts, secondary structure and backbone angles. Also, it partly negates the need for complex prediction models, as a two-layer perceptron was enough to achieve competitive performance in the third Critical Assessment of Functional Annotation benchmark. We also show that combining this sequence representation with protein 3D structure information does not lead to performance improvement, hinting that 3D structure is also potentially learned during the unsupervised pretraining. Availability and implementation Implementations of all used models can be found at https://github.com/stamakro/GCN-for-Structure-and-Function. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Amelia Villegas-Morcillo ◽  
Stavros Makrodimitris ◽  
Roeland C.H.J. van Ham ◽  
Angel M. Gomez ◽  
Victoria Sanchez ◽  
...  

AbstractMotivationProtein function prediction is a difficult bioinformatics problem. Many recent methods use deep neural networks to learn complex sequence representations and predict function from these. Deep supervised models require a lot of labeled training data which are not available for this task. However, a very large amount of protein sequences without functional labels is available.ResultsWe applied an existing deep sequence model that had been pre-trained in an unsupervised setting on the supervised task of protein function prediction. We found that this complex feature representation is effective for this task, outperforming hand-crafted features such as one-hot encoding of amino acids, k-mer counts, secondary structure and backbone angles. Also, it partly negates the need for deep prediction models, as a two-layer perceptron was enough to achieve state-of-the-art performance in the third Critical Assessment of Functional Annotation benchmark. We also show that combining this sequence representation with protein 3D structure information does not lead to performance improvement, hinting that three-dimensional structure is also potentially learned during the unsupervised pre-training.AvailabilityImplementations of all used models can be found at https://github.com/stamakro/GCN-for-Structure-and-Function.Contactameliavm@ugr.esSupplementary informationSupplementary data are available online.


Molecules ◽  
2017 ◽  
Vol 22 (10) ◽  
pp. 1732 ◽  
Author(s):  
Renzhi Cao ◽  
Colton Freitas ◽  
Leong Chan ◽  
Miao Sun ◽  
Haiqing Jiang ◽  
...  

2008 ◽  
Vol 9 (1) ◽  
pp. 350 ◽  
Author(s):  
Xiaoyu Jiang ◽  
Naoki Nariai ◽  
Martin Steffen ◽  
Simon Kasif ◽  
Eric D Kolaczyk

Sign in / Sign up

Export Citation Format

Share Document