scholarly journals Combining pairwise sequence similarity and support vector machines for remote protein homology detection

Author(s):  
Li Liao ◽  
William Stafford Noble
2007 ◽  
Vol 1 ◽  
pp. BBI.S315 ◽  
Author(s):  
Zhi Qun Tang ◽  
Hong Huang Lin ◽  
Hai Lei Zhang ◽  
Lian Yi Han ◽  
Xin Chen ◽  
...  

Various computational methods have been used for the prediction of protein and peptide function based on their sequences. A particular challenge is to derive functional properties from sequences that show low or no homology to proteins of known function. Recently, a machine learning method, support vector machines (SVM), have been explored for predicting functional class of proteins and peptides from amino acid sequence derived properties independent of sequence similarity, which have shown promising potential for a wide spectrum of protein and peptide classes including some of the low- and non-homologous proteins. This method can thus be explored as a potential tool to complement alignment-based, clustering-based, and structure-based methods for predicting protein function. This article reviews the strategies, current progresses, and underlying difficulties in using SVM for predicting the functional class of proteins. The relevant software and web-servers are described. The reported prediction performances in the application of these methods are also presented.


Author(s):  
NAZAR M. ZAKI ◽  
SAFAAI DERIS ◽  
ROSLI M. ILLIAS

Few years back, Jaakkola and Haussler published a method of combining generative and discriminative approaches for detecting protein homologies. The method was a variant of support vector machines using a new kernel function called Fisher Kernel. They begin by training a generative hidden Markov model for a protein family. Then, using the model, they derive a vector of features called Fisher scores that are assigned to the sequence and then use support vector machine in conjunction with the fisher scores for protein homologies detection. In this paper, we revisit the idea of using a discriminative approach, and in particular support vector machines for protein homologies detection. However, in place of the Fisher scoring method, we present a new Hidden Markov Model Combining Scores approach. Six scoring algorithms are combined as a way of extracting features from a protein sequence. Experiments show that our method, improves on previous methods for homologies detection of protein domains.


Sign in / Sign up

Export Citation Format

Share Document