A Feature Generation Algorithm with Applications to Biological Sequence Classification

Author(s):  
Rezarta Islamaj Dogan ◽  
Lise Getoor ◽  
W John Wilbur
2007 ◽  
Vol 17 (05) ◽  
pp. 369-381 ◽  
Author(s):  
BRITTA MERSCH ◽  
TOBIAS GLASMACHERS ◽  
PETER MEINICKE ◽  
CHRISTIAN IGEL

Oligo kernels for biological sequence classification have a high discriminative power. A new parameterization for the K-mer oligo kernel is presented, where all oligomers of length K are weighted individually. The task specific choice of these parameters increases the classification performance and reveals information about discriminative features. For adapting the multiple kernel parameters based on cross-validation the covariance matrix adaptation evolution strategy is proposed. It is applied to optimize the trimer oligo kernels for the detection of bacterial gene starts. The resulting kernels lead to higher classification rates, and the adapted parameters reveal the importance of particular triplets for classification, for example of those occurring in the Shine-Dalgarno Sequence.


Sign in / Sign up

Export Citation Format

Share Document