Background:
Prediction of protein subcellular location is a meaningful task which attracts
much attention in recent years. Particularly, the number of new protein sequences yielded by the highthroughput
sequencing technology in the post genomic era has increased explosively.
Objective:
Protein subcellular localization prediction based solely on sequence data remains to be a
challenging problem of computational biology.
Methods:
In this paper, three sets of evolutionary features are derived from the position-specific scoring
matrix, which has shown great potential in other bioinformatics problems. A fusion model is built
up by the optimal parameters combination. Finally, principal component analysis and support vector
machine classifier is applied to predict protein subcellular localization on NNPSL dataset and Cell-
PLoc 2.0 dataset.
Results:
Our experimental results show that the proposed method remarkably improved the prediction
accuracy, and the features derived from PSI-BLAST profile only are appropriate for protein subcellular
localization prediction.