Feature selection for human membrane protein type classification using filter methods
<span lang="EN-US">As the number of protein sequences in the database is increasing, effective and efficient techniques are needed to make these data meaningful. These protein sequences contain redundant and irrelevant features that cause lower classification accuracy and increase the running time of the computational algorithm. In this paper, we select the best features using Minimum Redundancy Maximum Relevance(mRMR) and Correlation-based feature selection(CFS) methods. Two datasets of human membrane protein are used, S1 and S2. After the features have been selected by mRMR and CFS, K-Nearest Neighbor(KNN) and Support Vector Machine(SVM) classifiers are used to classify these membrane proteins. The performance of these techniques is measured using accuracy, specificity and sensitivity. and F-measure. The proposed algorithm managed to achieve 76% accuracy for S1 and 73% accuracy for S2. Finally, our proposed methods present competitive results when compared with the previous works on membrane protein classification</span><span>.</span>