Background:
Detecting DNA-binding proetins (DBPs) based on biological and chemical methods is time consuming and
expensive.
Objective:
In recent years, the rise of computational biology methods based on Machine Learning (ML) has greatly improved the detection
efficiency of DBPs.
Method:
In this study, Multiple Kernel-based Fuzzy SVM Model with Support Vector Data Description (MK-FSVM-SVDD) is proposed to
predict DBPs. Firstly, sex features are extracted from protein sequence. Secondly, multiple kernels are constructed via these sequence feature.
Than, multiple kernels are integrated by Centered Kernel Alignment-based Multiple Kernel Learning (CKA-MKL). Next, fuzzy membership
scores of training samples are calculated with Support Vector Data Description (SVDD). FSVM is trained and employed to detect new DBPs.
Results:
Our model is test on several benchmark datasets. Compared with other methods, MK-FSVM-SVDD achieves best Matthew's
Correlation Coefficient (MCC) on PDB186 (0.7250) and PDB2272 (0.5476).
Conclusion:
We can conclude that MK-FSVM-SVDD is more suitable than common SVM, as the classifier for DNA-binding proteins
identification.