grDNA-Prot: The Prediction of DNA-Binding Proteins Based on Physicochemical Properties of Amino Acids and Support Vector Machine

The support vector machine (SVM) is used in the classification of sonar signals and DNA-binding proteins. Our study on the classification of sonar signals shows that SVM produces a result better than that obtained from other classification methods, which is consistent from the findings of other studies. The testing accuracy of classification is 95.19% as compared with that of 90.4% from multilayered neural network and that of 82.7% from nearest neighbor classifier. From our results on the classification of DNA-binding proteins, one finds that SVM gives a testing accuracy of 82.32%, which is slightly better than that obtained from an earlier study of SVM classification of protein–protein interactions. Hence, our study indicates the usefulness of SVM in the identification of DNA-binding proteins. Further improvements in SVM algorithm and parameters are suggested.

Download Full-text

Identification of DNA-Binding Proteins by Multiple Kernel Support Vector Machine and Sequence Information

Current Proteomics ◽

10.2174/1570164616666190417100509 ◽

2020 ◽

Vol 17 (4) ◽

pp. 302-310

Author(s):

Yijie Ding ◽

Feng Chen ◽

Xiaoyi Guo ◽

Jijun Tang ◽

Hongjie Wu

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Computational Method ◽

Support Vector ◽

Sequence Information ◽

Data Sets ◽

Multiple Kernel ◽

Kernel Support Vector Machine

Background: The DNA-binding proteins is an important process in multiple biomolecular functions. However, the tradition experimental methods for DNA-binding proteins identification are still time consuming and extremely expensive. Objective: In past several years, various computational methods have been developed to detect DNAbinding proteins. However, most of them do not integrate multiple information. Methods: In this study, we propose a novel computational method to predict DNA-binding proteins by two steps Multiple Kernel Support Vector Machine (MK-SVM) and sequence information. Firstly, we extract several feature and construct multiple kernels. Then, multiple kernels are linear combined by Multiple Kernel Learning (MKL). At last, a final SVM model, constructed by combined kernel, is built to predict DNA-binding proteins. Results: The proposed method is tested on two benchmark data sets. Compared with other existing method, our approach is comparable, even better than other methods on some data sets. Conclusion: We can conclude that MK-SVM is more suitable than common SVM, as the classifier for DNA-binding proteins identification.

Download Full-text

Extracting Sequence Features to Predict DNA-Binding Proteins Using Support Vector Machine

2013 International Conference on Computational and Information Sciences ◽

10.1109/iccis.2013.48 ◽

2013 ◽

Cited By ~ 2

Author(s):

Xin Ma ◽

Lefu Hu

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector ◽

Sequence Features

Download Full-text

Identifying DNA-binding proteins by combining support vector machine and PSSM distance transformation

BMC Systems Biology ◽

10.1186/1752-0509-9-s1-s10 ◽

2015 ◽

Vol 9 (Suppl 1) ◽

pp. S10 ◽

Cited By ~ 42

Author(s):

Ruifeng Xu ◽

Jiyun Zhou ◽

Hongpeng Wang ◽

Yulan He ◽

Xiaolong Wang ◽

...

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector ◽

Distance Transformation

Download Full-text

Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information

Computational and Mathematical Methods in Medicine ◽

10.1155/2013/524502 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 6

Author(s):

Xin Ma ◽

Jiansheng Wu ◽

Xiaoyun Xue

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Query Protein ◽

Dna Binding Proteins ◽

Evolutionary Information ◽

Support Vector ◽

Sequence Information ◽

Novel Approach ◽

Matthew’S Correlation Coefficient

DNA-binding proteins are fundamentally important in understanding cellular processes. Thus, the identification of DNA-binding proteins has the particularly important practical application in various fields, such as drug design. We have proposed a novel approach method for predicting DNA-binding proteins using only sequence information. The prediction model developed in this study is constructed by support vector machine-sequential minimal optimization (SVM-SMO) algorithm in conjunction with a hybrid feature. The hybrid feature is incorporating evolutionary information feature, physicochemical property feature, and two novel attributes. These two attributes use DNA-binding residues and nonbinding residues in a query protein to obtain DNA-binding propensity and nonbinding propensity. The results demonstrate that our SVM-SMO model achieves 0.67 Matthew's correlation coefficient (MCC) and 89.6% overall accuracy with 88.4% sensitivity and 90.8% specificity, respectively. Performance comparisons on various features indicate that two novel attributes contribute to the performance improvement. In addition, our SVM-SMO model achieves the best performance than state-of-the-art methods on independent test dataset.

Download Full-text

newDNA-Prot: Prediction of DNA-binding proteins by employing support vector machine and a comprehensive sequence representation

Computational Biology and Chemistry ◽

10.1016/j.compbiolchem.2014.09.002 ◽

2014 ◽

Vol 52 ◽

pp. 51-59 ◽

Cited By ~ 14

Author(s):

Yanping Zhang ◽

Jun Xu ◽

Wei Zheng ◽

Chen Zhang ◽

Xingye Qiu ◽

...

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector ◽

Sequence Representation

Download Full-text

gDNA-Prot: Predict DNA-binding proteins by employing support vector machine and a novel numerical characterization of protein sequence

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2016.06.002 ◽

2016 ◽

Vol 406 ◽

pp. 8-16 ◽

Cited By ~ 2

Author(s):

Yan-ping Zhang ◽

Wuyunqiqige ◽

Wei Zheng ◽

Shuyi Liu ◽

Chunguang Zhao

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Protein Sequence ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector ◽

Numerical Characterization

Download Full-text

Predicting DNA binding proteins using support vector machine with hybrid fractal features

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2013.10.009 ◽

2014 ◽

Vol 343 ◽

pp. 186-192 ◽

Cited By ~ 17

Author(s):

Xiao-Hui Niu ◽

Xue-Hai Hu ◽

Feng Shi ◽

Jing-Bo Xia

Keyword(s):

Support Vector Machine ◽

Dna Binding ◽

Binding Proteins ◽

Dna Binding Proteins ◽

Support Vector

Download Full-text

MK-FSVM-SVDD: A Multiple Kernel-based Fuzzy SVM Model for Predicting DNA-binding Proteins via Support Vector Data Description

Current Bioinformatics ◽

10.2174/1574893615999200607173829 ◽

2020 ◽

Vol 15 ◽

Author(s):

Yi Zou ◽

Hongjie Wu ◽

Xiaoyi Guo ◽

Li Peng ◽

Yijie Ding ◽

...

Keyword(s):

Dna Binding ◽

Binding Proteins ◽

Detection Efficiency ◽

Dna Binding Proteins ◽

Support Vector ◽

Support Vector Data Description ◽

Vector Data ◽

Data Description ◽

Multiple Kernel ◽

Svm Model

Background: Detecting DNA-binding proetins (DBPs) based on biological and chemical methods is time consuming and expensive. Objective: In recent years, the rise of computational biology methods based on Machine Learning (ML) has greatly improved the detection efficiency of DBPs. Method: In this study, Multiple Kernel-based Fuzzy SVM Model with Support Vector Data Description (MK-FSVM-SVDD) is proposed to predict DBPs. Firstly, sex features are extracted from protein sequence. Secondly, multiple kernels are constructed via these sequence feature. Than, multiple kernels are integrated by Centered Kernel Alignment-based Multiple Kernel Learning (CKA-MKL). Next, fuzzy membership scores of training samples are calculated with Support Vector Data Description (SVDD). FSVM is trained and employed to detect new DBPs. Results: Our model is test on several benchmark datasets. Compared with other methods, MK-FSVM-SVDD achieves best Matthew's Correlation Coefficient (MCC) on PDB186 (0.7250) and PDB2272 (0.5476). Conclusion: We can conclude that MK-FSVM-SVDD is more suitable than common SVM, as the classifier for DNA-binding proteins identification.

Download Full-text