scholarly journals KK-DBP: A Multi-Feature Fusion Method for DNA-Binding Protein Identification Based on Random Forest

2021 ◽  
Vol 12 ◽  
Author(s):  
Yuran Jia ◽  
Shan Huang ◽  
Tianjiao Zhang

DNA-binding protein (DBP) is a protein with a special DNA binding domain that is associated with many important molecular biological mechanisms. Rapid development of computational methods has made it possible to predict DBP on a large scale; however, existing methods do not fully integrate DBP-related features, resulting in rough prediction results. In this article, we develop a DNA-binding protein identification method called KK-DBP. To improve prediction accuracy, we propose a feature extraction method that fuses multiple PSSM features. The experimental results show a prediction accuracy on the independent test dataset PDB186 of 81.22%, which is the highest of all existing methods.

PLoS ONE ◽  
2019 ◽  
Vol 14 (9) ◽  
pp. e0221829 ◽  
Author(s):  
Margaret A. Gustafson ◽  
Elizabeth M. McCormick ◽  
Lalith Perera ◽  
Matthew J. Longley ◽  
Renkui Bai ◽  
...  

2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Ruifeng Xu ◽  
Jiyun Zhou ◽  
Bin Liu ◽  
Lin Yao ◽  
Yulan He ◽  
...  

DNA-binding proteins are crucial for various cellular processes, such as recognition of specific nucleotide, regulation of transcription, and regulation of gene expression. Developing an effective model for identifying DNA-binding proteins is an urgent research problem. Up to now, many methods have been proposed, but most of them focus on only one classifier and cannot make full use of the large number of negative samples to improve predicting performance. This study proposed a predictor called enDNA-Prot for DNA-binding protein identification by employing the ensemble learning technique. Experiential results showed that enDNA-Prot was comparable with DNA-Prot and outperformed DNAbinder and iDNA-Prot with performance improvement in the range of 3.97–9.52% in ACC and 0.08–0.19 in MCC. Furthermore, when the benchmark dataset was expanded with negative samples, the performance of enDNA-Prot outperformed the three existing methods by 2.83–16.63% in terms of ACC and 0.02–0.16 in terms of MCC. It indicated that enDNA-Prot is an effective method for DNA-binding protein identification and expanding training dataset with negative samples can improve its performance. For the convenience of the vast majority of experimental scientists, we developed a user-friendly web-server for enDNA-Prot which is freely accessible to the public.


2014 ◽  
Vol 34 (1) ◽  
pp. 8-17 ◽  
Author(s):  
Bin Liu ◽  
Jinghao Xu ◽  
Shixi Fan ◽  
Ruifeng Xu ◽  
Jiyun Zhou ◽  
...  

IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 66545-66556 ◽  
Author(s):  
Xiangzheng Fu ◽  
Wen Zhu ◽  
Bo Liao ◽  
Lijun Cai ◽  
Lihong Peng ◽  
...  

2010 ◽  
Vol 222 (03) ◽  
Author(s):  
S Degen ◽  
S Kuhfittig-Kulle ◽  
JH Schulte ◽  
F Westermann ◽  
A Schramm ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document