Background:
We research the binding function proteins in Elymus nutans. Recognition
for proteins is essential for study of biology. Machine learning methods have been widely used for
the prediction of proteins.
Methods:
We used BLAST software for the function annotations of Elymus nutans. Besides, we used
machine learning methods to recognize proteins which are not annotated by the software. In the
process, we focused on identifying the proteins with binding functions. In our research, features are
extracted by four algorithms, and then selected by mutual information estimator. Here three classifiers
are constructed based on K-nearest neighbour algorithm and gradient boosting algorithm.
Results and Conclusion:
Experimental results show that there are 848 proteins with ATP binding
function, 113 proteins with heme binding function, 315 proteins with zinc-ion binding function,
135 proteins with GTP binding function and 21 proteins with ADP binding function. Furthermore,
we have successfully predicted the functions of 10 special protein sequences whose function
annotations cannot be obtained by making sequence alignment with seven famous protein databases.
Among them, seven sequences have ATP binding functions, one sequence has heme binding function,
one sequence has zinc-ion binding function and the other one has GTP binding function.