Bioluminescent proteins prediction with voting strategy.
Background: Bioluminescence is a unique and significant phenomenon in nature. Bioluminescence is important for the lifecycle of some organisms and is valuable in biomedical research, including for gene expression analysis and bioluminescence imaging technology.In recent years, researchers have identified a number of methods for predicting bioluminescent proteins (BLPs), which have increased in accuracy, but could be further improved. Method: In this paper, we propose a new bioluminescent proteins prediction method based on a voting algorithm. We used four methods of feature extraction based on the amino acid sequence. We extracted 314 dimensional features in total from amino acid composition, physicochemical properties and k-spacer amino acid pair composition. In order to obtain the highest MCC value to establish the optimal prediction model, then used a voting algorithm to build the model.To create the best performing model, we discuss the selection of base classifiers and vote counting rules. Results: Our proposed model achieved 93.4% accuracy, 93.4% sensitivity and 91.7% specificity in the test set, which was better than any other method. We also improved a previous prediction of bioluminescent proteins in three lineages using our model building method, resulting in greatly improved accuracy.