Machine learning prediction of antiviral-HPV protein interactions for anti-HPV pharmacotherapy
Background: Persistent infection with high-risk types Human Papillomavirus could cause diseases including cervical cancers and oropharyngeal cancers. Nonetheless, so far there is no effective pharmacotherapy for treating the infection from high-risk HPV types, and hence it remains to be a severe threat to health of female. Methods: In light of drug repositioning strategy, we trained and benchmarked multiple machine learning predictive models so as to predict potential effective antiviral drugs for HPV infection in this work. Based on antiviral-target interaction dataset, we generated high dimension feature set of drug-target interaction pairs and used the dataset to train and construct machine learning predictive models. Results: Through optimizing models, measuring models predictive performance using 182 pairs of antiviral-target interaction dataset which were all approved by United States Food and Drug Administration, and benchmarking different models predictive performance, we identified the optimized Support Vector Machine and K-Nearest Neighbor classifier with high precision score were the best two predictors (0.80 and 0.85 respectively) amongst classifiers of Support Vector Machine, Random forest, Adaboost, Naive Bayes, K-Nearest Neighbors, and Logistic regression classifier. We applied these two predictors together and successfully predicted 58 pairs of antiviral-HPV protein interactions from 846 pairs of antiviral-HPV protein associations. Conclusions: Our work provided good drug candidates for anti-HPV drug discovery. So far as we know, we are the first one to conduct such HPV-oriented computational drug repositioning study.