Abstract
Background
Recently, machine learning (ML) is becoming attractive in genomic prediction, while its superiority in genomic prediction and the choosing of optimal ML methods are needed investigation.
Results
In this study, 2566 Chinese Yorkshire pigs with reproduction traits records were used, they were genotyped with GenoBaits Porcine SNP 50K and PorcineSNP50 panel. Four ML methods, including support vector regression (SVR), kernel ridge regression (KRR), random forest (RF) and Adaboost.R2 were implemented. Through 20 replicates of five-fold cross-validation, the genomic prediction abilities of ML methods were explored. Compared with genomic BLUP(GBLUP), single-step GBLUP (ssGBLUP) and Bayesian method BayesHE, our results indicated that ML methods significantly outperformed. The prediction accuracy of ML methods was improved by 19.3%, 15.0% and 20.8% on average over GBLUP, ssGBLUP and BayesHE, ranging from 8.9–24.0%, 7.6–17.5% and 11.1–24.6%, respectively. In addition, ML methods yielded smaller mean squared error (MSE) and mean absolute error (MAE) in all scenarios. ssGBLUP yielded improvement of 3.7% on average compared to GBLUP, and the performance of BayesHE was close to GBLUP. Among four ML methods, SVR and KRR had the most robust prediction abilities, which yielded higher accuracies, lower bias, lower MSE and MAE, and comparable computing efficiency as GBLUP. RF demonstrated the lowest prediction ability and computational efficiency among ML methods.
Conclusion
Our findings demonstrated that ML methods are more efficient than traditional genomic selection methods, and it could be new options for genomic prediction.