learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data
We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial (MET) breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or can retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated in daily windows based on naive (for instance, daily windows with a fixed number of days) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient boosted trees, random forests, stacked ensemble models, and multi-layer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with MET experimental data in a user-friendly way. The package is fully open source and accessible on GitHub.