32 Cross validation of best linear unbiased predictions of breeding values using an efficient leave-one-out strategy
Abstract Efficient strategies have been developed for leave-one-out cross validation (LOOCV) of predicted phenotypes in a simple model with an overall mean and marker effects or animal genetic effects to evaluate the accuracy of genomic predictions. For such a model, the correlation between the predicted and the observed phenotype is identical to the correlation between the observed phenotype and the estimated breeding value (EBV). When the model is more complex, with multiple fixed and random effects, although the correlation between the observed and predicted phenotype can be obtained efficiently by LOOCV, it is not equal to the correlation between the observed phenotype and EBV, which is the statistic of interest. The objective here was to develop and evaluate an efficient LOOCV method for EBV or for predictions of other random effects under a general mixed linear model. The approach is based on treated all effects in the model, with large variances for fixed effects. Naïve LOOCV requires inverting the (n - 1) x (n - 1) dimensional phenotypic covariance matrix for each of the n (= no. observations) training data sets. Our method efficiently obtains these inverses from the inverse of the phenotypic covariance matrix for all n observations. Naïve LOOCV of EBV by pre-correction of fixed effects using the training data (Naïve LOOCV) and the new efficient LOOCV were compared. The new efficient LOOCV for EBV was 962 times faster than Naïve LOOCV. Prediction accuracies from the two strategies were the same (0.20). Funded by USDA-NIFA grant # 2017-67007-26144.