Categorization of birth weight phenotypes for inclusion in genetic evaluations using a Deep Neural Network
Abstract Birth weight serves as a valuable indicator of the economically relevant trait of calving ease and erroneous data collection for birth weight could impact genetic evaluations for calving ease. The objective of the current study was to evaluate the use of deep neural networks for classifying contemporary groups based on the method used to generate birth weight phenotypes. Contemporary groups (CG; n = 120,000,000) ranging between 10 and 250 animals were simulated assuming 12 data collection and CG formation scenarios that could impact CG phenotypic variance including: weights recorded with a digital scale (REAL), hoof tape (TAPE), and those that were fabricated (FAB). The performance of 6 activation functions (AF; ReLu, sigmoid, exponential, ReLu6, Softmax, Softplus, Leaky ReLu, and TangH) were evaluated. Four hidden layers were used with 7 different scenarios relative to the number of neurons. Simulations were replicated 10 times. In general, accuracy (proportion of correct predictions) across AF and numbers of neurons were similar, with mean correlations ranging between 0.91 and 0.99. The AF ReLu, Sigmoid, Exponential and ReLu6 had the greatest consistency (mean pair-wise correlation among replicates) with an average correlation of greater than 0.85. Independent of the number of neurons used, the sigmoid function produced the highest accuracy (0.99) and consistency (0.93). The model with the greatest accuracy and consistency was then applied to real birth weight data supplied by the American Hereford Association. In the real data, the lowest phenotypic variance was for FAB CG (2.65 kg 2), REAL CG had the largest (15.84 kg 2) and TAPE CG was intermediate (6.84 kg 2). To investigate the potential impact of FAB data on routine genetic evaluations, CG classified as FAB in 90% or more of the replicates were removed from the evaluation for calving ease and the rank of resulting genetic predictions were compared to the case were records were not removed. The removal of FAB CG had a moderate impact on the prediction of calving ease expected progeny differences, primarily for animals with intermediate to high accuracy. Results suggest that a well-trained DNN can be effectively used to classify data based on quality metrics prior to inclusion in routine genetic evaluation.