Accuracies of genomic predictions for disease resistance of striped catfish to Edwardsiella ictaluri using artificial intelligence algorithms

Assessments of genomic prediction accuracies using machine and deep learning methods are currently not available or very limited in aquaculture species. The principal aim of this study was to examine the predictive performance of these new methods for disease resistance to Edwardsiella ictaluri in a population of striped catfish Pangasianodon hypophthalmus and to make comparisons with four common methods, i.e., pedigree-based best linear unbiased prediction (PBLUP), genomic-based best linear unbiased prediction (GBLUP), single-step GBLUP (ssGBLUP) and a non-linear Bayesian approach (notably BayesR). Our analyses using machine learning (i.e., KAML) and deep learning (i.e., DeepGP) together with the four common methods (PBLUP, GBLUP, ssGBLUP and BayesR) were conducted for two main disease resistance traits (i.e., survival status coded as 0 and 1 and survival time, i.e., days that the animals were still alive after the challenge test) in a pedigree consisting of 560 individual animals (490 offspring and 70 parents) genotyped for 14,154 Single Nucleotide Polymorphism (SNPs). The results showed that KAML outperformed GBLUP and ssGBLUP, with the increases in the prediction accuracies for both traits by 5.1 - 47.7%. However, the prediction accuracies obtained from KAML were comparable to those estimated using BayesR. Imputation of missing genotypes using AlphaFamImpute increased the prediction accuracies by 0.2 33.2% in all the methods used. On the other hand, there were no significant increases in the prediction accuracies for both survival status and survival time when multivariate models were used in comparison to univariate analyses. Interestingly, the genomic prediction accuracies based on only highly significant SNPs (P < 0.00001) were not largely different from those obtained from the whole set of 14,154 SNPs. In all our analyses, the accuracies of genomic prediction were somewhat higher for survival time than survival status (0/1 data). It is concluded that there are prospects for the application of genomic selection to increase disease resistance to Edwardsiella ictaluri in striped catfish breeding programs

Download Full-text

Accuracies of genomic predictions for disease resistance of striped catfish to Edwardsiella ictaluri using artificial intelligence algorithms

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab361 ◽

2021 ◽

Author(s):

Nguyen Thanh Vu ◽

Tran Huu Phuc ◽

Kim Thi Phuong Oanh ◽

Nguyen Van Sang ◽

Trinh Thi Trang ◽

...

Keyword(s):

Machine Learning ◽

Disease Resistance ◽

Survival Time ◽

Genomic Prediction ◽

Edwardsiella Ictaluri ◽

Learning Methods ◽

Linear Unbiased Prediction ◽

Survival Status ◽

Striped Catfish ◽

Best Linear Unbiased

Abstract Assessments of genomic prediction accuracies using artificial intelligence (AI) algorithms (i.e.,, machine and deep learning methods) are currently not available or very limited in aquaculture species. The principal aim of this study was to examine the predictive performance of these new methods for disease resistance to Edwardsiella ictaluri in a population of striped catfish Pangasianodon hypophthalmus and to make comparisons with four common methods, i.e.,, pedigree-based best linear unbiased prediction (PBLUP), genomic-based best linear unbiased prediction (GBLUP), single-step GBLUP (ssGBLUP) and a non-linear Bayesian approach (notably BayesR). Our analyses using machine learning (i.e.,, ML-KAML) and deep learning (i.e.,, DL-MLP and DL-CNN) together with the four common methods (PBLUP, GBLUP, ssGBLUP and BayesR) were conducted for two main disease resistance traits (i.e.,, survival status coded as 0 and 1 and survival time, i.e.,, days that the animals were still alive after the challenge test) in a pedigree consisting of 560 individual animals (490 offspring and 70 parents) genotyped for 14,154 Single Nucleotide Polymorphism (SNPs). The results using 6,470 SNPs after quality control showed that machine learning methods outperformed PBLUP, GBLUP and ssGBLUP, with the increases in the prediction accuracies for both traits by 9.1–15.4%. However, the prediction accuracies obtained from machine learning methods were comparable to those estimated using BayesR. Imputation of missing genotypes using AlphaFamImpute increased the prediction accuracies by 5.3–19.2% in all the methods and data used. On the other hand, there were insignificant decreases (0.3–5.6%) in the prediction accuracies for both survival status and survival time when multivariate models were used in comparison to univariate analyses. Interestingly, the genomic prediction accuracies based on only highly significant SNPs (P < 0.00001, 318 - 400 SNPs for survival status and 1,362–1,589 SNPs for survival time) were somewhat lower (0.3 to 15.6%) than those obtained from the whole set of 6,470 SNPs. In most of our analyses, the accuracies of genomic prediction were somewhat higher for survival time than survival status (0/1 data). It is concluded that although there are prospects for the application of genomic selection to increase disease resistance to Edwardsiella ictaluri in striped catfish breeding programs, further evaluation of these methods should be made in independent families/populations when more data are accumulated in future generations to avoid possible biases in the genetic parameters estimates and prediction accuracies for the disease resistant traits studied in this population of striped catfish P. hypophthalmus.

Download Full-text

Genomic Prediction Using Alternative Strategies of Weighted Single-Step Genomic BLUP for Yearling Weight and Carcass Traits in Hanwoo Beef Cattle

Genes ◽

10.3390/genes12020266 ◽

2021 ◽

Vol 12 (2) ◽

pp. 266

Author(s):

Hossein Mehrban ◽

Masoumeh Naserkheil ◽

Deuk Hwan Lee ◽

Chungil Cho ◽

Taejeong Choi ◽

...

Keyword(s):

Quantitative Trait Loci ◽

Beef Cattle ◽

Genomic Prediction ◽

Quantitative Trait ◽

Carcass Traits ◽

Best Linear Unbiased Prediction ◽

Single Step ◽

Linear Unbiased Prediction ◽

Single Nucleotide ◽

Best Linear Unbiased

The weighted single-step genomic best linear unbiased prediction (GBLUP) method has been proposed to exploit information from genotyped and non-genotyped relatives, allowing the use of weights for single-nucleotide polymorphism in the construction of the genomic relationship matrix. The purpose of this study was to investigate the accuracy of genetic prediction using the following single-trait best linear unbiased prediction methods in Hanwoo beef cattle: pedigree-based (PBLUP), un-weighted (ssGBLUP), and weighted (WssGBLUP) single-step genomic methods. We also assessed the impact of alternative single and window weighting methods according to their effects on the traits of interest. The data was comprised of 15,796 phenotypic records for yearling weight (YW) and 5622 records for carcass traits (backfat thickness: BFT, carcass weight: CW, eye muscle area: EMA, and marbling score: MS). Also, the genotypic data included 6616 animals for YW and 5134 for carcass traits on the 43,950 single-nucleotide polymorphisms. The ssGBLUP showed significant improvement in genomic prediction accuracy for carcass traits (71%) and yearling weight (99%) compared to the pedigree-based method. The window weighting procedures performed better than single SNP weighting for CW (11%), EMA (11%), MS (3%), and YW (6%), whereas no gain in accuracy was observed for BFT. Besides, the improvement in accuracy between window WssGBLUP and the un-weighted method was low for BFT and MS, while for CW, EMA, and YW resulted in a gain of 22%, 15%, and 20%, respectively, which indicates the presence of relevant quantitative trait loci for these traits. These findings indicate that WssGBLUP is an appropriate method for traits with a large quantitative trait loci effect.

Download Full-text

Accounting for Group-Specific Allele Effects and Admixture in Genomic Predictions: Theory and Experimental Evaluation in Maize

Genetics ◽

10.1534/genetics.120.303278 ◽

2020 ◽

Vol 216 (1) ◽

pp. 27-41

Author(s):

Simon Rio ◽

Laurence Moreau ◽

Alain Charcosset ◽

Tristan Mary-Huard

Keyword(s):

Genomic Prediction ◽

Prediction Accuracy ◽

Prediction Models ◽

Best Linear Unbiased Prediction ◽

Linear Unbiased Prediction ◽

Modeling Group ◽

A Genome ◽

Specific Allele ◽

Best Linear Unbiased ◽

Unbiased Prediction

Populations structured into genetic groups may display group-specific linkage disequilibrium, mutations, and/or interactions between quantitative trait loci and the genetic background. These factors lead to heterogeneous marker effects affecting the efficiency of genomic prediction, especially for admixed individuals. Such individuals have a genome that is a mosaic of chromosome blocks from different origins, and may be of interest to combine favorable group-specific characteristics. We developed two genomic prediction models adapted to the prediction of admixed individuals in presence of heterogeneous marker effects: multigroup admixed genomic best linear unbiased prediction random individual (MAGBLUP-RI), modeling the ancestry of alleles; and multigroup admixed genomic best linear unbiased prediction random allele effect (MAGBLUP-RAE), modeling group-specific distributions of allele effects. MAGBLUP-RI can estimate the segregation variance generated by admixture while MAGBLUP-RAE can disentangle the variability that is due to main allele effects from the variability that is due to group-specific deviation allele effects. Both models were evaluated for their genomic prediction accuracy using a maize panel including lines from the Dent and Flint groups, along with admixed individuals. Based on simulated traits, both models proved their efficiency to improve genomic prediction accuracy compared to standard GBLUP models. For real traits, a clear gain was observed at low marker densities whereas it became limited at high marker densities. The interest of including admixed individuals in multigroup training sets was confirmed using simulated traits, but was variable using real traits. Both MAGBLUP models and admixed individuals are of interest whenever group-specific SNP allele effects exist.

Download Full-text

Improvement of Genomic Prediction in Advanced Wheat Breeding Lines by Including Additive × additive Epistasis

10.21203/rs.3.rs-424490/v1 ◽

2021 ◽

Author(s):

Miguel Angel Raffo ◽

Pernille Sarup ◽

Xiangyu Guo ◽

Huiming Liu ◽

Jeppe Reitan Andersen ◽

...

Keyword(s):

Genomic Prediction ◽

Cross Validation ◽

Best Linear Unbiased Prediction ◽

Wheat Breeding ◽

Breeding Cycle ◽

Linear Unbiased Prediction ◽

Breeding Lines ◽

Genetic Merit ◽

Best Linear Unbiased ◽

Unbiased Prediction

Abstract Epistasis is the principal non-additive genetic effect in inbred wheat lines and can be used to develop cultivars based on total genetic merit. Correct models for variance components (VCs) estimation are needed to disentangle the genetic architecture of complex traits in wheat. We aimed to i) evaluate the performance of extended genomic best linear unbiased prediction (EG-BLUP) and the natural and orthogonal interactions approach (NOIA) for VCs estimation in a commercial wheat-breeding population, and ii) investigate whether including epistasis in genomic prediction enhance predictive ability (PA) for wheat breeding lines. In total, 2,060 sixth-generation (F6) lines from Nordic Seed A/S breeding company were phenotyped for grain yield over 21-year-x-location combinations in Denmark, and genotyped using 15K Illumina-BeadChip. Four models were used to estimate VCs and heritability at plot level: i) Baseline, ii) Genomic best linear unbiased prediction (G-BLUP), iii) EG-BLUP, and iv) NOIA. Narrow- and broad-sense heritabilities estimated with G-BLUP were 0.15 and 0.31, respectively. EG-BLUP and NOIA failed to achieve orthogonal partition of genetic variances. Even though NOIA removed Hardy-Weinberg equilibrium assumption, both models yielded very similar estimates, indicating that linkage disequilibrium causes the lack of orthogonality. The PA was studied using leave-one-line-out and leave-one-breeding-cycle-out cross-validations. Both EG-BLUP and NOIA increased PA significantly (16.5%) compared to G-BLUP in leave-one-line-out cross-validation. However, the improvement for including epistasis was not observed in the leave-one-breeding-cycle-out cross-validation. We conclude that although the variance partition into orthogonal genetic effects was not possible, epistatic models can be useful to enhance predictions of total genetic merit.

Download Full-text

A Benchmarking Between Deep Learning, Support Vector Machine and Bayesian Threshold Best Linear Unbiased Prediction for Predicting Ordinal Traits in Plant Breeding

G3 Genes|Genome|Genetics ◽

10.1534/g3.118.200998 ◽

2018 ◽

pp. g3.200998.2018 ◽

Cited By ~ 13

Author(s):

Osval A. Montesinos-López ◽

Javier Martín-Vallejo ◽

José Crossa ◽

Daniel Gianola ◽

Carlos M. Hernández-Suárez ◽

...

Keyword(s):

Support Vector Machine ◽

Deep Learning ◽

Plant Breeding ◽

Best Linear Unbiased Prediction ◽

Support Vector ◽

Learning Support ◽

Linear Unbiased Prediction ◽

Ordinal Traits ◽

Best Linear Unbiased ◽

Unbiased Prediction

Download Full-text

Weighted Genomic Best Linear Unbiased Prediction for Carcass Traits in Hanwoo Cattle

Genes ◽

10.3390/genes10121019 ◽

2019 ◽

Vol 10 (12) ◽

pp. 1019 ◽

Cited By ~ 3

Author(s):

Bryan Irvine Lopez ◽

Seung-Hwan Lee ◽

Jong-Eun Park ◽

Dong-Hyun Shin ◽

Jae-Don Oh ◽

...

Keyword(s):

Genomic Prediction ◽

Carcass Traits ◽

Best Linear Unbiased Prediction ◽

Birth Date ◽

Muscle Area ◽

Eye Muscle ◽

Linear Unbiased Prediction ◽

Best Linear Unbiased ◽

Hanwoo Cattle ◽

Unbiased Prediction

The genomic best linear unbiased prediction (GBLUP) method has been widely used in routine genomic evaluation as it assumes a common variance for all single nucleotide polymorphism (SNP). However, this is unlikely in the case of traits influenced by major SNP. Hence, the present study aimed to improve the accuracy of GBLUP by using the weighted GBLUP (WGBLUP), which gives more weight to important markers for various carcass traits of Hanwoo cattle, such as backfat thickness (BFT), carcass weight (CWT), eye muscle area (EMA), and marbling score (MS). Linear and different nonlinearA SNP weighting procedures under WGBLUP were evaluated and compared with unweighted GBLUP and traditional pedigree-based methods (PBLUP). WGBLUP methods were assessed over ten iterations. Phenotypic data from 10,215 animals from different commercial herds that were slaughtered at approximately 30-month-old of age were used. All these animals were genotyped using Illumina Bovine 50k SNP chip and were divided into a training and a validation population by birth date on 1 November 2015. Genomic prediction accuracies obtained in the nonlinearA weighting methods were higher than those of the linear weighting for all traits. Moreover, unlike with linear methods, no sudden drops in the accuracy were noted after the peak was reached in nonlinearA methods. The average accuracies using PBLUP were 0.37, 0.49, 0.40, and 0.37, and 0.62, 0.74, 0.67, and 0.65 using GBLUP for BFT, CWT, EMA, and MS, respectively. Moreover, these accuracies of genomic prediction were further increased to 4.84% and 2.70% for BFT and CWT, respectively by using the nonlinearA method under the WGBLUP model. For EMA and MS, WGBLUP was as accurate as GBLUP. Our results indicate that the WGBLUP using a nonlinearA weighting method provides improved predictions for CWT and BFT, suggesting that the ability of WGBLUP over the other models by weighting selected SNPs appears to be trait-dependent.

Download Full-text

Benchmarking between item based collaborative filtering algorithm and genomic best linear unbiased prediction (GBLUP) model in terms of prediction accuracy for wheat and maize//Estudio comparativo en términos de capacidad predictiva para datos de trigo y maíz entre el algoritmo de filtrado colaborativo y el modelo genómico mejor predictor lineal insesgado (GBLUP)

Biotecnia ◽

10.18633/biotecnia.v22i2.1255 ◽

2020 ◽

Vol 22 (2) ◽

pp. 136-146

Author(s):

Osval A. Montesinos-López ◽

Emeterio Franco-Pérez ◽

Francisco J. Luna-Vázquez ◽

Josafat Salinas-Ruiz ◽

Sara Sandoval-Carrillo ◽

...

Keyword(s):

Collaborative Filtering ◽

Genomic Prediction ◽

Prediction Models ◽

Best Linear Unbiased Prediction ◽

Linear Unbiased Prediction ◽

Collaborative Filtering Algorithm ◽

New Methodologies ◽

Best Linear Unbiased ◽

Better Than ◽

Unbiased Prediction

Aim/background: in view of the growing demand for food, new methodologies are needed to improve the genomic selection (GS) methodology to obtain more productive plant varieties and there is empirical evidence that GS it is revolutionizing plant breeding for food production around the world. Methods: since the prediction models play a key role in GS, for this reason Montesinos-López et al. (2018) proposed the item based collaborative filtering (IBCF) algorithm for Genomic prediction. For this reason, in this paper we compare the IBCF algorithm with the most popular genomic prediction model called the Genomic Best Linear Unbiased Prediction (GBLUP). Results: We found that the GBLUP is superior than the IBCF model, but the IBCF is competitive to the GBLUP model since produced very similar predictions, but with the large advantage that it is extremely efficient in terms of time for implementation. Conclusions: we found that the GBLUP is better than the IBCF algorithm but the IBCF is more than 400 times more efficient than the GBLUP model in terms of time for implementation. Limitations: The main limitation of the study is that it was performed in univariate terms and it is possible that the IBCF will perform better with multivariate data.RESUMENObjetivo / antecedentes: en vista de la creciente demanda de alimentos, se necesitan nuevas metodologías para mejorar la selección genómica (GS) para obtener variedades de plantas más productivas y en menor tiempo y existe evidencia que la SG está revolucionando el mejoramiento de plantas que ayudará a incrementar la producción de alimentos a nivel mundial. Métodos: dado que los modelos de predicción juegan un papel clave en GS, Montesinos-López et al. (2018) propusieron el algoritmo de filtrado colaborativo (IBCF) para la predicción genómica. Por esta razón, en este artículo comparamos el algoritmo IBCF con el modelo de predicción genómica más popular denominado mejor predictor lineal insesgado Bayesiano (GBLUP). Resultados: Encontramos que el GBLUP es superior en capacidad predictiva al modelo IBCF, pero el IBCF es competitivo con el modelo GBLUP ya que produjo predicciones muy similares, pero con la ventaja de que es eficiente en términos de tiempo de implementación. Conclusiones: encontramos que el GBLUP es mejor que el algoritmo IBCF, pero el IBCF es 400 veces más eficiente que el modelo GBLUP en términos de tiempo de implementación. Limitaciones: la principal limitación del estudio es que se realizó en términos univariados y es posible que el IBCF se desempeñe mejor con datos multivariados.

Download Full-text

Evaluation of the efficiency of genomic versus pedigree predictions for growth and wood quality traits in Scots pine

BMC Genomics ◽

10.1186/s12864-020-07188-4 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Ainhoa Calleja-Rodriguez ◽

Jin Pan ◽

Tomas Funda ◽

Zhiqiang Chen ◽

John Baison ◽

...

Keyword(s):

Scots Pine ◽

Genomic Prediction ◽

Wood Quality ◽

Selection Response ◽

Best Linear Unbiased Prediction ◽

Snp Markers ◽

Progeny Testing ◽

Quality Traits ◽

Linear Unbiased Prediction ◽

Best Linear Unbiased

Abstract Background Genomic selection (GS) or genomic prediction is a promising approach for tree breeding to obtain higher genetic gains by shortening time of progeny testing in breeding programs. As proof-of-concept for Scots pine (Pinus sylvestris L.), a genomic prediction study was conducted with 694 individuals representing 183 full-sib families that were genotyped with genotyping-by-sequencing (GBS) and phenotyped for growth and wood quality traits. 8719 SNPs were used to compare different genomic with pedigree prediction models. Additionally, four prediction efficiency methods were used to evaluate the impact of genomic breeding value estimations by assigning diverse ratios of training and validation sets, as well as several subsets of SNP markers. Results Genomic Best Linear Unbiased Prediction (GBLUP) and Bayesian Ridge Regression (BRR) combined with expectation maximization (EM) imputation algorithm showed slightly higher prediction efficiencies than Pedigree Best Linear Unbiased Prediction (PBLUP) and Bayesian LASSO, with some exceptions. A subset of approximately 6000 SNP markers, was enough to provide similar prediction efficiencies as the full set of 8719 markers. Additionally, prediction efficiencies of genomic models were enough to achieve a higher selection response, that varied between 50-143% higher than the traditional pedigree-based selection. Conclusions Although prediction efficiencies were similar for genomic and pedigree models, the relative selection response was doubled for genomic models by assuming that earlier selections can be done at the seedling stage, reducing the progeny testing time, thus shortening the breeding cycle length roughly by 50%.

Download Full-text

Genomic selection in American mink (Neovison vison) using a single-step genomic best linear unbiased prediction model for size and quality traits graded on live mink

Journal of Animal Science ◽

10.1093/jas/skab003 ◽

2021 ◽

Vol 99 (1) ◽

Author(s):

Trine M Villumsen ◽

Guosheng Su ◽

Bernt Guldbrandtsen ◽

Torben Asp ◽

Mogens S Lund

Keyword(s):

Body Weight ◽

Genomic Selection ◽

Genomic Prediction ◽

Best Linear Unbiased Prediction ◽

Single Step ◽

Quality Traits ◽

Linear Unbiased Prediction ◽

Best Linear Unbiased ◽

Regression Slopes ◽

Unbiased Prediction

Abstract Genomic selection relies on single-nucleotide polymorphisms (SNPs), which are often collected using medium-density SNP arrays. In mink, no such array is available; instead, genotyping by sequencing (GBS) can be used to generate marker information. Here, we evaluated the effect of genomic selection for mink using GBS. We compared the estimated breeding values (EBVs) from single-step genomic best linear unbiased prediction (SSGBLUP) models to the EBV from ordinary pedigree-based BLUP models. We analyzed seven size and quality traits from the live grading of brown mink. The phenotype data consisted of ~20,600 records for the seven traits from the mink born between 2013 and 2016. Genotype data included 2,103 mink born between 2010 and 2014, mostly breeding animals. In total, 28,336 SNP markers from 391 scaffolds were available for genomic prediction. The pedigree file included 29,212 mink. The predictive ability was assessed by the correlation (r) between progeny trait deviation (PTD) and EBV, and the regression of PTD on EBV, using 5-fold cross-validation. For each fold, one-fifth of animals born in 2014 formed the validation set. For all traits, the SSGBLUP model resulted in higher accuracies than the BLUP model. The average increase in accuracy was 15% (between 3% for fur clarity and 28% for body weight). For three traits (body weight, silky appearance of the under wool, and guard hair thickness), the difference in r between the two models was significant (P < 0.05). For all traits, the regression slopes of PTD on EBV from SSGBLUP models were closer to 1 than regression slopes from BLUP models, indicating SSGBLUP models resulted in less bias of EBV for selection candidates than the BLUP models. However, the regression coefficients did not differ significantly. In conclusion, the SSGBLUP model is superior to conventional BLUP model in the accurate selection of superior animals, and, thus, it would increase genetic gain in a selective breeding program. In addition, this study shows that GBS data work well in genomic prediction in mink, demonstrating the potential of GBS for genomic selection in livestock species.

Download Full-text