training population
Recently Published Documents


TOTAL DOCUMENTS

91
(FIVE YEARS 48)

H-INDEX

14
(FIVE YEARS 5)

Author(s):  
Habtamu Ayalew ◽  
Joshua D Anderson ◽  
Nick Krom ◽  
Yuhong Tang ◽  
Twain J Butler ◽  
...  

Abstract Triticale, a hybrid species between wheat and rye, is one of the newest additions to the plant kingdom with a very short history of improvement. It has very limited genomic resources because of its large and complex genome. Objectives of this study were to generate dense marker data, understand genetic diversity, population structure, linkage disequilibrium (LD), and estimate accuracies of commonly used genomic selection (GS) models on forage yield of triticale. Genotyping-by-sequencing (GBS), using PstI and MspI restriction enzymes for reducing genome complexity, was performed on a triticale diversity panel (n = 289). After filtering for biallelic loci with more than 70% genome coverage, and minor allele frequency (MAF) > 0.05, de novo variant calling identified 16,378 single nucleotide polymorphism (SNP) markers. Sequences of these variants were mapped to wheat and rye reference genomes to infer their homologous groups and chromosome positions. About 45% (7,430), and 58% (9,500) of the de novo identified SNPs were mapped to the wheat and rye reference genomes, respectively. Interestingly, 28.9% (2,151) of the 7,430 SNPs were mapped to the D genome of hexaploid wheat, indicating substantial substitution of the R genome with D genome in cultivated triticale. About 27% of marker pairs were in significant LD with an average r2 > 0.18 (P < 0.05). Genome-wide LD declined rapidly to r2 < 0.1 beyond 10 kb physical distance. The three sub-genomes (A, B, and R) showed comparable LD decay patterns. Genetic diversity and population structure analyses identified five distinct clusters. Genotype grouping did not follow prior winter vs. spring type classification. However, one of the clusters was largely dominated by winter triticale. Genomic selection accuracies were estimated for forage yield using three commonly used models with different training population sizes and marker densities. Genomic selection accuracy increased with increasing training population size while gain in accuracy tended to plateau with marker densities of 2,000 SNPs or more. Average GS accuracy was about 0.52, indicating the potential of using GS in triticale forage yield improvement.


2021 ◽  
Author(s):  
Xiangyu Guo ◽  
Ahmed Jahoor ◽  
Just Jensen ◽  
Pernille Sarup

Abstract The objectives were to investigate prediction of malting quality (MQ) phenotypes in different locations using information from metabolomic spectra, and compare the prediction ability using different models and different sizes of training population (TP). A total of 2,667 plots of 564 malting spring barley lines from three years and two locations were included. Five MQ traits were measured in wort produced from each individual plot. Metabolomic features (MFs) used were 24,018 NMR intensities measured on each wort sample. Models involved in the statistical analyses were a metabolomic best linear unbiased prediction (MBLUP) model and a partial least squares regression (PLSR) model. Predictive ability within location and across locations were compared using cross-validation methods. The proportion of variance in MQ traits that could be explained by effects of MFs was above 0.9 for all traits. The prediction accuracy increased with increasing TP size but when the TP size reached 1,000, the rate of increase was negligible. The number of components considered in the PLSR models can affect the performance of PLSR models and 20 components were optimal. The accuracy of individual plots and line means using leave-one-line-out cross-validation ranged from 0.722 to 0.865 and using leave-one-location-out cross-validation ranged from 0.517 to 0.817.In conclusion, it is possible to carry out metabolomic prediction of MQ traits using MFs, the prediction accuracy is high and MBLUP is better than PLSR if the training population is larger than 100. The results have significant implications for practical barley breeding for malting quality.


Author(s):  
Edwin Lauer ◽  
James Holland ◽  
Fikret Isik

Abstract Genomic prediction has the potential to significantly increase the rate of genetic gain in tree breeding programs. In this study, a clonally replicated population (n = 2063) was used to train a genomic prediction model. The model was validated both within the training population and in a separate population (n = 451). The prediction abilities from random (20% vs. 80%) cross validation within the training population were 0.56 and 0.78 for height and stem form, respectively. Removal of all full-sib relatives within the training population resulted in ∼50% reduction in their genomic prediction ability for both traits. The average prediction ability for all 451 individual trees was 0.29 for height and 0.57 for stem form. The degree of genetic linkage (full sib family, half sib family, unrelated) between the training and validation sets had a strong impact on prediction ability for stem form but not for height. A dominant dwarfing allele, the first to be reported in a conifer species, was discovered via GWAS on linkage group 5 that conferred a 0.33 m mean height reduction. However, the QTL was family specific. The rapid decay of LD, large genome size, and inconsistencies in marker-QTL linkage phase suggest that large, diverse training populations are needed for genomic selection in Pinus taeda L.


2021 ◽  
Vol 12 ◽  
Author(s):  
Woo-Keun Seo ◽  
Hyo Suk Nam ◽  
Jong-Won Chung ◽  
Young Dae Kim ◽  
Keon-Ha Kim ◽  
...  

Background and Purpose: Successful reperfusion therapy is supposed to be comprehensive and validated beyond the grade of recanalization. This study aimed to develop a novel scoring system for defining the successful recanalization after endovascular thrombectomy.Methods: We analyzed the data of consecutive acute stroke patients who were eligible to undergo reperfusion therapy within 24 h of onset and who underwent mechanical thrombectomy using a nationwide multicenter stroke registry. A new score was produced using the predictors which were directly linked to the procedure to evaluate the performance of the thrombectomy procedure.Results: In total, 446 patients in the training population and 222 patients in the validation population were analyzed. From the potential components of the score, four items were selected: Emergency Room-to-puncture time (T), adjuvant devices used (A), procedural intracranial bleeding (B), and post-thrombectomy reperfusion status [Thrombolysis in Cerebral Infarction (TICI)]. Using these items, the TAB-TICI score was developed, which showed good performance in terms of discriminating early neurological aggravation [AUC 0.73, 95% confidence interval (CI) 0.67–0.78, P < 0.01] and favorable outcomes (AUC 0.69, 95% CI 0.64–0.75, P < 0.01) in the training population. The stability of the TAB-TICI score was confirmed by external validation and sensitivity analyses. The TAB-TICI score and its derived grade of successful recanalization were significantly associated with the volume of thrombectomy cases at each site and in each admission year.Conclusion: The TAB-TICI score is a valid and easy-to-use tool to more comprehensively define successful recanalization after endovascular thrombectomy in acute stroke patients with large vessel occlusion.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 28-28
Author(s):  
Jorge Hidalgo ◽  
Daniela Lourenco ◽  
Shogo Tsuruta ◽  
Yutaka Masuda ◽  
Vivian Breen ◽  
...  

Abstract The objectives of this research were to investigate trends for accuracy of genomic predictions over time in a broiler population accumulating data, and to test if data from distant generations are useful in maintaining the accuracy of genomic predictions in selection candidates. The data contained 820k phenotypes for a growth trait (GROW), 200k for two feed efficiency traits (FE1 and FE2), and 42k for a dissection trait (DT). The pedigree included 1.2M animals across 7 years, over 100k from the last 4 years were genotyped. Accuracy was calculated by the linear regression method. Before genotypes became available for training populations, accuracy was nearly stable despite the accumulation of phenotypes and pedigrees. When the first year of genomic data was included in the training population, accuracy increased 56, 77, 39, and 111% for GROW, FE1, FE2, and DT, respectively. With genomic information, the accuracies increased every year except the last one, when they declined for GROW and FE2. The decay of accuracy over time was evaluated in progeny, grand-progeny, and great-grand-progeny of training populations. Without genotypes, the average decline in accuracy across traits was 41% from progeny to grand-progeny, and 19% from grand-progeny to great-grand-progeny. Whit genotypes, the average decline across traits was 14% from progeny to grand-progeny, and 2% from grand-progeny to great-grand-progeny. The accuracies in the last 3 generations were the same when the training population included 5 or 2 years of data, and a marginal decrease was observed when the training population included only 1 year of data. Training sets including genomic information provided an increased accuracy and persistence of genomic predictions compared to training sets without genomic data. The two most recent years of data were enough to maintain the accuracy of predictions in selection candidates.


2021 ◽  
Vol 12 ◽  
Author(s):  
Julio Isidro y Sánchez ◽  
Deniz Akdemir

Genomic selection (GS) is becoming an essential tool in breeding programs due to its role in increasing genetic gain per unit time. The design of the training set (TRS) in GS is one of the key steps in the implementation of GS in plant and animal breeding programs mainly because (i) TRS optimization is critical for the efficiency and effectiveness of GS, (ii) breeders test genotypes in multi-year and multi-location trials to select the best-performing ones. In this framework, TRS optimization can help to decrease the number of genotypes to be tested and, therefore, reduce phenotyping cost and time, and (iii) we can obtain better prediction accuracies from optimally selected TRS than an arbitrary TRS. Here, we concentrate the efforts on reviewing the lessons learned from TRS optimization studies and their impact on crop breeding and discuss important features for the success of TRS optimization under different scenarios. In this article, we review the lessons learned from training population optimization in plants and the major challenges associated with the optimization of GS including population size, the relationship between training and test set (TS), update of TRS, and the use of different packages and algorithms for TRS implementation in GS. Finally, we describe general guidelines to improving the rate of genetic improvement by maximizing the use of the TRS optimization in the GS framework.


2021 ◽  
Author(s):  
Rujian Sun ◽  
Bincheng Sun ◽  
Yu Tian ◽  
Shanshan Su ◽  
Yong Zhang ◽  
...  

Abstract Microarray technology facilitates rapid, accurate, and economical genotyping. Here, using resequencing data from 2,214 representative soybean accessions, we developed the ZDX1 high-throughput functional soybean array, containing 158,959 SNPs, covering 90.92% of soybean genes and sites related to agronomically important traits. We genotyped 817 soybean accessions using ZDX1, including parental lines, non-parental lines, and progeny from a practical breeding pipeline. It was clarified that non-parental lines had highest genetic diversity, and 235 SNPs were identified to be fixed in the progeny. The unknown soybean cyst nematode-resistant and early maturity accessions were identified by using allele combinations. Notably, we found that breeding index was a good indicator for progeny selection, in which the superior progeny were derived from the crossing more distantly related parents with at least one parent having a higher breeding index. Based on this rule, two varieties were directionally developed. Meanwhile, redundant parents were screened out and potential combinations were formulated. GBLUP analysis displayed that the markers in genic regions had priority to be higher accuracy on predicting four agronomic traits compared with either whole genome or intergenic markers. Then we used progeny to expand the training population to increase the prediction accuracy of breeding selection by 32.1%. Collectively, our work provided a versatile array for high accuracy selecting and predicting both parents and progeny that can greatly accelerate soybean breeding.


Author(s):  
Timothy G. Eckard ◽  
Story F.P. Miraldi ◽  
Karen Y. Peck ◽  
Matthew A. Posner ◽  
Steven J. Svoboda ◽  
...  

ABSTRACT Context: Lower extremity bone stress injuries (BSI) place a significant burden on the health and readiness of the US Armed Forces. Objective: To determine if pre-injury baseline performance on an expanded and automated 22-item version of the Landing Error Scoring System (LESS-22) is associated with the incidence of BSI in a military training population. Design: Prospective cohort study. Setting: US Military Academy at West Point Participants: 2,235 (510 females, 22.8%) incoming cadets Main outcome measures: Multivariable Poisson regression models were used to produce adjusted incidence rate ratios (IRR) to quantify the association between pre-injury LESS scores and BSI incidence rate during follow-up, adjusted for pertinent risk factors. Risk factors were included as covariates in the final model if the 95% confidence interval (95% CI) for the crude IRR did not contain 1.00. Results: A total of 54 BSI occurred during the study period, resulting in an overall incidence rate of 0.07 BSI per 1,000 person-days (95% CI: 0.05, 0.09). The mean number of exposure days was 345.4 (SD 61.12, range 3–368). The final model was adjusted for sex and BMI and yielded an adjusted IRR for LESS-22 score of 1.06 (95% CI: 1.002, 1.13; p=0.04), indicating that each additional LESS error documented at baseline was associated with a 6.0% increase in the incidence rate of BSI during the follow-up period. In addition, six individual LESS-22 items, including two newly added items, were significantly associated with BSI incidence. Conclusions: This study provides evidence that performance on the expanded and automated version of the LESS is associated with BSI incidence in a military training population. These results suggest that the automated LESS-22 may be a scalable solution for screening military training populations for BSI risk.


Author(s):  
Jorge Hidalgo ◽  
Daniela Lourenco ◽  
Shogo Tsuruta ◽  
Yutaka Masuda ◽  
Vivian Breen ◽  
...  

Abstract Accuracy of genomic predictions is an important component of the selection response. The objectives of this research were: 1) to investigate trends for prediction accuracies over time in a broiler population of accumulated phenotypes, genotypes, and pedigrees; 2) to test if data from distant generations are useful to maintain prediction accuracies in selection candidates. The data contained 820K phenotypes for a growth trait (GT), 200K for two feed efficiency traits (FE1 and FE2), and 42K for a carcass yield trait (CY). The pedigree included 1,252,619 birds hatched over seven years, of which 154,318 from the last four years were genotyped. Training populations were constructed adding one year of data sequentially, persistency of accuracy over time was evaluated using predictions from birds hatched in the three generations following or in the years after the training populations. In the first generation, before genotypes became available for the training populations (first three years of data), accuracies remained almost stable with successive additions of phenotypes and pedigree to the accumulated dataset. The inclusion of one year of genotypes in addition to four years of phenotypes and pedigree in the training population led to increases in accuracy of 54% for GT, 76% for FE1, 110% for CY, and 38% for FE2; on average, 74% of the increase was due to genomics. Prediction accuracies declined faster without than with genomic information in the training populations. When genotypes were unavailable, the average decline in prediction accuracy across traits was 41% from the first to the second generation of validation, and 51% from the second to the third generation of validation. When genotypes were available, the average decline across traits was 14% from the first to the second generation of validation, and 3% from the second to the third generation of validation. Prediction accuracies in the last three generations were the same when the training population included five or two years of data, and a decrease of ~7% was observed when the training population included only one year of data. Training sets including genomic information provided an increase in accuracy and persistence of genomic predictions compared to training sets without genomic data. The two most recent years of pedigree, phenotypic and genomic data were sufficient to maintain prediction accuracies in selection candidates. Similar conclusions were obtained using validation populations per year.


Biology ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 756
Author(s):  
Wentao Zhang ◽  
Kerry Boyle ◽  
Anita Brule-Babel ◽  
George Fedak ◽  
Peng Gao ◽  
...  

Fusarium head blight (FHB) resistance is quantitatively inherited, controlled by multiple minor effect genes, and highly affected by the interaction of genotype and environment. This makes genomic selection (GS) that uses genome-wide molecular marker data to predict the genetic breeding value as a promising approach to select superior lines with better resistance. However, various factors can affect accuracies of GS and better understanding how these factors affect GS accuracies could ensure the success of applying GS to improve FHB resistance in wheat. In this study, we performed a comprehensive evaluation of factors that affect GS accuracies with a multi-parental population designed for FHB resistance. We found larger sample sizes could get better accuracies. Training population designed by CDmean based optimization algorithms significantly increased accuracies than random sampling approach, while mean of predictor error variance (PEVmean) had the poorest performance. Different genomic selection models performed similarly for accuracies. Including prior known large effect quantitative trait loci (QTL) as fixed effect into the GS model considerably improved the predictability. Multi-traits models had almost no effects, while the multi-environment model outperformed the single environment model for prediction across different environments. By comparing within and across family prediction, better accuracies were obtained with the training population more closely related to the testing population. However, achieving good accuracies for GS prediction across populations is still a challenging issue for GS application.


Sign in / Sign up

Export Citation Format

Share Document