scholarly journals Prediction Strategies for Leveraging Information of Associated Traits under Single- and Multi-Trait Approaches in Soybeans

Agriculture ◽  
2020 ◽  
Vol 10 (8) ◽  
pp. 308
Author(s):  
Reyna Persa ◽  
Arthur Bernardeli ◽  
Diego Jarquin

The availability of molecular markers has revolutionized conventional ways to improve genotypes in plant and animal breeding through genome-based predictions. Several models and methods have been developed to leverage the genomic information in the prediction context to allow more efficient ways to screen and select superior genotypes. In plant breeding, usually, grain yield (yield) is the main trait to drive the selection of superior genotypes; however, in many cases, the information of associated traits is also routinely collected and it can potentially be used to enhance the selection. In this research, we considered different prediction strategies to leverage the information of the associated traits ([AT]; full: all traits observed for the same genotype; and partial: some traits observed for the same genotype) under an alternative single-trait model and the multi-trait approach. The alternative single-trait model included the information of the AT for yield prediction via the phenotypic covariances while the multi-trait model jointly analyzed all the traits. The performance of these strategies was assessed using the marker and phenotypic information from the Soybean Nested Association Mapping (SoyNAM) project observed in Nebraska in 2012. The results showed that the alternative single-trait strategy, which combines the marker and the information of the AT, outperforms the multi-trait model by around 12% and the conventional single-trait strategy (baseline) by 25%. When no information on the AT was available for those genotypes in the testing sets, the multi-trait model reduced the baseline results by around 6%. For the cases where genotypes were partially observed (i.e., some traits observed but not others for the same genotype), the multi-trait strategy showed improvements of around 6% for yield and between 2% to 9% for the other traits. Hence, when yield drives the selection of superior genotypes, the single-trait and multi-trait genomic prediction will achieve significant improvements when some genotypes have been fully or partially tested, with the alternative single-trait model delivering the best results. These results provide empirical evidence of the usefulness of the AT for improving the predictive ability of prediction models for breeding applications.


2019 ◽  
Vol 15 ◽  
pp. 117693431984002 ◽  
Author(s):  
Reka Howard ◽  
Diego Jarquin

Prediction techniques are important in plant breeding as they provide a tool for selection that is more efficient and economical than traditional phenotypic and pedigree based selection. The conventional genomic prediction models include molecular marker information to predict the phenotype. With the development of new phenomics techniques we have the opportunity to collect image data on the plants, and extend the traditional genomic prediction models where we incorporate diverse set of information collected on the plants. In our research, we developed a hybrid matrix model that incorporates molecular marker and canopy coverage information as a weighted linear combination to predict grain yield for the soybean nested association mapping (SoyNAM) panel. To obtain the testing and training sets, we clustered the individuals based on their marker and canopy information using 2 different clustering techniques, and we compared 5 different cross-validation schemes. The results showed that the predictive ability of the models was the highest when both the canopy and marker information was included, and it was the lowest when only the canopy information was included.



2021 ◽  
Vol 13 (15) ◽  
pp. 8247
Author(s):  
Dimitrios N. Vlachostergios ◽  
Christos Noulas ◽  
Anastasia Kargiotidou ◽  
Dimitrios Baxevanos ◽  
Evangelia Tigka ◽  
...  

Lentil is a versatile and profitable pulse crop with high nutritional food and feed values. The objectives of the study were to determine suitable locations for high yield and quality in terms of production and/or breeding, and to identify promising genotypes. For this reason, five lentil genotypes were evaluated in a multi-location network consisting of ten diverse sites for two consecutive growing seasons, for seed yield (SY), other agronomic traits, crude protein (CP), cooking time (CT) and crude protein yield (CPY). A significant diversification and specialization of the locations was identified with regards to SY, CP, CT and CPY. Different locations showed optimal values for each trait. Locations E4 and E3, followed by E10, were “ideal” for SY; locations E1, E3 and E7 were ideal for high CP; and the “ideal” locations for CT were E3 and E5, followed by E2. Therefore, the scope of the cultivation determined the optimum locations for lentil cultivation. The GGE-biplot analysis revealed different discriminating abilities and representativeness among the locations for the identification of the most productive and stable genotypes. Location E3 (Orestiada, Region of Thrace) was recognized as being optimal for lentil breeding, as it was the “ideal” or close to “ideal” for the selection of superior genotypes for SY, CP, CT and CPY. Adaptable genotypes (cv. Dimitra, Samos) showed a high SY along with excellent values for CP, CT and CPY, and are suggested either for cultivation in many regions or to be exploited in breeding programs.



2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Menelaos Pavlou ◽  
Gareth Ambler ◽  
Rumana Z. Omar

Abstract Background Clustered data arise in research when patients are clustered within larger units. Generalised Estimating Equations (GEE) and Generalised Linear Models (GLMM) can be used to provide marginal and cluster-specific inference and predictions, respectively. Methods Confounding by Cluster (CBC) and Informative cluster size (ICS) are two complications that may arise when modelling clustered data. CBC can arise when the distribution of a predictor variable (termed ‘exposure’), varies between clusters causing confounding of the exposure-outcome relationship. ICS means that the cluster size conditional on covariates is not independent of the outcome. In both situations, standard GEE and GLMM may provide biased or misleading inference, and modifications have been proposed. However, both CBC and ICS are routinely overlooked in the context of risk prediction, and their impact on the predictive ability of the models has been little explored. We study the effect of CBC and ICS on the predictive ability of risk models for binary outcomes when GEE and GLMM are used. We examine whether two simple approaches to handle CBC and ICS, which involve adjusting for the cluster mean of the exposure and the cluster size, respectively, can improve the accuracy of predictions. Results Both CBC and ICS can be viewed as violations of the assumptions in the standard GLMM; the random effects are correlated with exposure for CBC and cluster size for ICS. Based on these principles, we simulated data subject to CBC/ICS. The simulation studies suggested that the predictive ability of models derived from using standard GLMM and GEE ignoring CBC/ICS was affected. Marginal predictions were found to be mis-calibrated. Adjusting for the cluster-mean of the exposure or the cluster size improved calibration, discrimination and the overall predictive accuracy of marginal predictions, by explaining part of the between cluster variability. The presence of CBC/ICS did not affect the accuracy of conditional predictions. We illustrate these concepts using real data from a multicentre study with potential CBC. Conclusion Ignoring CBC and ICS when developing prediction models for clustered data can affect the accuracy of marginal predictions. Adjusting for the cluster mean of the exposure or the cluster size can improve the predictive accuracy of marginal predictions.



2022 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Michelle Louise Gatt ◽  
Maria Cassar ◽  
Sandra C. Buttigieg

Purpose The purpose of this paper is to identify and analyse the readmission risk prediction tools reported in the literature and their benefits when it comes to healthcare organisations and management.Design/methodology/approach Readmission risk prediction is a growing topic of interest with the aim of identifying patients in particular those suffering from chronic diseases such as congestive heart failure, chronic obstructive pulmonary disease and diabetes, who are at risk of readmission. Several models have been developed with different levels of predictive ability. A structured and extensive literature search of several databases was conducted using the Preferred Reporting Items for Systematic Reviews and Meta-analysis strategy, and this yielded a total of 48,984 records.Findings Forty-three articles were selected for full-text and extensive review after following the screening process and according to the eligibility criteria. About 34 unique readmission risk prediction models were identified, in which their predictive ability ranged from poor to good (c statistic 0.5–0.86). Readmission rates ranged between 3.1 and 74.1% depending on the risk category. This review shows that readmission risk prediction is a complex process and is still relatively new as a concept and poorly understood. It confirms that readmission prediction models hold significant accuracy at identifying patients at higher risk for such an event within specific context.Research limitations/implications Since most prediction models were developed for specific populations, conditions or hospital settings, the generalisability and transferability of the predictions across wider or other contexts may be difficult to achieve. Therefore, the value of prediction models remains limited to hospital management. Future research is indicated in this regard.Originality/value This review is the first to cover readmission risk prediction tools that have been published in the literature since 2011, thereby providing an assessment of the relevance of this crucial KPI to health organisations and managers.



2021 ◽  
Vol 12 (3) ◽  
pp. 221
Author(s):  
John K.M. Kuwornu ◽  
Chutiporn Anutariya ◽  
Attaphongse Taparugssanagorn ◽  
Sumanya Ngandee


Risks ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 159
Author(s):  
Sunghwa Park ◽  
Hyunsok Kim ◽  
Janghan Kwon ◽  
Taeil Kim

In this paper, we use a logit model to predict the probability of default for Korean shipping companies. We explore numerous financial ratios to find predictors of a shipping firm’s failure and construct four default prediction models. The results suggest that a model with industry specific indicators outperforms other models in predictive ability. This finding indicates that utilizing information about unique financial characteristics of the shipping industry may enhance the performance of default prediction models. Given the importance of the shipping industry in the Korean economy, this study can benefit both policymakers and market participants.



Author(s):  
Eva–Maria Walz ◽  
Marlon Maranan ◽  
Roderick van der Linden ◽  
Andreas H. Fink ◽  
Peter Knippertz

AbstractCurrent numerical weather prediction models show limited skill in predicting low-latitude precipitation. To aid future improvements, be it with better dynamical or statistical models, we propose a well-defined benchmark forecast. We use the arguably best currently high-resolution, gauge-calibrated, gridded precipitation product, the Integrated Multi-Satellite Retrievals for GPM (Global Precipitation Measurement) (IMERG) “final run” in a ± 15-day window around the date of interest to build an empirical climatological ensemble forecast. This window size is an optimal compromise between statistical robustness and flexibility to represent seasonal changes. We refer to this benchmark as Extended Probabilistic Climatology (EPC) and compute it on a 0.1°×0.1° grid for 40°S–40°N and the period 2001–2019. In order to reduce and standardize information, a mixed Bernoulli-Gamma distribution is fitted to the empirical EPC, which hardly affects predictive performance. The EPC is then compared to 1-day ensemble predictions from the European Centre for Medium-Range Weather Forecasts (ECMWF) using standard verification scores. With respect to rainfall amount, ECMWF performs only slightly better than EPS over most of the low latitudes and worse over high-mountain and dry oceanic areas as well as over tropical Africa, where the lack of skill is also evident in independent station data. For rainfall occurrence, EPC is superior over most oceanic, coastal, and mountain regions, although the better potential predictive ability of ECMWF indicates that this is mostly due to calibration problems. To encourage the use of the new benchmark, we provide the data, scripts, and an interactive webtool to the scientific community.



2021 ◽  
Vol 45 ◽  
Author(s):  
Anna Regina Tiago Carneiro ◽  
Osvaldo Toshiyuki Hamawaki ◽  
Ana Paula Oliveira Nogueira ◽  
Arthur Felipe Eustáquio e Silva ◽  
Raphael Lemes Hamawaki ◽  
...  

ABSTRACT The selection indexes aggregate information to multiple characters and, with this, they are able to carry out the selection of a set of variables simultaneously. The objective was to verify the genetic potential of agronomic traits and to select soybean F3:4 progenies based on different selection strategies. 123 progenies and the parents were sown in randomized blocks with two replications. The gains of direct selection by the indexes, the sum of “ranks” and the genotype-ideotype were lower for all characters when compared to the gains of direct and indirect selection. The rank sum index stood out for achieving the highest total gain with 37.11%. The index of the genotype-ideotype obtained a lower gain (-0.48%) for the character number of days for flowering compared to the sum index of “ranks” (-0.54%) and reached a negative gain for the attribute insertion height of the first pod with -1.82%. The genetic potential of the F3:4 population is high and allows different selection strategies to be applied to reach superior genotypes. The progenies UFU 72, UFU 116, UFU 86, UFU 45, UFU 117, UFU 56, UFU 5, UFU 106, UFU 6, UFU 4, UFU 73, UFU 101, UFU 96, UFU 90, UFU 123, UFU 116, UFU 88, UFU 65, UFU 70, UFU 3, UFU 69 and UFU 37 were selected by both selection indexes. The UFU 72, UFU 90, UFU 88 and UFU 69 progenies are agronomically superior both in direct and indirect selection, as in Mulamba and Mock (1978) sum of “ranks” selections and genotype-ideotype.



Sign in / Sign up

Export Citation Format

Share Document