Genetic analysis of longitudinal height data using random regression

2009 ◽  
Vol 39 (10) ◽  
pp. 1939-1948 ◽  
Author(s):  
Chunkao Wang ◽  
Bengt Andersson ◽  
Patrik Waldmann

Genetic analysis of forest longitudinal height data using random regression (RR) has the potential to be attractive to tree breeders because of its advantages for selection at early ages. Our study provides an example of implementation of RR to forest tree height growth data. The data set comes from the Swedish Scots pine ( Pinus sylvestris L.) breeding program with a pedigree over three generations and consists of 899 trees with reconstructed phenotypic height records for 16 years. Legendre polynomials and B-splines were used as base functions in RR models. The restricted maximum likelihood method was employed to estimate (co)variance parameters. Results show that heritability increased with age, except for early ages (years 1 to 4). In general, slightly higher heritabilities were found for the RR model than for the single-trait and paired-trait analyses for most ages. Moreover, the heritabilities obtained with B-splines as the base function tended to be somewhat higher than those obtained with Legendre polynomials. The RR method provides a promising approach for estimating genetic parameters of longitudinal data that can be used in early selection. However, application to real data from other species and to simulated data is needed before general breeding recommendations can be established.

Author(s):  
M D MacNeil ◽  
J W Buchanan ◽  
M L Spangler ◽  
E Hay

Abstract The objective of this study was to evaluate the effects of various data structures on the genetic evaluation for the binary phenotype of reproductive success. The data were simulated based on an existing pedigree and an underlying fertility phenotype with a heritability of 0.10. A data set of complete observations was generated for all cows. This data set was then modified mimicking the culling of cows when they first failed to reproduce, cows having a missing observation at either their second or fifth opportunity to reproduce as if they had been selected as donors for embryo transfer, and censoring records following the sixth opportunity to reproduce as in a cull-for-age strategy. The data were analyzed using a third order polynomial random regression model. The EBV of interest for each animal was the sum of the age-specific EBV over the first 10 observations (reproductive success at ages 2-11). Thus, the EBV might be interpreted as the genetic expectation of number of calves produced when a female is given ten opportunities to calve. Culling open cows resulted in the EBV for 3 year-old cows being reduced from 8.27 ± 0.03 when open cows were retained to 7.60 ± 0.02 when they were culled. The magnitude of this effect decreased as cows grew older when they first failed to reproduce and were subsequently culled. Cows that did not fail over the 11 years of simulated data had an EBV of 9.43 ± 0.01 and 9.35 ± 0.01 based on analyses of the complete data and the data in which cows that failed to reproduce were culled, respectively. Cows that had a missing observation for their second record had a significantly reduced EBV, but the corresponding effect at the fifth record was negligible. The current study illustrates that culling and management decisions, and particularly those that impact the beginning of the trajectory of sustained reproductive success, can influence both the magnitude and accuracy of resulting EBV.


Author(s):  
Luiz Fernando Brito ◽  
Felipe Gomes da Silva ◽  
Hinayah Rojas de Oliveira ◽  
Nadson Souza ◽  
Giovani Caetano ◽  
...  

2019 ◽  
Vol 9 (10) ◽  
pp. 3369-3380 ◽  
Author(s):  
Mehdi Momen ◽  
Malachy T. Campbell ◽  
Harkamal Walia ◽  
Gota Morota

Recent advancements in phenomics coupled with increased output from sequencing technologies can create the platform needed to rapidly increase abiotic stress tolerance of crops, which increasingly face productivity challenges due to climate change. In particular, high-throughput phenotyping (HTP) enables researchers to generate large-scale data with temporal resolution. Recently, a random regression model (RRM) was used to model a longitudinal rice projected shoot area (PSA) dataset in an optimal growth environment. However, the utility of RRM is still unknown for phenotypic trajectories obtained from stress environments. Here, we sought to apply RRM to forecast the rice PSA in control and water-limited conditions under various longitudinal cross-validation scenarios. To this end, genomic Legendre polynomials and B-spline basis functions were used to capture PSA trajectories. Prediction accuracy declined slightly for the water-limited plants compared to control plants. Overall, RRM delivered reasonable prediction performance and yielded better prediction than the baseline multi-trait model. The difference between the results obtained using Legendre polynomials and that using B-splines was small; however, the former yielded a higher prediction accuracy. Prediction accuracy for forecasting the last five time points was highest when the entire trajectory from earlier growth stages was used to train the basis functions. Our results suggested that it was possible to decrease phenotyping frequency by only phenotyping every other day in order to reduce costs while minimizing the loss of prediction accuracy. This is the first study showing that RRM could be used to model changes in growth over time under abiotic stress conditions.


2016 ◽  
Vol 51 (11) ◽  
pp. 1848-1856
Author(s):  
Alessandro Haiduck Padilha ◽  
◽  
Jaime Araujo Cobuci ◽  
Darlene dos Santos Daltro ◽  
José Braccini Neto

Abstract The objective of this work was to verify the gain in reliability of estimated breeding values (EBVs), when random regression models are applied instead of conventional 305-day lactation models, using fat and protein yield records of Brazilian Holstein cattle for future genetic evaluations. Data set contained 262,426 test-day fat and protein yield records, and 30,228 fat and protein lactation records at 305 days from first lactation. Single trait random regression models using Legendre polynomials and single trait lactation models were applied. Heritability for 305-day yield from lactation models was 0.24 (fat) and 0.17 (protein), and from random regression models was 0.20 (fat) and 0.21 (protein). Spearman correlations of EBVs, between lactation models and random regression models, for 305-day yield, ranged from 0.86 to 0.97 and 0.86 to 0.98 (bulls), and from 0.80 to 0.89 and 0.81 to 0.86 (cows), for fat and protein, respectively. Average increase in reliability of EBVs for 305-day yield of bulls ranged from 2 to 16% (fat) and from 4 to 26% (protein), and average reliability of cows ranged from 24 to 38% (fat and protein), which is higher than in the lactation models. Random regression models using Legendre polynomials will improve genetic evaluations of Brazilian Holstein cattle due to the reliability increase of EBVs, in comparison with 305-day lactation models.


2017 ◽  
Vol 13 (Especial 2) ◽  
pp. 222-234
Author(s):  
Lorrayne Gomes ◽  
Milena Vieira Lima ◽  
Jeferson Corrêa Ribeiro ◽  
Andreia Santos Cezário ◽  
Eliandra Maria Bianchini Oliveira ◽  
...  

In animal breeding, new methodologies can be applied in statistical analysis to improve the genetic evaluation and, for this reason, they have been the subject in several studies. In the last years, several research works have intended the model development with more adjustable functions to the distinct variables. A set of functions known as Spline functions has called the attention of researches. Then, the purpose of this review is to discuss the use of Spline functions that are applied to growth data in animal breeding. Splines are segmented regression functions that are united by points known as joint points and have the ability to improve the curvature of models and, therefore, the function adjustment. These functions have interesting properties such as the interpolatory nature, less multicolinearity problems, parameter linearity and the ability of increasing the approximation domain, all of which provide estimates in a wide range of possible values. There are three types of Spline functions: natural spline functions, smoothing spline 223 Colloquium Agrariae, vol. 13, n. Especial 2, Jan–Jun, 2017, p. 222-234. ISSN: 1809-8215. DOI: 10.5747/ca.2017.v13.nesp2.000229 functions or nonparametric regression and B-splines functions. These latter functions are more applied to animal breeding, mainly as alternatives to random regression models (RRM) that use the Legendre polynomials. The matrices formed by RRMs with the use of B-spline functions or Legendre polynomials are more scarce and easier to be inverted. Then, the use of Spline functions has been more intensified in the last years because studies have had the purpose of improving the adjustment with less model parameters in functions. New studies will allow improving the methodology and finding out new applications to the Spline functions.


2019 ◽  
Vol 36 (9) ◽  
pp. 2069-2085 ◽  
Author(s):  
Sohta A Ishikawa ◽  
Anna Zhukova ◽  
Wataru Iwasaki ◽  
Olivier Gascuel

Abstract The reconstruction of ancestral scenarios is widely used to study the evolution of characters along phylogenetic trees. One commonly uses the marginal posterior probabilities of the character states, or the joint reconstruction of the most likely scenario. However, marginal reconstructions provide users with state probabilities, which are difficult to interpret and visualize, whereas joint reconstructions select a unique state for every tree node and thus do not reflect the uncertainty of inferences. We propose a simple and fast approach, which is in between these two extremes. We use decision-theory concepts (namely, the Brier score) to associate each node in the tree to a set of likely states. A unique state is predicted in tree regions with low uncertainty, whereas several states are predicted in uncertain regions, typically around the tree root. To visualize the results, we cluster the neighboring nodes associated with the same states and use graph visualization tools. The method is implemented in the PastML program and web server. The results on simulated data demonstrate the accuracy and robustness of the approach. PastML was applied to the phylogeography of Dengue serotype 2 (DENV2), and the evolution of drug resistances in a large HIV data set. These analyses took a few minutes and provided convincing results. PastML retrieved the main transmission routes of human DENV2 and showed the uncertainty of the human-sylvatic DENV2 geographic origin. With HIV, the results show that resistance mutations mostly emerge independently under treatment pressure, but resistance clusters are found, corresponding to transmissions among untreated patients.


2020 ◽  
Vol 636 ◽  
pp. 19-33 ◽  
Author(s):  
AM Edwards ◽  
JPW Robinson ◽  
JL Blanchard ◽  
JK Baum ◽  
MJ Plank

Size spectra are recommended tools for detecting the response of marine communities to fishing or to management measures. A size spectrum succinctly describes how a property, such as abundance or biomass, varies with body size in a community. Required data are often collected in binned form, such as numbers of individuals in 1 cm length bins. Numerous methods have been employed to fit size spectra, but most give biased estimates when tested on simulated data, and none account for the data’s bin structure (breakpoints of bins). Here, we used 8 methods to fit an annual size-spectrum exponent, b, to an example data set (30 yr of the North Sea International Bottom Trawl Survey). The methods gave conflicting conclusions regarding b declining (the size spectrum steepening) through time, and so any resulting advice to ecosystem managers will be highly dependent upon the method used. Using simulated data, we showed that ignoring the bin structure gives biased estimates of b, even for high-resolution data. However, our extended likelihood method, which explicitly accounts for the bin structure, accurately estimated b and its confidence intervals, even for coarsely collected data. We developed a novel visualisation method that accounts for the bin structure and associated uncertainty, provide recommendations concerning different data types and have created an R package (sizeSpectra) to reproduce all results and encourage use of our methods. This work is also relevant to wider applications where a power-law distribution (the underlying distribution for a size spectrum) is fitted to binned data.


2019 ◽  
Author(s):  
Mehdi Momen ◽  
Malachy T. Campbell ◽  
Harkamal Walia ◽  
Gota Morota

AbstractRecent advancements in phenomics coupled with increased output from sequencing technologies can create the platform needed to rapidly increase abiotic stress tolerance of crops, which increasingly face productivity challenges due to climate change. In particular, the high-throughput phenotyping (HTP) enables researchers to generate large-scale data with temporal resolution. Recently, a random regression model (RRM) was used to model a longitudinal rice projected shoot area (PSA) dataset in an optimal growth environment. However, the utility of RRM is still unknown for phenotypic trajectories obtained from stress environments. Here, we sought to apply RRM to forecast the rice PSA in control and water-limited conditions under various longitudinal cross-validation scenarios. To this end, genomic Legendre polynomials and B-spline basis functions were used to capture PSA trajectories. Prediction accuracy declined slightly for the water-limited plants compared to control plants. Overall, RRM delivered reasonable prediction performance and yielded better prediction than the baseline multi-trait model. The difference between the results obtained using Legendre polynomials and that using B-splines was small; however, the former yielded a higher prediction accuracy. Prediction accuracy for forecasting the last five time points was highest when the entire trajectory from earlier growth stages was used to train the basis functions. Our results suggested that it was possible to decrease phenotyping frequency by only phenotyping every other day in order to reduce costs while minimizing the loss of prediction accuracy. This is the first study showing that RRM could be used to model changes in growth over time under abiotic stress conditions.


Sign in / Sign up

Export Citation Format

Share Document