Modelling and analysis of incomplete and short lactations

2003 ◽  
Vol 76 (1) ◽  
pp. 19-25 ◽  
Author(s):  
F Jaffrézic ◽  
P Minini

AbstractAdvantages of the use of test-day records for genetic evaluation of dairy cattle are now widely accepted. In particular, longitudinal models such as random regression avoid using ad hoc extrapolation procedures to reconstruct complete lactations as they provide individual predictions even for incomplete data. However, these predictions and parameter estimates obtained in the model do not take into account the lactation length. This can be an important drawback for phenotypic and genetic analysis of milk production of cows with shorter lactations. The aim of this paper is to propose a methodology that would correct these predictions, weighting them by the probability at each point in time of each cow being dried off. The proposed procedure is easy to implement and calculations are fast to compute. A simulation study and an application on real data for daily milk records show that the proposed methodology provides a more accurate estimation for individual cumulative production as well as genetic values, and avoids predicting negative productions at the end of the lactation as is often the case with random regression models.

PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0233200
Author(s):  
Michel Henriques de Souza ◽  
José Domingos Pereira Júnior ◽  
Skarlet De Marco Steckling ◽  
Jussara Mencalha ◽  
Fabíola dos Santos Dias ◽  
...  

The evaluation of cultivars using multi-environment trials (MET) is an important step in plant breeding programs. One of the objectives of these evaluations is to understand the genotype by environment interaction (GEI). A method of determining the effect of GEI on the performance of cultivars is based on studies of adaptability and stability. Initial studies were based on linear regression; however, these methodologies have limitations, mainly in trials with genetic or statistical unbalanced, heterogeneity of residual variances, and genetic covariance. An alternative would be the use of random regression models (RRM), in which the behavior of the genotypes is characterized as a reaction norm using longitudinal data or repeated measurements and information regarding a covariance function. The objective of this work was the application of RRM in the study of the behavior of common bean cultivars using a MET, based on Legendre polynomials and genotype-ideotype distances. We used a set of 13 trials, which were classified as unfavorable or favorable environments. The results revealed that RRM enables the prediction of the genotypic values of cultivars in environments where they were not evaluated with high accuracy values, thereby circumventing the unbalanced of the experiments. From these values, it was possible to measure the genotypic adaptability according to ideotypes, according to their reaction norms. In addition, the stability of the cultivars can be interpreted as variation in the behavior of the ideotype. The use of ideotypes based on real data allowed a better comparison of the performance of cultivars across environments. The use of RRM in plant breeding is a good alternative to understand the behavior of cultivars in a MET, especially when we want to quantify the adaptability and stability of genotypes.


2005 ◽  
Vol 81 (2) ◽  
pp. 233-238 ◽  
Author(s):  
G. Banos ◽  
G. Arsenos ◽  
Z. Abas ◽  
Z. Basdagianni

AbstractParameters of daily milk yield during the first three lactations of Chios ewes were estimated with random regression models. Data consisted of 42 675 test-day records of 7121 ewes from 75 flocks that had lambed between 1998 and 2000. Models fitted fourth order fixed regressions on Legendre polynomials of the number of days post partum and fourth order random regressions on the individual animal. (Co)variance components were estimated with Gibbs sampling. Lactations were analysed separately. The four eigen values accounted for 0·80 to 0·84, 0·11 to 0·15, 0·04 to 0·05 and about 0·01 of the animal variance, respectively, depending on lactation number. Animal variance estimates, including genetic and, partly, permanent environment effects, were high at the beginning of each lactation and decreased as lactation progressed, suggesting that the animal effect is most important to early daily records. Residual variance was highest in the middle of lactation, suggesting that non-systematic environmental factors play a bigger at that time. Animal correlation estimates between daily yield records ranged from 0·26 to 0·99, were highest for adjacent days and decreased for days further apart. The decline had a different shape in the three lactations and was more evident in the first, suggesting that the three lactations may be biologically distinct traits. Animal correlation estimates between daily and total lactation milk yield ranged from 0·61 to 0·98 and were highest in the middle and lowest towards the end of lactation. Early lactation daily yield had an animal correlation of 0·70 to 0·80 with total lactation milk yield, in all three lactations. Results of this study suggest that daily milk yield records in the early stages of lactation may be useful for selection of ewes with high producing ability and accurate prediction of total lactation milk yield.


Animals ◽  
2020 ◽  
Vol 10 (11) ◽  
pp. 2115
Author(s):  
Juan Vicente Delgado Bermejo ◽  
Francisco Antonio Limón Pérez ◽  
Francisco Javier Navas González ◽  
Jose Manuel León Jurado ◽  
Javier Fernández Álvarez ◽  
...  

A total of 137,927 controls of 22,932 Murciano-Granadina first lactation goats (measured between 1996–2016) were evaluated to determine the influence of the number of kids, season, year and farm on total milk yield, daily milk yield, lactation length, total production of fat and protein and percentages of fat and protein. All factors analyzed had a significant effect on the variables studied, except for the influence of the number of kids on the percentages of fat and protein, where the variation was very small. Goats with two offspring produced nearly 15% more milk, fat and protein per lactation compared to goats with simple kids. Kiddings occurring in summer–autumn resulted in average milk, fat and protein yields nearly 14, 19 and 23% higher when compared to winter–spring kiddings. Lactation curves were evaluated to determine the effects of the number of kids and season, using the linearized version of the model of Wood in random regression analyses. Peak Yield increased by about 0.3 kg per additional offspring at kidding, but persistence was higher in goats with single offspring. The kidding season significantly influenced the lactation curve shape. Hence summer-kidding goats were more productive, and peak occurred earlier, while a higher persistence was observed in goats kidding during autumn.


2000 ◽  
Vol 70 (3) ◽  
pp. 407-415 ◽  
Author(s):  
S. Brotherstone ◽  
I. M. S. White ◽  
K. Meyer

AbstractRandom regression models have been advocated for the analysis of test day records in dairy cattle. The effectiveness of a random regression analysis depends on the function used to model the data. To investigate functions suitable for the analysis of daily milk yield, test day milk yields of 7860 first lactation Holstein Friesian cows were analysed using random regression models involving three types of curves. Each analysis fitted the same curve to model overall trends through a fixed regression and random deviations due to animals. Curves included orthogonal polynomials, fitted to order 3 (quadratic), 4 (cubic) and 5 (quartic), respectively, a three-parameter parametric curve and a five-parameter parametric curve. Sets of random regression coefficients were fitted to model both animals’ genetic effects and permanent environmental effects. Temporary measurement errors were assumed independently but heterogeneously distributed, and assigned to one of 12 classes. Results showed that the measurement error variances were generally lowest around peak lactation, and higher at the beginning and end of lactation. Parametric curves yielded the highest likelihoods, but produced negative genetic associations between yield in early lactation and later lactation yields, while positive genetic correlations across the entire lactation were estimated with all models involving orthogonal polynomials. The fit of models using orthogonal polynomials to model test day yield was improved by including higher order fixed regressions.


Author(s):  
Jeremy Freese

This article presents a method and program for identifying poorly fitting observations for maximum-likelihood regression models for categorical dependent variables. After estimating a model, the program leastlikely will list the observations that have the lowest predicted probabilities of observing the value of the outcome category that was actually observed. For example, when run after estimating a binary logistic regression model, leastlikely will list the observations with a positive outcome that had the lowest predicted probabilities of a positive outcome and the observations with a negative outcome that had the lowest predicted probabilities of a negative outcome. These can be considered the observations in which the outcome is most surprising given the values of the independent variables and the parameter estimates and, like observations with large residuals in ordinary least squares regression, may warrant individual inspection. Use of the program is illustrated with examples using binary and ordered logistic regression.


2021 ◽  
Vol 13 (7) ◽  
pp. 168781402110277
Author(s):  
Yankai Hou ◽  
Zhaosheng Zhang ◽  
Peng Liu ◽  
Chunbao Song ◽  
Zhenpo Wang

Accurate estimation of the degree of battery aging is essential to ensure safe operation of electric vehicles. In this paper, using real-world vehicles and their operational data, a battery aging estimation method is proposed based on a dual-polarization equivalent circuit (DPEC) model and multiple data-driven models. The DPEC model and the forgetting factor recursive least-squares method are used to determine the battery system’s ohmic internal resistance, with outliers being filtered using boxplots. Furthermore, eight common data-driven models are used to describe the relationship between battery degradation and the factors influencing this degradation, and these models are analyzed and compared in terms of both estimation accuracy and computational requirements. The results show that the gradient descent tree regression, XGBoost regression, and light GBM regression models are more accurate than the other methods, with root mean square errors of less than 6.9 mΩ. The AdaBoost and random forest regression models are regarded as alternative groups because of their relative instability. The linear regression, support vector machine regression, and k-nearest neighbor regression models are not recommended because of poor accuracy or excessively high computational requirements. This work can serve as a reference for subsequent battery degradation studies based on real-time operational data.


Stats ◽  
2021 ◽  
Vol 4 (1) ◽  
pp. 28-45
Author(s):  
Vasili B.V. Nagarjuna ◽  
R. Vishnu Vardhan ◽  
Christophe Chesneau

In this paper, a new five-parameter distribution is proposed using the functionalities of the Kumaraswamy generalized family of distributions and the features of the power Lomax distribution. It is named as Kumaraswamy generalized power Lomax distribution. In a first approach, we derive its main probability and reliability functions, with a visualization of its modeling behavior by considering different parameter combinations. As prime quality, the corresponding hazard rate function is very flexible; it possesses decreasing, increasing and inverted (upside-down) bathtub shapes. Also, decreasing-increasing-decreasing shapes are nicely observed. Some important characteristics of the Kumaraswamy generalized power Lomax distribution are derived, including moments, entropy measures and order statistics. The second approach is statistical. The maximum likelihood estimates of the parameters are described and a brief simulation study shows their effectiveness. Two real data sets are taken to show how the proposed distribution can be applied concretely; parameter estimates are obtained and fitting comparisons are performed with other well-established Lomax based distributions. The Kumaraswamy generalized power Lomax distribution turns out to be best by capturing fine details in the structure of the data considered.


Author(s):  
Moritz Berger ◽  
Gerhard Tutz

AbstractA flexible semiparametric class of models is introduced that offers an alternative to classical regression models for count data as the Poisson and Negative Binomial model, as well as to more general models accounting for excess zeros that are also based on fixed distributional assumptions. The model allows that the data itself determine the distribution of the response variable, but, in its basic form, uses a parametric term that specifies the effect of explanatory variables. In addition, an extended version is considered, in which the effects of covariates are specified nonparametrically. The proposed model and traditional models are compared in simulations and by utilizing several real data applications from the area of health and social science.


Sign in / Sign up

Export Citation Format

Share Document