scholarly journals BEST LINEAR UNBIASED LATENT VALUES PREDICTORS FOR FINITE POPULATION LINEAR MODELS WITH DIFFERENT ERROR SOURCES

2021 ◽  
Vol 39 (4) ◽  
pp. 571-586
Author(s):  
German MORENO ◽  
Julio M. SINGER ◽  
Edward J. STANEK III

We develop best linear unbiased predictors (BLUP) of the latent values of labeled sample units selected from a finite population when there are two distinct sources of measurement error: endogenous, exogenous or both. Usual target parameters are the population mean, the latent values associated to a labeled unit or the latent value of the unit that will appear in a given position in the sample. We show how both types of measurement errors affect the within unit covariance matrices and indicate how the finite population BLUP may be obtained via standard software packages employed to fit mixed models in situations with either heteroskedastic or homoskedastic exogenous and endogenous measurement errors.

2017 ◽  
Vol 15 (1) ◽  
pp. 126-150 ◽  
Author(s):  
Yongge Tian

Abstract Matrix mathematics provides a powerful tool set for addressing statistical problems, in particular, the theory of matrix ranks and inertias has been developed as effective methodology of simplifying various complicated matrix expressions, and establishing equalities and inequalities occurred in statistical analysis. This paper describes how to establish exact formulas for calculating ranks and inertias of covariances of predictors and estimators of parameter spaces in general linear models (GLMs), and how to use the formulas in statistical analysis of GLMs. We first derive analytical expressions of best linear unbiased predictors/best linear unbiased estimators (BLUPs/BLUEs) of all unknown parameters in the model by solving a constrained quadratic matrix-valued function optimization problem, and present some well-known results on ordinary least-squares predictors/ordinary least-squares estimators (OLSPs/OLSEs). We then establish some fundamental rank and inertia formulas for covariance matrices related to BLUPs/BLUEs and OLSPs/OLSEs, and use the formulas to characterize a variety of equalities and inequalities for covariance matrices of BLUPs/BLUEs and OLSPs/OLSEs. As applications, we use these equalities and inequalities in the comparison of the covariance matrices of BLUPs/BLUEs and OLSPs/OLSEs. The work on the formulations of BLUPs/BLUEs and OLSPs/OLSEs, and their covariance matrices under GLMs provides direct access, as a standard example, to a very simple algebraic treatment of predictors and estimators in linear regression analysis, which leads a deep insight into the linear nature of GLMs and gives an efficient way of summarizing the results.


2018 ◽  
Vol 620 ◽  
pp. A168 ◽  
Author(s):  
G. Valle ◽  
M. Dell’Omodarme ◽  
P. G. Prada Moroni ◽  
S. Degl’Innocenti

Aims. We aim to perform a theoretical investigation on the direct impact of measurement errors in the observational constraints on the recovered age for stars in main sequence (MS) and red giant branch (RGB) phases. We assumed that a mix of classical (effective temperature Teff and metallicity [Fe/H]) and asteroseismic (Δν and νmax) constraints were available for the objects. Methods. Artificial stars were sampled from a reference isochrone and subjected to random Gaussian perturbation in their observational constraints to simulate observational errors. The ages of these synthetic objects were then recovered by means of a Monte Carlo Markov chains approach over a grid of pre-computed stellar models. To account for observational uncertainties the grid covers different values of initial helium abundance and mixing-length parameter, that act as nuisance parameters in the age estimation. Results. The obtained differences between the recovered and true ages were modelled against the errors in the observables. This procedure was performed by means of linear models and projection pursuit regression models. The first class of statistical models provides an easily generalizable result, whose robustness is checked with the second method. From linear models we find that no age error source dominates in all the evolutionary phases. Assuming typical observational uncertainties, for MS the most important error source in the reconstructed age is the effective temperature of the star. An offset of 75 K accounts for an underestimation of the stellar age from 0.4 to 0.6 Gyr for initial and terminal MS. An error of 2.5% in νmax resulted the second most important source of uncertainty accounting for about −0.3 Gyr. The 0.1 dex error in [Fe/H] resulted particularly important only at the end of the MS, producing an age error of −0.4 Gyr. For the RGB phase the dominant source of uncertainty is νmax, causing an underestimation of about 0.6 Gyr; the offset in the effective temperature and Δν caused respectively an underestimation and overestimation of 0.3 Gyr. We find that the inference from the linear model is a good proxy for that from projection pursuit regression models. Therefore, inference from linear models can be safely used thanks to its broader generalizability. Finally, we explored the impact on age estimates of adding the luminosity to the previously discussed observational constraints. To this purpose, we assumed – for computational reasons – a 2.5% error in luminosity, much lower than the average error in the Gaia DR2 catalogue. However, even in this optimistic case, the addition of the luminosity does not increase precision of age estimates. Moreover, the luminosity resulted as a major contributor to the variability in the estimated ages, accounting for an error of about −0.3 Gyr in the explored evolutionary phases.


2014 ◽  
Vol 14 (9) ◽  
pp. 2321-2335 ◽  
Author(s):  
N. M. Neykov ◽  
P. N. Neytchev ◽  
W. Zucchini

Abstract. Stochastic daily precipitation models are commonly used to generate scenarios of climate variability or change on a daily timescale. The standard models consist of two components describing the occurrence and intensity series, respectively. Binary logistic regression is used to fit the occurrence data, and the intensity series is modeled using a continuous-valued right-skewed distribution, such as gamma, Weibull or lognormal. The precipitation series is then modeled using the joint density, and standard software for generalized linear models can be used to perform the computations. A drawback of these precipitation models is that they do not produce a sufficiently heavy upper tail for the distribution of daily precipitation amounts; they tend to underestimate the frequency of large storms. In this study, we adapted the approach of Furrer and Katz (2008) based on hybrid distributions in order to correct for this shortcoming. In particular, we applied hybrid gamma–generalized Pareto (GP) and hybrid Weibull–GP distributions to develop a stochastic precipitation model for daily rainfall at Ihtiman in western Bulgaria. We report the results of simulations designed to compare the models based on the hybrid distributions and those based on the standard distributions. Some potential difficulties are outlined.


2014 ◽  
Vol 70 (a1) ◽  
pp. C1269-C1269
Author(s):  
Ethan Merritt

"Tools for validating structural models of proteins are relatively mature and widely implemented. New protein crystallographers are introduced early on to the importance of monitoring conformance with expected φ/ψ values, favored rotamers, and local stereochemistry. The protein model is validated by the PDB at the time of deposition using criteria that are also available in the standard software packages used to refine the model being deposited. By contrast, crystallographers are typically much less familiar with procedures to validate key non-protein components of the model – cofactors, substrates, inhibitors, etc. It has been estimated that as many as a third of all ligands in the PDB exhibit preventable errors of some sort, ranging from minor deviations in expected bond angles to wholly implausible placement in the binding pocket. Following recommendations from the wwPDB Validation Task Force, the PDB recently began validating ligand geometry as an integral part of deposition processing. This means that many crystallographers will soon receive for the first time a ""grade"" on the quality of ligands in the structure they have just deposited. Some will be surprised, as I was following my first PDB deposition of 2014, at how easily bad ligand geometry can slip through the cracks in supposedly robust structure refinement protocols that their lab has used for many years. I will illustrate use of current tools for generating ligand restraints to guide model refinement. One is the jligand+coot+cprodrg pipeline integrated into the CCP4 suite. Another is the Grade web server provided as a community resource by Global Phasing Ltd. Furthermore I will show examples from recent in-house refinements of how things can still go wrong even if you do use these tools, and how we recovered. The new PDB deposition checks may expose errors in your ligand descriptions after the fact. This presentation may help you avoid introducing those errors in the first place."


2009 ◽  
Vol 39 (1) ◽  
pp. 61-80 ◽  
Author(s):  
José Garrido ◽  
Jun Zhou

AbstractGeneralized linear models (GLMs) are gaining popularity as a statistical analysis method for insurance data. For segmented portfolios, as in car insurance, the question of credibility arises naturally; how many observations are needed in a risk class before the GLM estimators can be considered credible? In this paper we study the limited fluctuations credibility of the GLM estimators as well as in the extended case of generalized linear mixed model (GLMMs). We show how credibility depends on the sample size, the distribution of covariates and the link function. This provides a mechanism to obtain confidence intervals for the GLM and GLMM estimators.


Sign in / Sign up

Export Citation Format

Share Document