scholarly journals Using empirical covariance matrix in enhancing prediction accuracy of linear models with Missing Information

Author(s):  
Ahmadreza Moradipari ◽  
Sina Shahsavari ◽  
Ashkan Esmaeili ◽  
Farokh Marvasti
2013 ◽  
Vol 55 (3) ◽  
pp. 643-652
Author(s):  
Gauss M. Cordeiro ◽  
Denise A. Botter ◽  
Alexsandro B. Cavalcanti ◽  
Lúcia P. Barroso

2012 ◽  
Vol 01 (02) ◽  
pp. 1150006 ◽  
Author(s):  
WALID HACHEM ◽  
PHILIPPE LOUBATON ◽  
XAVIER MESTRE ◽  
JAMAL NAJIM ◽  
PASCAL VALLET

In array processing, a common problem is to estimate the angles of arrival of K deterministic sources impinging on an array of M antennas, from N observations of the source signal, corrupted by Gaussian noise. In the so-called subspace methods, the problem reduces to estimate a quadratic form (called "localization function") of a certain projection matrix related to the source signal empirical covariance matrix. The estimates of the angles of arrival are then obtained by taking the K deepest local minima of the estimated localization function. Recently, a new subspace estimation method has been proposed, in the context where the number of available samples N is of the same order of magnitude than the number of sensors M. In this context, the traditional subspace methods tend to fail because they are based on the empirical covariance matrix of the observations which is a poor estimate of the source signal covariance matrix. The new subspace method is based on a consistent estimator of the localization function in the regime where M and N tend to +∞ at the same rate. However, the consistency of the angles estimator was not addressed, and the purpose of this paper is to prove this consistency in the previous asymptotic regime. For this, we prove the property that the singular values of M × N Gaussian information plus noise matrix escape from certain intervals is an event of probability decreasing at rate [Formula: see text] for all p. A regularization trick is also introduced, which allows to confine these singular values into certain intervals and to use standard tools as Poincaré inequality to characterize any moments of the estimator. These results are believed to be of independent interest.


Author(s):  
Juan Gómez-Sanchis ◽  
Emilio Soria-Olivas ◽  
Marcelino Martinez-Sober ◽  
Jose Blasco ◽  
Juan Guerrero ◽  
...  

This work presents a new approach for one of the main problems in the analysis of atmospheric phenomena, the prediction of atmospheric concentrations of different elements. The proposed methodology is more efficient than other classical approaches and is used in this work to predict tropospheric ozone concentration. The relevance of this problem stems from the fact that excessive ozone concentrations may cause several problems related to public health. Previous research by the authors of this work has shown that the classical approach to this problem (linear models) does not achieve satisfactory results in tropospheric ozone concentration prediction. The authors’ approach is based on Machine Learning (ML) techniques, which include algorithms related to neural networks, fuzzy systems and advanced statistical techniques for data processing. In this work, the authors focus on one of the main ML techniques, namely, neural networks. These models demonstrate their suitability for this problem both in terms of prediction accuracy and information extraction.


2019 ◽  
Vol 7 (1) ◽  
pp. 78-91
Author(s):  
Stephen Haslett

Abstract When sample survey data with complex design (stratification, clustering, unequal selection or inclusion probabilities, and weighting) are used for linear models, estimation of model parameters and their covariance matrices becomes complicated. Standard fitting techniques for sample surveys either model conditional on survey design variables, or use only design weights based on inclusion probabilities essentially assuming zero error covariance between all pairs of population elements. Design properties that link two units are not used. However, if population error structure is correlated, an unbiased estimate of the linear model error covariance matrix for the sample is needed for efficient parameter estimation. By making simultaneous use of sampling structure and design-unbiased estimates of the population error covariance matrix, the paper develops best linear unbiased estimation (BLUE) type extensions to standard design-based and joint design and model based estimation methods for linear models. The analysis covers both with and without replacement sample designs. It recognises that estimation for with replacement designs requires generalized inverses when any unit is selected more than once. This and the use of Hadamard products to link sampling and population error covariance matrix properties are central topics of the paper. Model-based linear model parameter estimation is also discussed.


1986 ◽  
Vol 23 (A) ◽  
pp. 355-368 ◽  
Author(s):  
T. P. Speed

The standard ANOVA models with random effects for multi-indexed arrays of random variables with an arbitrary nesting structure on the indices are considered from the viewpoint of symmetry. It is found that the covariance matrix of such an array has sufficient symmetry to permit viewing the usual components of variance as a generalised spectrum and the linear models of random effects as a generalised spectral decomposition.


2013 ◽  
Vol 2013 ◽  
pp. 1-22
Author(s):  
C. Z. W. Hassell Sweatman ◽  
G. C. Wake ◽  
A. B. Pleasants ◽  
C. A. McLean ◽  
A. M. Sheppard

The statistical application considered here arose in epigenomics, linking the DNA methylation proportions measured at specific genomic sites to characteristics such as phenotype or birth order. It was found that the distribution of errors in the proportions of chemical modification (methylation) on DNA, measured at CpG sites, may be successfully modelled by a Laplace distribution which is perturbed by a Hermite polynomial. We use a linear model with such a response function. Hence, the response function is known, or assumed well estimated, but fails to be differentiable in the classical sense due to the modulus function. Our problem was to estimate coefficients for the linear model and the corresponding covariance matrix and to compare models with varying numbers of coefficients. The linear model coefficients may be found using the (derivative-free) simplex method, as in quantile regression. However, this theory does not yield a simple expression for the covariance matrix of the coefficients of the linear model. Assuming response functions which are 𝒞2 except where the modulus function attains zero, we derive simple formulae for the covariance matrix and a log-likelihood ratio statistic, using generalized calculus. These original formulae enable a generalized analysis of variance and further model comparisons.


Sign in / Sign up

Export Citation Format

Share Document