Quantile Regression with Measurement Errors and Missing Data

2017 ◽  
pp. 165-183
Author(s):  
Ying Wei
2020 ◽  
pp. 096228022094153
Author(s):  
Yongxin Bai ◽  
Maozai Tian ◽  
Man-Lai Tang ◽  
Wing-Yan Lee

In this paper, we consider variable selection for ultra-high dimensional quantile regression model with missing data and measurement errors in covariates. Specifically, we correct the bias in the loss function caused by measurement error by applying the orthogonal quantile regression approach and remove the bias caused by missing data using the inverse probability weighting. A nonconvex Atan penalized estimation method is proposed for simultaneous variable selection and estimation. With the proper choice of the regularization parameter and under some relaxed conditions, we show that the proposed estimate enjoys the oracle properties. The choice of smoothing parameters is also discussed. The performance of the proposed variable selection procedure is assessed by Monte Carlo simulation studies. We further demonstrate the proposed procedure with a breast cancer data set.


2021 ◽  
Vol 68 (1) ◽  
pp. 17-46
Author(s):  
Adam Korczyński

Statistical practice requires various imperfections resulting from the nature of data to be addressed. Data containing different types of measurement errors and irregularities, such as missing observations, have to be modelled. The study presented in the paper concerns the application of the expectation-maximisation (EM) algorithm to calculate maximum likelihood estimates, using an autoregressive model as an example. The model allows describing a process observed only through measurements with certain level of precision and through more than one data series. The studied series are affected by a measurement error and interrupted in some time periods, which causes the information for parameters estimation and later for prediction to be less precise. The presented technique aims to compensate for missing data in time series. The missing data appear in the form of breaks in the source of the signal. The adjustment has been performed by the EM algorithm to a hybrid version, supplemented by the Newton-Raphson method. This technique allows the estimation of more complex models. The formulation of the substantive model of an autoregressive process affected by noise is outlined, as well as the adjustment introduced to overcome the issue of missing data. The extended version of the algorithm has been verified using sampled data from a model serving as an example for the examined process. The verification demonstrated that the joint EM and Newton-Raphson algorithms converged with a relatively small number of iterations and resulted in the restoration of the information lost due to missing data, providing more accurate predictions than the original algorithm. The study also features an example of the application of the supplemented algorithm to some empirical data (in the calculation of a forecasted demand for newspapers).


2024 ◽  
Author(s):  
Mengli Zhang ◽  
Lan Xue ◽  
Carmen D. Tekwe ◽  
Yang Bai ◽  
Annie Qu

2022 ◽  
Vol 9 (3) ◽  
pp. 0-0

Missing data is universal complexity for most part of the research fields which introduces the part of uncertainty into data analysis. We can take place due to many types of motives such as samples mishandling, unable to collect an observation, measurement errors, aberrant value deleted, or merely be short of study. The nourishment area is not an exemption to the difficulty of data missing. Most frequently, this difficulty is determined by manipulative means or medians from the existing datasets which need improvements. The paper proposed hybrid schemes of MICE and ANN known as extended ANN to search and analyze the missing values and perform imputations in the given dataset. The proposed mechanism is efficiently able to analyze the blank entries and fill them with proper examining their neighboring records in order to improve the accuracy of the dataset. In order to validate the proposed scheme, the extended ANN is further compared against various recent algorithms or mechanisms to analyze the efficiency as well as the accuracy of the results.


SIMULATION ◽  
2020 ◽  
Vol 96 (10) ◽  
pp. 825-839
Author(s):  
Hao Cheng

Missing data is almost inevitable for various reasons in many applications. For hierarchical latent variable models, there usually exist two kinds of missing data problems. One is manifest variables with incomplete observations, the other is latent variables which cannot be observed directly. Missing data in manifest variables can be handled by different methods. For latent variables, there exist several kinds of partial least square (PLS) algorithms which have been widely used to estimate the value of latent variables. In this paper, we not only combine traditional linear regression type PLS algorithms with missing data handling methods, but also introduce quantile regression to improve the performances of PLS algorithms when the relationships among manifest and latent variables are not fixed according to the explored quantile of interest. Thus, we can get the overall view of variables’ relationships at different levels. The main challenges lie in how to introduce quantile regression in PLS algorithms correctly and how well the PLS algorithms perform when missing manifest variables occur. By simulation studies, we compare all the PLS algorithms with missing data handling methods in different settings, and finally build a business sophistication hierarchical latent variable model based on real data.


2020 ◽  
Vol 29 (9) ◽  
pp. 2697-2716
Author(s):  
Hélène Jacqmin-Gadda ◽  
Anaïs Rouanet ◽  
Robert D Mba ◽  
Viviane Philipps ◽  
Jean-François Dartigues

Quantile regressions are increasingly used to provide population norms for quantitative variables. Indeed, they do not require any Gaussian assumption for the response and allow to characterize its entire distribution through different quantiles. Quantile regressions are especially useful to provide norms of cognitive scores in the elderly that may help general practitioners to identify subjects with unexpectedly low cognitive level in routine examinations. These norms may be estimated from cohorts of elderly using quantile regression for longitudinal data, but this requires to properly account for selection by death, dropout and intermittent missing data. In this work, we extend the weighted estimating equation approach to estimate conditional quantiles in the population currently alive from mortal cohorts with dropout and intermittent missing data. Suitable weight estimation procedures are provided for both monotone and intermittent missing data and under two missing-at-random assumptions, when the observation probability given that the subject is alive depends on the survival time (p-MAR assumption) or not (u-MAR assumption). Inference is performed through subject-level bootstrap. The method is validated in a simulation study and applied to the French cohort Paquid to estimate quantiles of a cognitive test in the elderly population currently alive. On one hand, the simulations show that the u-MAR analysis is quite robust when the true missingness mechanism is p-MAR. This is a useful result because computation of suitable weights for intermittent missing data under the p-MAR assumption is untractable. On the other hand, the simulations highlight, along with the real data analysis, the usefulness of suitable weights for intermittent missing data. This method is implemented in the R package weightQuant.


Sign in / Sign up

Export Citation Format

Share Document