Quantile Regression with Measurement Errors and Missing Data

In this paper, we consider variable selection for ultra-high dimensional quantile regression model with missing data and measurement errors in covariates. Specifically, we correct the bias in the loss function caused by measurement error by applying the orthogonal quantile regression approach and remove the bias caused by missing data using the inverse probability weighting. A nonconvex Atan penalized estimation method is proposed for simultaneous variable selection and estimation. With the proper choice of the regularization parameter and under some relaxed conditions, we show that the proposed estimate enjoys the oracle properties. The choice of smoothing parameters is also discussed. The performance of the proposed variable selection procedure is assessed by Monte Carlo simulation studies. We further demonstrate the proposed procedure with a breast cancer data set.

Download Full-text

Smoothed and Corrected Score Approach to Censored Quantile Regression With Measurement Errors

Journal of the American Statistical Association ◽

10.1080/01621459.2014.989323 ◽

2015 ◽

Vol 110 (512) ◽

pp. 1670-1683 ◽

Cited By ~ 6

Author(s):

Yuanshan Wu ◽

Yanyuan Ma ◽

Guosheng Yin

Keyword(s):

Quantile Regression ◽

Measurement Errors ◽

Censored Quantile Regression ◽

Corrected Score

Download Full-text

Predicting in multivariate incomplete time series. Application of the expectation-maximisation algorithm supplemented by the Newton-Raphson method

Przegląd Statystyczny ◽

10.5604/01.3001.0015.0376 ◽

2021 ◽

Vol 68 (1) ◽

pp. 17-46

Author(s):

Adam Korczyński

Keyword(s):

Time Series ◽

Missing Data ◽

Em Algorithm ◽

Measurement Errors ◽

Parameters Estimation ◽

Maximum Likelihood Estimates ◽

Original Algorithm ◽

Expectation Maximisation ◽

Newton Raphson ◽

Raphson Method

Statistical practice requires various imperfections resulting from the nature of data to be addressed. Data containing different types of measurement errors and irregularities, such as missing observations, have to be modelled. The study presented in the paper concerns the application of the expectation-maximisation (EM) algorithm to calculate maximum likelihood estimates, using an autoregressive model as an example. The model allows describing a process observed only through measurements with certain level of precision and through more than one data series. The studied series are affected by a measurement error and interrupted in some time periods, which causes the information for parameters estimation and later for prediction to be less precise. The presented technique aims to compensate for missing data in time series. The missing data appear in the form of breaks in the source of the signal. The adjustment has been performed by the EM algorithm to a hybrid version, supplemented by the Newton-Raphson method. This technique allows the estimation of more complex models. The formulation of the substantive model of an autoregressive process affected by noise is outlined, as well as the adjustment introduced to overcome the issue of missing data. The extended version of the algorithm has been verified using sampled data from a model serving as an example for the examined process. The verification demonstrated that the joint EM and Newton-Raphson algorithms converged with a relatively small number of iterations and resulted in the restoration of the information lost due to missing data, providing more accurate predictions than the original algorithm. The study also features an example of the application of the supplemented algorithm to some empirical data (in the calculation of a forecasted demand for newspapers).

Download Full-text

A predictive leverage statistic for quantile regression with measurement errors

Communications in Statistics - Simulation and Computation ◽

10.1080/03610918.2016.1204455 ◽

2016 ◽

Vol 46 (8) ◽

pp. 6385-6398

Author(s):

Edmore Ranganai ◽

Saralees Nadarajah

Keyword(s):

Quantile Regression ◽

Measurement Errors

Download Full-text

Partially linear support vector orthogonal quantile regression with measurement errors

Journal of the Korean Data and Information Science Society ◽

10.7465/jkdi.2015.26.1.209 ◽

2015 ◽

Vol 26 (1) ◽

pp. 209-216 ◽

Cited By ~ 3

Author(s):

Changha Hwang

Keyword(s):

Quantile Regression ◽

Measurement Errors ◽

Support Vector ◽

Partially Linear

Download Full-text

Partially Functional Linear Quantile Regression With Measurement Errors

Statistica Sinica ◽

10.5705/ss.202021.0246 ◽

2024 ◽

Author(s):

Mengli Zhang ◽

Lan Xue ◽

Carmen D. Tekwe ◽

Yang Bai ◽

Annie Qu

Keyword(s):

Quantile Regression ◽

Measurement Errors

Download Full-text

Futuristic Prediction of Missing Value Imputation Methods Using Extended ANN

International Journal of Business Analytics ◽

10.4018/ijban.292055 ◽

2022 ◽

Vol 9 (3) ◽

pp. 0-0

Keyword(s):

Data Analysis ◽

Missing Data ◽

Measurement Errors ◽

Missing Values ◽

Missing Value ◽

Hybrid Schemes ◽

Imputation Methods ◽

Research Fields ◽

Data Missing ◽

The Given

Missing data is universal complexity for most part of the research fields which introduces the part of uncertainty into data analysis. We can take place due to many types of motives such as samples mishandling, unable to collect an observation, measurement errors, aberrant value deleted, or merely be short of study. The nourishment area is not an exemption to the difficulty of data missing. Most frequently, this difficulty is determined by manipulative means or medians from the existing datasets which need improvements. The paper proposed hybrid schemes of MICE and ANN known as extended ANN to search and analyze the missing values and perform imputations in the given dataset. The proposed mechanism is efficiently able to analyze the blank entries and fill them with proper examining their neighboring records in order to improve the accuracy of the dataset. In order to validate the proposed scheme, the extended ANN is further compared against various recent algorithms or mechanisms to analyze the efficiency as well as the accuracy of the results.

Download Full-text

Hierarchical likelihood methods for nonlinear and generalized linear mixed models with missing data and measurement errors in covariates

Journal of Multivariate Analysis ◽

10.1016/j.jmva.2012.02.011 ◽

2012 ◽

Vol 109 ◽

pp. 42-51 ◽

Cited By ~ 5

Author(s):

Maengseok Noh ◽

Lang Wu ◽

Youngjo Lee

Keyword(s):

Missing Data ◽

Mixed Models ◽

Measurement Errors ◽

Generalized Linear Mixed Models ◽

Linear Mixed Models ◽

Likelihood Methods ◽

Hierarchical Likelihood

Download Full-text

Comparison of partial least square algorithms in hierarchical latent variable model with missing data

SIMULATION ◽

10.1177/0037549720944467 ◽

2020 ◽

Vol 96 (10) ◽

pp. 825-839

Author(s):

Hao Cheng

Keyword(s):

Missing Data ◽

Quantile Regression ◽

Latent Variables ◽

Latent Variable ◽

Partial Least Square ◽

Least Square ◽

Latent Variable Model ◽

Data Handling ◽

Variable Model ◽

Handling Methods

Missing data is almost inevitable for various reasons in many applications. For hierarchical latent variable models, there usually exist two kinds of missing data problems. One is manifest variables with incomplete observations, the other is latent variables which cannot be observed directly. Missing data in manifest variables can be handled by different methods. For latent variables, there exist several kinds of partial least square (PLS) algorithms which have been widely used to estimate the value of latent variables. In this paper, we not only combine traditional linear regression type PLS algorithms with missing data handling methods, but also introduce quantile regression to improve the performances of PLS algorithms when the relationships among manifest and latent variables are not fixed according to the explored quantile of interest. Thus, we can get the overall view of variables’ relationships at different levels. The main challenges lie in how to introduce quantile regression in PLS algorithms correctly and how well the PLS algorithms perform when missing manifest variables occur. By simulation studies, we compare all the PLS algorithms with missing data handling methods in different settings, and finally build a business sophistication hierarchical latent variable model based on real data.

Download Full-text

Quantile regression for incomplete longitudinal data with selection by death

Statistical Methods in Medical Research ◽

10.1177/0962280220909986 ◽

2020 ◽

Vol 29 (9) ◽

pp. 2697-2716

Author(s):

Hélène Jacqmin-Gadda ◽

Anaïs Rouanet ◽

Robert D Mba ◽

Viviane Philipps ◽

Jean-François Dartigues

Keyword(s):

Missing Data ◽

Longitudinal Data ◽

Quantile Regression ◽

Real Data ◽

Missing At Random ◽

The Elderly ◽

R Package ◽

Estimating Equation ◽

Cognitive Test ◽

Quantile Regressions

Quantile regressions are increasingly used to provide population norms for quantitative variables. Indeed, they do not require any Gaussian assumption for the response and allow to characterize its entire distribution through different quantiles. Quantile regressions are especially useful to provide norms of cognitive scores in the elderly that may help general practitioners to identify subjects with unexpectedly low cognitive level in routine examinations. These norms may be estimated from cohorts of elderly using quantile regression for longitudinal data, but this requires to properly account for selection by death, dropout and intermittent missing data. In this work, we extend the weighted estimating equation approach to estimate conditional quantiles in the population currently alive from mortal cohorts with dropout and intermittent missing data. Suitable weight estimation procedures are provided for both monotone and intermittent missing data and under two missing-at-random assumptions, when the observation probability given that the subject is alive depends on the survival time (p-MAR assumption) or not (u-MAR assumption). Inference is performed through subject-level bootstrap. The method is validated in a simulation study and applied to the French cohort Paquid to estimate quantiles of a cognitive test in the elderly population currently alive. On one hand, the simulations show that the u-MAR analysis is quite robust when the true missingness mechanism is p-MAR. This is a useful result because computation of suitable weights for intermittent missing data under the p-MAR assumption is untractable. On the other hand, the simulations highlight, along with the real data analysis, the usefulness of suitable weights for intermittent missing data. This method is implemented in the R package weightQuant.

Download Full-text