A subspace ensemble framework for classification with high dimensional missing data

2016 ◽  
Vol 28 (4) ◽  
pp. 1309-1324 ◽  
Author(s):  
Hang Gao ◽  
Songlei Jian ◽  
Yuxing Peng ◽  
Xinwang Liu
2020 ◽  
pp. 096228022094153
Author(s):  
Yongxin Bai ◽  
Maozai Tian ◽  
Man-Lai Tang ◽  
Wing-Yan Lee

In this paper, we consider variable selection for ultra-high dimensional quantile regression model with missing data and measurement errors in covariates. Specifically, we correct the bias in the loss function caused by measurement error by applying the orthogonal quantile regression approach and remove the bias caused by missing data using the inverse probability weighting. A nonconvex Atan penalized estimation method is proposed for simultaneous variable selection and estimation. With the proper choice of the regularization parameter and under some relaxed conditions, we show that the proposed estimate enjoys the oracle properties. The choice of smoothing parameters is also discussed. The performance of the proposed variable selection procedure is assessed by Monte Carlo simulation studies. We further demonstrate the proposed procedure with a breast cancer data set.


2020 ◽  
Vol 13 (1) ◽  
pp. 148-151
Author(s):  
Kristóf Muhi ◽  
Zsolt Csaba Johanyák

AbstractIn most cases, a dataset obtained through observation, measurement, etc. cannot be directly used for the training of a machine learning based system due to the unavoidable existence of missing data, inconsistencies and high dimensional feature space. Additionally, the individual features can contain quite different data types and ranges. For this reason, a data preprocessing step is nearly always necessary before the data can be used. This paper gives a short review of the typical methods applicable in the preprocessing and dimensionality reduction of raw data.


Sign in / Sign up

Export Citation Format

Share Document