A subspace ensemble framework for classification with high dimensional missing data

In this paper, we consider variable selection for ultra-high dimensional quantile regression model with missing data and measurement errors in covariates. Specifically, we correct the bias in the loss function caused by measurement error by applying the orthogonal quantile regression approach and remove the bias caused by missing data using the inverse probability weighting. A nonconvex Atan penalized estimation method is proposed for simultaneous variable selection and estimation. With the proper choice of the regularization parameter and under some relaxed conditions, we show that the proposed estimate enjoys the oracle properties. The choice of smoothing parameters is also discussed. The performance of the proposed variable selection procedure is assessed by Monte Carlo simulation studies. We further demonstrate the proposed procedure with a breast cancer data set.

Download Full-text

High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity

The Annals of Statistics ◽

10.1214/12-aos1018 ◽

2012 ◽

Vol 40 (3) ◽

pp. 1637-1664 ◽

Cited By ~ 72

Author(s):

Po-Ling Loh ◽

Martin J. Wainwright

Keyword(s):

Missing Data ◽

High Dimensional ◽

High Dimensional Regression

Download Full-text

Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

Lecture Notes in Computer Science - Advances in Swarm Intelligence ◽

10.1007/978-3-319-41000-5_26 ◽

2016 ◽

pp. 259-270 ◽

Cited By ~ 13

Author(s):

Collins Leke ◽

Tshilidzi Marwala

Keyword(s):

Neural Network ◽

Missing Data ◽

Swarm Intelligence ◽

Deep Neural Network ◽

High Dimensional ◽

Network Approach ◽

Neural Network Approach ◽

Missing Data Estimation ◽

High Dimensional Datasets ◽

Data Estimation

Download Full-text

Total Variation Regularized Weighted Tensor Ring Decomposition for Missing Data Recovery in High-Dimensional Optical Remote Sensing Images

IEEE Geoscience and Remote Sensing Letters ◽

10.1109/lgrs.2021.3069895 ◽

2021 ◽

pp. 1-5

Author(s):

Minghua Wang ◽

Qiang Wang ◽

Jocelyn Chanussot ◽

Danfeng Hong

Keyword(s):

Remote Sensing ◽

Missing Data ◽

Total Variation ◽

Data Recovery ◽

High Dimensional ◽

Optical Remote Sensing ◽

Remote Sensing Images

Download Full-text

Dimensionality Reduction Methods Used in Machine Learning

Műszaki Tudományos Közlemények ◽

10.33894/mtk-2020.13.27 ◽

2020 ◽

Vol 13 (1) ◽

pp. 148-151

Author(s):

Kristóf Muhi ◽

Zsolt Csaba Johanyák

Keyword(s):

Machine Learning ◽

Missing Data ◽

Dimensionality Reduction ◽

Feature Space ◽

Data Preprocessing ◽

Short Review ◽

High Dimensional ◽

Data Types ◽

Reduction Methods ◽

The Individual

AbstractIn most cases, a dataset obtained through observation, measurement, etc. cannot be directly used for the training of a machine learning based system due to the unavoidable existence of missing data, inconsistencies and high dimensional feature space. Additionally, the individual features can contain quite different data types and ranges. For this reason, a data preprocessing step is nearly always necessary before the data can be used. This paper gives a short review of the typical methods applicable in the preprocessing and dimensionality reduction of raw data.

Download Full-text