Missing Data Imputation using Machine Learning Algorithm for Supervised Learning

<p style="text-align: justify;">The article aims to develop a machine-learning algorithm that can predict student’s graduation in the Industrial Engineering course at the Federal University of Amazonas based on their performance data. The methodology makes use of an information package of 364 students with an admission period between 2007 and 2019, considering characteristics that can affect directly or indirectly in the graduation of each one, being: type of high school, number of semesters taken, grade-point average, lockouts, dropouts and course terminations. The data treatment considered the manual removal of several characteristics that did not add value to the output of the algorithm, resulting in a package composed of 2184 instances. Thus, the logistic regression, MLP and XGBoost models developed and compared could predict a binary output of graduation or non-graduation to each student using 30% of the dataset to test and 70% to train, so that was possible to identify a relationship between the six attributes explored and achieve, with the best model, 94.15% of accuracy on its predictions.</p>

Download Full-text

Missing data imputation using machine learning based methods to improve HCC survival prediction

2020 28th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu49456.2020.9302222 ◽

2020 ◽

Author(s):

Mehmethan Yumus ◽

Merve Apaydin ◽

Ali Degirmenci ◽

Omer Karal

Keyword(s):

Machine Learning ◽

Missing Data ◽

Survival Prediction ◽

Data Imputation ◽

Missing Data Imputation

Download Full-text

EvoImputer: An evolutionary approach for Missing Data Imputation and feature selection in the context of supervised learning

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.107734 ◽

2021 ◽

pp. 107734

Author(s):

Shatha Awawdeh ◽

Hossam Faris ◽

Hazem Hiary

Keyword(s):

Feature Selection ◽

Missing Data ◽

Supervised Learning ◽

Evolutionary Approach ◽

Data Imputation ◽

Missing Data Imputation

Download Full-text

Adaptive Deep Incremental Learning — Assisted Missing Data Imputation for Streaming Data

Journal of Interconnection Networks ◽

10.1142/s021926592143009x ◽

2021 ◽

Author(s):

C. V. S. R. Syavasya ◽

M. A. Lakshmi

Keyword(s):

Missing Data ◽

Incremental Learning ◽

Missing Values ◽

Learning Algorithm ◽

Streaming Data ◽

Stochastic Gradient Descent ◽

Data Imputation ◽

Imputation Model ◽

Missing Data Imputation ◽

Hidden Neurons

With the rapid explosion of the data streams from the applications, ensuring accurate data analysis is essential for effective real-time decision making. Nowadays, data stream applications often confront the missing values that affect the performance of the classification models. Several imputation models have adopted the deep learning algorithms for estimating the missing values; however, the lack of parameter and structure tuning in classification, degrade the performance for data imputation. This work presents the missing data imputation model using the adaptive deep incremental learning algorithm for streaming applications. The proposed approach incorporates two main processes: enhancing the deep incremental learning algorithm and enhancing deep incremental learning-based imputation. Initially, the proposed approach focuses on tuning the learning rate with both the Adaptive Moment Estimation (Adam) along with Stochastic Gradient Descent (SGD) optimizers and tuning the hidden neurons. Secondly, the proposed approach applies the enhanced deep incremental learning algorithm to estimate the imputed values in two steps: (i) imputation process to predict the missing values based on the temporal-proximity and (ii) generation of complete IoT dataset by imputing the missing values from both the predicted values. The experimental outcomes illustrate that the proposed imputation model effectively transforms the incomplete dataset into a complete dataset with minimal error.

Download Full-text