Imbalanced-type Incomplete Data Fuzzy Modeling and Missing Value Imputations

One way to deal with the presence of missing value or incomplete data is to impute the data using EM Algorithm. The need for large and fast data processing is necessary to implement parallel computing on EM algorithm serial program. In the parallel program architecture of EM Algorithm in this study, the controller is only related to the EM module whereas the EM module itself uses matrix and vector modules intensively. Parallelization is done by using OpenMP in EM modules which results in faster compute time on parallel programs than serial programs. Parallel computing with a thread of 4 (four) increases speed up, reduces compute time, and reduces efficiency when compared to parallel computing by the number of threads 2 (two).

Download Full-text

Fuzzy Modeling for Imprecise and Incomplete Data

Statistical Analysis of Clinical Data on a Pocket Calculator, Part 2 - SpringerBriefs in Statistics ◽

10.1007/978-94-007-4704-3_10 ◽

2012 ◽

pp. 35-40

Author(s):

Ton J. Cleophas ◽

Aeilko H. Zwinderman

Keyword(s):

Incomplete Data ◽

Fuzzy Modeling

Download Full-text

Comparison of Algorithms for Clustering Incomplete Data

Foundations of Computing and Decision Sciences ◽

10.2478/fcds-2014-0007 ◽

2014 ◽

Vol 39 (2) ◽

pp. 107-127 ◽

Cited By ~ 6

Author(s):

Artur Matyja ◽

Krzysztof Siminski

Keyword(s):

Data Analysis ◽

Incomplete Data ◽

Missing Values ◽

Real Data ◽

Complete Data ◽

The Other ◽

Data Sets ◽

Missing Value ◽

Comparison Of Algorithms ◽

New Algorithms

Abstract The missing values are not uncommon in real data sets. The algorithms and methods used for the data analysis of complete data sets cannot always be applied to missing value data. In order to use the existing methods for complete data, the missing value data sets are preprocessed. The other solution to this problem is creation of new algorithms dedicated to missing value data sets. The objective of our research is to compare the preprocessing techniques and specialised algorithms and to find their most advantageous usage.

Download Full-text

A Global Clustering Approach Using Hybrid Optimization for Incomplete Data Based on Interval Reconstruction of Missing Value

International Journal of Intelligent Systems ◽

10.1002/int.21752 ◽

2015 ◽

Vol 31 (4) ◽

pp. 297-313 ◽

Cited By ~ 5

Author(s):

Liyong Zhang ◽

Wei Lu ◽

Xiaodong Liu ◽

Witold Pedrycz ◽

Chongquan Zhong ◽

...

Keyword(s):

Incomplete Data ◽

Hybrid Optimization ◽

Missing Value ◽

Clustering Approach

Download Full-text

Automatic missing value imputation for cleaning phase of diabetic’s readmission prediction model

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v12i2.pp2001-2013 ◽

2022 ◽

Vol 12 (2) ◽

pp. 2001

Author(s):

Jesmeen Mohd Zebaral Hoque ◽

Jakir Hossen ◽

Shohel Sayeed ◽

Chy. Mohammed Tawsif K. ◽

Jaya Ganesan ◽

...

Keyword(s):

Incomplete Data ◽

Missing Values ◽

Prediction Models ◽

Low Cost ◽

Support Vector ◽

Data Sampling ◽

Data Set ◽

Missing Value ◽

Missing Value Imputation ◽

Proper Analysis

Recently, the industry of healthcare started generating a large volume of datasets. If hospitals can employ the data, they could easily predict the outcomes and provide better treatments at early stages with low cost. Here, data analytics (DA) was used to make correct decisions through proper analysis and prediction. However, inappropriate data may lead to flawed analysis and thus yield unacceptable conclusions. Hence, transforming the improper data from the entire data set into useful data is essential. Machine learning (ML) technique was used to overcome the issues due to incomplete data. A new architecture, automatic missing value imputation (AMVI) was developed to predict missing values in the dataset, including data sampling and feature selection. Four prediction models (i.e., logistic regression, support vector machine (SVM), AdaBoost, and random forest algorithms) were selected from the well-known classification. The complete AMVI architecture performance was evaluated using a structured data set obtained from the UCI repository. Accuracy of around 90% was achieved. It was also confirmed from cross-validation that the trained ML model is suitable and not over-fitted. This trained model is developed based on the dataset, which is not dependent on a specific environment. It will train and obtain the outperformed model depending on the data available.

Download Full-text

Takagi-Sugeno Modeling of Incomplete Data for Missing Value Imputation With the Use of Alternate Learning

IEEE Access ◽

10.1109/access.2020.2991669 ◽

2020 ◽

Vol 8 ◽

pp. 83633-83644

Author(s):

Xiaochen Lai ◽

Liyong Zhang ◽

Xin Liu

Keyword(s):

Incomplete Data ◽

Missing Value ◽

Missing Value Imputation ◽

Takagi Sugeno

Download Full-text

AN APPLICATION OF GENETIC ALGORITHM FOR CLUSTERING OBSERVATIONS WITH INCOMPLETE DATA

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v1i1.48 ◽

2017 ◽

Vol 1 (1) ◽

pp. 13-23

Author(s):

Frisca Rizki Ananda ◽

Asep Saefuddin ◽

Bagus Sartono

Keyword(s):

Genetic Algorithm ◽

Cluster Analysis ◽

Incomplete Data ◽

Similarity Index ◽

Complete Data ◽

Missing Value ◽

Incomplete Observations ◽

Common Strategy ◽

Distance Approach ◽

Iris Data

Cluster analysis is a method to classify observations into several clusters. A common strategy for clustering the observations uses distance as a similarity index. However distance approach cannot be applied when data is not complete. Genetic Algorithm is applied by involving variance (GACV) in order to solve this problem. This study employed GACV on Iris data that was introduced by Sir Ronald Fisher. Clustering the incomplete data was implemented on data which was produced by deleting some values of Iris data. The algorithm was developed under R 3.0.2 software and got satisfying result for clustering complete data with 95.99% sensitivity and 98% consistency. GACV could be applied to cluster observations with missing value without filling in the missing value or excluding these observations. Performance on clustering incomplete observations is also satisfying but tends to decrease as the proportion of incomplete values increases. The proportion of incomplete values should be less than or equal to 40% to get sensitivity and consistency not less than 90. Keywords: Cluster Analysis, Genetic Algorithm, Incomplete Data.

Download Full-text

Normalization and outlier removal in class center-based firefly algorithm for missing value imputation

Journal Of Big Data ◽

10.1186/s40537-021-00518-7 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Heru Nugroho ◽

Nugraha Priya Utama ◽

Kridanto Surendro

Keyword(s):

Incomplete Data ◽

Statistical Power ◽

Firefly Algorithm ◽

Missing Values ◽

Outlier Removal ◽

Processing Stage ◽

Missing Value ◽

Missing Value Imputation ◽

Almost All ◽

True Values

AbstractA missing value is one of the factors that often cause incomplete data in almost all studies, even those that are well-designed and controlled. It can also decrease a study’s statistical power or result in inaccurate estimations and conclusions. Hence, data normalization and missing value handling are considered the major problems in the data pre-processing stage, while classification algorithms are adopted to handle numerical features. In cases where the observed data contained outliers, the missing value estimated results are sometimes unreliable or even differ greatly from the true values. Therefore, this study aims to propose the combination of normalization and outlier removals before imputing missing values on the class center-based firefly algorithm method (ON + C3FA). Moreover, some standard imputation techniques like mean, a random value, regression, as well as multiple imputation, KNN imputation, and decision tree (DT)-based missing value imputation were utilized as a comparison of the proposed method. Experimental results on the sonar dataset showed normalization and outlier removals effect in the methods. According to the proposed method (ON + C3FA), AUC, accuracy, F1-Score, Precision, Recall, and AUC-PR had 0.972, 0.906, 0.906, 0.908, 0.906, 0.61 respectively. The result showed combining normalization and outlier removals in C3-FA (ON + C3FA) was an efficient technique for obtaining actual data in handling missing values, and it also outperformed the previous studies methods with r and RMSE values of 0.935 and 0.02. Meanwhile, the Dks value obtained from this technique was 0.04, which indicated that it could maintain the values or distribution accuracy.

Download Full-text

Imputation of Incomplete Data Based on Attribute Cross Fitting Model and Iterative Missing Value Variables

Advances in Neural Networks – ISNN 2020 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-64221-1_15 ◽

2020 ◽

pp. 167-175

Author(s):

Jinchong Zhu ◽

Liyong Zhang ◽

Xiaochen Lai ◽

Genglin Zhang

Keyword(s):

Incomplete Data ◽

Missing Value ◽

Fitting Model

Download Full-text

Imbalanced-type Incomplete Data Fuzzy Modeling and Missing Value Imputations

Missing Value Imputations by Rule-Based Incomplete Data Fuzzy Modeling

Comparison of Serial and Parallel Computation on Predicting Missing Data with EM Algorithm

Fuzzy Modeling for Imprecise and Incomplete Data

Comparison of Algorithms for Clustering Incomplete Data

A Global Clustering Approach Using Hybrid Optimization for Incomplete Data Based on Interval Reconstruction of Missing Value

Automatic missing value imputation for cleaning phase of diabetic’s readmission prediction model

Takagi-Sugeno Modeling of Incomplete Data for Missing Value Imputation With the Use of Alternate Learning

AN APPLICATION OF GENETIC ALGORITHM FOR CLUSTERING OBSERVATIONS WITH INCOMPLETE DATA

Normalization and outlier removal in class center-based firefly algorithm for missing value imputation

Imputation of Incomplete Data Based on Attribute Cross Fitting Model and Iterative Missing Value Variables

Export Citation Format