scholarly journals A New Heuristic Approach for Treating Missing Value: ABCimp

2019 ◽  
Vol 25 (6) ◽  
pp. 48-54
Author(s):  
Pinar Cihan ◽  
Zeynep Banu Ozger

Missing values in datasets present an important problem for traditional and modern statistical methods. Many statistical methods have been developed to analyze the complete datasets. However, most of the real world datasets contain missing values. Therefore, in recent years, many methods have been developed to overcome the missing value problem. Heuristic methods have become popular in this field due to their superior performance in many other optimization problems. This paper introduces an Artificial Bee Colony algorithm based new approach for missing value imputation in the four real-world discrete datasets. At the proposed Artificial Bee Colony Imputation (ABCimp) method, Bayesian Optimization is integrated into the Artificial Bee Colony algorithm. The performance of the proposed technique is compared with other well-known six methods, which are Mean, Median, k Nearest Neighbor (k-NN), Multivariate Equation by Chained Equation (MICE), Singular Value Decomposition (SVD), and MissForest (MF). The classification error and root mean square error are used as the evaluation criteria of the imputation methods performance and the Naive Bayes algorithm is used as the classifier. The empirical results show that state-of-the-art ABCimp performs better than the other most popular imputation methods at the variable missing rates ranging from 3 % to 15 %.

2017 ◽  
Author(s):  
Runmin Wei ◽  
Jingye Wang ◽  
Erik Jia ◽  
Tianlu Chen ◽  
Yan Ni ◽  
...  

AbstractLeft-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been developed and applied to the situation of MNAR in the field of metabolomics. Thus, a practical left-censored missing value imputation method is urgently needed. We have developed an iterative Gibbs sampler based left-censored missing value imputation approach (GSimp). We compared GSimp with other three imputation methods on two real-world targeted metabolomics datasets and one simulation dataset using our imputation evaluation pipeline. The results show that GSimp outperforms other imputation methods in terms of imputation accuracy, observation distribution, univariate and multivariate analyses, and statistical sensitivity. The R code for GSimp, evaluation pipeline, vignette, real-world and simulated targeted metabolomics datasets are available at: https://github.com/WandeRum/GSimp.Author summaryMissing values caused by the limit of detection/quantification (LOD/LOQ) were widely observed in mass spectrometry (MS)-based targeted metabolomics studies and could be recognized as missing not at random (MNAR). MNAR leads to biased parameter estimations and jeopardizes following statistical analyses in different aspects, such as distorting sample distribution, impairing statistical power, etc. Although a wide range of missing value imputation methods was developed for –omics studies, a limited number of methods was designed appropriately for the situation of MNAR currently. To alleviate problems caused by MNAR and facilitate targeted metabolomics studies, we developed a Gibbs sampler based missing value imputation approach, called GSimp, which is public-accessible on GitHub. And we compared our method with existing approaches using an imputation evaluation pipeline on real-world and simulated metabolomics datasets to demonstrate the superiority of our method from different perspectives.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Chunhua Ju ◽  
Chonghuan Xu

Although there are many good collaborative recommendation methods, it is still a challenge to increase the accuracy and diversity of these methods to fulfill users’ preferences. In this paper, we propose a novel collaborative filtering recommendation approach based onK-means clustering algorithm. In the process of clustering, we use artificial bee colony (ABC) algorithm to overcome the local optimal problem caused byK-means. After that we adopt the modified cosine similarity to compute the similarity between users in the same clusters. Finally, we generate recommendation results for the corresponding target users. Detailed numerical analysis on a benchmark datasetMovieLensand a real-world dataset indicates that our new collaborative filtering approach based on users clustering algorithm outperforms many other recommendation methods.


2012 ◽  
Vol 3 (4) ◽  
pp. 19-33 ◽  
Author(s):  
Harish Sharma ◽  
Jagdish Chand Bansal ◽  
K. V. Arya ◽  
Kusum Deep

Artificial Bee Colony (ABC) optimization algorithm is relatively a simple and recent population based probabilistic approach for global optimization. ABC has been outperformed over some Nature Inspired Algorithms (NIAs) when tested over test problems as well as real world optimization problems. This paper presents an attempt to modify ABC to make it less susceptible to stick at local optima and computationally efficient. In the case of local convergence, addition of some external potential solutions may help the swarm to get out of the local valley and if the algorithm is taking too much time to converge then deletion of some swarm members may help to speed up the convergence. Therefore, in this paper a dynamic swarm size strategy in ABC is proposed. The proposed strategy is named as Dynamic Swarm Artificial Bee Colony algorithm (DSABC). To show the performance of DSABC, it is tested over 16 global optimization problems of different complexities and a popular real world optimization problem namely Lennard-Jones potential energy minimization problem. The simulation results show that the proposed strategies outperformed than the basic ABC and three recent variants of ABC, namely, the Gbest-Guided ABC, Best-So-Far ABC and Modified ABC.


2012 ◽  
Vol 22 (7-8) ◽  
pp. 1447-1459 ◽  
Author(s):  
Marisa da Silva Maximiano ◽  
Miguel A. Vega-Rodríguez ◽  
Juan A. Gómez-Pulido ◽  
Juan M. Sánchez-Pérez

Sign in / Sign up

Export Citation Format

Share Document