Missing Value Imputation Using ANN Optimized by Genetic Algorithm

2018 ◽  
Vol 5 (2) ◽  
pp. 41-57 ◽  
Author(s):  
Anjana Mishra ◽  
Bighnaraj Naik ◽  
Suresh Kumar Srichandan

Missing value arises in almost all serious statistical analyses and creates numerous problems in processing data in databases. In real world applications, information may be missing due to instrumental errors, optional fields and non-response to some questions in surveys, data entry errors, etc. Most of the data mining techniques need analysis of complete data without any missing information and this induces researchers to develop efficient methods to handle them. It is one of the most important areas where research is being carried out for a long time in various domains. The objective of this article is to handle missing data, using an evolutionary (genetic) algorithm including some relatively simple methodologies that can often yield reasonable results. The proposed method uses genetic algorithm and multi-layer perceptron (MLP) for accurately predicting missing data with higher accuracy.

Author(s):  
Anjana Mishra ◽  
Bighnaraj Naik ◽  
Suresh Kumar Srichandan

Missing value arises in almost all serious statistical analyses and creates numerous problems in processing data in databases. In real world applications, information may be missing due to instrumental errors, optional fields and non-response to some questions in surveys, data entry errors, etc. Most of the data mining techniques need analysis of complete data without any missing information and this induces researchers to develop efficient methods to handle them. It is one of the most important areas where research is being carried out for a long time in various domains. The objective of this article is to handle missing data, using an evolutionary (genetic) algorithm including some relatively simple methodologies that can often yield reasonable results. The proposed method uses genetic algorithm and multi-layer perceptron (MLP) for accurately predicting missing data with higher accuracy.


Author(s):  
Tian Lu ◽  
Qinxue Chen ◽  
Zeyu Liu

Although cyclo[18]carbon has been theoretically and experimentally investigated since long time ago, only very recently it was prepared and directly observed by means of STM/AFM in condensed phase (Kaiser et al., <i>Science</i>, <b>365</b>, 1299 (2019)). The unique ring structure and dual 18-center π delocalization feature bring a variety of unusual characteristics and properties to the cyclo[18]carbon, which are quite worth to be explored. In this work, we present an extremely comprehensive and detailed investigation on almost all aspects of the cyclo[18]carbon, including (1) Geometric characteristics (2) Bonding nature (3) Electron delocalization and aromaticity (4) Intermolecular interaction (5) Reactivity (6) Electronic excitation and UV/Vis spectrum (7) Molecular vibration and IR/Raman spectrum (8) Molecular dynamics (9) Response to external field (10) Electron ionization, affinity and accompanied process (11) Various molecular properties. We believe that our full characterization of the cyclo[18]carbon will greatly deepen researchers' understanding of this system, and thereby help them to utilize it in practice and design its various valuable derivatives.


Author(s):  
Tian Lu ◽  
Qinxue Chen ◽  
Zeyu Liu

Although cyclo[18]carbon has been theoretically and experimentally investigated since long time ago, only very recently it was prepared and directly observed by means of STM/AFM in condensed phase (Kaiser et al., <i>Science</i>, <b>365</b>, 1299 (2019)). The unique ring structure and dual 18-center π delocalization feature bring a variety of unusual characteristics and properties to the cyclo[18]carbon, which are quite worth to be explored. In this work, we present an extremely comprehensive and detailed investigation on almost all aspects of the cyclo[18]carbon, including (1) Geometric characteristics (2) Bonding nature (3) Electron delocalization and aromaticity (4) Intermolecular interaction (5) Reactivity (6) Electronic excitation and UV/Vis spectrum (7) Molecular vibration and IR/Raman spectrum (8) Molecular dynamics (9) Response to external field (10) Electron ionization, affinity and accompanied process (11) Various molecular properties. We believe that our full characterization of the cyclo[18]carbon will greatly deepen researchers' understanding of this system, and thereby help them to utilize it in practice and design its various valuable derivatives.


METRON ◽  
2021 ◽  
Author(s):  
Paolo Mariani ◽  
Andrea Marletta

AbstractSocial media has become a widespread element of people’s everyday life, which is used to communicate and generate contents. Among the several ways to express a reaction to social media contents, the “Likes” are critical. Indeed, they convey preferences, which drive existing markets or allow the creation of new ones. Nevertheless, the appreciation indicators have some complex features, as for example the interpretation of the absence of “Likes”. In this case, the lack of approval may be considered as a specific behaviour. The present study aimed to define whether the absence of Likes may indicate the presence of a specific behaviour through the contextualization of the treatment of missing data applied to real cases. We provided a practical strategy for extracting more knowledge from social media data, whose synthesis raises several measurement problems. We proposed an approach based on the disambiguation of missing data in two modalities: “Dislike” and “Nothing”. Finally, a data pre-processing technique was suggested to increase the signal of social media data.


2021 ◽  
Vol 40 (4) ◽  
pp. 8493-8500
Author(s):  
Yanwei Du ◽  
Feng Chen ◽  
Xiaoyi Fan ◽  
Lei Zhang ◽  
Henggang Liang

With the increase of the number of loaded goods, the number of optional loading schemes will increase exponentially. It is a long time and low efficiency to determine the loading scheme with experience. Genetic algorithm is a search heuristic algorithm used to solve optimization in the field of computer science artificial intelligence. Genetic algorithm can effectively select the optimal loading scheme but unable to utilize weight and volume capacity of cargo and truck. In this paper, we propose hybrid Genetic and fuzzy logic based cargo-loading decision making model that focus on achieving maximum profit with maximum utilization of weight and volume capacity of cargo and truck. In this paper, first of all, the components of the problem of goods stowage in the distribution center are analyzed systematically, which lays the foundation for the reasonable classification of the problem of goods stowage and the establishment of the mathematical model of the problem of goods stowage. Secondly, the paper abstracts and defines the problem of goods loading in distribution center, establishes the mathematical model for the optimization of single car three-dimensional goods loading, and designs the genetic algorithm for solving the model. Finally, Matlab is used to solve the optimization model of cargo loading, and the good performance of the algorithm is verified by an example. From the performance evaluation analysis, proposed the hybrid system achieve better outcomes than the standard SA model, GA method, and TS strategy.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Nishith Kumar ◽  
Md. Aminul Hoque ◽  
Masahiro Sugimoto

AbstractMass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outliers that originate for several reasons, including technical and biological sources. Although several missing data imputation techniques are described in the literature, all conventional existing techniques only solve the missing value problems. They do not relieve the problems of outliers. Therefore, outliers in the dataset decrease the accuracy of the imputation. We developed a new kernel weight function-based proposed missing data imputation technique that resolves the problems of missing values and outliers. We evaluated the performance of the proposed method and other conventional and recently developed missing imputation techniques using both artificially generated data and experimentally measured data analysis in both the absence and presence of different rates of outliers. Performances based on both artificial data and real metabolomics data indicate the superiority of our proposed kernel weight-based missing data imputation technique to the existing alternatives. For user convenience, an R package of the proposed kernel weight-based missing value imputation technique was developed, which is available at https://github.com/NishithPaul/tWLSA.


2004 ◽  
Vol 19 (2) ◽  
pp. 718-723 ◽  
Author(s):  
P. Kumar ◽  
V.K. Chandna ◽  
M.S. Thomas

2010 ◽  
Vol 19 (01) ◽  
pp. 107-121 ◽  
Author(s):  
JUAN CARLOS FIGUEROA GARCÍA ◽  
DUSKO KALENATIC ◽  
CESAR AMILCAR LÓPEZ BELLO

This paper presents a proposal based on an evolutionary algorithm for imputing missing observations in time series. A genetic algorithm based on the minimization of an error function derived from their autocorrelation function, mean, and variance is presented. All methodological aspects of the genetic structure are presented. An extended description of the design of the fitness function is provided. Four application examples are provided and solved by using the proposed method.


2015 ◽  
Vol 21 (S4) ◽  
pp. 218-223 ◽  
Author(s):  
D. Dowsett

AbstractTwo techniques for use with SIMION [1] are presented, boundary matching and genetic optimization. The first allows systems which were previously difficult or impossible to simulate in SIMION to be simulated with great accuracy. The second allows any system to be rapidly and robustly optimized using a parallelized genetic algorithm. Each method will be described along with examples of real world applications.


Author(s):  
Caio Ribeiro ◽  
Alex A. Freitas

AbstractLongitudinal datasets of human ageing studies usually have a high volume of missing data, and one way to handle missing values in a dataset is to replace them with estimations. However, there are many methods to estimate missing values, and no single method is the best for all datasets. In this article, we propose a data-driven missing value imputation approach that performs a feature-wise selection of the best imputation method, using known information in the dataset to rank the five methods we selected, based on their estimation error rates. We evaluated the proposed approach in two sets of experiments: a classifier-independent scenario, where we compared the applicabilities and error rates of each imputation method; and a classifier-dependent scenario, where we compared the predictive accuracy of Random Forest classifiers generated with datasets prepared using each imputation method and a baseline approach of doing no imputation (letting the classification algorithm handle the missing values internally). Based on our results from both sets of experiments, we concluded that the proposed data-driven missing value imputation approach generally resulted in models with more accurate estimations for missing data and better performing classifiers, in longitudinal datasets of human ageing. We also observed that imputation methods devised specifically for longitudinal data had very accurate estimations. This reinforces the idea that using the temporal information intrinsic to longitudinal data is a worthwhile endeavour for machine learning applications, and that can be achieved through the proposed data-driven approach.


Sign in / Sign up

Export Citation Format

Share Document