Optimasi Bobot K-Means Clustering untuk Mengatasi Missing Value dengan Menggunakan Algoritma Genetica

Nilai yang hilang membutuhkan preprosesing dengan teknik imputasi untuk menghasilkan data yang lengkap. Proses imputasi membutuhkan initial bobot yang sesuai, karena data yang dihasilkan adalah data pengganti. Pemilihan nilai bobot yang optimal dan kesesuaian nilai K pada metode K-Means Imputation (KMI) merupakan masalah besar, sehingga menimbulkan error semakin meningkat. Model gabungan algoritma genetika (GA) dan KMI atau yang dikenal GAKMI digunakan untuk menentukan bobot optimal pada setiap cluster data yang mengandung nilai yang hilang. Algoritma genetika digunakan untuk memilih bobot dengan menggunakan pengkodean bilangan riel pada kromosom. Model hybrid GA dan KMI dengan pengelompokan menggunakan jumlah jarak Euclidian setiap titik data dari pusat clusternya. Pengukuran kinerja algoritma menggunakan fungsi kebugaran optimal dengan nilai MSE terkecil. Hasil percobaan data hepatitis menunjukkan bahwa GA efisien dalam menemukan nilai bobot awal optimal dari ruang pencarian yang besar. Hasil perhitungan menggunakan nilai MSE =0.044 pada K=3 dan replika ke-5 menunjukkan kinerja GAKMI menghasilkan tingkat kesalahan yang rendah untuk data hepatitis dengan atribut campuran. Hasil penelitian dengan menggunakan pengujian tingkat imputasi menunjukkan algoritma GAKMI menghasilkan nilai r = 0.526 lebih tinggi dibandingkan dengan metode lainnya. Penelitian ini menunjukkan GAKMI menghasilkan nilai r yang lebih tinggi dibandingkan metode imputasi lainnya sehingga dianggap paling baik dibandingkan teknik imputasi secara umum. AbstractMissing values require preprocessing techniques as imputation to produce complete data. Complete data imputation results require the appropriate initial weights, because the resulting data is replacement data. The choice of the optimal weighting value and the suitability of the network nodes in the K-Means Imputation (KMI) method are big problems, causing increasing errors. The combined model of Genetic Algorithm (GA) and KMI is used to determine the optimal weights for each data cluster containing missing values. Genetic algorithm is used to select weights by using real number coding on chromosomes. GA is applied to the KMI using clustering calculated using the sum of the Euclidean distances of each data point from the center of the cluster. Performance measurement algorithms using the fitness function optimally with the smallest MSE value. The results of the hepatitis data experiment show that GA is efficient in finding the optimal initial weight value from a large search space. The results of calculations using the MSE value = 0.04 for K = 3 and the 5th replication. So, GAKMI resulted in a low error rate for mixed data. The results of research using imputation level testing performed GAKMI produced r = 0.526 higher than the other methods. Thus, the higher the r value, the best for the imputation technique.

Download Full-text

Experimental Programming of Genetic Algorithms for the “Casse-Tête” Problem

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.292513 ◽

2022 ◽

Vol 13 (1) ◽

pp. 0-0

Keyword(s):

Genetic Algorithm ◽

Fitness Function ◽

Search Space ◽

Rank Test ◽

Mutation Operators ◽

Signed Rank ◽

Significant Difference ◽

Chromosome Representation ◽

Wilcoxon Signed Rank Test ◽

Signed Rank Test

The Casse-tête board puzzle consists of an n×n grid covered with n^2 tokens. m<n^2 tokens are deleted from the grid so that each row and column of the grid contains an even number of remaining tokens. The size of the search space is exponential. This study used a genetic algorithm (GA) to design and implement solutions for the board puzzle. The chromosome representation is a matrix of binary permutations. Variants for two crossover operators and two mutation operators were presented. The study experimented with and compared four possible operator combinations. Additionally, it compared GA and simulated annealing (SA)-based solutions, finding a 100% success rate (SR) for both. However, the GA-based model was more effective in solving larger instances of the puzzle than the SA-based model. The GA-based model was found to be considerably more efficient than the SA-based model when measured by the number of fitness function evaluations (FEs). The Wilcoxon signed-rank test confirms a significant difference among FEs in the two models (p=0.038).

Download Full-text

Variable Search Space Converging Genetic Algorithm for Solving System of Non-linear Equations

Journal of Intelligent Systems ◽

10.1515/jisys-2019-0233 ◽

2020 ◽

Vol 30 (1) ◽

pp. 142-164 ◽

Cited By ~ 1

Author(s):

Venkatesh SS ◽

Deepak Mishra

Keyword(s):

Genetic Algorithm ◽

Optimization Problems ◽

Fitness Function ◽

Linear Equations ◽

Search Space ◽

New Variant ◽

Non Linear ◽

Coarse Search ◽

Non Linear Equations ◽

Digit Length

Abstract This paper introduce a new variant of the Genetic Algorithm whichis developed to handle multivariable, multi-objective and very high search space optimization problems like the solving system of non-linear equations. It is an integer coded Genetic Algorithm with conventional cross over and mutation but with Inverse algorithm is varying its search space by varying its digit length on every cycle and it does a fine search followed by a coarse search. And its solution to the optimization problem will converge to precise value over the cycles. Every equation of the system is considered as a single minimization objective function. Multiple objectives are converted to a single fitness function by summing their absolute values. Some difficult test functions for optimization and applications are used to evaluate this algorithm. The results prove that this algorithm is capable to produce promising and precise results.

Download Full-text

A Route Planning’s Method for Micro UAV Based on Improved Genetic Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.602-605.1348 ◽

2014 ◽

Vol 602-605 ◽

pp. 1348-1351 ◽

Cited By ~ 2

Author(s):

Ling Lu ◽

Hua Zhang

Keyword(s):

Genetic Algorithm ◽

Environmental Impact ◽

Penalty Function ◽

Fitness Function ◽

Search Space ◽

Improved Genetic Algorithm ◽

Path Constraints ◽

Simulation Results ◽

Return Path

According to micro UAV is susceptible to wind environmental impact during the flight, this paper puts forward a kind of ergodicity track strategy for return path re planning. By adding coverage flight and sacrificing path to achieve safe recovery of voyage. A genetic algorithm is used to solve the problem, and introducing dynamic penalty function to improve the fitness function, effectively reduce the search space. The simulation results show that the method can generate safely track, and satisfy the path constraints.

Download Full-text

Regional Land Use and Transportation Planning with a Genetic Algorithm

Transportation Research Record Journal of the Transportation Research Board ◽

10.3141/1831-24 ◽

2003 ◽

Vol 1831 (1) ◽

pp. 210-218 ◽

Cited By ~ 4

Author(s):

Richard Balling ◽

Michael Lowry ◽

Mitsuru Saito

Keyword(s):

Genetic Algorithm ◽

Land Use ◽

Regional Planning ◽

Traffic Congestion ◽

Transportation Planning ◽

Fitness Function ◽

Search Space ◽

Metropolitan Region ◽

Search Spaces ◽

Regional Land

A new approach to regional land use and transportation planning, which uses a genetic algorithm as an integrated optimization tool, is presented. The approach is illustrated by applying it to the Wasatch Front Metropolitan Region, which consists of four counties in the state of Utah. This genetic algorithm–-based approach was applied earlier to the twin cities of Provo and Orem in Utah, but here it is adapted to regional planning. Three issues make regional planning particularly difficult: ( a) individual cities have significant planning autonomy, ( b) the search space of possible plans is immense, and ( c) preferences between competing objectives vary among stakeholders. The approach used here addresses the first issue by the way the problem is formulated. The second issue is addressed with a genetic algorithm. Such algorithms are particularly well suited to problems with large search spaces. The third issue is addressed by using a multiobjective fitness function in the genetic algorithm. It was found that a genetic algorithm could produce a set of nondominated future land use scenarios and street plans for a region, from which regional planners can make a selection. Execution of the algorithm to produce 100 plans per generation for 100 generations took about 4 days with a high-end personal computer. Interesting trends for reducing change and traffic congestion were discovered.

Download Full-text

An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness Function

Applied Computational Intelligence and Soft Computing ◽

10.1155/2012/945401 ◽

2012 ◽

Vol 2012 ◽

pp. 1-11 ◽

Cited By ~ 3

Author(s):

Satoko Kikuchi ◽

Goutam Chakraborty

Keyword(s):

Genetic Algorithm ◽

State Of The Art ◽

Fitness Function ◽

Computational Cost ◽

Search Space ◽

Search Efficiency ◽

Genome Sequences ◽

Modified Genetic Algorithm ◽

Correct Sequence ◽

Np Problem

To decode a long genome sequence, shotgun sequencing is the state-of-the-art technique. It needs to properly sequence a very large number, sometimes as large as millions, of short partially readable strings (fragments). Arranging those fragments in correct sequence is known as fragment assembling, which is an NP-problem. Presently used methods require enormous computational cost. In this work, we have shown how our modified genetic algorithm (GA) could solve this problem efficiently. In the proposed GA, the length of the chromosome, which represents the volume of the search space, is reduced with advancing generations, and thereby improves search efficiency. We also introduced a greedy mutation, by swapping nearby fragments using some heuristics, to improve the fitness of chromosomes. We compared results with Parsons’ algorithm which is based on GA too. We used fragments with partial reads on both sides, mimicking fragments in real genome assembling process. In Parsons’ work base-pair array of the whole fragment is known. Even then, we could obtain much better results, and we succeeded in restructuring contigs covering 100% of the genome sequences.

Download Full-text

Implicit Representation in Genetic Algorithms Using Redundancy

Evolutionary Computation ◽

10.1162/evco.1997.5.3.277 ◽

1997 ◽

Vol 5 (3) ◽

pp. 277-302 ◽

Cited By ~ 16

Author(s):

Anne M. Raich ◽

Jamshid Ghaboussi

Keyword(s):

Genetic Algorithm ◽

Genetic Algorithms ◽

Fitness Function ◽

Search Space ◽

Function Evaluation ◽

Diverse Population ◽

Implicit Representation ◽

Essential Information ◽

Redundant Representation ◽

Better Than

A new representation combining redundancy and implicit fitness constraints is introduced that performs better than a simple genetic algorithm (GA) and a structured GA in experiments. The implicit redundant representation (IRR) consists of a string that is over-specified, allowing for sections of the string to remain inactive during function evaluation. The representation does not require the user to prespecify the number of parameters to evaluate or the location of these parameters within the string. This information is obtained implicitly by the fitness function during the GA operations. The good performance of the IRR can be attributed to several factors: less disruption of existing fit members due to the increased probability of crossovers and mutation affecting only redundant material; discovery of fit members through the conversion of redundant material into essential information; and the ability to enlarge or reduce the search space dynamically by varying the number of variables evaluated by the fitness function. The IRR GA provides a more biologically parallel representation that maintains a diverse population throughout the evolution process. In addition, the IRR provides the necessary flexibility to represent unstructured problem domains that do not have the explicit constraints required by fixed representations.

Download Full-text

MTC: Minimizing Time and Cost of Cloud Task Scheduling based on Customers and Providers Needs using Genetic Algorithm

International Journal of Intelligent Systems and Applications ◽

10.5815/ijisa.2021.02.03 ◽

2021 ◽

Vol 13 (2) ◽

pp. 38-51

Author(s):

Nasim Soltani Soulegan ◽

◽

Behrang Barekatain ◽

Behzad Soleimani Neysiani

Keyword(s):

Genetic Algorithm ◽

Cloud Computing ◽

Heterogeneous Computing ◽

Virtual Machines ◽

Heuristic Algorithms ◽

Fitness Function ◽

Statistical Tests ◽

Search Space ◽

Considerable Effect ◽

Cloud Services

Cloud computing is considered a pattern for distributed and heterogeneous computing derived from many resources, and requests aim to share resources. Recently, cloud computing is graded among the top best technologies globally, which must be scheduled favorably to maximize providers’ profit and improve service quality for their customers. Scheduling specifies how users’ requests are assigned to virtual machines, and it plays a vital role in the efficiency and capability of the system. Its objective is to have a throughput or complete jobs in minimum time and the highest standard. Scheduling jobs in heterogeneous distributed systems is an NP-hard polynomial indecisive problem that is not solvable in polynomial time for real-time scheduling. The time complexity of jobs is growing exponentially, and this problem has a considerable effect on the quality of cloud services and providers’ efficiencies. The optimization of scheduling-related parameters using heuristic and meta-heuristic algorithms can reduce the search space complexity and execution time. This study intends to represent a fitness function to minimize time and cost parameters. The proposed method uses a multi-purposed weighted genetic algorithm that provides six basic parameters: utility, task execution cost, response time, wait time, Makespan, and throughput to provide comprehensive optimization. The proposed approach improved response and wait times, throughput, Makespan, and utility 16, 9, 7, 8 percentages, respectively, by only a one cost unit reduction, which is dispensable. As a result, both providers and users will experience better services. The statistical tests show that the achieved improvement is valid for 94% of experiments.

Download Full-text

Improved Genetic Algorithm Based Classification

International Journal of Computer Science and Informatics ◽

10.47893/ijcsi.2012.1040 ◽

2012 ◽

pp. 211-217

Author(s):

Keshavamurthy B. N ◽

Asad Mohammed Khan ◽

Durga Toshniwal

Keyword(s):

Genetic Algorithm ◽

Evolutionary Algorithms ◽

Probabilistic Models ◽

Fitness Function ◽

Optimal Solution ◽

Search Space ◽

Improved Genetic Algorithm ◽

Data Set ◽

Useful Knowledge ◽

Bayes Algorithm

Classification is the supervised learning technique of data mining which is used to extract hidden useful knowledge over a large volume of databases by predicting the class values based on the predicting attribute values. Of the various techniques, the most widely talked ones include decision tree, probabilistic model and evolutionary algorithms. Recently, the probabilistic and evolutionary techniques are most worked upon, because of the fact that probabilistic models yields high accuracy when there is no attribute dependency in the existing problem and evolutionary algorithms are used to obtain optimal solution over a large search space very quickly when there is less information known about a problem and problem posses attribute dependency. Though there are tradeoffs in each model still there are scopes to improve upon the existing. The proposed approach improves the evolutionary technique such as genetic algorithm by improving the fitness function parameters. Also, in this we compare the genetic algorithm results with Naïve Bayes algorithm results. For the experimental work we have used the nursery data set taken from the UCI Machine Learning Repository.

Download Full-text

Research and Analysis on Gray Genetic Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.427-429.1514 ◽

2013 ◽

Vol 427-429 ◽

pp. 1514-1517

Author(s):

Zong Jiang Wang

Keyword(s):

Genetic Algorithm ◽

Parameter Space ◽

Fitness Function ◽

Search Space ◽

Small Sample ◽

Reference Sequence ◽

Match Method ◽

Image Match

In view of the slow match speed of the image, the article proposes gray genetic algorithm (GGA), one kind of new fast image match method, combining the gray connection theory with genetic algorithm. This method, firstly, determines question's parameter space to obtain several of initial points required to be match through coding the parameter space and the string collection initialized. Then the reference sequence and the comparison sequence separately are to be constructed by means of the template chart and the histogram searching for current subgraph's information. Lastly, fitness function are established on these two sequences between pessimistic interrelatedness as reference Based on this, the string collection initialized evolves gradually optimizing region of the search space after many kinds of genetic algorithm's operation, such as the choice operation, the overlapping operation and the variation operation and so on. Finally, it infinitely approaches the optimum matching position. Because the GGA law has fully applied the small sample and genetic algorithm computation parallelism characterized in the gray connection theory, the timeliness of the image match have been distinctly enhanced with certain match precision.

Download Full-text

Research on Application of Intelligence Schedule of City Pubic Traffic Vehicles Based on Genetic Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.706-708.1902 ◽

2013 ◽

Vol 706-708 ◽

pp. 1902-1906

Author(s):

Rui Chen ◽

Liang Fang

Keyword(s):

Genetic Algorithm ◽

Fitness Function ◽

Search Space ◽

Uniform Grid ◽

Penalty Function Method ◽

Improved Genetic Algorithm ◽

Start Time ◽

True Value ◽

Coding Method ◽

Research On Application

Giving attention to the benefits of the passengers and agency, this paper adopts the true value of the coding method using the start time as the variable and uses the penalty function method to add a variety of constraints to the objective function when constructing the fitness function,which simplifies the calculation. Finally, the simulation results are obtained by using the improved Genetic Algorithm for solving the non-uniform grid schedule. Results show that the improved Genetic Algorithm can find the approximate best result in the huge search space of optimization, and greatly increased the computational efficiency.

Download Full-text