scholarly journals Optimasi Bobot K-Means Clustering untuk Mengatasi Missing Value dengan Menggunakan Algoritma Genetica

2021 ◽  
Vol 8 (4) ◽  
pp. 745
Author(s):  
Bain Khusnul Khotimah ◽  
Muhammad Syarief ◽  
Miswanto Miswanto ◽  
Herry Suprajitno

<p class="Abstrak">Nilai yang hilang membutuhkan preprosesing dengan teknik imputasi untuk menghasilkan data yang lengkap. Proses imputasi membutuhkan initial bobot yang sesuai, karena data yang dihasilkan adalah data pengganti. Pemilihan nilai bobot yang optimal dan kesesuaian nilai <em>K</em> pada metode <em>K-Means</em> Imputation (KMI) merupakan masalah besar, sehingga menimbulkan error semakin meningkat. Model gabungan algoritma genetika (GA) dan KMI atau yang dikenal GAKMI digunakan untuk menentukan bobot optimal pada setiap <em>cluster</em> data yang mengandung nilai yang hilang. Algoritma genetika digunakan untuk memilih bobot dengan menggunakan pengkodean bilangan riel pada kromosom. Model hybrid GA dan KMI dengan pengelompokan menggunakan jumlah jarak <em>Euclidian</em> setiap titik data dari pusat clusternya. Pengukuran kinerja algoritma menggunakan fungsi kebugaran optimal dengan nilai MSE terkecil. Hasil percobaan data hepatitis menunjukkan bahwa GA efisien dalam menemukan nilai bobot awal optimal dari ruang pencarian yang besar. Hasil perhitungan menggunakan nilai MSE =0.044 pada K=3 dan replika ke-5 menunjukkan kinerja GAKMI menghasilkan tingkat kesalahan yang rendah untuk data hepatitis dengan atribut campuran. Hasil penelitian dengan menggunakan pengujian tingkat imputasi menunjukkan algoritma GAKMI menghasilkan nilai <em>r</em> = 0.526 lebih tinggi dibandingkan dengan metode lainnya. Penelitian ini menunjukkan GAKMI menghasilkan nilai r yang lebih tinggi dibandingkan metode imputasi lainnya sehingga dianggap paling baik dibandingkan teknik imputasi secara umum. </p><p class="Abstrak"> </p><p class="Abstrak"><em><strong>Abstract</strong></em></p><p class="Judul2"><em>Missing values require preprocessing techniques as imputation to produce complete data. Complete data imputation results require the appropriate initial weights, because the resulting data is replacement data. The choice of the optimal weighting value and the suitability of the network nodes in the K-Means Imputation (KMI) method are big problems, causing increasing errors. The combined model of Genetic Algorithm (GA) and KMI is used to determine the optimal weights for each data cluster containing missing values. Genetic algorithm is used to select weights by using real number coding on chromosomes. GA is applied to the KMI using clustering calculated using the sum of the Euclidean distances of each data point from the center of the cluster. Performance measurement algorithms using the fitness function optimally with the smallest MSE value. The results of the hepatitis data experiment show that GA is efficient in finding the optimal initial weight value from a large search space. The results of calculations using the MSE value = 0.04 </em><em>for</em><em> K = 3 and the 5th replication</em><em>. So, </em><em>GAKMI resulted in a low error rate for mixed data. The results of research using imputation level testing </em><em>performed</em><em> GAKMI  produc</em><em>ed</em><em> r = 0.526 higher than the other methods. Thus, the higher the r value, the best for the imputation technique.</em></p><p class="Abstrak"><em><strong><br /></strong></em></p>

2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

The Casse-tête board puzzle consists of an n×n grid covered with n^2 tokens. m&lt;n^2 tokens are deleted from the grid so that each row and column of the grid contains an even number of remaining tokens. The size of the search space is exponential. This study used a genetic algorithm (GA) to design and implement solutions for the board puzzle. The chromosome representation is a matrix of binary permutations. Variants for two crossover operators and two mutation operators were presented. The study experimented with and compared four possible operator combinations. Additionally, it compared GA and simulated annealing (SA)-based solutions, finding a 100% success rate (SR) for both. However, the GA-based model was more effective in solving larger instances of the puzzle than the SA-based model. The GA-based model was found to be considerably more efficient than the SA-based model when measured by the number of fitness function evaluations (FEs). The Wilcoxon signed-rank test confirms a significant difference among FEs in the two models (p=0.038).


2020 ◽  
Vol 30 (1) ◽  
pp. 142-164 ◽  
Author(s):  
Venkatesh SS ◽  
Deepak Mishra

Abstract This paper introduce a new variant of the Genetic Algorithm whichis developed to handle multivariable, multi-objective and very high search space optimization problems like the solving system of non-linear equations. It is an integer coded Genetic Algorithm with conventional cross over and mutation but with Inverse algorithm is varying its search space by varying its digit length on every cycle and it does a fine search followed by a coarse search. And its solution to the optimization problem will converge to precise value over the cycles. Every equation of the system is considered as a single minimization objective function. Multiple objectives are converted to a single fitness function by summing their absolute values. Some difficult test functions for optimization and applications are used to evaluate this algorithm. The results prove that this algorithm is capable to produce promising and precise results.


2014 ◽  
Vol 602-605 ◽  
pp. 1348-1351 ◽  
Author(s):  
Ling Lu ◽  
Hua Zhang

According to micro UAV is susceptible to wind environmental impact during the flight, this paper puts forward a kind of ergodicity track strategy for return path re planning. By adding coverage flight and sacrificing path to achieve safe recovery of voyage. A genetic algorithm is used to solve the problem, and introducing dynamic penalty function to improve the fitness function, effectively reduce the search space. The simulation results show that the method can generate safely track, and satisfy the path constraints.


2003 ◽  
Vol 1831 (1) ◽  
pp. 210-218 ◽  
Author(s):  
Richard Balling ◽  
Michael Lowry ◽  
Mitsuru Saito

A new approach to regional land use and transportation planning, which uses a genetic algorithm as an integrated optimization tool, is presented. The approach is illustrated by applying it to the Wasatch Front Metropolitan Region, which consists of four counties in the state of Utah. This genetic algorithm–-based approach was applied earlier to the twin cities of Provo and Orem in Utah, but here it is adapted to regional planning. Three issues make regional planning particularly difficult: ( a) individual cities have significant planning autonomy, ( b) the search space of possible plans is immense, and ( c) preferences between competing objectives vary among stakeholders. The approach used here addresses the first issue by the way the problem is formulated. The second issue is addressed with a genetic algorithm. Such algorithms are particularly well suited to problems with large search spaces. The third issue is addressed by using a multiobjective fitness function in the genetic algorithm. It was found that a genetic algorithm could produce a set of nondominated future land use scenarios and street plans for a region, from which regional planners can make a selection. Execution of the algorithm to produce 100 plans per generation for 100 generations took about 4 days with a high-end personal computer. Interesting trends for reducing change and traffic congestion were discovered.


2012 ◽  
Vol 2012 ◽  
pp. 1-11 ◽  
Author(s):  
Satoko Kikuchi ◽  
Goutam Chakraborty

To decode a long genome sequence, shotgun sequencing is the state-of-the-art technique. It needs to properly sequence a very large number, sometimes as large as millions, of short partially readable strings (fragments). Arranging those fragments in correct sequence is known as fragment assembling, which is an NP-problem. Presently used methods require enormous computational cost. In this work, we have shown how our modified genetic algorithm (GA) could solve this problem efficiently. In the proposed GA, the length of the chromosome, which represents the volume of the search space, is reduced with advancing generations, and thereby improves search efficiency. We also introduced a greedy mutation, by swapping nearby fragments using some heuristics, to improve the fitness of chromosomes. We compared results with Parsons’ algorithm which is based on GA too. We used fragments with partial reads on both sides, mimicking fragments in real genome assembling process. In Parsons’ work base-pair array of the whole fragment is known. Even then, we could obtain much better results, and we succeeded in restructuring contigs covering 100% of the genome sequences.


1997 ◽  
Vol 5 (3) ◽  
pp. 277-302 ◽  
Author(s):  
Anne M. Raich ◽  
Jamshid Ghaboussi

A new representation combining redundancy and implicit fitness constraints is introduced that performs better than a simple genetic algorithm (GA) and a structured GA in experiments. The implicit redundant representation (IRR) consists of a string that is over-specified, allowing for sections of the string to remain inactive during function evaluation. The representation does not require the user to prespecify the number of parameters to evaluate or the location of these parameters within the string. This information is obtained implicitly by the fitness function during the GA operations. The good performance of the IRR can be attributed to several factors: less disruption of existing fit members due to the increased probability of crossovers and mutation affecting only redundant material; discovery of fit members through the conversion of redundant material into essential information; and the ability to enlarge or reduce the search space dynamically by varying the number of variables evaluated by the fitness function. The IRR GA provides a more biologically parallel representation that maintains a diverse population throughout the evolution process. In addition, the IRR provides the necessary flexibility to represent unstructured problem domains that do not have the explicit constraints required by fixed representations.


2021 ◽  
Vol 13 (2) ◽  
pp. 38-51
Author(s):  
Nasim Soltani Soulegan ◽  
◽  
Behrang Barekatain ◽  
Behzad Soleimani Neysiani

Cloud computing is considered a pattern for distributed and heterogeneous computing derived from many resources, and requests aim to share resources. Recently, cloud computing is graded among the top best technologies globally, which must be scheduled favorably to maximize providers’ profit and improve service quality for their customers. Scheduling specifies how users’ requests are assigned to virtual machines, and it plays a vital role in the efficiency and capability of the system. Its objective is to have a throughput or complete jobs in minimum time and the highest standard. Scheduling jobs in heterogeneous distributed systems is an NP-hard polynomial indecisive problem that is not solvable in polynomial time for real-time scheduling. The time complexity of jobs is growing exponentially, and this problem has a considerable effect on the quality of cloud services and providers’ efficiencies. The optimization of scheduling-related parameters using heuristic and meta-heuristic algorithms can reduce the search space complexity and execution time. This study intends to represent a fitness function to minimize time and cost parameters. The proposed method uses a multi-purposed weighted genetic algorithm that provides six basic parameters: utility, task execution cost, response time, wait time, Makespan, and throughput to provide comprehensive optimization. The proposed approach improved response and wait times, throughput, Makespan, and utility 16, 9, 7, 8 percentages, respectively, by only a one cost unit reduction, which is dispensable. As a result, both providers and users will experience better services. The statistical tests show that the achieved improvement is valid for 94% of experiments.


Author(s):  
Keshavamurthy B. N ◽  
Asad Mohammed Khan ◽  
Durga Toshniwal

Classification is the supervised learning technique of data mining which is used to extract hidden useful knowledge over a large volume of databases by predicting the class values based on the predicting attribute values. Of the various techniques, the most widely talked ones include decision tree, probabilistic model and evolutionary algorithms. Recently, the probabilistic and evolutionary techniques are most worked upon, because of the fact that probabilistic models yields high accuracy when there is no attribute dependency in the existing problem and evolutionary algorithms are used to obtain optimal solution over a large search space very quickly when there is less information known about a problem and problem posses attribute dependency. Though there are tradeoffs in each model still there are scopes to improve upon the existing. The proposed approach improves the evolutionary technique such as genetic algorithm by improving the fitness function parameters. Also, in this we compare the genetic algorithm results with Naïve Bayes algorithm results. For the experimental work we have used the nursery data set taken from the UCI Machine Learning Repository.


2013 ◽  
Vol 427-429 ◽  
pp. 1514-1517
Author(s):  
Zong Jiang Wang

In view of the slow match speed of the image, the article proposes gray genetic algorithm (GGA), one kind of new fast image match method, combining the gray connection theory with genetic algorithm. This method, firstly, determines question's parameter space to obtain several of initial points required to be match through coding the parameter space and the string collection initialized. Then the reference sequence and the comparison sequence separately are to be constructed by means of the template chart and the histogram searching for current subgraph's information. Lastly, fitness function are established on these two sequences between pessimistic interrelatedness as reference Based on this, the string collection initialized evolves gradually optimizing region of the search space after many kinds of genetic algorithm's operation, such as the choice operation, the overlapping operation and the variation operation and so on. Finally, it infinitely approaches the optimum matching position. Because the GGA law has fully applied the small sample and genetic algorithm computation parallelism characterized in the gray connection theory, the timeliness of the image match have been distinctly enhanced with certain match precision.


2013 ◽  
Vol 706-708 ◽  
pp. 1902-1906
Author(s):  
Rui Chen ◽  
Liang Fang

Giving attention to the benefits of the passengers and agency, this paper adopts the true value of the coding method using the start time as the variable and uses the penalty function method to add a variety of constraints to the objective function when constructing the fitness function,which simplifies the calculation. Finally, the simulation results are obtained by using the improved Genetic Algorithm for solving the non-uniform grid schedule. Results show that the improved Genetic Algorithm can find the approximate best result in the huge search space of optimization, and greatly increased the computational efficiency.


Sign in / Sign up

Export Citation Format

Share Document