An Improved Genetic Algorithm for Document Clustering on the Cloud

This article presents a modified genetic algorithm for text document clustering on the cloud. Traditional approaches of genetic algorithms in document clustering represents chromosomes based on cluster centroids, and does not divide cluster centroids during crossover operations. This limits the possibility of the algorithm to introduce different variations to the population, leading it to be trapped in local minima. In this approach, a crossover point may be selected even at a position inside a cluster centroid, which allows modifying some cluster centroids. This also guides the algorithm to get rid of the local minima, and find better solutions than the traditional approaches. Moreover, instead of running only one genetic algorithm as done in the traditional approaches, this article partitions the population and runs a genetic algorithm on each of them. This gives an opportunity to simultaneously run different parts of the algorithm on different virtual machines in cloud environments. Experimental results also demonstrate that the accuracy of the proposed approach is at least 4% higher than the other approaches.

Download Full-text

An improved genetic algorithm for task scheduling in the cloud environments using the priority queues: Formal verification, simulation, and statistical testing

Journal of Systems and Software ◽

10.1016/j.jss.2016.07.006 ◽

2017 ◽

Vol 124 ◽

pp. 1-21 ◽

Cited By ~ 115

Author(s):

Bahman Keshanchi ◽

Alireza Souri ◽

Nima Jafari Navimipour

Keyword(s):

Genetic Algorithm ◽

Formal Verification ◽

Task Scheduling ◽

Priority Queues ◽

Statistical Testing ◽

Improved Genetic Algorithm ◽

Cloud Environments

Download Full-text

An Improved Genetic Algorithm for Document Clustering with Semantic Similarity Measure

2008 Fourth International Conference on Natural Computation ◽

10.1109/icnc.2008.374 ◽

2008 ◽

Cited By ~ 3

Author(s):

Wei Song ◽

Soon Cheol Park

Keyword(s):

Genetic Algorithm ◽

Semantic Similarity ◽

Similarity Measure ◽

Document Clustering ◽

Improved Genetic Algorithm ◽

Semantic Similarity Measure

Download Full-text

Hybrid Combination of Error Back Propagation and Genetic Algorithm for Text Document Clustering

International Journal of Computer Trends and Technology ◽

10.14445/22312803/ijctt-v68i11p109 ◽

2020 ◽

Vol 68 (11) ◽

pp. 64-68

Author(s):

Ashwani Mathur

Keyword(s):

Genetic Algorithm ◽

Document Clustering ◽

Back Propagation ◽

Error Back Propagation ◽

Hybrid Combination ◽

Text Document

Download Full-text

A Solution to Reconstruct Cross-Cut Shredded Text Documents Based on Character Recognition and Genetic Algorithm

Abstract and Applied Analysis ◽

10.1155/2014/829602 ◽

2014 ◽

Vol 2014 ◽

pp. 1-11 ◽

Cited By ~ 3

Author(s):

Hedong Xu ◽

Jing Zheng ◽

Ziwei Zhuang ◽

Suohai Fan

Keyword(s):

Genetic Algorithm ◽

Character Recognition ◽

Clustering Algorithm ◽

Feature Matching ◽

Travelling Salesman Problem ◽

Improved Genetic Algorithm ◽

Text Documents ◽

Matching Algorithm ◽

Text Document ◽

Line Spacing

The reconstruction of destroyed paper documents is of more interest during the last years. This topic is relevant to the fields of forensics, investigative sciences, and archeology. Previous research and analysis on the reconstruction of cross-cut shredded text document (RCCSTD) are mainly based on the likelihood and the traditional heuristic algorithm. In this paper, a feature-matching algorithm based on the character recognition via establishing the database of the letters is presented, reconstructing the shredded document by row clustering, intrarow splicing, and interrow splicing. Row clustering is executed through the clustering algorithm according to the clustering vectors of the fragments. Intrarow splicing regarded as the travelling salesman problem is solved by the improved genetic algorithm. Finally, the document is reconstructed by the interrow splicing according to the line spacing and the proximity of the fragments. Computational experiments suggest that the presented algorithm is of high precision and efficiency, and that the algorithm may be useful for the different size of cross-cut shredded text document.

Download Full-text

Improved Genetic Algorithm for Monitoring of Virtual Machines in Cloud Environment

Smart Intelligent Computing and Applications - Smart Innovation, Systems and Technologies ◽

10.1007/978-981-13-1927-3_34 ◽

2018 ◽

pp. 319-326 ◽

Cited By ~ 2

Author(s):

Sayantani Basu ◽

G. Kannayaram ◽

Somula Ramasubbareddy ◽

C. Venkatasubbaiah

Keyword(s):

Genetic Algorithm ◽

Virtual Machines ◽

Cloud Environment ◽

Improved Genetic Algorithm

Download Full-text

An improved genetic algorithm using greedy strategy toward task scheduling optimization in cloud environments

Neural Computing and Applications ◽

10.1007/s00521-019-04119-7 ◽

2019 ◽

Vol 32 (6) ◽

pp. 1531-1541 ◽

Cited By ~ 10

Author(s):

Zhou Zhou ◽

Fangmin Li ◽

Huaxi Zhu ◽

Houliang Xie ◽

Jemal H. Abawajy ◽

...

Keyword(s):

Genetic Algorithm ◽

Task Scheduling ◽

Improved Genetic Algorithm ◽

Scheduling Optimization ◽

Greedy Strategy ◽

Cloud Environments

Download Full-text

Deadline-Constrained Cost-Effective Load-Balanced Improved Genetic Algorithm for Workflow Scheduling

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2021100101 ◽

2021 ◽

Vol 16 (4) ◽

pp. 1-34

Author(s):

Sandeep Kumar Bothra ◽

Sunita Singhal ◽

Hemlata Goyal

Keyword(s):

Genetic Algorithm ◽

Virtual Machines ◽

Cost Effective ◽

Critical Issue ◽

Scientific Workflow ◽

Workflow Scheduling ◽

Improved Genetic Algorithm ◽

Greedy Strategy ◽

Mutation Operators ◽

Load Balanced

Resource scheduling in a cloud computing environment is noteworthy for scientific workflow execution under a cost-effective deadline constraint. Although various researchers have proposed to resolve this critical issue by applying various meta-heuristic and heuristic approaches, no one is able to meet the strict deadline conditions with load-balanced among machines. This article has proposed an improved genetic algorithm that initializes the population with a greedy strategy. Greedy strategy assigns the task to a virtual machine that is under loaded instead of assigning the tasks randomly to a machine. In general workflow scheduling, task dependency is tested after each crossover and mutation operators of genetic algorithm, but here the authors perform after the mutation operation only which yield better results. The proposed model also considered booting time and performance variation of virtual machines. The authors compared the algorithm with previously developed heuristics and metaheuristics both and found it increases hit rate and load balance. It also reduces execution time and cost.

Download Full-text