An Improved Genetic Algorithm for Text Clustering
The genetic algorithm (GA) is a self-adapted probability search method used to solve optimization problems, which has been applied widely in science and engineering. In this paper, we propose an improved variable string length genetic algorithm (IVGA) for text clustering. Our algorithm has been exploited for automatically evolving the optimal number of clusters as well as providing proper data set clustering. The chromosome is encoded by special indices to indicate the location of each gene. More effective version of evolutional steps can automatically adjust the influence between the diversity of the population and selective pressure during generations. The superiority of the improved genetic algorithm over conventional variable string length genetic algorithm (VGA) is demonstrated by providing proper text clustering.