Hybrid Combination of Error Back Propagation and Genetic Algorithm for Text Document Clustering

Author(s):  
Ashwani Mathur
2018 ◽  
Vol 8 (4) ◽  
pp. 20-28
Author(s):  
Ruksana Akter ◽  
Yoojin Chung

This article presents a modified genetic algorithm for text document clustering on the cloud. Traditional approaches of genetic algorithms in document clustering represents chromosomes based on cluster centroids, and does not divide cluster centroids during crossover operations. This limits the possibility of the algorithm to introduce different variations to the population, leading it to be trapped in local minima. In this approach, a crossover point may be selected even at a position inside a cluster centroid, which allows modifying some cluster centroids. This also guides the algorithm to get rid of the local minima, and find better solutions than the traditional approaches. Moreover, instead of running only one genetic algorithm as done in the traditional approaches, this article partitions the population and runs a genetic algorithm on each of them. This gives an opportunity to simultaneously run different parts of the algorithm on different virtual machines in cloud environments. Experimental results also demonstrate that the accuracy of the proposed approach is at least 4% higher than the other approaches.


Author(s):  
Ruksana Akter ◽  
Yoojin Chung

This article presents a modified genetic algorithm for text document clustering on the cloud. Traditional approaches of genetic algorithms in document clustering represents chromosomes based on cluster centroids, and does not divide cluster centroids during crossover operations. This limits the possibility of the algorithm to introduce different variations to the population, leading it to be trapped in local minima. In this approach, a crossover point may be selected even at a position inside a cluster centroid, which allows modifying some cluster centroids. This also guides the algorithm to get rid of the local minima, and find better solutions than the traditional approaches. Moreover, instead of running only one genetic algorithm as done in the traditional approaches, this article partitions the population and runs a genetic algorithm on each of them. This gives an opportunity to simultaneously run different parts of the algorithm on different virtual machines in cloud environments. Experimental results also demonstrate that the accuracy of the proposed approach is at least 4% higher than the other approaches.


Author(s):  
Laith Mohammad Abualigah ◽  
Essam Said Hanandeh ◽  
Ahamad Tajudin Khader ◽  
Mohammed Abdallh Otair ◽  
Shishir Kumar Shandilya

Background: Considering the increasing volume of text document information on Internet pages, dealing with such a tremendous amount of knowledge becomes totally complex due to its large size. Text clustering is a common optimization problem used to manage a large amount of text information into a subset of comparable and coherent clusters. Aims: This paper presents a novel local clustering technique, namely, β-hill climbing, to solve the problem of the text document clustering through modeling the β-hill climbing technique for partitioning the similar documents into the same cluster. Methods: The β parameter is the primary innovation in β-hill climbing technique. It has been introduced in order to perform a balance between local and global search. Local search methods are successfully applied to solve the problem of the text document clustering such as; k-medoid and kmean techniques. Results: Experiments were conducted on eight benchmark standard text datasets with different characteristics taken from the Laboratory of Computational Intelligence (LABIC). The results proved that the proposed β-hill climbing achieved better results in comparison with the original hill climbing technique in solving the text clustering problem. Conclusion: The performance of the text clustering is useful by adding the β operator to the hill climbing.


Symmetry ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 1082
Author(s):  
Fanqiang Meng

Risk and security are two symmetric descriptions of the uncertainty of the same system. If the risk early warning is carried out in time, the security capability of the system can be improved. A safety early warning model based on fuzzy c-means clustering (FCM) and back-propagation neural network was established, and a genetic algorithm was introduced to optimize the connection weight and other properties of the neural network, so as to construct the safety early warning system of coal mining face. The system was applied in a coal face in Shandong, China, with 46 groups of data as samples. Firstly, the original data were clustered by FCM, the input space was fuzzy divided, and the samples were clustered into three categories. Then, the clustered data was used as the input of the neural network for training and prediction. The back-propagation neural network and genetic algorithm optimization neural network were trained and verified many times. The results show that the early warning model can realize the prediction and early warning of the safety condition of the working face, and the performance of the neural network model optimized by genetic algorithm is better than the traditional back-propagation artificial neural network model, with higher prediction accuracy and convergence speed. The established early warning model and method can provide reference and basis for the prediction, early warning and risk management of coal mine production safety, so as to discover the hidden danger of working face accident as soon as possible, eliminate the hidden danger in time and reduce the accident probability to the maximum extent.


2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
Haisheng Song ◽  
Ruisong Xu ◽  
Yueliang Ma ◽  
Gaofei Li

The back propagation neural network (BPNN) algorithm can be used as a supervised classification in the processing of remote sensing image classification. But its defects are obvious: falling into the local minimum value easily, slow convergence speed, and being difficult to determine intermediate hidden layer nodes. Genetic algorithm (GA) has the advantages of global optimization and being not easy to fall into local minimum value, but it has the disadvantage of poor local searching capability. This paper uses GA to generate the initial structure of BPNN. Then, the stable, efficient, and fast BP classification network is gotten through making fine adjustments on the improved BP algorithm. Finally, we use the hybrid algorithm to execute classification on remote sensing image and compare it with the improved BP algorithm and traditional maximum likelihood classification (MLC) algorithm. Results of experiments show that the hybrid algorithm outperforms improved BP algorithm and MLC algorithm.


Sign in / Sign up

Export Citation Format

Share Document