Chaotic Tornadogenesis Optimization Algorithm for Data Clustering Problems

Author(s):  
Ravi Kumar Saidala ◽  
Nagaraju Devarakonda

This article describes how clustering is an attractive and major task in data mining in which particular set of objects are grouped according to their similarities based on some criteria. Among the numerous algorithms, k-Means is the best and efficient in address clustering problems. Any expert system is said to be good, only if it returns the optimal data clusters. The challenge of optimal clustering lies in finding the optimal number of clusters and identifying all the data groups correctly which is a NP-hard problem. Recently a new optimization algorithm TOA was developed to address these problems. However, the standard TOA is too often trapped at the local optima and premature convergence. To overcome this, this article proposes CTOA. The main objective of embedding chaotic maps into standard TOA is to compute and automatically adapt the internal parameters. The proposed CTOA is first benchmarked on standard mathematical functions and later applied to 10 data clustering problems. The obtained graphical and statistical results along with comparisons illustrate the capabilities of CTOA regarding accuracy and robustness

2012 ◽  
Vol 3 (1) ◽  
pp. 1-20
Author(s):  
Amit Banerjee

In this paper, a multi-objective genetic algorithm for data clustering based on the robust fuzzy least trimmed squares estimator is presented. The proposed clustering methodology addresses two critical issues in unsupervised data clustering – the ability to produce meaningful partition in noisy data, and the requirement that the number of clusters be known a priori. The multi-objective genetic algorithm-driven clustering technique optimizes the number of clusters as well as cluster assignment, and cluster prototypes. A two-parameter, mapped, fixed point coding scheme is used to represent assignment of data into the true retained set and the noisy trimmed set, and the optimal number of clusters in the retained set. A three-objective criterion is also used as the minimization functional for the multi-objective genetic algorithm. Results on well-known data sets from literature suggest that the proposed methodology is superior to conventional fuzzy clustering algorithms that assume a known value for optimal number of clusters.


2021 ◽  
Vol 25 (3) ◽  
pp. 605-626
Author(s):  
Chen Zhao ◽  
Zhongxin Liu ◽  
Zengqiang Chen ◽  
Yao Ning

Krill herd algorithm (KHA) is an emerging nature-inspired approach that has been successfully applied to optimization. However, KHA may get stuck into local optima owing to its poor exploitation. In this paper, the orthogonal learning (OL) mechanism is incorporated to enhance the performance of KHA for the first time, then an improved method named orthogonal krill herd algorithm (OKHA) is obtained. Compared with the existing hybridizations of KHA, OKHA could discover more useful information from historical data and construct a more promising solution. The proposed algorithm is applied to solve CEC2017 numerical problems, and its robustness is verified based on the simulation results. Moreover, OKHA is applied to tackle data clustering problems selected from the UCI Machine Learning Repository. The experimental results illustrate that OKHA is superior to or at least competitive with other representative clustering techniques.


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4086
Author(s):  
Tribhuvan Singh ◽  
Nitin Saxena ◽  
Manju Khurana ◽  
Dilbag Singh ◽  
Mohamed Abdalla ◽  
...  

A k-means algorithm is a method for clustering that has already gained a wide range of acceptability. However, its performance extremely depends on the opening cluster centers. Besides, due to weak exploration capability, it is easily stuck at local optima. Recently, a new metaheuristic called Moth Flame Optimizer (MFO) is proposed to handle complex problems. MFO simulates the moths intelligence, known as transverse orientation, used to navigate in nature. In various research work, the performance of MFO is found quite satisfactory. This paper suggests a novel heuristic approach based on the MFO to solve data clustering problems. To validate the competitiveness of the proposed approach, various experiments have been conducted using Shape and UCI benchmark datasets. The proposed approach is compared with five state-of-art algorithms over twelve datasets. The mean performance of the proposed algorithm is superior on 10 datasets and comparable in remaining two datasets. The analysis of experimental results confirms the efficacy of the suggested approach.


2018 ◽  
Vol 14 (1) ◽  
pp. 11-23 ◽  
Author(s):  
Lin Zhang ◽  
Yanling He ◽  
Huaizhi Wang ◽  
Hui Liu ◽  
Yufei Huang ◽  
...  

Background: RNA methylome has been discovered as an important layer of gene regulation and can be profiled directly with count-based measurements from high-throughput sequencing data. Although the detailed regulatory circuit of the epitranscriptome remains uncharted, clustering effect in methylation status among different RNA methylation sites can be identified from transcriptome-wide RNA methylation profiles and may reflect the epitranscriptomic regulation. Count-based RNA methylation sequencing data has unique features, such as low reads coverage, which calls for novel clustering approaches. <P><P> Objective: Besides the low reads coverage, it is also necessary to keep the integer property to approach clustering analysis of count-based RNA methylation sequencing data. <P><P> Method: We proposed a nonparametric generative model together with its Gibbs sampling solution for clustering analysis. The proposed approach implements a beta-binomial mixture model to capture the clustering effect in methylation level with the original count-based measurements rather than an estimated continuous methylation level. Besides, it adopts a nonparametric Dirichlet process to automatically determine an optimal number of clusters so as to avoid the common model selection problem in clustering analysis. <P><P> Results: When tested on the simulated system, the method demonstrated improved clustering performance over hierarchical clustering, K-means, MClust, NMF and EMclust. It also revealed on real dataset two novel RNA N6-methyladenosine (m6A) co-methylation patterns that may be induced directly by METTL14 and WTAP, which are two known regulatory components of the RNA m6A methyltransferase complex. <P><P> Conclusion: Our proposed DPBBM method not only properly handles the count-based measurements of RNA methylation data from sites of very low reads coverage, but also learns an optimal number of clusters adaptively from the data analyzed. <P><P> Availability: The source code and documents of DPBBM R package are freely available through the Comprehensive R Archive Network (CRAN): https://cran.r-project.org/web/packages/DPBBM/.


Author(s):  
Prachi Agrawal ◽  
Talari Ganesh ◽  
Ali Wagdy Mohamed

AbstractThis article proposes a novel binary version of recently developed Gaining Sharing knowledge-based optimization algorithm (GSK) to solve binary optimization problems. GSK algorithm is based on the concept of how humans acquire and share knowledge during their life span. A binary version of GSK named novel binary Gaining Sharing knowledge-based optimization algorithm (NBGSK) depends on mainly two binary stages: binary junior gaining sharing stage and binary senior gaining sharing stage with knowledge factor 1. These two stages enable NBGSK for exploring and exploitation of the search space efficiently and effectively to solve problems in binary space. Moreover, to enhance the performance of NBGSK and prevent the solutions from trapping into local optima, NBGSK with population size reduction (PR-NBGSK) is introduced. It decreases the population size gradually with a linear function. The proposed NBGSK and PR-NBGSK applied to set of knapsack instances with small and large dimensions, which shows that NBGSK and PR-NBGSK are more efficient and effective in terms of convergence, robustness, and accuracy.


Sign in / Sign up

Export Citation Format

Share Document