KIDBSCAN: A New Efficient Data Clustering Algorithm

Data clustering is inevitable for crucial data analytic based applications. Though data clustering algorithms are capacious in the literature, there is always a room for efficient data clustering algorithms. This is due to the uncontrollable growth of data and its utilization. The data clustering may consider any of the data formats such as text, images, audio, video and so on. Due to the increasing utilization trend of digital images, this work intends to present a data clustering algorithm for digital images, which is based colour distance and Improvised DBSCAN (IDBSCAN) algorithm. The proposed IDBSCAN completely weeds out the annoying process of setting the initial parameters such as 𝜺 and 𝒎𝒊𝒏𝒑𝒕𝒔 by setting them automatically. The performance of the proposed work is analysed in terms of clustering accuracy, precision, recall, Fmeasure and time consumption rates. The proposed work outperforms the existing approaches with reasonable time consumption.

Download Full-text

Efficient data clustering algorithm designed using a heuristic approach

International Journal of Data Analysis Techniques and Strategies ◽

10.1504/ijdats.2021.10037313 ◽

2021 ◽

Vol 13 (1/2) ◽

pp. 3

Author(s):

Meeta Singh ◽

Deepa Bura ◽

Poonam Nandal

Keyword(s):

Data Clustering ◽

Clustering Algorithm ◽

Heuristic Approach ◽

Efficient Data

Download Full-text

An Improved Pigeon-Inspired Optimization for Clustering Analysis Problems

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026817500146 ◽

2017 ◽

Vol 16 (02) ◽

pp. 1750014 ◽

Cited By ~ 5

Author(s):

Haiyun Li ◽

Haifeng Li ◽

Xin Chen ◽

Kaibin Wei

Keyword(s):

Clustering Analysis ◽

Data Clustering ◽

Clustering Algorithm ◽

Heuristic Algorithms ◽

Initial Population ◽

Local Optima ◽

Parametric Control ◽

Object Based ◽

Efficient Data ◽

Efficient Alternative

Clustering is an important technology in data mining, which attempts to partition a set of objects into clusters based on the values of their attributes. [Formula: see text]-means is a simple and efficient data clustering algorithm. However, it highly depends on the initial solution and is extremely easy to be trapped in local optima. In contrast, meta-heuristic algorithms show good performance to break through the local optima obstacle. In this paper, we propose an improved pigeon-inspired optimization (IPIO) algorithm towards resolving this problem. The algorithm uses an object-based initialization method to generate the initial population and introduces a parametric control strategy to navigate the flying direction. Meanwhile, the climb process of monkey algorithm (MA) with dimension by dimension improvement is adopted to strengthen the local search ability. In this paper, experiments over six real datasets are conducted to validate the effectiveness of IPIO. The experimental results show that IPIO is an efficient alternative in resolving the clustering analysis problem.

Download Full-text

Tree-Based Algorithm for Stable and Efficient Data Clustering

Informatics ◽

10.3390/informatics7040038 ◽

2020 ◽

Vol 7 (4) ◽

pp. 38

Author(s):

Hasan Aljabbouli ◽

Abdullah Albizri ◽

Antoine Harfouche

Keyword(s):

Data Structure ◽

Data Clustering ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Convergence Properties ◽

Insertion Technique ◽

Efficient Data ◽

Tree Data ◽

Tree Data Structure ◽

Nearest Neighbor Searches

The K-means algorithm is a well-known and widely used clustering algorithm due to its simplicity and convergence properties. However, one of the drawbacks of the algorithm is its instability. This paper presents improvements to the K-means algorithm using a K-dimensional tree (Kd-tree) data structure. The proposed Kd-tree is utilized as a data structure to enhance the choice of initial centers of the clusters and to reduce the number of the nearest neighbor searches required by the algorithm. The developed framework also includes an efficient center insertion technique leading to an incremental operation that overcomes the instability problem of the K-means algorithm. The results of the proposed algorithm were compared with those obtained from the K-means algorithm, K-medoids, and K-means++ in an experiment using six different datasets. The results demonstrated that the proposed algorithm provides superior and more stable clustering solutions.

Download Full-text

Balanced Data Clustering Algorithm for Both Hard and Soft Clustering

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i2.176183 ◽

2018 ◽

Vol 6 (2) ◽

pp. 176-183

Author(s):

Purnendu Das ◽

◽

Bishwa Ranjan Roy ◽

Saptarshi Paul ◽

◽

...

Keyword(s):

Data Clustering ◽

Clustering Algorithm ◽

Soft Clustering

Download Full-text

Tree-ART2 Learning Model for Spatial Clustering in Second Dimension

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.543-547.1934 ◽

2014 ◽

Vol 543-547 ◽

pp. 1934-1938

Author(s):

Ming Xiao

Keyword(s):

Network Model ◽

Spatial Data ◽

Data Clustering ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Adaptive Resonance Theory ◽

Spatial Distance ◽

Resonance Theory ◽

Adaptive Resonance ◽

Vector Module

For a clustering algorithm in two-dimension spatial data, the Adaptive Resonance Theory exists not only the shortcomings of pattern drift and vector module of information missing, but also difficultly adapts to spatial data clustering which is irregular distribution. A Tree-ART2 network model was proposed based on the above situation. It retains the memory of old model which maintains the constraint of spatial distance by learning and adjusting LTM pattern and amplitude information of vector. Meanwhile, introducing tree structure to the model can reduce the subjective requirement of vigilance parameter and decrease the occurrence of pattern mixing. It is showed that TART2 network has higher plasticity and adaptability through compared experiments.

Download Full-text

Improved Fuzzy C-Means Clustering for Transformer Fault Diagnosis Using Dissolved Gas Analysis Data

Energies ◽

10.3390/en11092344 ◽

2018 ◽

Vol 11 (9) ◽

pp. 2344 ◽

Cited By ~ 6

Author(s):

Enwen Li ◽

Linong Wang ◽

Bin Song ◽

Siliang Jian

Keyword(s):

Fault Diagnosis ◽

Membership Function ◽

Data Clustering ◽

Clustering Algorithm ◽

Gas Analysis ◽

Dissolved Gas ◽

Fuzzy C Means ◽

Dissolved Gas Analysis ◽

Fcm Clustering ◽

Transformer Fault

Dissolved gas analysis (DGA) of the oil allows transformer fault diagnosis and status monitoring. Fuzzy c-means (FCM) clustering is an effective pattern recognition method, but exhibits poor clustering accuracy for dissolved gas data and usually fails to subsequently correctly classify transformer faults. The existing feasible approach involves combination of the FCM clustering algorithm with other intelligent algorithms, such as neural networks and support vector machines. This method enables good classification; however, the algorithm complexity is greatly increased. In this paper, the FCM clustering algorithm itself is improved and clustering analysis of DGA data is realized. First, the non-monotonicity of the traditional clustering membership function with respect to the sample distance and its several local extrema are discussed, which mainly explain the poor classification accuracy of DGA data clustering. Then, an exponential form of the membership function is proposed to obtain monotony with respect to distance, thereby improving the dissolved gas data clustering. Likewise, a similarity function to determine the degree of membership is derived. Test results for large datasets show that the improved clustering algorithm can be successfully applied for DGA-data-based transformer fault detection.

Download Full-text