A spatially constrained clustering algorithm with no prior knowledge of the number of clusters

A spatially-constrained clustering algorithm is presented in this paper. This algorithm is a distributed clustering approach to fine-tune the optimal distances between agents of the system to strengthen the data passing among them using a set of spatial constraints. In fact, this method will increase interconnectivity among agents and clusters, leading to improvement of the overall communicative functionality of the multi-robot system. This strategy will lead to the establishment of loosely-coupled connections among the clusters. These implicit interconnections will mobilize the clusters to receive and transmit information within the multi-agent system. In other words, this algorithm classifies each agent into the clusters with the lowest cost of local communication with its peers. This research demonstrates that the presented decentralized method will actually boost the communicative agility of the swarm by probabilistic proof of the acquired optimality. Hence, the common assumption regarding the full-knowledge of the agents’ primary locations has been fully relaxed compared to former methods. Consequently, the algorithm’s reliability and efficiency is confirmed. Furthermore, the method’s efficacy in passing information will improve the functionality of higher-level swarm operations, such as task assignment and swarm flocking. Analytical investigations and simulated accomplishments, corresponding to highly-populated swarms, prove the claimed efficiency and coherence.

Download Full-text

K-Means Cloning: Adaptive Spherical K-Means Clustering

Algorithms ◽

10.3390/a11100151 ◽

2018 ◽

Vol 11 (10) ◽

pp. 151 ◽

Cited By ~ 5

Author(s):

Abdel-Rahman Hedar ◽

Abdel-Monem Ibrahim ◽

Alaa Abdel-Hakim ◽

Adel Sewisy

Keyword(s):

Prior Knowledge ◽

Clustering Algorithm ◽

Negative Impact ◽

The Other ◽

Cell Cloning ◽

Number Of Clusters ◽

Cluster Set ◽

Initial Identification ◽

Novel Method ◽

Clustering Data

We propose a novel method for adaptive K-means clustering. The proposed method overcomes the problems of the traditional K-means algorithm. Specifically, the proposed method does not require prior knowledge of the number of clusters. Additionally, the initial identification of the cluster elements has no negative impact on the final generated clusters. Inspired by cell cloning in microorganism cultures, each added data sample causes the existing cluster ‘colonies’ to evaluate, with the other clusters, various merging or splitting actions in order for reaching the optimum cluster set. The proposed algorithm is adequate for clustering data in isolated or overlapped compact spherical clusters. Experimental results support the effectiveness of this clustering algorithm.

Download Full-text

Method for determining optimal number of clusters in K-means clustering algorithm

Journal of Computer Applications ◽

10.3724/sp.j.1087.2010.01995 ◽

2010 ◽

Vol 30 (8) ◽

pp. 1995-1998 ◽

Cited By ~ 18

Author(s):

Shi-bing ZHOU ◽

Zhen-yuan XU ◽

Xu-qing TANG

Keyword(s):

Clustering Algorithm ◽

Optimal Number ◽

Number Of Clusters ◽

Optimal Number Of Clusters

Download Full-text

A novel bidirectional clustering algorithm based on local density

Scientific Reports ◽

10.1038/s41598-021-93244-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Baicheng Lyu ◽

Wenhua Wu ◽

Zhiqiang Hu

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Clustering Algorithms ◽

Cluster Number ◽

Denoising Method ◽

Number Of Clusters ◽

Data Points ◽

Cutoff Distance ◽

Large Clusters ◽

Small Clusters

AbstractWith the widely application of cluster analysis, the number of clusters is gradually increasing, as is the difficulty in selecting the judgment indicators of cluster numbers. Also, small clusters are crucial to discovering the extreme characteristics of data samples, but current clustering algorithms focus mainly on analyzing large clusters. In this paper, a bidirectional clustering algorithm based on local density (BCALoD) is proposed. BCALoD establishes the connection between data points based on local density, can automatically determine the number of clusters, is more sensitive to small clusters, and can reduce the adjusted parameters to a minimum. On the basis of the robustness of cluster number to noise, a denoising method suitable for BCALoD is proposed. Different cutoff distance and cutoff density are assigned to each data cluster, which results in improved clustering performance. Clustering ability of BCALoD is verified by randomly generated datasets and city light satellite images.

Download Full-text

A dynamic genetic clustering algorithm for automatic choice of the number of clusters

2011 9th IEEE International Conference on Control and Automation (ICCA) ◽

10.1109/icca.2011.6137921 ◽

2011 ◽

Cited By ~ 2

Author(s):

Hong He ◽

Yonghong Tan

Keyword(s):

Clustering Algorithm ◽

Number Of Clusters ◽

Genetic Clustering ◽

Automatic Choice

Download Full-text

Self-Adaptive K-Means Based on a Covering Algorithm

Complexity ◽

10.1155/2018/7698274 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 1

Author(s):

Yiwen Zhang ◽

Yuanyuan Zhou ◽

Xing Guo ◽

Jintao Wu ◽

Qiang He ◽

...

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

Real Data ◽

Second Phase ◽

Data Sets ◽

Number Of Clusters ◽

Large Scale Data ◽

Long Time ◽

Two Phases ◽

Selection Of

The K-means algorithm is one of the ten classic algorithms in the area of data mining and has been studied by researchers in numerous fields for a long time. However, the value of the clustering number k in the K-means algorithm is not always easy to be determined, and the selection of the initial centers is vulnerable to outliers. This paper proposes an improved K-means clustering algorithm called the covering K-means algorithm (C-K-means). The C-K-means algorithm can not only acquire efficient and accurate clustering results but also self-adaptively provide a reasonable numbers of clusters based on the data features. It includes two phases: the initialization of the covering algorithm (CA) and the Lloyd iteration of the K-means. The first phase executes the CA. CA self-organizes and recognizes the number of clusters k based on the similarities in the data, and it requires neither the number of clusters to be prespecified nor the initial centers to be manually selected. Therefore, it has a “blind” feature, that is, k is not preselected. The second phase performs the Lloyd iteration based on the results of the first phase. The C-K-means algorithm combines the advantages of CA and K-means. Experiments are carried out on the Spark platform, and the results verify the good scalability of the C-K-means algorithm. This algorithm can effectively solve the problem of large-scale data clustering. Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions.

Download Full-text

ENTROPY-BASED CLUSTER VALIDATION AND ESTIMATION OF THE NUMBER OF CLUSTERS IN GENE EXPRESSION DATA

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720012500114 ◽

2012 ◽

Vol 10 (05) ◽

pp. 1250011

Author(s):

NATALIA NOVOSELOVA ◽

IGOR TOM

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Clustering Algorithm ◽

Selection Procedure ◽

Biological Knowledge ◽

Consensus Clustering ◽

Expression Data ◽

Cluster Validation ◽

Number Of Clusters ◽

Validity Measure

Many external and internal validity measures have been proposed in order to estimate the number of clusters in gene expression data but as a rule they do not consider the analysis of the stability of the groupings produced by a clustering algorithm. Based on the approach assessing the predictive power or stability of a partitioning, we propose the new measure of cluster validation and the selection procedure to determine the suitable number of clusters. The validity measure is based on the estimation of the "clearness" of the consensus matrix, which is the result of a resampling clustering scheme or consensus clustering. According to the proposed selection procedure the stable clustering result is determined with the reference to the validity measure for the null hypothesis encoding for the absence of clusters. The final number of clusters is selected by analyzing the distance between the validity plots for initial and permutated data sets. We applied the selection procedure to estimate the clustering results on several datasets. As a result the proposed procedure produced an accurate and robust estimate of the number of clusters, which are in agreement with the biological knowledge and gold standards of cluster quality.

Download Full-text

TCLUST: Trimming Approach of Robust Clustering Method

Malaysian Journal of Fundamental and Applied Sciences ◽

10.11113/mjfas.v8n4.154 ◽

2014 ◽

Vol 8 (4) ◽

Author(s):

Muhamad Alias Md. Jedi ◽

Robiah Adnan

Keyword(s):

Clustering Algorithm ◽

Likelihood Function ◽

R Package ◽

Clustering Method ◽

Number Of Clusters ◽

Robust Clustering ◽

Scatter Matrix ◽

Group Assignment ◽

Log Likelihood ◽

Clustering Approach

TCLUST is a method in statistical clustering technique which is based on modification of trimmed k-means clustering algorithm. It is called “crisp” clustering approach because the observation is can be eliminated or assigned to a group. TCLUST strengthen the group assignment by putting constraint to the cluster scatter matrix. The emphasis in this paper is to restrict on the eigenvalues, λ of the scatter matrix. The idea of imposing constraints is to maximize the log-likelihood function of spurious-outlier model. A review of different robust clustering approach is presented as a comparison to TCLUST methods. This paper will discuss the nature of TCLUST algorithm and how to determine the number of cluster or group properly and measure the strength of group assignment. At the end of this paper, R-package on TCLUST implement the types of scatter restriction, making the algorithm to be more flexible for choosing the number of clusters and the trimming proportion.

Download Full-text

SISTEM APLIKASI BERBASIS OPTIMASI METODE ELBOW UNTUK PENENTUAN CLUSTERING PELANGGAN

JOUTICA ◽

10.30736/jti.v3i1.196 ◽

2018 ◽

Vol 3 (1) ◽

pp. 117 ◽

Cited By ~ 1

Author(s):

Elly Muningsih ◽

Sri Kiswati

Keyword(s):

Data Center ◽

Clustering Algorithm ◽

Visual Basic ◽

Customer Management ◽

Number Of Clusters ◽

Transaction Data ◽

Development Method ◽

Popular Method ◽

Or Groups ◽

Cluster 2

Customer is a very important asset for the company. Having customers who are loyal to the company is an absolute and important for the progress of the company. This study aims to help companies, especially in the online shop to create a better customer management by identifying and grouping customers into several clusters or groups to know the characteristics of their loyalty to the company. The method used in this research is K-Means method which is one of the best and most popular method in clustering algorithm. To overcome the weakness of the K-Means method in determining the number of clusters, we use the Elbow method where this method gets the comparison of the number of clusters added by calculating the SSE (Sum of Square Error) of each cluster value. This research starts from collecting the necessary data and will be processed. From total transaction data 478 then done cleaning of data and result 73 data. Then the data processed with RapidMiner software from Cluster 2 up to 10 to search the data center of each cluster. From the calculated SSE value found that the best number of clusters is 3. The end result of the research is a Visual Basic based application program that is expected to provide ease in grouping or clustering customers. Software development method using Waterfall method.

Download Full-text

Spatially Constrained Clustering to Define Geographical Rating Territories

Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods ◽

10.5220/0006118100820088 ◽

2017 ◽

Author(s):

Shengkun Xie ◽

Anna T. Lawniczak ◽

Zizhen Wang

Keyword(s):

Constrained Clustering ◽

Spatially Constrained Clustering

Download Full-text