hard clustering
Recently Published Documents


TOTAL DOCUMENTS

73
(FIVE YEARS 22)

H-INDEX

13
(FIVE YEARS 1)

2022 ◽  
Vol 10 (4) ◽  
pp. 544-553
Author(s):  
Ratna Kurniasari ◽  
Rukun Santoso ◽  
Alan Prahutama

Effective communication between the government and society is essential to achieve good governance. The government makes an effort to provide a means of public complaints through an online aspiration and complaint service called “LaporGub..!”. To group incoming reports easier, the topic of the report is searched by using clustering. Text Mining is used to convert text data into numeric data so that it can be processed further. Clustering is classified as soft clustering (fuzzy) and hard clustering. Hard clustering will divide data into clusters strictly without any overlapping membership with other clusters. Soft clustering can enter data into several clusters with a certain degree of membership value. Different membership values make fuzzy grouping have more natural results than hard clustering because objects at the boundary between several classes are not forced to fully fit into one class but each object is assigned a degree of membership. Fuzzy c-means has an advantage in terms of having a more precise placement of the cluster center compared to other cluster methods, by improving the cluster center repeatedly. The formation of the best number of clusters is seen based on the maximum silhouette coefficient. Wordcloud is used to determine the dominant topic in each cluster. Word cloud is a form of text data visualization. The results show that the maximum silhouette coefficient value for fuzzy c-means clustering is shown by the three clusters. The first cluster produces a word cloud regarding road conditions as many as 449 reports, the second cluster produces a word cloud regarding covid assistance as many as 964 reports, and the third cluster produces a word cloud regarding farmers fertilizers as many as 176 reports. The topic of the report regarding covid assistance is the cluster with the most number of members. 


2021 ◽  
Vol 15 (2) ◽  
pp. 385-392
Author(s):  
Muhamad Budiman Johra

Mengembangkan wilayah untuk mengurangi kesenjangan dan menjamin pemerataan merupakan salah satu dari tujuh agenda Pembangunana RPJMN IV Tahun 2020-2024. Setiap wilayah tentunya memiliki potensi yang berbeda, baik potensi fisik maupun non-fisik. Perbedaan inilah yang menjadi dasar dalam pengelompokan desa sehingga pembangunan desa menjadi lebih terarah. Secara umum metode klaster dapat dibedakan menjadi dua kelompok yaitu hard clustering dan soft clustering. Pada hard clustering setiap objek dipetakan terhadap setiap kelompok. Metode yang populer pada kelompok hard clustering adalah Cluster K-Means. Sedangkan pada soft clustering objek tidak hanya dipetakan kedalam satu kelompok. Fuzzy K Means (FCM) merupakan salah satu metode dalam soft clustering, dimana Fuzzy K Means merupakan pengembangan dari Cluster K-Means. Cara kerja FCM adalah objek diberi probabilitas yang pada dasarnya menggambarkan kepemilikan objek ke dalam Cluster.


Author(s):  
Naghmeh Pakgohar ◽  
Javad Eshaghi Rad ◽  
Gholam Hossein Gholami ◽  
Ahmad Alijanpour ◽  
David W. Roberts

2021 ◽  
Vol 5 (1) ◽  
pp. 141-160
Author(s):  
Nurafiza Thamrin ◽  
Arie Wahyu Wijayanto

The National Medium Term Development Plan 2020-2024 states that one of the visions of national development is to accelerate the distribution of welfare and justice. Cluster analysis is analysis that grouping of objects into several smaller groups where the objects in one group have similar characteristics. This study was conducted to find the best clustering method and to classify cities based on the level of welfare in Java. In this study, the cluster analysis that used was hard clustering such as K-Means, K-Medoids (PAM and CLARA), and Hierarchical Agglomerative as well as soft clustering such as Fuzzy C Means. This study use elbow method, silhouette method, and gap statistics to determine the optimal number of clusters. From the evaluation results of the silhouette coefficient, dunn index, connectivity coefficient, and Sw/Sb ratio, it was found that the best cluster analysis was Agglomerative Ward Linkage which produced three clusters. The first cluster consists of 27 cities with moderate welfare, the second cluster consists of 16 cities with high welfare, the third cluster consists of 76 cities with low welfare. With the best clustering results, the government of cities in Java shall be able to make a better policies of welfare based on the dominant indicators found in each cluster.


PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0241888
Author(s):  
Thanasan Intarakumthornchai ◽  
Ramil Kesvarakul

Chicken egg products increased by 60% worldwide resulting in the farmers or traders egg industry. The double yolk (DY) eggs are priced higher than single yolk (SY) eggs around 35% at the same size. Although, separating DY from SY will increase more revenue but it has to be replaced at the higher cost from skilled labor for sorting. Normally, the separation of double yolk eggs required the expertise person by weigh and shape of egg but it is still high error. The purpose of this research is to detect double-yolked (DY) chicken eggs with weight and ratio of the egg’s size using fuzzy logic and developing a low cost prototype to reduce the cost of separation. The K-means clustering is used for separating DY and SY, firstly. However, the error from this technique is still high as 15.05% because of its hard clustering. Therefore, the intersection zone scattering from using the weight and ratio of the egg’s size to input of DY and SY is taken into consider with fuzzy logic algorithm, to improve the error. The results of errors from fuzzy logic are depended with input membership functions (MF). This research selects triangular MF of weight as low = 65 g, medium = 75 g and high = 85 g, while ratio of the egg is triangular MF as low = 1.30, medium = 1.40 and high = 1.50. This algorithm is not provide the minimum total error but it gives the low error to detect a double yolk while the real egg is SY as 1.43% of total eggs. This algorithm is applied to develop a double yolk egg detection prototype with Mbed platform by a load cell and OpenMV CAM, to measure the weight and ratio of the egg respectively.


2020 ◽  
Vol 39 (2) ◽  
pp. 464-471
Author(s):  
J.A. Adeyiga ◽  
S.O. Olabiyisi ◽  
E.O. Omidiora

Several criminal profiling systems have been developed to assist the Law Enforcement Agencies in solving crimes but the techniques employed in most of the systems lack the ability to cluster criminal based on their behavioral characteristics. This paper reviewed different clustering techniques used in criminal profiling and then selects one fuzzy clustering algorithm (Expectation Maximization) and two hard clustering algorithm (K-means and Hierarchical). The selected algorithms were then developed and tested on real life data to produce "profiles" of criminal activity and behavior of criminals. The algorithms were implemented using WEKA software package. The performance of the algorithms was evaluated using cluster accuracy and time complexity. The results show that Expectation Maximization algorithm gave a 90.5% clusters accuracy in 8.5s, while K-Means had 62.6% in 0.09s and Hierarchical with 51.9% in 0.11s. In conclusion, soft clustering algorithm performs better than hard clustering algorithm in analyzing criminal data. Keywords: Clustering Algorithm, Profiling, Crime, Membership value


2020 ◽  
Vol 13 (2) ◽  
pp. 234-239
Author(s):  
Wang Meng ◽  
Dui Hongyan ◽  
Zhou Shiyuan ◽  
Dong Zhankui ◽  
Wu Zige

Background: Clustering is one of the most important data mining methods. The k-means (c-means ) and its derivative methods are the hotspot in the field of clustering research in recent years. The clustering method can be divided into two categories according to the uncertainty, which are hard clustering and soft clustering. The Hard C-Means clustering (HCM) belongs to hard clustering while the Fuzzy C-Means clustering (FCM) belongs to soft clustering in the field of k-means clustering research respectively. The linearly separable problem is a big challenge to clustering and classification algorithm and further improvement is required in big data era. Objective: RKM algorithm based on fuzzy roughness is also a hot topic in current research. The rough set theory and the fuzzy theory are powerful tools for depicting uncertainty, which are the same in essence. Therefore, RKM can be kernelized by the mean of KFCM. In this paper, we put forward a Kernel Rough K-Means algorithm (KRKM) for RKM to solve nonlinear problem for RKM. KRKM expanded the ability of processing complex data of RKM and solve the problem of the soft clustering uncertainty. Methods: This paper proposed the process of the Kernel Rough K-Means algorithm (KRKM). Then the clustering accuracy was contrasted by utilizing the data sets from UCI repository. The experiment results shown the KRKM with improved clustering accuracy, comparing with the RKM algorithm. Results: The classification precision of KFCM and KRKM were improved. For the classification precision, KRKM was slightly higher than KFCM, indicating that KRKM was also an attractive alternative clustering algorithm and had good clustering effect when dealing with nonlinear clustering. Conclusion: Through the comparison with the precision of KFCM algorithm, it was found that KRKM had slight advantages in clustering accuracy. KRKM was one of the effective clustering algorithms that can be selected in nonlinear clustering.


2020 ◽  
Vol 2 (2) ◽  
pp. 111-119
Author(s):  
Dr. Akey Sungheetha ◽  
Dr. Rajesh Sharma R

The detection of edges is the one of the important stage in the application, associated with the machine vision, computer vision and the image processing. It is most commonly and highly preferred in the area were the extraction or the detection of the attribute are necessary. As the manual methods of diagnosis in the medical images acquired from the CT (computed tomography) and the MRI (magnetic resonance images) are very tedious and as well as time consuming, the paper puts forth the methodology to detect the edges in the CT and the MRI by employing Gabor Transform as well as the soft and the hard clustering. This proposed method is highly preferred among the image with dynamic variations. The technique used in the paper is evaluated using 4500 instance of the MRI and 3000 instance of CT. The results on the basis of the figure of merit (FOM) and Misclassification rate (MCR) are compared with other standard approaches and the performance was evinced.


2020 ◽  
Vol 23 (1) ◽  
pp. 79-89
Author(s):  
Quy Hoang Van ◽  
Huy Tran Van ◽  
Huy Ngo Hoang ◽  
Tuyet Dao Van ◽  
Sergey Ablameyko

The efficient manifold ranking (EMR) algorithm is used quite effectively in content-based image retrieval (CBIR) for large image databases where images are represented by multiple low-level features to describe about the color, texture and shape. The EMR ranking algorithm requires steps to determine anchor points of the image database by using the k-means hard clustering and the accuracy of the ranking depends strongly on the selected anchor points. This paper describes a new result based on a modified Fuzzy C-Means (FCM) clustering algorithm to select anchor points in the large database in order to increase the efficiency of manifold ranking specially for the large database cases. Experiments have demonstrated the effectiveness of the proposed algorithm for the issue of building an anchor graph, the set of anchor points determined by this novel lvdc-FCM algorithm has actually increased the effective of manifold ranking and the quality of images query results which retrieved of the CBIR.


Sign in / Sign up

Export Citation Format

Share Document