ANALISIS KECENDERUNGAN LAPORAN MASYARAKAT PADA “LAPORGUB..!” PROVINSI JAWA TENGAH MENGGUNAKAN TEXT MINING DENGAN FUZZY C-MEANS CLUSTERING

Ratna Kurniasari; Rukun Santoso; Alan Prahutama

doi:10.14710/j.gauss.v10i4.33101

ANALISIS KECENDERUNGAN LAPORAN MASYARAKAT PADA “LAPORGUB..!” PROVINSI JAWA TENGAH MENGGUNAKAN TEXT MINING DENGAN FUZZY C-MEANS CLUSTERING

Jurnal Gaussian ◽

10.14710/j.gauss.v10i4.33101 ◽

2022 ◽

Vol 10 (4) ◽

pp. 544-553

Author(s):

Ratna Kurniasari ◽

Rukun Santoso ◽

Alan Prahutama

Keyword(s):

Text Mining ◽

Cluster Center ◽

Text Data ◽

Fuzzy C Means ◽

Word Cloud ◽

Silhouette Coefficient ◽

Degree Of Membership ◽

Fuzzy C Means Clustering ◽

Hard Clustering ◽

The Government

Effective communication between the government and society is essential to achieve good governance. The government makes an effort to provide a means of public complaints through an online aspiration and complaint service called “LaporGub..!”. To group incoming reports easier, the topic of the report is searched by using clustering. Text Mining is used to convert text data into numeric data so that it can be processed further. Clustering is classified as soft clustering (fuzzy) and hard clustering. Hard clustering will divide data into clusters strictly without any overlapping membership with other clusters. Soft clustering can enter data into several clusters with a certain degree of membership value. Different membership values make fuzzy grouping have more natural results than hard clustering because objects at the boundary between several classes are not forced to fully fit into one class but each object is assigned a degree of membership. Fuzzy c-means has an advantage in terms of having a more precise placement of the cluster center compared to other cluster methods, by improving the cluster center repeatedly. The formation of the best number of clusters is seen based on the maximum silhouette coefficient. Wordcloud is used to determine the dominant topic in each cluster. Word cloud is a form of text data visualization. The results show that the maximum silhouette coefficient value for fuzzy c-means clustering is shown by the three clusters. The first cluster produces a word cloud regarding road conditions as many as 449 reports, the second cluster produces a word cloud regarding covid assistance as many as 964 reports, and the third cluster produces a word cloud regarding farmers fertilizers as many as 176 reports. The topic of the report regarding covid assistance is the cluster with the most number of members.

Download Full-text

Implementation of Fuzzy C-Means for Clustering the Majelis Ulama Indonesia (MUI) Fatwa Documents

Jurnal Online Informatika ◽

10.15575/join.v6i1.591 ◽

2021 ◽

Vol 6 (1) ◽

pp. 79

Author(s):

Fajar Rohman Hariri

Keyword(s):

Text Mining ◽

Islamic Law ◽

Clustering Method ◽

Fuzzy C Means ◽

Silhouette Coefficient ◽

Best Value ◽

Fuzzy C Means Clustering

Since the Indonesian Ulema Council (MUI) was established in 1975 until now, this institution has produced 201 edicts covering various fields. Text mining is one of the techniques used to collect data hidden from data that form text. One method of extracting text is Clustering. The present study implements the Fuzzy C-Means Clustering method in MUI fatwa documents to classify existing fatwas based on the similarity of the issues discussed. Silhouette Coefficient is used to analyze the resulting clusters, with the best value of 0.0982 with 10 clusters grouping. Classify fatwas based on the similarity of the issues discussed can make it easier and faster in the search for an Islamic law in Indonesia.

Download Full-text

Improved Fuzzy FCM-LI Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.765-767.670 ◽

2013 ◽

Vol 765-767 ◽

pp. 670-673

Author(s):

Li Bo Hou

Keyword(s):

Real Time ◽

Clustering Algorithm ◽

Feature Analysis ◽

Cluster Center ◽

High Dimensional ◽

Fuzzy C Means ◽

Sample Data ◽

Fuzzy C Means Clustering ◽

Fcm Clustering ◽

Np Hard Problem

Fuzzy C-means (FCM) clustering algorithm is one of the widely applied algorithms in non-supervision of pattern recognition. However, FCM algorithm in the iterative process requires a lot of calculations, especially when feature vectors has high-dimensional, Use clustering algorithm to sub-heap, not only inefficient, but also may lead to "the curse of dimensionality." For the problem, This paper analyzes the fuzzy C-means clustering algorithm in high dimensional feature of the process, the problem of cluster center is an np-hard problem, In order to improve the effectiveness and Real-time of fuzzy C-means clustering algorithm in high dimensional feature analysis, Combination of landmark isometric (L-ISOMAP) algorithm, Proposed improved algorithm FCM-LI. Preliminary analysis of the samples, Use clustering results and the correlation of sample data, using landmark isometric (L-ISOMAP) algorithm to reduce the dimension, further analysis on the basis, obtained the final results. Finally, experimental results show that the effectiveness and Real-time of FCM-LI algorithm in high dimensional feature analysis.

Download Full-text

Cluster Analysis on Dengue Incidence and Weather Data Using K-Medoids and Fuzzy C-Means Clustering Algorithms (Case Study: Spread of Dengue in the DKI Jakarta Province)

Journal of Mathematical and Fundamental Sciences ◽

10.5614/j.math.fund.sci.2021.53.3.9 ◽

2022 ◽

Vol 53 (3) ◽

pp. 466-486

Author(s):

Cindy Cindy ◽

Cynthia Cynthia ◽

Valentino Vito ◽

Devvi Sarwinda ◽

Bevina Desjwiandra Handari ◽

...

Keyword(s):

Time Series ◽

Time Series Data ◽

Clustering Algorithms ◽

Capital City ◽

Weather Data ◽

Series Data ◽

Weather Factors ◽

Fuzzy C Means ◽

Fuzzy C Means Clustering ◽

The Government

In Indonesia, Dengue incidence tends to increase every year but has been fluctuating in recent years. The potential for Dengue outbreaks in DKI Jakarta, the capital city, deserves serious attention. Weather factors are suspected of being associated with the incidence of Dengue in Indonesia. This research used weather and Dengue incidence data for five regions of DKI Jakarta, Indonesia, from December 30, 2008, to January 2, 2017. The study used a clustering approach on time-series and non-time-series data using K-Medoids and Fuzzy C-Means Clustering. The clustering results for the non-time-series data showed a positive correlation between the number of Dengue incidents and both average relative humidity and amount of rainfall. However, Dengue incidence and average temperature were negatively correlated. Moreover, the clustering implementation on the time-series data showed that rainfall patterns most closely resembled those of Dengue incidence. Therefore, rainfall can be used to estimate Dengue incidence. Both results suggest that the government could utilize weather data to predict possible spikes in DHF incidence, especially when entering the rainy season and alert the public to greater probability of a Dengue outbreak.

Download Full-text

Comparison of Soft and Hard Clustering: A Case Study on Welfare Level in Cities on Java Island

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i1p141-160 ◽

2021 ◽

Vol 5 (1) ◽

pp. 141-160

Author(s):

Nurafiza Thamrin ◽

Arie Wahyu Wijayanto

Keyword(s):

Cluster Analysis ◽

National Development ◽

Optimal Number ◽

Medium Term ◽

Silhouette Coefficient ◽

Hard Clustering ◽

The Government ◽

High Welfare ◽

Optimal Number Of Clusters

The National Medium Term Development Plan 2020-2024 states that one of the visions of national development is to accelerate the distribution of welfare and justice. Cluster analysis is analysis that grouping of objects into several smaller groups where the objects in one group have similar characteristics. This study was conducted to find the best clustering method and to classify cities based on the level of welfare in Java. In this study, the cluster analysis that used was hard clustering such as K-Means, K-Medoids (PAM and CLARA), and Hierarchical Agglomerative as well as soft clustering such as Fuzzy C Means. This study use elbow method, silhouette method, and gap statistics to determine the optimal number of clusters. From the evaluation results of the silhouette coefficient, dunn index, connectivity coefficient, and Sw/Sb ratio, it was found that the best cluster analysis was Agglomerative Ward Linkage which produced three clusters. The first cluster consists of 27 cities with moderate welfare, the second cluster consists of 16 cities with high welfare, the third cluster consists of 76 cities with low welfare. With the best clustering results, the government of cities in Java shall be able to make a better policies of welfare based on the dominant indicators found in each cluster.

Download Full-text

Big Data Clustering Using Improvised Fuzzy C-Means Clustering

Revue d intelligence artificielle ◽

10.18280/ria.340604 ◽

2020 ◽

Vol 34 (6) ◽

pp. 701-708

Author(s):

Venkat Rayala ◽

Satyanarayan Reddy Kalli

Keyword(s):

Big Data ◽

Comparative Analysis ◽

Research Work ◽

Cluster Center ◽

Adjusted Rand Index ◽

Postal Service ◽

Data Types ◽

Normalized Mutual Information ◽

Fuzzy C Means ◽

Fuzzy C Means Clustering

Clustering emerged as powerful mechanism to analyze the massive data generated by modern applications; the main aim of it is to categorize the data into clusters where objects are grouped into the particular category. However, there are various challenges while clustering the big data recently. Deep Learning has been powerful paradigm for big data analysis, this requires huge number of samples for training the model, which is time consuming and expensive. This can be avoided though fuzzy approach. In this research work, we design and develop an Improvised Fuzzy C-Means (IFCM)which comprises the encoder decoder Convolutional Neural Network (CNN) model and Fuzzy C-means (FCM) technique to enhance the clustering mechanism. Encoder decoder based CNN is used for learning feature and faster computation. In general, FCM, we introduce a function which measure the distance between the cluster center and instance which helps in achieving the better clustering and later we introduce Optimized Encoder Decoder (OED) CNN model for improvising the performance and for faster computation. Further in order to evaluate the proposed mechanism, three distinctive data types namely Modified National Institute of Standards and Technology (MNIST), fashion MNIST and United States Postal Service (USPS) are used, also evaluation is carried out by considering the performance metric like Accuracy, Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI). Moreover, comparative analysis is carried out on each dataset and comparative analysis shows that IFCM outperforms the existing model.

Download Full-text

Application of artificial fish swarm optimization semi-supervised kernel fuzzy clustering algorithm in network intrusion

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-179935 ◽

2020 ◽

Vol 39 (2) ◽

pp. 1619-1626

Author(s):

Yongsheng Zong ◽

Guoyan Huang

Keyword(s):

Intrusion Detection ◽

Detection Rate ◽

Clustering Algorithm ◽

Probabilistic Constraints ◽

Cluster Center ◽

Swarm Optimization ◽

Fuzzy C Means ◽

Network Intrusion ◽

Fuzzy C Means Clustering ◽

Artificial Fish Swarm

For the unsupervised learning based clustering algorithm, the intrusion detection rate is low, and the training sample based on supervised learning clustering algorithm is insufficient. A semi-supervised kernel fuzzy C-means clustering algorithm based on artificial fish swarm optimization (AFSA-KFCM) is proposed. Firstly, the kernel function is used to change the distance function in the traditional semi-supervised fuzzy C-means clustering algorithm to define a new objective function, thus improving the probabilistic constraints of the fuzzy C-means algorithm. Then, the artificial fish swarm algorithm with strong global optimization ability is used to improve the KFCM sensitivity to the initial cluster center and easy to fall into the local extremum, thus improving the convergence speed and improving the classification effect. The test results in the Wine and IRIS public datasets show that the AFSA-KFCM clustering algorithm is superior to the traditional algorithm in clustering accuracy and time efficiency. At the same time, the experimental results in KDDCUP99 experimental data show that the algorithm can obtain the ideal detection rate and false detection rate in intrusion detection.

Download Full-text

An Improved FCM Algorithm Based on Subtractive Clustering for Power Load Classification

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.986-987.206 ◽

2014 ◽

Vol 986-987 ◽

pp. 206-210 ◽

Cited By ~ 1

Author(s):

Rui Dong ◽

Min Xiang Huang

Keyword(s):

Experimental Analysis ◽

Global Search ◽

Cluster Center ◽

Subtractive Clustering ◽

Local Optima ◽

Fcm Algorithm ◽

Fuzzy C Means ◽

Power Load ◽

Fuzzy C Means Clustering ◽

Random Initialization

FCM is used in many power load classification currently, but it also has some shortcomings. This paper give an algorithm based on Subtractive Clustering and improved Fuzzy C-means Clustering (SUB-FCM) to solve this problem. This algorithm use subtractive clustering to initialize the cluster center matrix, solve the random initialization of FCM, and improve the global search ability, avoid falling into local optima. Experimental analysis found this algorithm also could accelerate the convergence speed, and has better clustering results. It can be applied to power load classification effectively.

Download Full-text

A Novel Segmentation Method for Brain MRI Using a Block-Based Integrated Fuzzy C-Means Clustering Algorithm

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2020.2970 ◽

2020 ◽

Vol 10 (3) ◽

pp. 579-585

Author(s):

Hui Zhang ◽

Hongjie Zhang

Keyword(s):

Clustering Algorithm ◽

Brain Diseases ◽

Cluster Center ◽

Large Sample Size ◽

Large Sample ◽

Fcm Algorithm ◽

Fuzzy C Means ◽

Fuzzy C Means Clustering ◽

Final Cluster ◽

Block Based

Accurate segmentation of brain tissue has important guiding significance and practical application value for the diagnosis of brain diseases. Brain magnetic resonance imaging (MRI) has the characteristics of high dimensionality and large sample size. Such datasets create considerable computational complexity in image processing. To efficiently process large sample data, this article integrates the proposed block clustering strategy with the classic fuzzy C-means clustering (FCM) algorithm and proposes a block-based integrated FCM clustering algorithm (BI-FCM). The algorithm first performs block processing on each image and then clusters each subimage using the FCM algorithm. The cluster centers for all subimages are again clustered using FCM to obtain the final cluster center. Finally, the distance from each pixel to the final cluster center is obtained, and the corresponding division is performed according to the distance. The dataset used in this experiment is the Simulated Brain Database (SBD). The results show that the BI-FCM algorithm addresses the large sample processing problem well, and the theory is simple and effective.

Download Full-text

Fuzzy C-Means Clustering Algorithm Based on Coefficient of Variation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.998-999.873 ◽

2014 ◽

Vol 998-999 ◽

pp. 873-877

Author(s):

Zhen Bo Wang ◽

Bao Zhi Qiu

Keyword(s):

Coefficient Of Variation ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Real Data ◽

Cluster Center ◽

Data Set ◽

Fuzzy C Means ◽

Initial Cluster ◽

Fuzzy C Means Clustering ◽

The Impact

To reduce the impact of irrelevant attributes on clustering results, and improve the importance of relevant attributes to clustering, this paper proposes fuzzy C-means clustering algorithm based on coefficient of variation (CV-FCM). In the algorithm, coefficient of variation is used to weigh attributes so as to assign different weights to each attribute in the data set, and the magnitude of weight is used to express the importance of different attributes to clusters. In addition, for the characteristic of fuzzy C-means clustering algorithm that it is susceptible to initial cluster center value, the method for the selection of initial cluster center based on maximum distance is introduced on the basis of weighted coefficient of variation. The result of the experiment based on real data sets shows that this algorithm can select cluster center effectively, with the clustering result superior to general fuzzy C-means clustering algorithms.

Download Full-text

PEMETAAN DAERAH BERPOTENSI TRANSMIGRAN DI KECAMATAN KARTASURA DENGAN METODE FUZZY C-MEANS (FCM) CLUSTERING

Jurnal Teknologi Informasi dan Komunikasi (TIKomSiN) ◽

10.30646/tikomsin.v6i1.347 ◽

2018 ◽

Vol 6 (1) ◽

Author(s):

Mawar Hardiyanti ◽

Yustina Retno Wahyu Utami ◽

Wawan Laksito Yuly Saptomo

Keyword(s):

Secondary Data ◽

Well Being ◽

Fuzzy C Means ◽

Program Testing ◽

Mapping System ◽

Design Build ◽

Economic Background ◽

Fuzzy C Means Clustering ◽

Fcm Clustering ◽

The Government

In an attempt to achieve the well-being of Indonesia, one of the Government's policies that need to be implemented are the deployment and implementation of the transmigration program. In General only a transmigration program offered by the Government to all societies without knowing the economic background and his family so that the transmigration program was not right on target. Based on the background of the problems in this research is how to design, build, develop and implement Fuzzy C-Means Clustering on Regional Mapping System for classifying the area potentially Homesteader in Kartasura. The data obtained by conducting interviews at the population administration of the subdistrict of Kartasura, observation, and study of the literature. In this research, the author uses secondary data. Data obtained from Reports in Kartasura Subdistrict number 2015 by BPS (Statistics Indonesia) Sukoharjo Regency. The results obtained are Fuzzy C-Means method can be applied to a system of mapping the area potentially Homesteader in Kartasura can optimize the work of the Government in the implementation of the resettlement program. Testing the cluster with Center validation methods using MPC alternate data criteria in the period the year 2014 and 2015 which States that 3 clusters are the cluster validation.Keywords: Classification, Fuzzy C-Means, Transmigration

Download Full-text