Clustering Penerima Beasiswa Yayasan Untuk Mahasiswa Menggunakan Metode K-Means

Milk is an important intake to meet nutritional needs. Both consumed by children, and adults. Indonesia has many producers of fresh milk, but it is not sufficient for national milk needs. Data mining is a science in the field of computers that is widely used in research. one of the data mining techniques is Clustering. Clustering is a method by grouping data. The Clustering method will be more optimal if you use a lot of data. Data to be used are provincial data in Indonesia from 2000 to 2017 obtained from the Central Statistics Agency. The results of this study are in Clusters based on 2 milk-producing groups, namely high-dairy producers and low-milk producing regions. From 27 data on fresh milk production in Indonesia, two high-level provinces can be obtained, namely: West Java and East Java. And 25 others were added in 7 provinces which did not follow the calculation of the K-Means Clustering Algorithm, including in the low level cluster.

Download Full-text

DATA MINING DALAM PENGELOMPOKAN JENIS DAN JUMLAH PEMBAGIAN ZAKAT DENGAN MENGGUNAKAN METODE CLUSTERING K-MEANS (STUDI KASUS: BADAN AMIL ZAKAT KOTA BENGKULU)

JURNAL TEKNOLOGI INFORMASI ◽

10.36294/jurti.v1i2.298 ◽

2018 ◽

Vol 1 (2) ◽

pp. 211

Author(s):

Prahasti Prahasti

Keyword(s):

Data Mining ◽

Data Processing ◽

Clustering Algorithm ◽

Test Results ◽

Clustering Method ◽

Center Point

Abstrack - This research applies data mining by grouping the types and recipients of zakat. The application is done by the k-means clustering algorithm where the data to be entered is grouped by education and type of work in the distribution of zakat. Then a cluster is formed using the centroid value to determine the closest center point of distance between data. In the k-means clustering algorithm data processing is stopped in the iteration count of the data has not changed (fixed data) from the data that has been grouped. The test is done by using the RapidMiner software experiment conducted by the k-means clustering method which consists of input units, data processing units and output units, k-means clustering grouping data 1-2-1-1, 1-2-1-2 and 3-4-3-4. The results obtained from these tests are grouping the distribution of zakat with each cluster not the same. The test results are displayed in slatter graph. Keywords - Data Mining, K-Means Clusttering, Zakat

Download Full-text

Dengue Disease Detection using K- Means, Hierarchical, Kohonen- SOM Clustering

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9066.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 904-907

Keyword(s):

Data Mining ◽

Clustering Algorithm ◽

Research Work ◽

Data Mining Algorithm ◽

Clustering Method ◽

Dengue Disease ◽

Related Data ◽

Som Algorithm ◽

Som Clustering ◽

Kohonen Som

Data Mining is the process of extracting useful information. Data Mining is about finding new information from pre-existing databases. It is the procedure of mining facts from data and deals with the kind of patterns that can be mined. Therefore, this proposed work is to detect and categorize the illness of people who are affected by Dengue through Data Mining techniques mainly as the Clustering method. Clustering is the method of finding related groups of data in a dataset and used to split the related data into a group of sub-classes. So, in this research work clustering method is used to categorize the age group of people those who are affected by mosquito-borne viral infection using K-Means and Hierarchical Clustering algorithm and Kohonen-SOM algorithm has been implemented in Tanagra tool. The scientists use the data mining algorithm for preventing and defending different diseases like Dengue disease. This paper helps to apply the algorithm for clustering of Dengue fever in Tanagra tool to detect the best results from those algorithms.

Download Full-text

An effective and efficient hierarchical K-means clustering algorithm

International Journal of Distributed Sensor Networks ◽

10.1177/1550147717728627 ◽

2017 ◽

Vol 13 (8) ◽

pp. 155014771772862 ◽

Cited By ~ 8

Author(s):

Jianpeng Qi ◽

Yanwei Yu ◽

Lihong Wang ◽

Jinglei Liu ◽

Yingjie Wang

Keyword(s):

Data Mining ◽

Clustering Algorithm ◽

Hierarchical Optimization ◽

Clustering Method ◽

Number Of Clusters ◽

Computation Cost ◽

Optimization Principle ◽

Pruning Strategy ◽

Efficiency And Effectiveness ◽

Synthetic Datasets

K-means plays an important role in different fields of data mining. However, k-means often becomes sensitive due to its random seeds selecting. Motivated by this, this article proposes an optimized k-means clustering method, named k*-means, along with three optimization principles. First, we propose a hierarchical optimization principle initialized by k* seeds ([Formula: see text]) to reduce the risk of random seeds selecting, and then use the proposed “top- n nearest clusters merging” to merge the nearest clusters in each round until the number of clusters reaches at [Formula: see text]. Second, we propose an “optimized update principle” that leverages moved points updating incrementally instead of recalculating mean and [Formula: see text] of cluster in k-means iteration to minimize computation cost. Third, we propose a strategy named “cluster pruning strategy” to improve efficiency of k-means. This strategy omits the farther clusters to shrink the adjustable space in each iteration. Experiments performed on real UCI and synthetic datasets verify the efficiency and effectiveness of our proposed algorithm.

Download Full-text

Computational analysis of incremental clustering approaches for Large Data

International Journal of Computers and Communications ◽

10.46300/91013.2021.15.3 ◽

2021 ◽

Vol 15 ◽

pp. 14-18

Author(s):

Arun Pratap Singh Kushwah ◽

Shailesh Jaloree ◽

Ramjeevan Singh Thakur

Keyword(s):

Data Mining ◽

Clustering Algorithm ◽

Computational Analysis ◽

Large Data ◽

Distance Functions ◽

Spatial Density ◽

Incremental Clustering ◽

Clustering Method ◽

Density Method ◽

Incremental Approach

Clustering is an approach of data mining, which helps us to find the underlying hidden structure in the dataset. K-means is a clustering method which usages distance functions to find the similarities or dissimilarities between the instances. DBSCAN is a clustering algorithm, which discovers the arbitrary shapes & sizes of clusters from huge volume of using spatial density method. These two approaches of clustering are the classical methods for efficient clustering but underperform when the data is updated frequently in the databases so, the incremental or gradual clustering approaches are always preferred in this environment. In this paper, an incremental approach for clustering is introduced using K-means and DBSCAN to handle the new datasets dynamically updated in the database in an interval.

Download Full-text

Application of K-Means Clustering Algorithm for Determination of Fire-Prone Areas Utilizing Hotspots in West Kalimantan Province

International Journal of Advances in Data and Information Systems ◽

10.25008/ijadis.v1i1.7 ◽

2020 ◽

Vol 1 (1) ◽

pp. 9-16

Author(s):

Nabila Amalia Khairani ◽

Edi Sutoyo

Keyword(s):

Data Mining ◽

Forest Fires ◽

Clustering Algorithm ◽

Social Aspects ◽

Mining Method ◽

Clustering Method ◽

West Kalimantan ◽

A Value ◽

The Impact

Forest and land fires are disasters that often occur in Indonesia. In 2007, 2012 and 2015 forest fires that occurred in Sumatra and Kalimantan attracted global attention because they brought smog pollution to neighboring countries. One of the regions that has the highest fire hotspots is West Kalimantan Province. Forest and land fires have an impact on health, especially on the communities around the scene, as well as on the economic and social aspects. This must be overcome, one of them is by knowing the location of the area of ??fire and can analyze the causes of forest and land fires. With the impact caused by forest and land fires, the purpose of this study is to apply the clustering method using the k-means algorithm to be able to determine the hotspot prone areas in West Kalimantan Province. And evaluate the results of the cluster that has been obtained from the clustering method using the k-means algorithm. Data mining is a suitable method to be able to find out information on hotspot areas. The data mining method used is clustering because this method can process hotspot data into information that can inform areas prone to hotspots. This clustering uses k-means algorithm which is grouping data based on similar characteristics. The hotspots data obtained are grouped into 3 clusters with the results obtained for cluster 0 as many as 284 hotspots including hazardous areas, 215 hotspots including non-prone areas and 129 points that belong to very vulnerable areas. Then the clustering results were evaluated using the Davies-Bouldin Index (DBI) method with a value of 3.112 which indicates that the clustering results of 3 clusters were not optimal.

Download Full-text

Analysis of Data Mining Using K-Means Clustering Algorithm for Product Grouping

IJIIS: International Journal of Informatics and Information Systems ◽

10.47738/ijiis.v3i1.3 ◽

2020 ◽

Vol 3 (1) ◽

pp. 12-22

Author(s):

Mohammad Imron ◽

Uswatun Hasanah ◽

Bahrul Humaidi

Keyword(s):

Data Mining ◽

Data Processing ◽

Clustering Algorithm ◽

Processing Technology ◽

Clustering Method ◽

Data Mining Techniques ◽

Sales Data ◽

Using Data

Rizki Barokah Store is one of the stores that every day sell a variety of basic materials of daily necessities such as food, drinks, snacks, toiletries, and so on. However, some problems occur in the Rizki Barokah Store is often a build-up of product stocks that resulted in the product has expired. This is due to an error in making decisions on the product stock. In addition to these problems, with the amount of sales data stored on the database, the store has not done data mining and grouping to know the potential of the product. Whereas data-processing technology can already be done using data mining techniques. To overcome the period of the land, the technique used in data mining with the clustering method using the algorithm K-means. With the use of these techniques, the purpose of this research is to grouping products based on products of interest and less interest, advise on the stock of products, and know the products of interest and less demand.

Download Full-text

Application of K-Means Clustering Algorithm for Determination of Fire-Prone Areas Utilizing Hotspots in West Kalimantan Province

International Journal of Advances in Data and Information Systems ◽

10.25008/ijadis.v1i1.13 ◽

2020 ◽

Vol 1 (1) ◽

pp. 9-16 ◽

Cited By ~ 1

Author(s):

Nabila Amalia Khairani ◽

Edi Sutoyo

Keyword(s):

Data Mining ◽

Forest Fires ◽

Clustering Algorithm ◽

Social Aspects ◽

Mining Method ◽

Clustering Method ◽

West Kalimantan ◽

A Value ◽

The Impact

Forest and land fires are disasters that often occur in Indonesia. In 2007, 2012 and 2015 forest fires that occurred in Sumatra and Kalimantan attracted global attention because they brought smog pollution to neighboring countries. One of the regions that has the highest fire hotspots is West Kalimantan Province. Forest and land fires have an impact on health, especially on the communities around the scene, as well as on the economic and social aspects. This must be overcome, one of them is by knowing the location of the area of ??fire and can analyze the causes of forest and land fires. With the impact caused by forest and land fires, the purpose of this study is to apply the clustering method using the k-means algorithm to be able to determine the hotspot prone areas in West Kalimantan Province. And evaluate the results of the cluster that has been obtained from the clustering method using the k-means algorithm. Data mining is a suitable method to be able to find out information on hotspot areas. The data mining method used is clustering because this method can process hotspot data into information that can inform areas prone to hotspots. This clustering uses k-means algorithm which is grouping data based on similar characteristics. The hotspots data obtained are grouped into 3 clusters with the results obtained for cluster 0 as many as 284 hotspots including hazardous areas, 215 hotspots including non-prone areas and 129 points that belong to very vulnerable areas. Then the clustering results were evaluated using the Davies-Bouldin Index (DBI) method with a value of 3.112 which indicates that the clustering results of 3 clusters were not optimal.

Download Full-text

K-MEANS CLUSTERING ALGORITHM FOR SERVICE DATA ANALYSIS BASED ON CUSTOMERS COMBINATION

Unes journal of Information System ◽

10.31933/ujis.3.1.001-007.2018 ◽

2018 ◽

Vol 3 (1) ◽

pp. 001

Author(s):

Zulhendra Zulhendra ◽

Gunadi Widi Nurcahyo ◽

Julius Santony

Keyword(s):

Data Mining ◽

Data Analysis ◽

Clustering Algorithm ◽

Customer Complaints ◽

Using Data ◽

Clustering Data ◽

Service Data ◽

Selection Of

In this study using Data Mining, namely K-Means Clustering. Data Mining can be used in searching for a large enough data analysis that aims to enable Indocomputer to know and classify service data based on customer complaints using Weka Software. In this study using the algorithm K-Means Clustering to predict or classify complaints about hardware damage on Payakumbuh Indocomputer. And can find out the data of Laptop brands most do service on Indocomputer Payakumbuh as one of the recommendations to consumers for the selection of Laptops.

Download Full-text

A hierarchical clustering method for random intervals based on a similarity measure

Computational Statistics ◽

10.1007/s00180-021-01121-3 ◽

2021 ◽

Author(s):

Ana Belén Ramos-Guajardo

Keyword(s):

Hierarchical Clustering ◽

Similarity Measure ◽

Clustering Algorithm ◽

Real Life ◽

Stopping Criterion ◽

Clustering Method ◽

Bootstrap Test ◽

Empirical Performance ◽

Random Intervals ◽

Expected Values

AbstractA new clustering method for random intervals that are measured in the same units over the same group of individuals is provided. It takes into account the similarity degree between the expected values of the random intervals that can be analyzed by means of a two-sample similarity bootstrap test. Thus, the expectations of each pair of random intervals are compared through that test and a p-value matrix is finally obtained. The suggested clustering algorithm considers such a matrix where each p-value can be seen at the same time as a kind of similarity between the random intervals. The algorithm is iterative and includes an objective stopping criterion that leads to statistically similar clusters that are different from each other. Some simulations to show the empirical performance of the proposal are developed and the approach is applied to two real-life situations.

Download Full-text