kmeans algorithm
Recently Published Documents


TOTAL DOCUMENTS

28
(FIVE YEARS 14)

H-INDEX

4
(FIVE YEARS 0)

2021 ◽  
Vol 2021 ◽  
pp. 1-6
Author(s):  
Guozhang Li ◽  
Rayner Alfred ◽  
Xue Wang

Now, entering the 21st century, with the continuous improvement of my country’s higher education level, the enrollment rate of all colleges and universities across the country is increasing year by year. Faced with the information management of a large number of students, the workload and work pressure of consultants at various universities have doubled. The rapid and effective development of modern computer software and hardware has also initiated and effectively developed the informatization process of universities. The student management system is the core and foundation of the entire school education management system. This study mainly introduces the application of student behavior analysis and research models based on clustering technology. This paper uses the application research of student behavior analysis and research model based on clustering technology, uses clustering technology to analyze student behavior, and reasonably analyzes the feasibility of KMEANS algorithm and campus data mining. The cluster analysis algorithm is used to divide students into different groups according to the characteristics of the students, and then, data analysis and data association rules’ mining are performed on each group of students. At the same time, the decision tree algorithm is used to predict the future of students based on the historical data of the students and the current data of the students. The development status of the school helps the school to understand the situation of the students in real time, make predictions and warnings for possible situations, provide personalized applications for teachers and students, and provide decision-making support for the management. It can be seen from the experimental analysis that the application of student behavior analysis and research models based on clustering technology has increased the efficiency of student education by 17%. The limitations of student behavior analysis and research on clustering technology provide good applications for the KMEANS algorithm. Analysis, discussion, and summary of the methods and approaches are obtained to enrich the academic research results.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Enchang Sun ◽  
Hanxing Qu ◽  
Yongyi Yuan ◽  
Meng Li ◽  
Zhuwei Wang ◽  
...  

With the increasing application of unmanned aerial vehicles (UAVs), UAV-based base stations (BSs) have been widely used. In some situations when there is no ground BSs, such as mountainous areas and isolated islands, or BSs being out of service, like disaster areas, UAV-based networks may be rapidly deployed. In this paper, we propose a framework for UAV deployment, power control, and channel allocation for device-to-device (D2D) users, which is used for the underlying D2D communication in UAV-based networks. Firstly, the number and location of UAVs are iteratively optimized by the particle swarm optimization- (PSO-) Kmeans algorithm. After UAV deployment, this study maximizes the energy efficiency (EE) of D2D pairs while ensuring the quality of service (QoS). To solve this optimization problem, the adaptive mutation salp swarm algorithm (AMSSA) is proposed, which adopts the population variation strategy, the dynamic leader-follower numbers, and position update, as well as Q -learning strategy. Finally, simulation results show that the PSO-Kmeans algorithm can achieve better communication quality of cellular users (CUEs) with fewer UAVs compared with the PSO algorithm. The AMSSA has excellent global searching ability and local mining ability, which is not only superior to other benchmark schemes but also closer to the optimal performance of D2D pairs in terms of EE.


2021 ◽  
Vol 15 ◽  
Author(s):  
Isaac Goicovich ◽  
Paulo Olivares ◽  
Claudio Román ◽  
Andrea Vázquez ◽  
Cyril Poupon ◽  
...  

Fiber clustering methods are typically used in brain research to study the organization of white matter bundles from large diffusion MRI tractography datasets. These methods enable exploratory bundle inspection using visualization and other methods that require identifying brain white matter structures in individuals or a population. Some applications, such as real-time visualization and inter-subject clustering, need fast and high-quality intra-subject clustering algorithms. This work proposes a parallel algorithm using a General Purpose Graphics Processing Unit (GPGPU) for fiber clustering based on the FFClust algorithm. The proposed GPGPU implementation exploits data parallelism using both multicore and GPU fine-grained parallelism present in commodity architectures, including current laptops and desktop computers. Our approach implements all FFClust steps in parallel, improving execution times in all of them. In addition, our parallel approach includes a parallel Kmeans++ algorithm implementation and defines a new variant of Kmeans++ to reduce the impact of choosing outliers as initial centroids. The results show that our approach provides clustering quality results very similar to FFClust, and it requires an execution time of 3.5 s for processing about a million fibers, achieving a speedup of 11.5 times compared to FFClust.


Author(s):  
V. R. Elangovan ◽  
A. J. Rajeswari Joe ◽  
D. Akila ◽  
K. Hema Shankari ◽  
G. Suseendran
Keyword(s):  

Author(s):  
Ruddy Cahyanto ◽  
Antonius Rachmat Chrismanto ◽  
Danny Sebastian

Clustering is a technique in data mining thatgroups data sets into similar data clusters. One of thealgorithms that is commonly used for clustering is K-Means.However, the K-Means algorithm has several weaknesses, oneof them is the random factor in initial centroid selection, sothat cluster result is inconsistent even though it is tested withthe exact same data. The Modified K-Means algorithm focuseson selecting the initial centroid to overcome inconsistencies ofcluster results in the K-Means method. The test was conductedusing sentipol dataset and only focused on comment data.Furthermore, the specified number of clusters is 3 based on thenumber of existing comment labels (positive, negative, andneutral). According to testing result proves that Modified KMeans algorithm produces better purity value than K-Meansalgorithm. Modified K-Means algorithm produces average ofpurity value 0,42, while K-Means produces average of purityvalue 0,391. Meanwhile, from testing related to random factorsconducted 5 times with the same attributes and test data, theresults of the cluster on the Modified K-Means algorithm didnot change, so automatically the resulting purity value was alsothe same. Whereas in the K-Means algorithm, the clusterresults always change in each test, so the result of purity valueis also likely to change.


2020 ◽  
Vol 39 (5) ◽  
pp. 6993-7004
Author(s):  
Lu Han ◽  
Zhi Su ◽  
Jing Lin

Ever increasing ordinal variables are being collected by the Personal Credit Reference System in China, however this system suffers from analysis of this kind of data, which cannot be calculated by Euclidean distance. In this study, we put forward a hybrid KNN algorithm based on Sugeno measure, and we prove that the error of this algorithm is smaller than that of Euclidean distance, furthermore, we use real data obtained from the Personal Credit Reference System to perform experiments and get the user’s initial portrait. Through the comparisons with Kmeans algorithm and other different distance measures in KNN algorithm, we find that the hybrid KNN algorithm is more suitable for clustering personal credit data.


2020 ◽  
Vol 7 (2) ◽  
pp. 391
Author(s):  
Issa Arwani

<p>Proses klasterisasi data di <em>DBMS</em> akan lebih efisien jika dilakukan langsung di dalam <em>DBMS</em> itu sendiri karena <em>DBMS</em> mendukung untuk pengelolaan data yang baik. <em>SQL-Kmeans</em> merupakan salah satu metode yang sebelumnya telah digunakan untuk mengintegrasikan algoritme klasterisasi <em>K-means</em> ke dalam <em>DBMS</em> menggunakan <em>SQL</em>. Akan tetapi, metode ini juga membawa kelemahan dari algoritme <em>K-means</em> itu sendiri yaitu lamanya iterasi untuk mencapai konvergen dan keakuratan hasil klasterisasi yang belum optimal akibat dari proses inisialisasi <em>centroid</em> awal secara acak. Algoritme <em>Median Initial Centroid (MIC)-Kmeans</em> merupakan pengembangan dari algoritme <em>K-means</em> yang bisa memberikan solusi optimal dalam menentukan awal <em>centroid</em> yang berdampak pada keakuratan dan lamanya iterasi. Dengan keunggulan yang dimiliki algoritme <em>MIC-Kmeans</em>, maka dalam penelitian ini dipilih sebagai alternatif algoritme yang diintegrasikan dalam proses klasterisasi data secara langsung di <em>DBMS</em> menggunakan <em>SQL</em>. Proses integrasinya meliputi 4 tahap yaitu tahap inisialisasi tabel <em>dataset</em>, tahap pemetaan algoritme <em>MIC-Kmeans</em> pada <em>SQL</em> dan tabel <em>dataset</em>, tahap perancangan <em>SQL </em>untuk tiap hasil pemetaan dan tahap implementasi rancangan <em>SQL</em> dalam <em>MySQL</em> <em>stored procedure</em>. Hasil pengujian menunjukkan bahwa metode <em>SQL MIC-Kmeans</em> bisa mengurangi 43% jumlah iterasi dan mengurangi 39% waktu yang dibutuhkan dari metode <em>SQL-Kmeans</em> untuk mencapai konvergen. Selain itu, nilai rata-rata <em>silhouette coefficient </em>metode <em>SQL MIC-Kmeans</em> adalah 0,79 dan masuk dalam kategori <em>strong structure</em> (nilai rentang 0,7 sampai 1). Sedangkan nilai rata-rata <em>silhouette coefficient </em>metode <em>SQL-Kmeans </em>adalah<em> </em>0,68<em> </em>dan masuk dalam kategori <em>medium structure </em>(nilai rentang 0,5 sampai 0,7).</p><p class="Judul2"><strong><em>Abstract</em></strong></p><p class="Judul2"><em>The process of data clustering in the DBMS will be more efficient because the DBMS supports good data management. SQL-Kmeans is a method that has been used to integrate K-means clustering algorithms into DBMS using SQL. However, it carries the weakness of the K-means algorithm itself in the duration of iterations to reach convergence and the accuracy of clustering due to the centroid initialization process randomly. Median Initial Centroid (MIC)-Kmeans algorithm is a development of the K-means algorithm that can provide the optimal solution in determining the initial centroid which has an impact on the accuracy and duration of iterations. With the advantages of the MIC-Kmeans algorithm, the method was chosen as an alternative algorithm to be integrated in the DBMS using SQL  for a clustering. The integration process includes 4 stages, there are dataset initialization, SQL algorithm mapping and dataset table, SQL design for each mapping result, and implementation SQL in the MySQL stored procedure. The test results show that the SQL MIC-Kmeans method can reduce 43% the number of iterations and reduce 39% of the time required from the SQL-Kmeans method to reach convergence. In addition, the average value of the coefficient SQL MIC-Kmeans method is 0.79 and categorized as strong structure (value ranges from 0.7 to 1). While, the average value of the coefficient SQL-Kmeans method is 0.68 and categorized as medium structure (value ranges from 0.5 to 0.7).</em></p>


Sign in / Sign up

Export Citation Format

Share Document