An evolutionary algorithm for clustering data streams with a variable number of clusters

2017 ◽  
Vol 67 ◽  
pp. 228-238 ◽  
Author(s):  
Jonathan de Andrade Silva ◽  
Eduardo Raul Hruschka ◽  
João Gama
Author(s):  
ASHFAQUR RAHMAN ◽  
BRIJESH VERMA

This paper presents an algorithm to generate ensemble classifier by joint optimization of accuracy and diversity. It is expected that the base classifiers in an ensemble are accurate and diverse (i.e., complementary in terms of errors) among each other for the ensemble classifier to be more accurate. We adopt a multi-objective evolutionary algorithm (MOEA) for joint optimization of accuracy and diversity on our recently developed nonuniform layered cluster oriented ensemble classifier (NULCOEC). In NULCOEC, the data set is partitioned into a variable number of clusters at different layers. Base classifiers are then trained on the clusters at different layers. The performance of NULCOEC is a function of the vector of the number of layers and clusters. The research presented in this paper investigates the implication of applying MOEA to generate NULCOEC. Accuracy and diversity of the ensemble classifier is expressed as a function of layers and clusters. A MOEA then searches for the combination of layers and clusters to obtain the nondominated set of (accuracy, diversity). We have obtained the results of single objective optimization (i.e., optimizing either accuracy or diversity) and compared them with the results of MOEA on sixteen UCI data sets. The results show that the MOEA can improve the performance of ensemble classifier.


2022 ◽  
Vol 10 (4) ◽  
pp. 583-593
Author(s):  
Syiva Multi Fani ◽  
Rukun Santoso ◽  
Suparti Suparti

Social media is computer-based technology that facilitates the sharing of ideas, thoughts, and information through the building of virtual networks and communities. Twitter is one of the most popular social media in Indonesia which has 78 million users. Businesses rely heavily on Twitter for advertising. Businesses can use these types of tweet content as a means of advertising to Twitter users by Knowing the types of tweet content that are mostly retweeted by their followers . In this study, the application of Text Mining to perform clustering using the K-means clustering method with the best number of clusters obtained from the Silhouette Coefficient method on the @bliblidotcom Twitter tweet data to determine the types of tweet content that are mostly retweeted by @bliblidotcom followers. Tweets with the most retweets and favorites are discount offers and flash sales, so Blibli Indonesia could use this kind of tweet to conduct advertising on social media Twitter because the prize quiz tweets are liked by the @bliblidotcom Twitter account followers.


2013 ◽  
Vol 462-463 ◽  
pp. 438-442
Author(s):  
Ming Gu

Neural network with quadratic junction was described. Structure, properties and unsupervised learning rules of the neural network were discussed. An ART-based hierarchical clustering algorithm using this kind of neural networks was suggested. The algorithm can determine the number of clusters and clustering data. A 2-D artificial data set is used to illustrate and compare the effectiveness of the proposed algorithm and K-means algorithm.


2002 ◽  
Vol 3 (2) ◽  
pp. 23-27 ◽  
Author(s):  
Daniel Barbará
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document