Association rule mining based fuzzy manta ray foraging optimization algorithm for frequent itemset generation from social media

Author(s):  
N. Lakshmi ◽  
M. Krishnamurthy

2020 ◽  
Vol 54 (3) ◽  
pp. 365-382
Author(s):  
Praveen Kumar Gopagoni ◽  
Mohan Rao S K

PurposeAssociation rule mining generates the patterns and correlations from the database, which requires large scanning time, and the cost of computation associated with the generation of the rules is quite high. On the other hand, the candidate rules generated using the traditional association rules mining face a huge challenge in terms of time and space, and the process is lengthy. In order to tackle the issues of the existing methods and to render the privacy rules, the paper proposes the grid-based privacy association rule mining.Design/methodology/approachThe primary intention of the research is to design and develop a distributed elephant herding optimization (EHO) for grid-based privacy association rule mining from the database. The proposed method of rule generation is processed as two steps: in the first step, the rules are generated using apriori algorithm, which is the effective association rule mining algorithm. In general, the extraction of the association rules from the input database is based on confidence and support that is replaced with new terms, such as probability-based confidence and holo-entropy. Thus, in the proposed model, the extraction of the association rules is based on probability-based confidence and holo-entropy. In the second step, the generated rules are given to the grid-based privacy rule mining, which produces privacy-dependent rules based on a novel optimization algorithm and grid-based fitness. The novel optimization algorithm is developed by integrating the distributed concept in EHO algorithm.FindingsThe experimentation of the method using the databases taken from the Frequent Itemset Mining Dataset Repository to prove the effectiveness of the distributed grid-based privacy association rule mining includes the retail, chess, T10I4D100K and T40I10D100K databases. The proposed method outperformed the existing methods through offering a higher degree of privacy and utility, and moreover, it is noted that the distributed nature of the association rule mining facilitates the parallel processing and generates the privacy rules without much computational burden. The rate of hiding capacity, the rate of information preservation and rate of the false rules generated for the proposed method are found to be 0.4468, 0.4488 and 0.0654, respectively, which is better compared with the existing rule mining methods.Originality/valueData mining is performed in a distributed manner through the grids that subdivide the input data, and the rules are framed using the apriori-based association mining, which is the modification of the standard apriori with the holo-entropy and probability-based confidence replacing the support and confidence in the standard apriori algorithm. The mined rules do not assure the privacy, and hence, the grid-based privacy rules are employed that utilize the adaptive elephant herding optimization (AEHO) for generating the privacy rules. The AEHO inherits the adaptive nature in the standard EHO, which renders the global optimal solution.



2021 ◽  
Author(s):  
Erna Hikmawati ◽  
Nur Ulfa Maulidevi ◽  
Kridanto Surendro

Abstract The process of extracting data to obtain useful information is known as data mining. Furthermore, one of the promising and widely used techniques for this extraction process is association rule mining. This technique is used to identify interesting relationships between sets of items in a dataset and predict associative behavior for new data. The first step in association rule mining is the determination of the frequent item set that will be involved in the rule formation process. In this step, a threshold is used to eliminate items excluded in the frequent itemset which is also known as the minimum support. Furthermore, the threshold provides an important role in determining the number of rules generated. However, setting the wrong threshold leads to the failure of the association rule mining to obtain rules. Currently, the minimum support value is determined by the user. This leads to a challenge that becomes worse for a user that is ignorant of the dataset characteristics. In this study, a method was proposed to determine the minimum support value based on the characteristics of the dataset. Furthermore, this required certain criteria to be used as thresholds which led to more adaptive rules according to the needs of the user. The results of this study showed that 6 from 8 datasets, obtained a rule with lift ratio > 1 using the minimum threshold value that was determined through this method.



2022 ◽  
Vol 1 ◽  
Author(s):  
Agostinetto Giulia ◽  
Sandionigi Anna ◽  
Bruno Antonia ◽  
Pescini Dario ◽  
Casiraghi Maurizio

Boosted by the exponential growth of microbiome-based studies, analyzing microbiome patterns is now a hot-topic, finding different fields of application. In particular, the use of machine learning techniques is increasing in microbiome studies, providing deep insights into microbial community composition. In this context, in order to investigate microbial patterns from 16S rRNA metabarcoding data, we explored the effectiveness of Association Rule Mining (ARM) technique, a supervised-machine learning procedure, to extract patterns (in this work, intended as groups of species or taxa) from microbiome data. ARM can generate huge amounts of data, making spurious information removal and visualizing results challenging. Our work sheds light on the strengths and weaknesses of pattern mining strategy into the study of microbial patterns, in particular from 16S rRNA microbiome datasets, applying ARM on real case studies and providing guidelines for future usage. Our results highlighted issues related to the type of input and the use of metadata in microbial pattern extraction, identifying the key steps that must be considered to apply ARM consciously on 16S rRNA microbiome data. To promote the use of ARM and the visualization of microbiome patterns, specifically, we developed microFIM (microbial Frequent Itemset Mining), a versatile Python tool that facilitates the use of ARM integrating common microbiome outputs, such as taxa tables. microFIM implements interest measures to remove spurious information and merges the results of ARM analysis with the common microbiome outputs, providing similar microbiome strategies that help scientists to integrate ARM in microbiome applications. With this work, we aimed at creating a bridge between microbial ecology researchers and ARM technique, making researchers aware about the strength and weaknesses of association rule mining approach.



2021 ◽  
Vol 2 (2) ◽  
pp. 3-21
Author(s):  
Yassine Drias ◽  
Habiba Drias

This article presents a data mining study carried out on social media users in the context of COVID-19 and offers four main contributions. The first one consists in the construction of a COVID-19 dataset composed of tweets posted by users during the first stages of the virus propagation. The second contribution offers a sample of the interactions between users on topics related to the pandemic. The third contribution is a sentiment analysis, which explores the evolution of emotions throughout time, while the fourth one is an association rule mining task. The indicators determined by statistics and the results obtained from sentiment analysis and association rule mining are eloquent. For instance, signs of an upcoming worldwide economic crisis were clearly detected at an early stage in this study. Overall results are promising and can be exploited in the prediction of the aftermath of COVID-19 and similar crisis in the future.



2021 ◽  
Vol 11 (14) ◽  
pp. 6512
Author(s):  
Jonathan Ayebakuro Orama ◽  
Joan Borràs ◽  
Antonio Moreno

Tourists who visit a city for the first time may find it difficult to decide on places to visit, as the amount of information in the Web about cultural and leisure activities may be large. Recommender systems address this problem by suggesting the points of interest that fit better with the user’s preferences. This paper presents a novel recommender system that leverages tweets to build user profiles, taking into account not only their personal preferences but also their travel habits. Association rules, which are mined from the previous visits of users documented on Twitter, are used to make the final recommendations of places to visit. The system has been applied to data of the city of Barcelona, and the results show that the use of the social media-based clustering procedure increases its performance according to several relevant metrics.



Sign in / Sign up

Export Citation Format

Share Document