Ant Miner

2020 ◽  
Vol 11 (2) ◽  
pp. 47-64
Author(s):  
Bijaya Kumar Nanda ◽  
Satchidananda Dehuri

Discovering classification rules from large data is an important task of data mining and is gaining considerable attention. This article presents a novel ant miner for classification rule mining. Our ant miner is inspired by research on the behavior of real ant colonies, simulated annealing, and some data mining concepts as well as principles. Here we present a Michigan style approach for single objective classification rule mining. The algorithm is tested on a few benchmark datasets drawn from UCI repository. Our experimental outcomes confirm that ant miner-HMC (Hybrid Michigan Style Classification) is significantly better than ant-miner-MC (Michigan Style Classification).

Author(s):  
Bijaya Kumar Nanda ◽  
Satchidananda Dehuri

In data mining the task of extracting classification rules from large data is an important task and is gaining considerable attention. This article presents a novel ant miner for classification rule mining. The ant miner is inspired by researches on the behaviour of real ant colonies, simulated annealing, and some data mining concepts as well as principles. This paper presents a Pittsburgh style approach for single objective classification rule mining. The algorithm is tested on a few benchmark datasets drawn from UCI repository. The experimental outcomes confirm that ant miner-HPB (Hybrid Pittsburgh Style Classification) is significantly better than ant-miner-PB (Pittsburgh Style Classification).


Author(s):  
Juan Luis Olmo ◽  
José Raúl Romero ◽  
Sebastián Ventura

Ant programming is a kind of automatic programming that generates computer programs by using the ant colony metaheuristic as the search technique. It has demonstrated good generalization ability for the extraction of comprehensible classifiers. To date, three ant programming algorithms for classification rule mining have been proposed in the literature: two of them are devoted to regular classification, differing mainly in the optimization approach, single-objective or multi-objective, while the third one is focused on imbalanced domains. This chapter collects these algorithms, presenting different experimental studies that confirm the aptitude of this metaheuristic to address this data-mining task.


Data Mining ◽  
2011 ◽  
pp. 191-208 ◽  
Author(s):  
Rafael S. Parpinelli ◽  
Heitor S. Lopes ◽  
Alex A. Freitas

This work proposes an algorithm for rule discovery called Ant-Miner (Ant Colony-Based Data Miner). The goal of Ant-Miner is to extract classification rules from data. The algorithm is based on recent research on the behavior of real ant colonies as well as in some data mining concepts. We compare the performance of Ant-Miner with the performance of the wellknown C4.5 algorithm on six public domain data sets. The results provide evidence that: (a) Ant-Miner is competitive with C4.5 with respect to predictive accuracy; and (b) the rule sets discovered by Ant-Miner are simpler (smaller) than the rule sets discovered by C4.5.


Author(s):  
Dipankar Dutta ◽  
Jaya Sil

Classification is one of the most studied areas of data mining, which gives classification rules during training or learning. Classification rule mining, an important data-mining task, extracts significant rules for classification of objects. In this chapter class specific rules are represented in IF <Antecedent> THEN <Consequent> form. With the popularity of soft computing methods, researchers explore different soft computing tools for rule discovery. Genetic algorithm (GA) is one of such tools. Over time, new techniques of GA for forming classification rules are invented. In this chapter, the authors focus on an understanding of the evolution of GA in classification rule mining to get an optimal rule set that builds an efficient classifier.


Author(s):  
V. Jinubala ◽  
P. Jeyakumar

Data Mining is an emerging research field in the analysis of agricultural data. In fact the most important problem in extracting knowledge from the agriculture data is the missing values of the attributes in the selected data set. If such deficiencies are there in the selected data set then it needs to be cleaned during preprocessing of the data in order to obtain a functional data. The main objective of this paper is to analyse the effectiveness of the various imputation methods in producing a complete data set that can be more useful for applying data mining techniques and presented a comparative analysis of the imputation methods for handling missing values. The pest data set of rice crop collected throughout Maharashtra state under Crop Pest Surveillance and Advisory Project (CROPSAP) during 2009-2013 was used for analysis. The different methodologies like Deleting of rows, Mean & Median, Linear regression and Predictive Mean Matching were analysed for Imputation of Missing values. The comparative analysis shows that Predictive Mean Matching Methodology was better than other methods and effective for imputation of missing values in large data set.


2015 ◽  
Vol 2015 ◽  
pp. 1-14 ◽  
Author(s):  
Agustín Ortíz Díaz ◽  
José del Campo-Ávila ◽  
Gonzalo Ramos-Jiménez ◽  
Isvani Frías Blanco ◽  
Yailé Caballero Mota ◽  
...  

The treatment of large data streams in the presence of concept drifts is one of the main challenges in the field of data mining, particularly when the algorithms have to deal with concepts that disappear and then reappear. This paper presents a new algorithm, called Fast Adapting Ensemble (FAE), which adapts very quickly to both abrupt and gradual concept drifts, and has been specifically designed to deal with recurring concepts. FAE processes the learning examples in blocks of the same size, but it does not have to wait for the batch to be complete in order to adapt its base classification mechanism. FAE incorporates a drift detector to improve the handling of abrupt concept drifts and stores a set of inactive classifiers that represent old concepts, which are activated very quickly when these concepts reappear. We compare our new algorithm with various well-known learning algorithms, taking into account, common benchmark datasets. The experiments show promising results from the proposed algorithm (regarding accuracy and runtime), handling different types of concept drifts.


2018 ◽  
Vol 7 (2) ◽  
pp. 100-105
Author(s):  
Simranjit Kaur ◽  
Seema Baghla

Online shopping has a shopping channel or purchasing various items through online medium. Data mining is defined as a process used to extract usable data from a larger set of any raw data. The data set extraction from the demographic profiles and Questionnaire to investigate the gathered based by association. The method for shopping was totally changed with the happening to internet Technology. Association rule mining is one of the important problems of data mining has been used here. The goal of the association rule mining is to detect relationships or associations between specific values of categorical variables in large data sets.


Data Mining means a procedure to extracting the information out of large data. Data miningapproaches includes classification, association rule, clustering, etc. Data mining is applied in four stages such as data sources, data extrapolation / gathering, modeling and deploying modules. Classification is a method in data mining to predict the group membership of data instances. It’s an method useful in data mining with vast applications for classifying the different types of data used in almost every fields. Classification is giving a class label to in determine set of cases. In this survey, we would like discuss Bayesian classification, rules based classification, Decision trees &neural network.


There is huge amount of data being generated every minute on internet. This data is of no use until we cannot extract useful information from it. Data mining is the process of extracting useful information or knowledge from this huge amount of data that can be further used for various purposes. Discovering Association rules is one of the most important tasks among all other data mining tasks. Association rules contain the rules in the form of IF then THAN form. The leftmost part of the rule i.e. IF is called as the Antecedent which defines the condition and the rightmost part i.e. ELSE is called as the Consequent which defines the result. In this paper, we present the overview and comparison of Apriori, Apriori PT and Frequent Itemsets algorithm of association component in Tanagra Tool. We analyzed the performance based on the execution time and memory used for different number of instances, support and Rule Length in Spambase Dataset. The results show that when we increase the support value the Apriori PT takes the less execution time and Apriori takes less memory space. When numbers of instances are reduced Frequent Itemsets outperforms well both in case of memory and execution time. When rule length is increased the Apriori algorithm performs better than Apriori PT and Frequent Itemsets.


Sign in / Sign up

Export Citation Format

Share Document