Ant Programming Algorithms for Classification

Author(s):  
Juan Luis Olmo ◽  
José Raúl Romero ◽  
Sebastián Ventura

Ant programming is a kind of automatic programming that generates computer programs by using the ant colony metaheuristic as the search technique. It has demonstrated good generalization ability for the extraction of comprehensible classifiers. To date, three ant programming algorithms for classification rule mining have been proposed in the literature: two of them are devoted to regular classification, differing mainly in the optimization approach, single-objective or multi-objective, while the third one is focused on imbalanced domains. This chapter collects these algorithms, presenting different experimental studies that confirm the aptitude of this metaheuristic to address this data-mining task.

Author(s):  
Bijaya Kumar Nanda ◽  
Satchidananda Dehuri

In data mining the task of extracting classification rules from large data is an important task and is gaining considerable attention. This article presents a novel ant miner for classification rule mining. The ant miner is inspired by researches on the behaviour of real ant colonies, simulated annealing, and some data mining concepts as well as principles. This paper presents a Pittsburgh style approach for single objective classification rule mining. The algorithm is tested on a few benchmark datasets drawn from UCI repository. The experimental outcomes confirm that ant miner-HPB (Hybrid Pittsburgh Style Classification) is significantly better than ant-miner-PB (Pittsburgh Style Classification).


2020 ◽  
Vol 11 (2) ◽  
pp. 47-64
Author(s):  
Bijaya Kumar Nanda ◽  
Satchidananda Dehuri

Discovering classification rules from large data is an important task of data mining and is gaining considerable attention. This article presents a novel ant miner for classification rule mining. Our ant miner is inspired by research on the behavior of real ant colonies, simulated annealing, and some data mining concepts as well as principles. Here we present a Michigan style approach for single objective classification rule mining. The algorithm is tested on a few benchmark datasets drawn from UCI repository. Our experimental outcomes confirm that ant miner-HMC (Hybrid Michigan Style Classification) is significantly better than ant-miner-MC (Michigan Style Classification).


2021 ◽  
Vol 2 (2) ◽  
pp. 3-21
Author(s):  
Yassine Drias ◽  
Habiba Drias

This article presents a data mining study carried out on social media users in the context of COVID-19 and offers four main contributions. The first one consists in the construction of a COVID-19 dataset composed of tweets posted by users during the first stages of the virus propagation. The second contribution offers a sample of the interactions between users on topics related to the pandemic. The third contribution is a sentiment analysis, which explores the evolution of emotions throughout time, while the fourth one is an association rule mining task. The indicators determined by statistics and the results obtained from sentiment analysis and association rule mining are eloquent. For instance, signs of an upcoming worldwide economic crisis were clearly detected at an early stage in this study. Overall results are promising and can be exploited in the prediction of the aftermath of COVID-19 and similar crisis in the future.


Author(s):  
Dipankar Dutta ◽  
Jaya Sil

Classification is one of the most studied areas of data mining, which gives classification rules during training or learning. Classification rule mining, an important data-mining task, extracts significant rules for classification of objects. In this chapter class specific rules are represented in IF <Antecedent> THEN <Consequent> form. With the popularity of soft computing methods, researchers explore different soft computing tools for rule discovery. Genetic algorithm (GA) is one of such tools. Over time, new techniques of GA for forming classification rules are invented. In this chapter, the authors focus on an understanding of the evolution of GA in classification rule mining to get an optimal rule set that builds an efficient classifier.


Author(s):  
E.A. Derkach , O.I. Guseva

Objectives: to compare the accuracy of equations F.P. Hadlock and computer programs by V.N. Demidov in determining gestational age and fetal weight in the third trimester of gestation. Materials: 328 patients in terms 36–42 weeks of gestation are examined. Ultrasonography was performed in 0–5 days prior to childbirth. Results: it is established that the average mistake in determination of term of pregnancy when using the equation of F.P. Hadlock made 12,5 days, the computer program of V.N. Demidov – 4,4 days (distinction 2,8 times). The mistake within 4 days, when using the equation of F.P. Hadlock has met on average in 23,1 % of observations, the computer program of V.N. Demidov — 65,9 % (difference in 2,9 times). The mistake more than 10 days, took place respectively in 51,7 and 8,2 % (distinction by 6,3 times). At a comparative assessment of size of a mistake in determination of fetal mass it is established that when using the equation of F.P. Hadlock it has averaged 281,0 g, at application of the computer program of V.N. Demidov — 182,5 g (distinction of 54 %). The small mistake in the mass of a fetus which isn't exceeding 200 g at application of the equation of F.P. Hadlock has met in 48,1 % of cases and the computer program of V.N. Demidov — 64,0 % (distinction of 33,1 %). The mistake exceeding 500 g has been stated in 18 % (F.P. Hadlock) and 4,3 % (V.N. Demidov) respectively (distinction 4,2 times). Conclusions: the computer program of V.N. Demidov has high precision in determination of term of a gestation and mass of a fetus in the III pregnancy.


Author(s):  
Suma B. ◽  
Shobha G.

<span>Privacy preserving data mining has become the focus of attention of government statistical agencies and database security research community who are concerned with preventing privacy disclosure during data mining. Repositories of large datasets include sensitive rules that need to be concealed from unauthorized access. Hence, association rule hiding emerged as one of the powerful techniques for hiding sensitive knowledge that exists in data before it is published. In this paper, we present a constraint-based optimization approach for hiding a set of sensitive association rules, using a well-structured integer linear program formulation. The proposed approach reduces the database sanitization problem to an instance of the integer linear programming problem. The solution of the integer linear program determines the transactions that need to be sanitized in order to conceal the sensitive rules while minimizing the impact of sanitization on the non-sensitive rules. We also present a heuristic sanitization algorithm that performs hiding by reducing the support or the confidence of the sensitive rules. The results of the experimental evaluation of the proposed approach on real-life datasets indicate the promising performance of the approach in terms of side effects on the original database.</span>


A Data mining is the method of extracting useful information from various repositories such as Relational Database, Transaction database, spatial database, Temporal and Time-series database, Data Warehouses, World Wide Web. Various functionalities of Data mining include Characterization and Discrimination, Classification and prediction, Association Rule Mining, Cluster analysis, Evolutionary analysis. Association Rule mining is one of the most important techniques of Data Mining, that aims at extracting interesting relationships within the data. In this paper we study various Association Rule mining algorithms, also compare them by using synthetic data sets, and we provide the results obtained from the experimental analysis


Edulib ◽  
2018 ◽  
Vol 8 (2) ◽  
pp. 194
Author(s):  
Lilis Syarifah ◽  
Imas Sukaesih Sitanggang ◽  
Pudji Muljono

The thesis is student study report which is accomplished as a requirement of graduation for Master program. Selecting study’s topic and advisors influence implementation of the study. Therefore, study’s topic is able to improve academic institution quality, however a large number of thesis documents on the repository cause difficulty to get information related to advisor’s expertness and the frequent or rare topic is former studied. Association rule mining can be used to mine information on the related item. This study aims to analyze advising patterns system in Master program on Agriculture based on supervisors and their topic research on metadata thesis of IPB repository and text documents of summary using data mining approach. The datas were collected from the repository of Bogor Agricultural University website and processed using R language programming. Pattern result of the reseach were that the most popular association on supervisor was occurred at support value of 0.00793 or equivalent to 7 theses and four popular topics were Botanical insecticide, Global warming, Upland Rice, and Land Use Change. The analysis result could be useful information to be reference or suggest future research or appropriate supervisor among agricultural.


Sign in / Sign up

Export Citation Format

Share Document