A Fast Boosting Based Incremental Genetic Algorithm for Mining Classification Rules in Large Datasets

Genetic algorithm is a search technique purely based on natural evolution process. It is widely used by the data mining community for classification rule discovery in complex domains. During the learning process it makes several passes over the data set for determining the accuracy of the potential rules. Due to this characteristic it becomes an extremely I/O intensive slow process. It is particularly difficult to apply GA when the training data set becomes too large and not fully available. An incremental Genetic algorithm based on boosting phenomenon is proposed in this paper which constructs a weak ensemble of classifiers in a fast incremental manner and thus tries to reduce the learning cost considerably.

Download Full-text

A Fast Boosting Based Incremental Genetic Algorithm for Mining Classification Rules in Large Datasets

Modeling Applications and Theoretical Innovations in Interdisciplinary Evolutionary Computation ◽

10.4018/978-1-4666-3628-6.ch004 ◽

2013 ◽

pp. 46-53

Author(s):

Periasamy Vivekanandan ◽

Raju Nedunchezhian

Keyword(s):

Data Mining ◽

Genetic Algorithm ◽

Large Datasets ◽

Training Data ◽

Classification Rule ◽

Classification Rules ◽

Slow Process ◽

Ensemble Of Classifiers ◽

Data Set ◽

Mining Community

Download Full-text

Diagnosis of Diseases: Classification Rule Discovery from Medical Data using Genetic Algorithm with Suppressor Mutation

2020 International Conference on System, Computation, Automation and Networking (ICSCAN) ◽

10.1109/icscan49426.2020.9262429 ◽

2020 ◽

Author(s):

E. Thamizhselvi ◽

Geetha Vaithianathan

Keyword(s):

Genetic Algorithm ◽

Medical Data ◽

Suppressor Mutation ◽

Classification Rule ◽

Rule Discovery ◽

Diagnosis Of Diseases

Download Full-text

Classification rule discovery using variant genetic algorithm

2017 International Conference on Circuits, Controls, and Communications (CCUBE) ◽

10.1109/ccube.2017.8394151 ◽

2017 ◽

Author(s):

T Shobha ◽

R J Anandhi

Keyword(s):

Genetic Algorithm ◽

Classification Rule ◽

Rule Discovery

Download Full-text

A Genetic Algorithm Approach for Discovering Tuned Fuzzy Classification Rules with Intra- and Inter-Class Exceptions

Journal of Intelligent Systems ◽

10.1515/jisys-2015-0136 ◽

2016 ◽

Vol 25 (2) ◽

pp. 263-282 ◽

Cited By ~ 3

Author(s):

Renu Bala ◽

Saroj Ratnoo

Keyword(s):

Genetic Algorithm ◽

Fuzzy Rule ◽

Fuzzy Classification ◽

Rule Base ◽

Second Phase ◽

Classification Rules ◽

Linguistic Terms ◽

Data Set ◽

Actual Distribution ◽

Genetic Algorithm Approach

AbstractFuzzy rule-based systems (FRBSs) are proficient in dealing with cognitive uncertainties like vagueness and ambiguity imperative to real-world decision-making situations. Fuzzy classification rules (FCRs) based on fuzzy logic provide a framework for a flexible human-like reasoning involving linguistic variables. Appropriate membership functions (MFs) and suitable number of linguistic terms – according to actual distribution of data – are useful to strengthen the knowledge base (rule base [RB]+ data base [DB]) of FRBSs. An RB is expected to be accurate and interpretable, and a DB must contain appropriate fuzzy constructs (type of MFs, number of linguistic terms, and positioning of parameters of MFs) for the success of any FRBS. Moreover, it would be fascinating to know how a system behaves in some rare/exceptional circumstances and what action ought to be taken in situations where generalized rules cease to work. In this article, we propose a three-phased approach for discovery of FCRs augmented with intra- and inter-class exceptions. A pre-processing algorithm is suggested to tune DB in terms of the MFs and number of linguistic terms for each attribute of a data set in the first phase. The second phase discovers FCRs employing a genetic algorithm approach. Subsequently, intra- and inter-class exceptions are incorporated in the rules in the third phase. The proposed approach is illustrated on an example data set and further validated on six UCI machine learning repository data sets. The results show that the approach has been able to discover more accurate, interpretable, and interesting rules. The rules with intra-class exceptions tell us about the unique objects of a category, and rules with inter-class exceptions enable us to take a right decision in the exceptional circumstances.

Download Full-text

Identification of Poison using C4.5 Algorithm

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset207247 ◽

2020 ◽

pp. 218-222

Author(s):

Lai Lai Yee ◽

Myo Ma Ma

Keyword(s):

Data Mining ◽

Test Data ◽

Knowledge Worker ◽

Training Data ◽

Independent Data ◽

Classification Rules ◽

Natural Evolution ◽

C4.5 Algorithm ◽

Other Information

Data mining is the task of discovering interesting patterns from large amounts of data where the data can be stored in databases, data warehouses or other information repositories. This can be viewed as a result of the natural evolution of information technology. The key point is that data mining is the application of these and other AI and statistical techniques to common business problems in a fashion that makes these techniques available to the skilled knowledge worker as well as the trained statistics professional. This paper is classification system for Toxicology using C4.5. Firstly, the input data are randomly partitioned into two independent data, a training data and a test data. And then two third of the data are allocated to the training data and the remaining one third is allocated to the test data. Final step is C4.5 Algorithm Process, the training data is used to derive C4.5 algorithm. Classification Process, test data are used to estimate the accuracy of the classification rules. If the accuracy is considered acceptable the rules can be applied to the classification of new data.

Download Full-text

CoABCMiner: An Algorithm for Cooperative Rule Classification System Based on Artificial Bee Colony

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213015500281 ◽

2016 ◽

Vol 25 (01) ◽

pp. 1550028 ◽

Cited By ~ 12

Author(s):

Mete Celik ◽

Fehim Koylu ◽

Dervis Karaboga

Keyword(s):

Artificial Bee Colony ◽

Statistical Tests ◽

Rule Learning ◽

Machine Learning Algorithms ◽

Classification Model ◽

Classification Rule ◽

Data Sets ◽

Classification Rules ◽

Data Set ◽

Bee Colony

In data mining, classification rule learning extracts the knowledge in the representation of IF_THEN rule which is comprehensive and readable. It is a challenging problem due to the complexity of data sets. Various meta-heuristic machine learning algorithms are proposed for rule learning. Cooperative rule learning is the discovery process of all classification rules with a single run concurrently. In this paper, a novel cooperative rule learning algorithm, called CoABCMiner, based on Artificial Bee Colony is introduced. The proposed algorithm handles the training data set and discovers the classification model containing the rule list. Token competition, new updating strategy used in onlooker and employed phases, and new scout bee mechanism are proposed in CoABCMiner to achieve cooperative learning of different rules belonging to different classes. We compared the results of CoABCMiner with several state-of-the-art algorithms using 14 benchmark data sets. Non parametric statistical tests, such as Friedman test, post hoc test, and contrast estimation based on medians are performed. Nonparametric tests determine the similarity of control algorithm among other algorithms on multiple problems. Sensitivity analysis of CoABCMiner is conducted. It is concluded that CoABCMiner can be used to discover classification rules for the data sets used in experiments, efficiently.

Download Full-text

UNCERTAINTY HANDLING IN DISASTER MANAGEMENT USING HIERARCHICAL ROUGH SET GRANULATION

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-ii-3-w5-271-2015 ◽

2015 ◽

Vol II-3/W5 ◽

pp. 271-276 ◽

Cited By ~ 1

Author(s):

H. Sheikhian ◽

M. R. Delavar ◽

A. Stein

Keyword(s):

Decision Making ◽

Granular Computing ◽

Seismic Vulnerability ◽

Quality Measures ◽

Rule Extraction ◽

Training Data ◽

Classification Rules ◽

Earthquake Intensity ◽

Data Set ◽

The North

Uncertainty is one of the main concerns in geospatial data analysis. It affects different parts of decision making based on such data. In this paper, a new methodology to handle uncertainty for multi-criteria decision making problems is proposed. It integrates hierarchical rough granulation and rule extraction to build an accurate classifier. Rough granulation provides information granules with a detailed quality assessment. The granules are the basis for the rule extraction in granular computing, which applies quality measures on the rules to obtain the best set of classification rules. The proposed methodology is applied to assess seismic physical vulnerability in Tehran. Six effective criteria reflecting building age, height and material, topographic slope and earthquake intensity of the North Tehran fault have been tested. The criteria were discretized and the data set was granulated using a hierarchical rough method, where the best describing granules are determined according to the quality measures. The granules are fed into the granular computing algorithm resulting in classification rules that provide the highest prediction quality. This detailed uncertainty management resulted in 84% accuracy in prediction in a training data set. It was applied next to the whole study area to obtain the seismic vulnerability map of Tehran. A sensitivity analysis proved that earthquake intensity is the most effective criterion in the seismic vulnerability assessment of Tehran.

Download Full-text

A Novel Approach for Classifying MANETs Attacks with a Neutrosophic Intelligent System based on Genetic Algorithm

Security and Communication Networks ◽

10.1155/2018/5828517 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10 ◽

Cited By ~ 3

Author(s):

Haitham Elwahsh ◽

Mona Gamal ◽

A. A. Salama ◽

I. M. El-Henawy

Keyword(s):

Genetic Algorithm ◽

Ad Hoc ◽

Intelligent System ◽

Fitness Function ◽

Mathematical Formulation ◽

Training Data ◽

Data Set ◽

Novel Approach ◽

Learning Capabilities ◽

Hoc Networks

Recently designing an effective intrusion detection systems (IDS) within Mobile Ad Hoc Networks Security (MANETs) becomes a requirement because of the amount of indeterminacy and doubt exist in that environment. Neutrosophic system is a discipline that makes a mathematical formulation for the indeterminacy found in such complex situations. Neutrosophic rules compute with symbols instead of numeric values making a good base for symbolic reasoning. These symbols should be carefully designed as they form the propositions base for the neutrosophic rules (NR) in the IDS. Each attack is determined by membership, nonmembership, and indeterminacy degrees in neutrosophic system. This research proposes a MANETs attack inference by a hybrid framework of Self-Organized Features Maps (SOFM) and the genetic algorithms (GA). The hybrid utilizes the unsupervised learning capabilities of the SOFM to define the MANETs neutrosophic conditional variables. The neutrosophic variables along with the training data set are fed into the genetic algorithm to find the most fit neutrosophic rule set from a number of initial subattacks according to the fitness function. This method is designed to detect unknown attacks in MANETs. The simulation and experimental results are conducted on the KDD-99 network attacks data available in the UCI machine-learning repository for further processing in knowledge discovery. The experiments cleared the feasibility of the proposed hybrid by an average accuracy of 99.3608 % which is more accurate than other IDS found in literature.

Download Full-text

Immunity-Based Genetic Algorithm for Classification Rule Discovery

Lecture Notes in Computer Science - Advances in Natural Computation ◽

10.1007/11539117_103 ◽

2005 ◽

pp. 727-734

Author(s):

Ziqiang Wang ◽

Dexian Zhang

Keyword(s):

Genetic Algorithm ◽

Classification Rule ◽

Rule Discovery

Download Full-text

Efficient Classification Rule Mining for Breast Cancer Detection

Research Advances in the Integration of Big Data and Smart Computing - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-4666-8737-0.ch003 ◽

2016 ◽

pp. 50-63

Author(s):

Sufal Das ◽

Hemanta Kumar Kalita

Keyword(s):

Breast Cancer ◽

Genetic Algorithm ◽

Association Rule ◽

Breast Cancer Detection ◽

Classification Rule ◽

Classification Rules ◽

Optimal Solutions ◽

Rule Mining ◽

Detect Breast Cancer ◽

Binary Strings

Breast cancer is the second largest cause of cancer deaths among women. Mainly, this disease is tumor related cause of death in women. Early detection of breast cancer may protect women from death. Various computational methods have been utilized to enhance the diagnoses procedures. In this paper, we have presented the genetic algorithm (GA) based association rule mining method which can be applied to detect breast cancer efficiently. In this work, we have represented each solution as chromosome and applied to genetic algorithm based rule mining. Association rules which imply classification rules are encoded with binary strings to represent chromosomes. Finally, optimal solutions are found out by develop GA-based approach utilizing a feedback linkage between feature selection and association rule.

Download Full-text