scholarly journals Improving Modular Classification Rule Induction with G-Prism Using Dynamic Rule Term Boundaries

Author(s):  
Manal Almutairi ◽  
Frederic Stahl ◽  
Max Bramer
2017 ◽  
Vol 16 (04) ◽  
pp. 1750034 ◽  
Author(s):  
Fadi Thabtah ◽  
Firuz Kamalov

A typical predictive approach in data mining that produces If-Then knowledge for decision making is rule-based classification. Rule-based classification includes a large number of algorithms that fall under the categories of covering, greedy, rule induction, and associative classification. These approaches have shown promising results due to the simplicity of the models generated and the user’s ability to understand, and maintain them. Phishing is one of the emergent online threats in web security domains that necessitates anti-phishing models with rules so users can easily differentiate among website types. This paper critically analyses recent research studies on the use of predictive models with rules for phishing detection, and evaluates the applicability of these approaches on phishing. To accomplish our task, we experimentally evaluate four different rule-based classifiers that belong to greedy, associative classification and rule induction approaches on real phishing datasets and with respect to different evaluation measures. Moreover, we assess the classifiers derived and contrast them with known classic classification algorithms including Bayes Net and Simple Logistics. The aim of the comparison is to determine the pros and cons of predictive models with rules and reveal their actual performance when it comes to detecting phishing activities. The results clearly showed that eDRI, a recently greedy algorithm, not only generates useful models but these are also highly competitive with respect to predictive accuracy as well as runtime when they are employed as anti-phishing tools.


2012 ◽  
Vol 28 (4) ◽  
pp. 451-478 ◽  
Author(s):  
Frederic Stahl ◽  
Max Bramer

AbstractThe fast increase in the size and number of databases demands data mining approaches that are scalable to large amounts of data. This has led to the exploration of parallel computing technologies in order to perform data mining tasks concurrently using several processors. Parallelization seems to be a natural and cost-effective way to scale up data mining technologies. One of the most important of these data mining technologies is the classification of newly recorded data. This paper surveys advances in parallelization in the field of classification rule induction.


Sign in / Sign up

Export Citation Format

Share Document