Comparative analysis of attribute selection measures used for attribute selection in decision tree induction

Author(s):  
A. S. Bhatt
2019 ◽  
Vol 46 (3) ◽  
pp. 325-339
Author(s):  
Muhammad Shaheen ◽  
Tanveer Zafar ◽  
Sajid Ali Khan

Selection of an attribute for placement of the decision tree at an appropriate position (e.g. root of the tree) is an important decision. Many attribute selection measures such as Information Gain, Gini Index and Entropy have been developed for this purpose. The suitability of an attribute generally depends on the diversity of its values, relevance and dependency. Different attribute selection measures have different criteria for measuring the suitability of an attribute. Diversity Index is a classical statistical measure for determining the diversity of values, and according to our knowledge, it has never been used as an attribute selection method. In this article, we propose a novel attribute selection method for decision tree classification. In the proposed scheme, the average of Information Gain, Gini Index and Diversity Index are taken into account for assigning a weight to the attributes. The attribute with the highest average value is selected for the classification. We have empirically tested our proposed algorithm for classification of different data sets of scientific journals and conferences. We have developed a web-based application named JC-Rank that makes use of our proposed algorithm. We have also compared the results of our proposed technique with some existing decision tree classification algorithms.


Author(s):  
Abdullah M. Al Ghoson

Decision tree induction and Clustering are two of the most prevalent data mining techniques used separately or together in many business applications. Most commercial data mining software tools provide these two techniques but few of them satisfy business needs.  There are many criteria and factors to choose the most appropriate software for a particular organization. This paper aims to provide a comparative analysis for three popular data mining software tools, which are SAS® Enterprise Miner, SPSS Clementine, and IBM DB2® Intelligent Miner based on four main criteria, which are performance, functionality, usability, and auxiliary Task Support.


Sign in / Sign up

Export Citation Format

Share Document