Integrating Noun-Based Feature Ranking and Selection Methods with Arabic Text Associative Classification Approach

2014 ◽  
Vol 39 (11) ◽  
pp. 7807-7822 ◽  
Author(s):  
Abdullah S. Ghareb ◽  
Abdul Razak Hamdan ◽  
Azuraliza Abu Bakar
2018 ◽  
Vol 8 (2) ◽  
pp. 1-24 ◽  
Author(s):  
Abdullah Saeed Ghareb ◽  
Azuraliza Abu Bakara ◽  
Qasem A. Al-Radaideh ◽  
Abdul Razak Hamdan

The filtering of a large amount of data is an important process in data mining tasks, particularly for the categorization of unstructured high dimensional data. Therefore, a feature selection process is desired to reduce the space of high dimensional data into small relevant subset dimensions that represent the best features for text categorization. In this article, three enhanced filter feature selection methods, Category Relevant Feature Measure, Modified Category Discriminated Measure, and Odd Ratio2, are proposed. These methods combine the relevant information about features in both the inter- and intra-category. The effectiveness of the proposed methods with Naïve Bayes and associative classification is evaluated by traditional measures of text categorization, namely, macro-averaging of precision, recall, and F-measure. Experiments are conducted on three Arabic text datasets used for text categorization. The experimental results showed that the proposed methods are able to achieve better and comparable results when compared to 12 well known traditional methods.


2014 ◽  
Vol 41 (15) ◽  
pp. 6945-6958 ◽  
Author(s):  
Shobeir Fakhraei ◽  
Hamid Soltanian-Zadeh ◽  
Farshad Fotouhi

2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Zhongmei Zhou

A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when the minimum support is set to be low. It is difficult to select a high quality rule set for classification. Second, the accuracy of associative classification depends on the setting of the minimum support and the minimum confidence. In comparison with associative classification, some improved traditional rule-based classification approaches often produce a classification rule set that plays an important role in prediction. Thus, some improved traditional rule-based classification approaches not only achieve better efficiency than associative classification but also get higher accuracy. In this paper, we put forward a new classification approach called CMR (classification based on multiple classification rules). CMR combines the advantages of both associative classification and rule-based classification. Our experimental results show that CMR gets higher accuracy than some traditional rule-based classification methods.


1982 ◽  
Vol 1 (2) ◽  
pp. 91-96 ◽  
Author(s):  
J. W. H. Swanepoel

In many studies the experimenter has under consideration several (two or more) alternatives, and is studying them in order to determine which is the best (with regard to certain specified criteria of “goodness”). Such an experimenter does not wish basically to test hypotheses, or construct confidence intervals, or perform regression analyses (though these may be appropriate parts of his analysis); he does wish to select the best of several alternatives, and the major part of his analysis should therefore be directed towards this goal. It is precisely for this problem that ranking and selection procedures were developed. This paper presents an overview of some recent work in this field, with emphasis on aspects important to experimenters confronted with selection problems. Fixed sample size and sequential procedures for both the indifference zone and subset formulations of the selection problem are discussed.


Sign in / Sign up

Export Citation Format

Share Document