TEXT CLASSIFICATION USING MODIFIED MULTI CLASS ASSOCIATION RULE

2016 ◽  
Vol 78 (8-2) ◽  
Author(s):  
Siti Sakira Kamaruddin ◽  
Yuhanis Yusof ◽  
Husniza Husni ◽  
Mohammad Hayel Al Refai

This paper presents text classification using a modified Multi Class Association Rule Method. The method is based on Associative Classification which combines classification with association rule discovery. Although previous work proved that Associative Classification produces better classification accuracy compared to typical classifiers, the study on applying Associative Classification to solve text classification problem are limited due to the common problem of high dimensionality of text data and this will consequently results in exponential number of generated classification rules. To overcome this problem the modified Multi-Class Association Rule Method was enhanced in two stages. In stage one the frequent pattern are represented using a proposed vertical data format to reduce the text dimensionality problem and in stage two the generated rule was pruned using a proposed Partial Rule Match to reduce the number of generated rules. The proposed method was tested on a text classification problem and the result shows that it performed better than the existing method in terms of classification accuracy and number of generated rules.

2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

Associative Classification (AC) or Class Association Rule (CAR) mining is a very efficient method for the classification problem. It can build comprehensible classification models in the form of a list of simple IF-THEN classification rules from the available data. In this paper, we present a new, and improved discrete version of the Crow Search Algorithm (CSA) called NDCSA-CAR to mine the Class Association Rules. The goal of this article is to improve the data classification accuracy and the simplicity of classifiers. The authors applied the proposed NDCSA-CAR algorithm on eleven benchmark dataset and compared its result with traditional algorithms and recent well known rule-based classification algorithms. The experimental results show that the proposed algorithm outperformed other rule-based approaches in all evaluated criteria.


2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

Associative Classification (AC) or Class Association Rule (CAR) mining is a very efficient method for the classification problem. It can build comprehensible classification models in the form of a list of simple IF-THEN classification rules from the available data. In this paper, we present a new, and improved discrete version of the Crow Search Algorithm (CSA) called NDCSA-CAR to mine the Class Association Rules. The goal of this article is to improve the data classification accuracy and the simplicity of classifiers. The authors applied the proposed NDCSA-CAR algorithm on eleven benchmark dataset and compared its result with traditional algorithms and recent well known rule-based classification algorithms. The experimental results show that the proposed algorithm outperformed other rule-based approaches in all evaluated criteria.


2020 ◽  
Vol 10 (20) ◽  
pp. 7013
Author(s):  
Jamolbek Mattiev ◽  
Branko Kavsek

Building accurate and compact classifiers in real-world applications is one of the crucial tasks in data mining nowadays. In this paper, we propose a new method that can reduce the number of class association rules produced by classical class association rule classifiers, while maintaining an accurate classification model that is comparable to the ones generated by state-of-the-art classification algorithms. More precisely, we propose a new associative classifier that selects “strong” class association rules based on overall coverage of the learning set. The advantage of the proposed classifier is that it generates significantly smaller rules on bigger datasets compared to traditional classifiers while maintaining the classification accuracy. We also discuss how the overall coverage of such classifiers affects their classification accuracy. Performed experiments measuring classification accuracy, number of classification rules and other relevance measures such as precision, recall and f-measure on 12 real-life datasets from the UCI ML repository (Dua, D.; Graff, C. UCI Machine Learning Repository. Irvine, CA: University of California, 2019) show that our method was comparable to 8 other well-known rule-based classification algorithms. It achieved the second-highest average accuracy (84.9%) and the best result in terms of average number of rules among all classification methods. Although not achieving the best results in terms of classification accuracy, our method proved to be producing compact and understandable classifiers by exhaustively searching the entire example space.


Author(s):  
Farida Nur Khasanah ◽  
Fhira Nhita

<p>Weather change is one of the things that can affect people around the world in doing activities, including in Indonesia. The area of Indonesia, especially in Bandung regency has a high intensity of rainfall, compared with other regions. The people of Bandung Regency mostly have livelihoods in the fields of industry and agriculture, both of which are closely related to the effects of weather. Weather prediction is used for reference, so the future of society can prepare all possible weather before the move. One method of data mining used to predict weather is the association rule method. In this method there is Frequent Pattern Growth (FP-Growth) algorithm, this algorithm is used to determine the pattern of linkage between attribute weather with rainfall. The result of the FP-Growth algorithm is an association rule, the result of the algorithm rules is then used as reference for data entry in the classification process, where the process is done to get the forecast based on the rainfall category to obtain maximum accuracy. The highest performance result of FP-Growth from the result of rules based on its confidence value is 92%.</p>


Author(s):  
Padmavati Shrivastava ◽  
Uzma Ansari

Text mining is an emerging technology that can be used to augment existing data in corporate databases by making unstructured text data available for analysis. The incredible increase in online documents, which has been mostly due to the expanding internet, has renewed the interest in automated document classification and data mining. The demand for text classification to aid the analysis and management of text is increasing. Text is cheap, but information, in the form of knowing what classes a text belongs to, is expensive. Text classification is the process of classifying documents into predefined categories based on their content. Automatic classification of text can provide this information at low cost, but the classifiers themselves must be built with expensive human effort, or trained from texts which have themselves been manually classified. Both classification and association rule mining are indispensable to practical applications. For association rule mining, the target of discovery is not pre-determined, while for classification rule mining there is one and only one predetermined target. Thus, great savings and conveniences to the user could result if the two mining techniques can somehow be integrated. In this paper, such an integrated framework, called associative classification is used for text categorization The algorithm presented here for text classification uses words as features , to derive feature set from preclassified text documents. The concept of Naïve Bayes classifier is then used on derived features for final classification.


2021 ◽  
Vol 13 (4) ◽  
pp. 547
Author(s):  
Wenning Wang ◽  
Xuebin Liu ◽  
Xuanqin Mou

For both traditional classification and current popular deep learning methods, the limited sample classification problem is very challenging, and the lack of samples is an important factor affecting the classification performance. Our work includes two aspects. First, the unsupervised data augmentation for all hyperspectral samples not only improves the classification accuracy greatly with the newly added training samples, but also further improves the classification accuracy of the classifier by optimizing the augmented test samples. Second, an effective spectral structure extraction method is designed, and the effective spectral structure features have a better classification accuracy than the true spectral features.


2021 ◽  
Vol 7 (2) ◽  
pp. 128
Author(s):  
Siriporn Sawangarreerak ◽  
Putthiporn Thanathamathee

Identifying fraudulent financial statements is important in open innovation to help users analyze financial statements and make investment decisions. It also helps users be aware of the occurrence of fraud in financial statements by considering the associated pattern. This study aimed to find associated fraud patterns in financial ratios from financial statements on the Stock Exchange of Thailand using discretization of the financial ratios and frequent pattern growth (FP-Growth) association rule mining to find associated patterns. We found nine associated patterns in financial ratios related to fraudulent financial statements. This study is different from others that have analyzed the occurrence of fraud by using mathematics for each financial item. Moreover, this study discovered six financial items related to fraud: (1) gross profit, (2) primary business income, (3) ratio of primary business income to total assets, (4) ratio of capitals and reserves to total debt, (5) ratio of long-term debt to total capital and reserves, and (6) ratio of accounts receivable to primary business income. The three other financial items that were different from other studies to be focused on were (1) ratio of gross profit to primary business profit, (2) ratio of long-term debt to total assets, and (3) total assets.


2017 ◽  
Vol 7 (1.1) ◽  
pp. 19
Author(s):  
T. Nusrat Jabeen ◽  
M. Chidambaram ◽  
G. Suseendran

Security and privacy has emerged to be a serious concern in which the business professional don’t desire to share their classified transaction data. In the earlier work, secured sharing of transaction databases are carried out. The performance of those methods is enhanced further by bringing in Security and Privacy aware Large Database Association Rule Mining (SPLD-ARM) framework. Now the Improved Secured Association Rule Mining (ISARM) is introduced for the horizontal and vertical segmentation of huge database. Then k-Anonymization methods referred to as suppression and generalization based Anonymization method is employed for privacy guarantee. At last, Diffie-Hellman encryption algorithm is presented in order to safeguard the sensitive information and for the storage service provider to work on encrypted information. The Diffie-Hellman algorithm is utilized for increasing the quality of the system on the overall by the generation of the secured keys and thus the actual data is protected more efficiently. Realization of the newly introduced technique is conducted in the java simulation environment that reveals that the newly introduced technique accomplishes privacy in addition to security.


Sign in / Sign up

Export Citation Format

Share Document