A Novel Efficient Classification Algorithm Based on Class Association Rules

A novel classification algorithm based on class association rules is proposed in this paper. Firstly, the algorithm mines frequent items and rules only in one phase. Then, the algorithm ranks rules that pass the support and confidence thresholds using a global sorting method according to a series of parameters, including confidence, support, antecedent cardinality, class distribution frequency, item row order and rule antecedent length. Classifier building is based on rule items that do not overlap in the training phase and rule items that each training instance is covered by only a single rule. Experimental results on the 8 datasets from UCI ML Repository show that the proposed algorithm is highly competitive when compared with the C4.5,CBA,CMAR and CPAR algorithms in terms of classification accuracy and efficiency. This algorithm can offer an available associative classification technique for data mining.

Download Full-text

Association Rule and Quantitative Association Rule Mining among Infrequent Items

Rare Association Rule Mining and Knowledge Discovery ◽

10.4018/978-1-60566-754-6.ch002 ◽

2010 ◽

pp. 15-32 ◽

Cited By ~ 1

Author(s):

Ling Zhou ◽

Stephen Yau

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Rule Mining ◽

Transactional Databases ◽

Frequent Items ◽

Increasing Demand ◽

Quantitative Association Rule

Association rule mining among frequent items has been extensively studied in data mining research. However, in recent years, there is an increasing demand for mining infrequent items (such as rare but expensive items). Since exploring interesting relationships among infrequent items has not been discussed much in the literature, in this chapter, the authors propose two simple, practical and effective schemes to mine association rules among rare items. Their algorithms can also be applied to frequent items with bounded length. Experiments are performed on the well-known IBM synthetic database. The authors’ schemes compare favorably to Apriori and FP-growth under the situation being evaluated. In addition, they explore quantitative association rule mining in transactional databases among infrequent items by associating quantities of items: some interesting examples are drawn to illustrate the significance of such mining.

Download Full-text

A Hybrid Classification Approach Based on Decision Tree and Naïve Bays Methods

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2014100104 ◽

2014 ◽

Vol 4 (4) ◽

pp. 61-72

Author(s):

Saed A. Muqasqas ◽

Qasem A. Al Radaideh ◽

Bilal A. Abul-Huda

Keyword(s):

Data Mining ◽

Decision Tree ◽

Classification Accuracy ◽

Classification Methods ◽

Hybrid Classifier ◽

Classification Approach ◽

Classification Technique ◽

Average Accuracy ◽

Proposed Model ◽

Hybrid Classification

Data classification as one of the main tasks of data mining has an important role in many fields. Classification techniques differ mainly in the accuracy of their models, which depends on the method adopted during the learning phase. Several researchers attempted to enhance the classification accuracy by combining different classification methods in the same learning process; resulting in a hybrid-based classifier. In this paper, the authors propose and build a hybrid classifier technique based on Naïve Bayes and C4.5 classifiers. The main goal of the proposed model is to reduce the complexity of the NBTree technique, which is a well known hybrid classification technique, and to improve the overall classification accuracy. Thirty six samples of UCI datasets were used in evaluation. Results have shown that the proposed technique significantly outperforms the NBTree technique and some other classifiers proposed in the literature in term of classification accuracy. The proposed classification approach yields an overall average accuracy equal to 85.70% over the 36 datasets.

Download Full-text

Integrating Data Mining Techniques for Naïve Bayes Classification: Applications to Medical Datasets

Computation ◽

10.3390/computation9090099 ◽

2021 ◽

Vol 9 (9) ◽

pp. 99

Author(s):

Pannapa Changpetch ◽

Apasiri Pitpeng ◽

Sasiprapa Hiriote ◽

Chumpol Yuangyai

Keyword(s):

Data Mining ◽

Association Rules ◽

Classification Accuracy ◽

Naive Bayes ◽

Classification Tree ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Association Rules Analysis

In this study, we designed a framework in which three techniques—classification tree, association rules analysis (ASA), and the naïve bayes classifier—were combined to improve the performance of the latter. A classification tree was used to discretize quantitative predictors into categories and ASA was used to generate interactions in a fully realized way, as discretized variables and interactions are key to improving the classification accuracy of the naïve Bayes classifier. We applied our methodology to three medical datasets to demonstrate the efficacy of the proposed method. The results showed that our methodology outperformed the existing techniques for all the illustrated datasets. Although our focus here was on medical datasets, our proposed methodology is equally applicable to datasets in many other areas.

Download Full-text

Association rules for data mining in item classification algorithm: Web service approach

2012 Second International Conference on Digital Information and Communication Technology and it's Applications (DICTAP) ◽

10.1109/dictap.2012.6215408 ◽

2012 ◽

Cited By ~ 1

Author(s):

Manop Phankokkruad

Keyword(s):

Data Mining ◽

Web Service ◽

Association Rules ◽

Classification Algorithm

Download Full-text

A Semisupervised Cascade Classification Algorithm

Applied Computational Intelligence and Soft Computing ◽

10.1155/2016/5919717 ◽

2016 ◽

Vol 2016 ◽

pp. 1-14 ◽

Cited By ~ 3

Author(s):

Stamatis Karlos ◽

Nikos Fazakis ◽

Sotiris Kotsiantis ◽

Kyriakos Sgarbas

Keyword(s):

Data Mining ◽

Classification Accuracy ◽

Feature Space ◽

Training Phase ◽

Classification Methods ◽

Cascade Classifier ◽

Base Classifier ◽

Novel Approach ◽

Semisupervised Classification ◽

Benchmark Datasets

Classification is one of the most important tasks of data mining techniques, which have been adopted by several modern applications. The shortage of enough labeled data in the majority of these applications has shifted the interest towards using semisupervised methods. Under such schemes, the use of collected unlabeled data combined with a clearly smaller set of labeled examples leads to similar or even better classification accuracy against supervised algorithms, which use labeled examples exclusively during the training phase. A novel approach for increasing semisupervised classification using Cascade Classifier technique is presented in this paper. The main characteristic of Cascade Classifier strategy is the use of a base classifier for increasing the feature space by adding either the predicted class or the probability class distribution of the initial data. The classifier of the second level is supplied with the new dataset and extracts the decision for each instance. In this work, a self-trained NB∇C4.5 classifier algorithm is presented, which combines the characteristics of Naive Bayes as a base classifier and the speed of C4.5 for final classification. We performed an in-depth comparison with other well-known semisupervised classification methods on standard benchmark datasets and we finally reached to the point that the presented technique has better accuracy in most cases.

Download Full-text

Exploring Associative Classification Technique Using Weighted Utility Association Rules for Predictive Analytics

High Performance Architecture and Grid Computing - Communications in Computer and Information Science ◽

10.1007/978-3-642-22577-2_24 ◽

2011 ◽

pp. 169-178 ◽

Cited By ~ 1

Author(s):

Mamta Punjabi ◽

Vineet Kushwaha ◽

Rashmi Ranjan

Keyword(s):

Association Rules ◽

Predictive Analytics ◽

Associative Classification ◽

Weighted Utility ◽

Classification Technique

Download Full-text

P1064Using Data Mining to Predict Bleeding Events caused by Novel Oral Anticoagulants

EP Europace ◽

10.1093/europace/euaa162.278 ◽

2020 ◽

Vol 22 (Supplement_1) ◽

Author(s):

W R Chiou ◽

M C Hsieh ◽

H N Chuang ◽

C C Huang ◽

J Y Chuang ◽

...

Keyword(s):

Data Mining ◽

Neural Networks ◽

Decision Tree ◽

Association Rules ◽

Classification Accuracy ◽

Prediction Error ◽

Oral Anticoagulants ◽

Omission Error ◽

Bleeding Events ◽

Accurate Model

Abstract Background Novel oral anticoagulants (NOAC) is important in preventing thromboembolism in atrial fibrillation (AF) patients. Bleeding risk was evaluated by HAS-BLED score traditionally. Data mining is a relatively new discipline that has sprung up at the confluence of several other disciplines, driven primarily by the growth of large databases. Purpose This study aimed to find a useful predictive model by data mining to assess the risk of rivaroxaban, an antithrombotic drug that causes bleeding in AF patients. The seven parameters of the HAS-BLED score were used to predict the effect of rivaroxaban on bleeding tendency in AF patients and may provide clinicians with appropriate treatments to avoid complications from bleeding events and reduce the incidence of health damage. Methods Through conducting a multicenter retrospective study, we identified patients with AF who were treated with rivaroxaban for more than 1 month between December 1, 2011 and November 30, 2016. After preprocessing, the established data were used for training and testing of data mining models. This study evaluated four models, including association rules, neural networks, Bayesian classification, and decision trees. Result Of the 872 enrolled cases, 432 were in any of the bleeding groups and 432 were in the non-bleeding randomized control group. After comparing the overall classification accuracy, omission error and over-prediction error, the decision tree proved to be the most accurate model for bleeding prediction. The overall classification accuracy is 77%, the omission error is 15%, the over-prediction error is 21.9%, and the AUC score is 0.84. The results show that the model has good discriminative ability and visibility of decision rules. Conclusion Among several data mining models, decision tree proved to be the most accurate model for bleeding prediction. The conclusion of this study can be used as a reference for supporting decision making before anticoagulation treatment and suggest future research to compare efficacy of bleeding prediction between HAS-BLED score and decision tree. Data mining comparison Model Omission error Commission error Overall accuracy AUC score Ranking Decision tree 15.0% 21.90% 77.00% 0.84 1 Association rules 16.8% 27.20% 76.50% 0.81 2 Neural networks 12.0% 26.40% 78.20% 0.83 3 Bayesian classification 16.1% 27.50% 76.50% 0.83 4

Download Full-text

Interactive Data Mining: A Short Background Study on Effective Interaction and Visualization by Association Rules

2nd International conference on Innovative Engineering Technologies (ICIET'2015) August 7-8, 2015 Bangkok (Thailand) ◽

10.15242/iie.e0815001 ◽

2015 ◽

Keyword(s):

Data Mining ◽

Association Rules ◽

Effective Interaction ◽

Interactive Data Mining ◽

Interactive Data

Download Full-text

How Useful Can Be Data Mining For A Continuos Speech Therapist’s Education?

Balkan Region Conference on Engineering and Business Education ◽

10.2478/cplbu-2014-0050 ◽

2014 ◽

Vol 1 (1) ◽

pp. 339-342

Author(s):

Mirela Danubianu ◽

Dragos Mircea Danubianu

Keyword(s):

Data Mining ◽

Information And Communication Technology ◽

Association Rules ◽

Communication Technology ◽

Speech Therapy ◽

Proper Treatment ◽

Speech Impairments ◽

Information And Communication ◽

Specific Education

AbstractSpeech therapy can be viewed as a business in logopaedic area that aims to offer services for correcting language. A proper treatment of speech impairments ensures improved efficiency of therapy, so, in order to do that, a therapist must continuously learn how to adjust its therapy methods to patient's characteristics. Using Information and Communication Technology in this area allowed collecting a lot of data regarding various aspects of treatment. These data can be used for a data mining process in order to find useful and usable patterns and models which help therapists to improve its specific education. Clustering, classification or association rules can provide unexpected information which help to complete therapist's knowledge and to adapt the therapy to patient's needs.

Download Full-text

Research on Classification Algorithm in Data Mining

2017 3rd International Conference on Environment, Biology, Medicine and Computer Applications (ICEBMCA 2017) ◽

10.25236/icebmca.2017.23 ◽

2017 ◽

Keyword(s):

Data Mining ◽

Classification Algorithm

Download Full-text