class probability estimation Latest Research Papers

In machine learning and data mining, traditional learning models aim for high classification accuracy. However, accurate class probability prediction is more desirable than classification accuracy in many practical applications, such as medical diagnosis. Although it is known that decision trees can be adapted to be class probability estimators in a variety of approaches, and the resulting models are uniformly called Probability Estimation Trees (PETs), the performances of these PETs in class probability estimation, have not yet been investigated. We begin our research by empirically studying PETs in terms of class probability estimation, measured by Log Conditional Likelihood (LCL). We also compare a PET called C4.4 with other representative models, including Naïve Bayes, Naïve Bayes Tree, Bayesian Network, KNN and SVM, in LCL. From our experiments, we draw several valuable conclusions. First, among various tree-based models, C4.4 is the best in yielding precise class probability prediction measured by LCL. We provide an explanation for this and reveal the nature of LCL. Second, compared with non tree-based models, C4.4 also performs best. Finally, LCL does not dominate another well-established relevant metric — AUC, which suggests that different decision-tree learning models should be used for different objectives. Our experiments are conducted on the basis of 36 UCI sample sets. We run all the models within a machine learning platform — Weka. We also explore an approach to improve the class probability estimation of Naïve Bayes Tree. We propose a greedy and recursive learning algorithm, where at each step, LCL is used as the scoring function to expand the decision tree. The algorithm uses Naïve Bayes created at leaves to estimate class probabilities of test samples. The whole tree encodes the posterior class probability in its structure. One benefit of improving class probability estimation is that both classification accuracy and AUC can be possibly scaled up. We call the new model LCL Tree (LCLT). Our experiments on 33 UCI sample sets show that LCLT outperforms all state-of-the-art learning models, such as Naïve Bayes Tree, significantly in accurate class probability prediction measured by LCL, as well as in classification accuracy and AUC.

Download Full-text

DECISION TREE WITH BETTER CLASS PROBABILITY ESTIMATION

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001409007296 ◽

2009 ◽

Vol 23 (04) ◽

pp. 745-763 ◽

Cited By ~ 9

Author(s):

LIANGXIAO JIANG ◽

CHAOQUN LI ◽

ZHIHUA CAI

Keyword(s):

Error Rate ◽

Classification Accuracy ◽

Direct Marketing ◽

Scale Up ◽

Experimental Results ◽

Probability Estimation ◽

Weighted Version ◽

Estimation Performance ◽

Class Probability Estimation ◽

Class Probability

Traditionally, the performance of a classifier is measured by its classification accuracy or error rate. In fact, probability-based classifiers also produce the class probability estimation (the probability that a test instance belongs to the predicted class). This information is often ignored in classification, as long as the class with the highest class probability estimation is identical to the actual class. In many data mining applications, however, classification accuracy and error rate are not enough. For example, in direct marketing, we often need to deploy different promotion strategies to customers with different likelihood (class probability) of buying some products. Thus, accurate class probability estimations are often required to make optimal decisions. In this paper, we firstly review some state-of-the-art probability-based classifiers and empirically investigate their class probability estimation performance. From our experimental results, we can draw a conclusion: C4.4 is an attractive algorithm for class probability estimation. Then, we present a locally weighted version of C4.4 to scale up its class probability estimation performance by combining locally weighted learning with C4.4. We call our improved algorithm locally weighted C4.4, simply LWC4.4. We experimentally test LWC4.4 using the whole 36 UCI data sets selected by Weka. The experimental results show that LWC4.4 significantly outperforms C4.4 in terms of class probability estimation.

Download Full-text

class probability estimation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Tree aggregation for random forest class probability estimation

A preliminary study on class probability estimation for random forest using kernel density estimators

Not always simple classification: Learning SuperParent for class probability estimation

Class probability estimation for medical studies

Overfitting, generalization, and MSE in class probability estimation with high-dimensional data

Using Latent Class Probability Estimation And Residual Inclusion To Address Confounding In Medication Adherence Modeling

Improving Tree augmented Naive Bayes for class probability estimation

DCPE co-training: Co-training based on diversity of class probability estimation

LEARNING DECISION TREES WITH LOG CONDITIONAL LIKELIHOOD

DECISION TREE WITH BETTER CLASS PROBABILITY ESTIMATION

Export Citation Format

class probability estimationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Tree aggregation for random forest class probability estimation

A preliminary study on class probability estimation for random forest using kernel density estimators

Not always simple classification: Learning SuperParent for class probability estimation

Class probability estimation for medical studies

Overfitting, generalization, and MSE in class probability estimation with high-dimensional data

Using Latent Class Probability Estimation And Residual Inclusion To Address Confounding In Medication Adherence Modeling

Improving Tree augmented Naive Bayes for class probability estimation

DCPE co-training: Co-training based on diversity of class probability estimation

LEARNING DECISION TREES WITH LOG CONDITIONAL LIKELIHOOD

DECISION TREE WITH BETTER CLASS PROBABILITY ESTIMATION

class probability estimation
Recently Published Documents